
# Python Workshop Exercises

This notebook provides self-guided exercises to practice and reinforce the concepts from the workshop. Each exercise includes the necessary class or function illustrations so that no reference to the original workshop is needed.


### 1. Class Creation and Attributes

Understand class attributes and methods.

- **Exercise:** Create a `RNASequence` class inheriting from `DNASequence`. Add methods specific to RNA, such as `reverse_transcribe()` to get a DNA sequence from RNA.
- **Challenge:** Implement an error-handling mechanism to ensure only RNA-compatible nucleotides are allowed.

**Provided Class Illustration:**

```python
class DNASequence:
    def __init__(self, sequence):
        self.sequence = sequence

    def transcribe(self):
        # Converts DNA to RNA sequence
        return self.sequence.replace('T', 'U')
```


In [None]:
# Your code here


### 2. Inheritance and Overriding Methods

Practice subclassing and method overriding.

- **Exercise:** Create a subclass `ProteinGene` that inherits from `ExpressionGene`. Add an attribute `protein_length` and override `__str__()` to include protein length in the output.
- **Challenge:** Write a `calculate_molecular_weight()` method using average amino acid weights.

**Provided Class Illustration:**

```python
class ExpressionGene:
    def __init__(self, gene_id, sequence, expression_level):
        self.gene_id = gene_id
        self.sequence = sequence
        self.expression_level = expression_level

    def __str__(self):
        return f"Gene {self.gene_id} with expression level {self.expression_level}"
```


In [None]:
# Your code here


### 3. Working with Encapsulation

Learn about public, protected, and private attributes.

- **Exercise:** In the `DNASequence` class, add a protected attribute `_sequence_quality` and a private attribute `__accession_number`. Write getter and setter methods for both attributes.
- **Challenge:** Try to access the private attribute directly from outside the class to observe the error, then modify the code to use the getter method instead.

**Provided Class Illustration:**

```python
class DNASequence:
    def __init__(self, sequence):
        self.sequence = sequence
        self._sequence_quality = None  # Protected attribute
        self.__accession_number = "ABC123"  # Private attribute
```


In [None]:
# Your code here


### 4. Sequence Manipulation Using Biopython

Utilize Biopython for DNA sequence analysis.

- **Exercise:** Write a function using Biopython's `Seq` object that accepts a DNA sequence, transcribes it to mRNA, and then translates it to a protein.
- **Challenge:** Write a function that reads a FASTA file, extracts sequences, and calculates GC content for each.

**Provided Biopython Example:**

```python
from Bio.Seq import Seq

# Example of creating and manipulating a Seq object
seq = Seq("ATGGCC")
print("Transcription:", seq.transcribe())
print("Translation:", seq.translate())
```


In [None]:
# Your code here


### 5. Modular Code and Imports

Organize code using modules.

- **Exercise:** Split the class definitions and methods for DNA, RNA, and protein sequences into separate Python modules. Import these into a main script to create objects and test functionalities.
- **Challenge:** Add unit tests for each module.

**Illustration of Module Structure:**

1. **dna_sequence.py**
```python
class DNASequence:
    def __init__(self, sequence):
        self.sequence = sequence
```
2. **main.py**
```python
from dna_sequence import DNASequence

dna = DNASequence("ATCG")
print(dna.sequence)
```


In [None]:
# Your code here


### 6. Nucleotide Count and GC Content

Reinforce counting and analyzing sequences.

- **Exercise:** Create a `count_nucleotides()` function in `DNASequence` using Python’s `Counter` to tally nucleotide occurrences. Then, add `calculate_gc_content()` using `SeqUtils`.
- **Challenge:** Modify the GC content function to handle cases where sequences contain ambiguous nucleotides (e.g., 'N').

**Provided Example for Counting Nucleotides:**

```python
from collections import Counter

# Example of counting nucleotides in a sequence
sequence = "ATCGGCTA"
counts = Counter(sequence)
print(counts)
```


In [None]:
# Your code here


### 7. Advanced Sequence Translation

Understand translation and stop codon handling.

- **Exercise:** Implement a method that takes a nucleotide sequence and outputs the translated protein sequence, stopping at the first stop codon.
- **Challenge:** Modify the method to handle non-standard start codons and alternative translation tables.

**Biopython Translation Example:**

```python
from Bio.Seq import Seq

# Translating a DNA sequence into a protein
seq = Seq("ATGGCC")
protein = seq.translate()
print(protein)
```


In [None]:
# Your code here


### 8. Creating and Manipulating `SeqRecord` Objects

Work with metadata in sequences.

- **Exercise:** Create a `SeqRecord` for a gene with attributes for gene ID, sequence, and organism. Print out a well-formatted report of the record.
- **Challenge:** Write a function that reads multiple `SeqRecord` objects from a GenBank file and prints a summary.

**Provided Example with `SeqRecord`:**

```python
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord

# Creating a SeqRecord object
record = SeqRecord(Seq("ATGGCC"), id="GeneID", description="Example Gene")
print(record)
```


In [None]:
# Your code here


### 9. GC Content and Codon Usage Analysis

Apply sequence analysis skills.

- **Exercise:** Write a `GeneAnalysis` class with methods to calculate GC content, nucleotide frequencies, and codon usage from a `DNASequence`.
- **Challenge:** Compare codon usage between two different gene sequences.

**Provided GC Content Example with Biopython:**

```python
from Bio.SeqUtils import gc_fraction

# Calculate GC content
sequence = "ATGGCC"
gc_content = gc_fraction(sequence)
print("GC Content:", gc_content)
```


In [None]:
# Your code here


### 10. Data Input/Output with Biopython

Practice sequence file handling.

- **Exercise:** Write a script to read a FASTA file, calculate nucleotide counts for each sequence, and output results to a CSV file.
- **Challenge:** Implement the script to work with multiple file formats (e.g., GenBank) using Biopython’s input/output functionalities.

**Provided Example for Reading FASTA with Biopython:**

```python
from Bio import SeqIO

# Reading a FASTA file
for record in SeqIO.parse("example.fasta", "fasta"):
    print(record.id, record.seq)
```


In [None]:
# Your code here
