# Week 1: Python Recap, Git/GitHub, and Introduction to OOP

## Learning Objectives
By the end of this week, you should be able to:
- Write Python functions and use common data structures
- Use basic Git commands for version control
- Understand the fundamentals of Object-Oriented Programming
- Create simple classes with attributes and methods

---

## Part 1: Python Recap

### 1.1 Functions Review

#### What is a function?
A **function** is a reusable block of code that performs a specific task. Think of it like a recipe: you give it ingredients (inputs), it follows steps (the code inside), and produces a dish (output).

#### Key components of a function:
1. **Function definition**: starts with `def` followed by the function name
2. **Parameters**: inputs the function needs (in parentheses)
3. **Docstring**: a description of what the function does (in triple quotes)
4. **Function body**: the code that does the work (indented)
5. **Return statement**: what the function gives back as output

Let's look at an example:

In [None]:
# Example: A simple function
def calculate_bmi(weight_kg, height_m):
    """
    Calculate Body Mass Index.
    
    Parameters:
    weight_kg (float): Weight in kilograms
    height_m (float): Height in meters
    
    Returns:
    float: BMI value
    """
    # ** means "to the power of" (so height_m ** 2 means height squared)
    bmi = weight_kg / (height_m ** 2)
    return bmi  # The 'return' statement sends the result back to whoever called the function

# Test it - we "call" the function by using its name with arguments
result = calculate_bmi(70, 1.75)
print(result)
# When we call this function, 70 becomes weight_kg and 1.75 becomes height_m inside the function

#### Understanding the flow:
1. We **define** the function (lines 2-9) - this creates the recipe
2. We **call** the function (line 12) - this actually uses the recipe
3. Python substitutes our values (70 and 1.75) into the parameters
4. The function executes and returns a value
5. We store that returned value in `result` and print it

**Exercise 1.1:** Write a function `calculate_gc_content(dna_sequence)` that calculates the GC content (percentage of G and C nucleotides) in a DNA sequence string.

**Hints:**
- You can count characters in a string using the `.count()` method: `"ATGC".count("G")` returns 1
- To get the length of a string, use `len()`: `len("ATGC")` returns 4
- GC content formula: (count of G + count of C) / total length × 100
- Remember to return the result!

In [None]:
def calculate_gc_content(dna_sequence):
    """
    Calculate the GC content of a DNA sequence.
    
    Parameters:
    dna_sequence (str): DNA sequence containing A, T, G, C
    
    Returns:
    float: GC content as a percentage
    """
    # Step 1: Count how many G's are in the sequence
    # Step 2: Count how many C's are in the sequence
    # Step 3: Add the two counts together
    # Step 4: Divide by the total length of the sequence
    # Step 5: Multiply by 100 to get a percentage
    # Step 6: Return the result
    
    # Your code here
    pass

# Test your function
test_seq = "ATGCGATCGATCG"
print(f"GC content: {calculate_gc_content(test_seq):.2f}%")
# Expected output: around 53.85%

### 1.2 Data Structures Review

#### Python's main data structures:

**1. Lists** - ordered collections that can be modified
```python
my_list = [1, 2, 3, 4]
my_list[0]  # Access first item (indexing starts at 0!)
my_list.append(5)  # Add item to end
```

**2. Dictionaries** - key-value pairs for fast lookups
```python
my_dict = {"gene1": 100, "gene2": 200}
my_dict["gene1"]  # Access value using key
my_dict["gene3"] = 150  # Add new key-value pair
```

**3. Sets** - unordered collections of unique items
```python
my_set = {1, 2, 3, 3}  # Duplicates automatically removed, becomes {1, 2, 3}
```

#### Important list operations:
- `sum(list)` - adds all numbers in the list
- `len(list)` - returns how many items are in the list
- `list.append(item)` - adds item to the end
- You can loop through lists: `for item in my_list:`

**Exercise 1.2:** Given a list of gene expression values, write code to:
1. Calculate the mean expression
2. Find genes with expression above the mean
3. Return a dictionary with gene names as keys and their expression values

**Hints:**
- Mean = sum of all values / number of values
- You can use `sum()` and `len()` functions
- To create a dictionary from two lists, you can use a loop or `zip()`: `dict(zip(list1, list2))`
- To find values above mean, use a loop with an if statement

In [None]:
# Sample data
genes = ['BRCA1', 'TP53', 'EGFR', 'MYC', 'KRAS']
expression_values = [120.5, 89.3, 156.7, 45.2, 203.1]

# 1. Calculate mean expression
# Hint: mean = sum of all values / count of values
mean_expression = # Your code here

# 2. Find genes above mean
# Hint: Create an empty list, then loop through genes and expression_values together
# Use zip() to pair them: for gene, expr in zip(genes, expression_values):
# If expr > mean_expression, add gene to your list
high_expression_genes = # Your code here

# 3. Create gene expression dictionary
# Hint: Use dict(zip(genes, expression_values))
gene_expression_dict = # Your code here

print(f"Mean expression: {mean_expression:.2f}")
print(f"Highly expressed genes: {high_expression_genes}")
print(f"Gene expression dictionary: {gene_expression_dict}")

### 1.3 Control Flow Review

#### What is control flow?
Control flow determines the order in which code executes. The two main types are:

**1. Conditional statements (if/elif/else)** - make decisions
```python
if temperature < 0:
    print("Freezing")
elif temperature < 10:
    print("Cold")
else:
    print("Warm")
```

**2. Loops** - repeat code
```python
# For loop - iterate over items
for gene in genes:
    print(gene)

# While loop - repeat while condition is true
count = 0
while count < 5:
    print(count)
    count += 1  # Same as: count = count + 1
```

#### Key comparison operators:
- `<` less than
- `<=` less than or equal to
- `>` greater than
- `>=` greater than or equal to
- `==` equal to (note: two equals signs for comparison!)
- `!=` not equal to

**Exercise 1.3:** Write a function `classify_temperature(temp_celsius)` that returns:
- "Freezing" if temp < 0
- "Cold" if 0 <= temp < 10
- "Mild" if 10 <= temp < 20
- "Warm" if 20 <= temp < 30
- "Hot" if temp >= 30

**Hints:**
- Use if/elif/else statements
- Check conditions from most restrictive to least restrictive
- Each branch should return a string
- Remember proper indentation!

In [None]:
def classify_temperature(temp_celsius):
    """
    Classify temperature into categories.
    
    Parameters:
    temp_celsius (float): Temperature in Celsius
    
    Returns:
    str: Temperature classification
    """
    # Your code here
    # Start with: if temp_celsius < 0:
    #     return "Freezing"
    # Then add elif for the other conditions
    pass

# Test cases
test_temps = [-5, 5, 15, 25, 35]
for temp in test_temps:
    print(f"{temp}°C: {classify_temperature(temp)}")

**Exercise 1.4:** Write a function `count_nucleotides(dna_sequence)` that returns a dictionary with counts of each nucleotide (A, T, G, C).

**Hints:**
- Create an empty dictionary: `counts = {}`
- Use a for loop to go through each nucleotide: `for nucleotide in dna_sequence:`
- For each nucleotide, either:
  - **Option 1**: Use `.count()` method (easier but less efficient)
  - **Option 2**: Keep a running count in your dictionary (more efficient)
    - Check if nucleotide is already in dictionary: `if nucleotide in counts:`
    - If yes, add 1: `counts[nucleotide] += 1`
    - If no, initialize to 1: `counts[nucleotide] = 1`

In [None]:
def count_nucleotides(dna_sequence):
    """
    Count occurrences of each nucleotide in a DNA sequence.
    
    Parameters:
    dna_sequence (str): DNA sequence
    
    Returns:
    dict: Dictionary with nucleotide counts
    """
    # Your code here
    # Option 1 (easier): Create a dictionary with each base and use .count()
    # counts = {'A': dna_sequence.count('A'), ...}
    
    # Option 2 (better practice): Loop through sequence and count as you go
    pass

# Test
sequence = "ATGCGATCGATCGTAGCTA"
print(count_nucleotides(sequence))
# Expected: {'A': 5, 'T': 5, 'G': 5, 'C': 4}

---

## Part 2: Git/GitHub Introduction

### 2.1 Version Control Concepts

#### What is version control?
Version control is like having an "undo" button for your entire project, plus the ability to:
- See who changed what and when
- Try new ideas without breaking working code
- Collaborate with others without overwriting each other's work
- Go back to any previous version of your project

#### Git vs GitHub - What's the difference?
- **Git** is the software that tracks changes on your computer (like Microsoft Word's "Track Changes")
- **GitHub** is a website that stores your Git projects online (like Google Drive for code)

Think of it this way:
- Git = the tool you use locally
- GitHub = the cloud storage and collaboration platform

**Discussion Questions** (to be discussed in small groups):
1. Why is version control important in scientific computing?
2. What problems does Git solve?
3. Can you think of a situation where you wished you had version control?

### 2.2 Essential Git Commands

Below are the commands you'll practice during the hands-on session. **Do not run these in the notebook** - use your terminal instead!

#### Understanding the Git workflow:
Git has three main "states" your files can be in:
1. **Working Directory** - where you edit files
2. **Staging Area** - files ready to be saved (committed)
3. **Repository** - permanent snapshot of your files

The workflow is: Edit → Add (stage) → Commit (save) → Push (upload)

```bash
# ONE-TIME SETUP: Configure Git (do this once per computer)
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"
# This tells Git who you are - important for tracking who made changes!

# GETTING A REPOSITORY: Clone a repository (download from GitHub)
git clone <repository-url>
# This creates a folder with all the project files on your computer

# CHECKING STATUS: See what's changed
git status
# Shows: which files you modified, which are staged, which are untracked
# Run this often! It's like asking "what's the current situation?"

# STAGING CHANGES: Add files to staging area (prepare them for committing)
git add <filename>      # Add specific file
git add .               # Add all changed files in current directory
# Think of this as putting items in a box before sealing it

# COMMITTING: Save a snapshot of your staged changes
git commit -m "Descriptive message about your changes"
# The message should explain WHAT you changed and WHY
# Good: "Add function to calculate GC content"
# Bad: "update" or "fix stuff"

# SYNCING WITH GITHUB:
git push                # Upload your commits to GitHub
git pull                # Download changes others made from GitHub
# Always pull before you start working!
# Always push when you're done!

# BRANCHING: Create separate lines of development
git branch <branch-name>              # Create new branch
git checkout <branch-name>            # Switch to that branch
git checkout -b <branch-name>         # Create and switch in one command
# Branches let you try new things without affecting the main code

# VIEWING HISTORY:
git log                 # See all commits (press 'q' to quit)
git log --oneline       # Compact view
```

#### A typical Git workflow:
```bash
# 1. Make sure you have latest version
git pull

# 2. Make your changes to files (edit in your editor)

# 3. Check what changed
git status

# 4. Stage your changes
git add .

# 5. Commit with a good message
git commit -m "Add temperature classification function"

# 6. Upload to GitHub
git push
```

### 2.3 Hands-on Git Practice

**Activity:** You'll practice these commands during class:

1. Clone the course repository
2. Create a new file called `my_functions.py`
3. Add one of the functions you wrote above
4. Use `git status` to see the changes
5. Stage your file with `git add`
6. Commit with a descriptive message
7. (If you have push access) Push to the remote repository

**Common Git mistakes and how to fix them:**
- Forgot to `git pull` first? → `git pull`, then resolve any conflicts
- Made changes on wrong branch? → `git stash`, then switch branches
- Want to undo last commit? → `git reset --soft HEAD~1` (keeps changes) or `git reset --hard HEAD~1` (deletes changes)
- Committed wrong files? → `git reset HEAD <file>` to unstage

---

## Part 3: Object-Oriented Programming (OOP)

### 3.1 What is OOP and Why Use It?

#### The problem with procedural programming:
Imagine you're tracking experiments. Without OOP, you might have:
```python
experiment_name = "Growth study"
temperature = 25
ph = 7.0
measurements = [1.2, 1.5, 1.3]
```

But what if you have 10 experiments? You'd need:
```python
experiment1_name = "Growth study"
experiment1_temp = 25
experiment2_name = "Stress test"
experiment2_temp = 37
# ... this gets messy fast!
```

#### The OOP solution:
With OOP, you create a "blueprint" (class) for experiments, then create individual experiments (objects) from that blueprint.

**Key OOP concepts:**
- **Class**: A blueprint/template (like a cookie cutter)
- **Object**: A specific instance made from the class (like individual cookies)
- **Attributes**: Data that belongs to an object (the object's characteristics)
- **Methods**: Functions that belong to a class (what the object can do)

#### Real-world analogy:
- **Class** = "Dog" (the concept of dogs in general)
- **Object** = Your specific dog "Max"
- **Attributes** = Max's name, breed, age, color
- **Methods** = bark(), eat(), sleep() - actions Max can perform

### 3.2 Creating Your First Class

#### Anatomy of a class:
```python
class ClassName:                    # 1. Class definition (note: class names use CamelCase)
    def __init__(self, param1, param2):   # 2. Constructor (runs when you create an object)
        self.attribute1 = param1    # 3. Attributes (data stored in the object)
        self.attribute2 = param2
    
    def method_name(self):          # 4. Methods (functions that belong to the class)
        # do something
        return something
```

#### Important notes about `self`:
- `self` refers to the specific object you're working with
- It MUST be the first parameter in every method
- You don't pass it when calling methods - Python does it automatically
- Think of `self` as "this particular instance"

#### The `__init__` method:
- Called automatically when you create a new object
- Used to set up initial attributes
- Also called a "constructor"

Let's see a complete example:

In [None]:
# Define a class
class Gene:
    """A class to represent a gene."""
    
    def __init__(self, name, chromosome, length):
        """Initialize a gene with its properties."""
        # Store the parameters as attributes
        self.name = name              # Gene name
        self.chromosome = chromosome  # Which chromosome it's on
        self.length = length          # Length in base pairs
    
    def get_info(self):
        """Return a formatted string with gene information."""
        # Access attributes using self.attribute_name
        return f"{self.name} is located on chromosome {self.chromosome} and is {self.length} bp long"
    
    def is_long(self):
        """Check if gene is longer than 10,000 bp."""
        return self.length > 10000

# Create objects (instances) of the class
gene1 = Gene("BRCA1", "17", 81189)    # Create first gene object
gene2 = Gene("TP53", "17", 19149)     # Create second gene object

# Use the objects
print(gene1.get_info())         # Call method on gene1
print(gene2.get_info())         # Call method on gene2
print(f"Is {gene1.name} long? {gene1.is_long()}")  # Access attribute and call method

# You can also access attributes directly
print(f"Gene name: {gene1.name}")
print(f"Chromosome: {gene1.chromosome}")

#### Understanding what happens:
1. `Gene("BRCA1", "17", 81189)` creates a new Gene object
2. Python automatically calls `__init__` with these values
3. The values are stored as attributes in that specific object
4. `gene1` and `gene2` are separate objects with their own data
5. When we call `gene1.get_info()`, it uses `gene1`'s attributes
6. When we call `gene2.get_info()`, it uses `gene2`'s attributes

### 3.3 Practice: Creating Simple Classes

**Exercise 3.1:** Create a `Sample` class for biological samples.

**Requirements:**
- Attributes: `sample_id`, `organism`, `tissue_type`, `collection_date`
- Methods:
  - `get_summary()`: returns a string with all sample information
  - `is_from_organism(organism_name)`: returns True if the sample is from that organism

**Step-by-step approach:**
1. Write the class definition: `class Sample:`
2. Write `__init__` method to initialize all attributes
3. Write `get_summary()` method that returns a formatted string
4. Write `is_from_organism()` method that compares `self.organism` with the parameter
5. Create at least 2 sample objects to test
6. Call methods on your objects and print results

In [None]:
class Sample:
    """A class to represent a biological sample."""
    
    def __init__(self, sample_id, organism, tissue_type, collection_date):
        """Initialize sample with its properties."""
        # Your code here: assign all parameters to self.attributes
        pass
    
    def get_summary(self):
        """Return formatted string with sample information."""
        # Your code here: return a string using self.sample_id, self.organism, etc.
        pass
    
    def is_from_organism(self, organism_name):
        """Check if sample is from specified organism."""
        # Your code here: return True if self.organism equals organism_name
        pass

# Test your class
sample1 = Sample("S001", "Arabidopsis thaliana", "leaf", "2024-01-15")
sample2 = Sample("S002", "Mus musculus", "liver", "2024-01-16")

print(sample1.get_summary())
print(sample2.get_summary())
print(f"Is sample1 from Arabidopsis? {sample1.is_from_organism('Arabidopsis thaliana')}")

### 3.4 Working with Lists as Attributes

#### Attributes can be any data type!
So far we've used simple attributes (strings, numbers). But attributes can be any Python data type, including lists, dictionaries, or even other objects!

**Common pattern:** Initialize an empty list in `__init__`, then add items with methods

```python
class DataCollector:
    def __init__(self, name):
        self.name = name
        self.data = []              # Start with empty list
    
    def add_measurement(self, value):
        self.data.append(value)     # Add to the list
    
    def get_average(self):
        if len(self.data) == 0:
            return 0
        return sum(self.data) / len(self.data)
```

**Exercise 3.2:** Create an `Experiment` class to track experimental measurements.

**Requirements:**
- Attributes: `name`, `temperature`, `ph`, `measurements` (list)
- Methods:
  - `add_measurement(value)`: adds a measurement to the list
  - `get_average()`: returns the average of all measurements
  - `condition_string()`: returns a formatted string like "Temperature: 25°C, pH: 7.0"

**Hints:**
- Initialize `self.measurements` as an empty list `[]` in `__init__`
- In `add_measurement()`, use `self.measurements.append(value)`
- In `get_average()`, check if list is empty first! Use `if len(self.measurements) == 0:`
- Use `sum(self.measurements)` and `len(self.measurements)` to calculate average

In [None]:
class Experiment:
    """A class to track experimental conditions and measurements."""
    
    def __init__(self, name, temperature, ph):
        """Initialize experiment with conditions."""
        self.name = name
        self.temperature = temperature
        self.ph = ph
        self.measurements = []  # Start with empty list
    
    def add_measurement(self, value):
        """Add a measurement to the experiment."""
        # Your code here
        pass
    
    def get_average(self):
        """Calculate average of all measurements."""
        # Your code here
        # Remember to check if list is empty!
        pass
    
    def condition_string(self):
        """Return formatted string with experimental conditions."""
        # Your code here
        pass

# Test your class
exp = Experiment("Growth rate study", 25, 7.0)
exp.add_measurement(1.2)
exp.add_measurement(1.5)
exp.add_measurement(1.3)

print(exp.condition_string())
print(f"Average measurement: {exp.get_average():.2f}")
# Expected: Temperature: 25°C, pH: 7.0
# Expected: Average measurement: 1.33

### 3.5 Multiple Objects - The Power of OOP

The real power of OOP is creating multiple independent objects from the same class. Each object has its own data!

**Key concept:** Objects are independent
```python
exp1 = Experiment("Study A", 25, 7.0)
exp2 = Experiment("Study B", 37, 6.5)

exp1.add_measurement(1.2)  # Only affects exp1!
exp2.add_measurement(2.5)  # Only affects exp2!

# They have completely separate data
print(exp1.get_average())  # 1.2
print(exp2.get_average())  # 2.5
```

This is much cleaner than having separate variables for each experiment!

**Exercise 3.3:** Create a `Protein` class with:
- Attributes: `name`, `sequence` (string of amino acids), `molecular_weight`
- Methods:
  - `get_length()`: returns the number of amino acids (length of sequence)
  - `has_motif(motif)`: returns True if the motif string is found in the sequence
  - `get_info()`: returns a formatted string with all protein information

Then create at least 3 different protein objects and test all methods.

**Hints:**
- For `get_length()`: use `len(self.sequence)`
- For `has_motif(motif)`: use `motif in self.sequence` (returns True/False)
- For `get_info()`: return a formatted string with f-string including all attributes

In [None]:
class Protein:
    """A class to represent a protein."""
    
    def __init__(self, name, sequence, molecular_weight):
        """Initialize protein with its properties."""
        # Your code here
        pass
    
    def get_length(self):
        """Return the number of amino acids."""
        # Your code here
        pass
    
    def has_motif(self, motif):
        """Check if motif is present in sequence."""
        # Your code here
        pass
    
    def get_info(self):
        """Return formatted string with all information."""
        # Your code here
        pass

# Create and test multiple protein objects
protein1 = Protein("Insulin", "MALWMRLLPLLALLALWGPDPAAAFVNQHLCGSHLVEALYLVCGERGFFYTPK", 5808)
protein2 = Protein("Lysozyme", "KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTDYGILQINSRWW", 14300)
protein3 = Protein("Myoglobin", "GLSDGEWQLVLNVWGKVEADIPGHGQEVLIRLFKGHPETLEKFDKFKHLKSEDEMKASEDLKKHGATVLTALGGILK", 17000)

# Test all methods on all proteins
for protein in [protein1, protein2, protein3]:
    print(protein.get_info())
    print(f"Length: {protein.get_length()} amino acids")
    print(f"Has 'GG' motif: {protein.has_motif('GG')}")
    print()

### 3.6 Challenge Exercise - Complex Attributes

Now let's combine everything: classes with dictionary attributes!

**Exercise 3.4:** Create a `GeneExpression` class that:
- Stores gene name and a dictionary of expression values for different conditions
- Has methods to:
  - Add a new condition with its expression value
  - Get the condition with highest expression
  - Get the condition with lowest expression
  - Calculate fold change between two conditions
  - Display all expression data

**Hints:**
- Initialize `self.expression_data` as empty dictionary `{}` in `__init__`
- To add: `self.expression_data[condition] = value`
- To find max: use `max(self.expression_data, key=self.expression_data.get)`
- To find min: use `min(self.expression_data, key=self.expression_data.get)`
- Fold change = condition1_value / condition2_value
- To display: loop through dictionary with `for condition, value in self.expression_data.items():`

In [None]:
class GeneExpression:
    """A class to track gene expression across different conditions."""
    
    def __init__(self, gene_name):
        """Initialize with gene name."""
        self.gene_name = gene_name
        self.expression_data = {}  # Dictionary: condition -> expression value
    
    def add_condition(self, condition, expression_value):
        """Add expression value for a condition."""
        # Your code here
        pass
    
    def get_max_condition(self):
        """Return the condition with highest expression."""
        # Your code here
        # Hint: max(self.expression_data, key=self.expression_data.get)
        pass
    
    def get_min_condition(self):
        """Return the condition with lowest expression."""
        # Your code here
        pass
    
    def calculate_fold_change(self, condition1, condition2):
        """Calculate fold change between two conditions."""
        # Your code here
        # Hint: return self.expression_data[condition1] / self.expression_data[condition2]
        pass
    
    def display_expression(self):
        """Display all expression data."""
        print(f"Expression data for {self.gene_name}:")
        # Your code here
        # Loop through dictionary and print each condition and its value
        pass

# Test case
gene = GeneExpression("BRCA1")
gene.add_condition("control", 100)
gene.add_condition("heat_stress", 250)
gene.add_condition("cold_stress", 80)
gene.add_condition("drought", 180)

gene.display_expression()
print(f"\nHighest expression: {gene.get_max_condition()}")
print(f"Lowest expression: {gene.get_min_condition()}")
print(f"Fold change (heat/control): {gene.calculate_fold_change('heat_stress', 'control'):.2f}")

---

## Part 4: Integration Exercise

**Exercise 4.1:** Combine everything you've learned today.

Create a `SequenceAnalyzer` class that:
1. Takes a DNA sequence as input
2. Has methods to:
   - Calculate GC content (use your function from Part 1)
   - Count nucleotides (use your function from Part 1)
   - Transcribe DNA to RNA (replace T with U)
   - Get the reverse complement
   - Generate a summary report as a formatted string

**Implementation strategy:**
1. Copy your functions from Part 1 into the class as methods
2. Remember to add `self` as first parameter
3. Use `self.sequence` to access the DNA sequence
4. For transcription: use `self.sequence.replace('T', 'U')`
5. For reverse complement:
   - First reverse: `reversed_seq = self.sequence[::-1]`
   - Then complement each base (A↔T, G↔C)
   - Hint: use a dictionary for the complement mapping

After creating the class, save it as a `.py` file and practice using Git to:
- Add it to your repository
- Commit with a good message
- (If working in groups) Push to your shared repository

In [None]:
class SequenceAnalyzer:
    """A comprehensive DNA sequence analysis tool."""
    
    def __init__(self, sequence):
        """Initialize with DNA sequence."""
        self.sequence = sequence.upper()  # Convert to uppercase for consistency
    
    def calculate_gc_content(self):
        """Calculate GC content percentage."""
        # Your code here (copy from Exercise 1.1, but use self.sequence)
        pass
    
    def count_nucleotides(self):
        """Count each nucleotide."""
        # Your code here (copy from Exercise 1.4)
        pass
    
    def transcribe(self):
        """Convert DNA to RNA (T -> U)."""
        # Your code here
        # Hint: return self.sequence.replace('T', 'U')
        pass
    
    def reverse_complement(self):
        """Get reverse complement of DNA sequence."""
        # Step 1: Define complement mapping
        complement = {'A': 'T', 'T': 'A', 'G': 'C', 'C': 'G'}
        
        # Step 2: Reverse the sequence
        reversed_seq = self.sequence[::-1]
        
        # Step 3: Build complement
        # Your code here
        # Hint: use a loop or list comprehension
        # comp_seq = ''.join([complement[base] for base in reversed_seq])
        pass
    
    def generate_report(self):
        """Generate comprehensive analysis report."""
        # Your code here
        # Call all your methods and format the output nicely
        # Example format:
        # Sequence: ATGC...
        # Length: X bp
        # GC Content: Y%
        # Nucleotide counts: {...}
        # RNA sequence: ...
        # Reverse complement: ...
        pass

# Test your class
seq = SequenceAnalyzer("ATGCGATCGATCGTAGCTA")
print(seq.generate_report())

---

## Reflection and Next Steps

### What we covered today:
✓ Python fundamentals: functions, data structures, control flow  
✓ Version control with Git/GitHub  
✓ OOP basics: classes, objects, attributes, methods  

### Key takeaways:
**Functions:**
- Reusable code blocks that take inputs and produce outputs
- Use clear names and docstrings
- Remember to return values!

**Git:**
- Always `git pull` before starting work
- Commit often with descriptive messages
- `git status` is your friend - use it frequently

**OOP:**
- Classes are blueprints, objects are specific instances
- `self` refers to the current object
- `__init__` sets up initial attributes
- Methods are functions that belong to a class

### For your group project this week:
1. Form your groups (3-4 people)
2. Decide on a project topic related to your field (biology, chemistry, physics)
3. Set up a shared GitHub repository
4. Each member clones the repository
5. Draft a project outline (README.md file)
6. Decide who will work on what components
7. Practice the Git workflow: branch → add → commit → push → pull request

### Questions to discuss with your group:
- What classes might be useful for your project?
- What attributes and methods should each class have?
- How will you divide the work?
- What branching strategy will you use?
- How often will you integrate your code?

### Prepare for next week:
- Review OOP concepts - especially `self` and `__init__`
- Make sure everyone in your group can push/pull from the shared repository
- Start thinking about the data structures your project needs
- Practice creating simple classes on your own

### Common troubleshooting:
**Python:**
- IndentationError? → Check that all code blocks are properly indented
- NameError? → Variable/function not defined or typo in name
- AttributeError? → Trying to access method/attribute that doesn't exist

**Git:**
- Merge conflict? → Manually edit file to resolve, then `git add` and `git commit`
- Permission denied? → Check if you have write access to repository
- Can't push? → Probably need to pull first: `git pull` then `git push`

---

## Additional Practice Exercises

### Bonus Exercise 1: Enhanced Experiment Class

Extend the `Experiment` class to include:
- A method to remove outliers (values that are >2 standard deviations from mean)
- A method to get the standard deviation
- A method to export data to a dictionary format

**Hints for standard deviation:**
```python
import statistics
std_dev = statistics.stdev(self.measurements)
```

Or calculate manually:
```python
mean = sum(self.measurements) / len(self.measurements)
variance = sum((x - mean) ** 2 for x in self.measurements) / len(self.measurements)
std_dev = variance ** 0.5
```

In [None]:
import statistics

class EnhancedExperiment:
    """Extended Experiment class with statistical methods."""
    
    def __init__(self, name, temperature, ph):
        # Your code here
        pass
    
    # Include all methods from original Experiment class
    
    def get_std_dev(self):
        """Calculate standard deviation of measurements."""
        # Your code here
        pass
    
    def remove_outliers(self):
        """Remove measurements more than 2 standard deviations from mean."""
        # Your code here
        # 1. Calculate mean and std_dev
        # 2. Filter measurements: keep only if abs(value - mean) <= 2 * std_dev
        # 3. Update self.measurements with filtered list
        pass
    
    def to_dict(self):
        """Export all data as dictionary."""
        # Your code here
        # Return dictionary with keys: name, temperature, ph, measurements
        pass

### Bonus Exercise 2: Organism Class

Create an `Organism` class that could be useful for ecological studies:

**Requirements:**
- Attributes: `species_name`, `common_name`, `kingdom`, `population_count`, `habitat`
- Methods:
  - `update_population(new_count)`: update population count
  - `is_endangered()`: returns True if population < 1000
  - `get_taxonomic_info()`: returns formatted string with classification
  - `calculate_population_change(new_count)`: returns percentage change

Create a few organisms and track their populations over time.

In [None]:
class Organism:
    """A class to represent an organism for ecological studies."""
    
    def __init__(self, species_name, common_name, kingdom, population_count, habitat):
        # Your code here
        pass
    
    def update_population(self, new_count):
        """Update the population count."""
        # Your code here
        pass
    
    def is_endangered(self):
        """Check if organism is endangered (population < 1000)."""
        # Your code here
        pass
    
    def get_taxonomic_info(self):
        """Return formatted taxonomic information."""
        # Your code here
        pass
    
    def calculate_population_change(self, new_count):
        """Calculate percentage change in population."""
        # Your code here
        # Formula: (new_count - current_count) / current_count * 100
        pass

# Test your class
# Create organisms and track population changes