# Week 1: Python Recap, Git/GitHub, and Introduction to OOP

## Learning Objectives
By the end of this week, you should be able to:
- Write Python functions and use common data structures
- Use basic Git commands for version control
- Understand the fundamentals of Object-Oriented Programming
- Create simple classes with attributes and methods

---

## Part 1: Python Recap

### 1.1 Functions Review

Let's start with a quick refresher on functions.

In [None]:
# Example: A simple function
def calculate_bmi(weight_kg, height_m):
    """
    Calculate Body Mass Index.
    
    Parameters:
    weight_kg (float): Weight in kilograms
    height_m (float): Height in meters
    
    Returns:
    float: BMI value
    """
    bmi = weight_kg / (height_m ** 2)
    return bmi

# Test it
print(calculate_bmi(70, 1.75))

**Exercise 1.1:** Write a function `calculate_gc_content(dna_sequence)` that calculates the GC content (percentage of G and C nucleotides) in a DNA sequence string.

In [None]:
def calculate_gc_content(dna_sequence):
    """
    Calculate the GC content of a DNA sequence.
    
    Parameters:
    dna_sequence (str): DNA sequence containing A, T, G, C
    
    Returns:
    float: GC content as a percentage
    """
    # Your code here
    pass

# Test your function
test_seq = "ATGCGATCGATCG"
print(f"GC content: {calculate_gc_content(test_seq):.2f}%")
# Expected output: around 53.85%

### 1.2 Data Structures Review

Quick review of lists, dictionaries, and sets.

**Exercise 1.2:** Given a list of gene expression values, write code to:
1. Calculate the mean expression
2. Find genes with expression above the mean
3. Return a dictionary with gene names as keys and their expression values

In [None]:
# Sample data
genes = ['BRCA1', 'TP53', 'EGFR', 'MYC', 'KRAS']
expression_values = [120.5, 89.3, 156.7, 45.2, 203.1]

# 1. Calculate mean expression
mean_expression = # Your code here

# 2. Find genes above mean
high_expression_genes = # Your code here

# 3. Create gene expression dictionary
gene_expression_dict = # Your code here

print(f"Mean expression: {mean_expression:.2f}")
print(f"Highly expressed genes: {high_expression_genes}")
print(f"Gene expression dictionary: {gene_expression_dict}")

# Expected output:
# Mean: 125.00
# Max: 200
# Genes above 100: 3


### 1.3 Control Flow Review

**Exercise 1.3:** Write a function `classify_temperature(temp_celsius)` that returns:
- "Freezing" if temp < 0
- "Cold" if 0 <= temp < 10
- "Mild" if 10 <= temp < 20
- "Warm" if 20 <= temp < 30
- "Hot" if temp >= 30

In [None]:
def classify_temperature(temp_celsius):
    # Your code here
    pass

# Test cases
test_temps = [-5, 5, 15, 25, 35]
for temp in test_temps:
    print(f"{temp}°C: {classify_temperature(temp)}")

**Exercise 1.4:** Write a function `count_nucleotides(dna_sequence)` that returns a dictionary with counts of each nucleotide (A, T, G, C).

In [None]:
def count_nucleotides(dna_sequence):
    """
    Count occurrences of each nucleotide in a DNA sequence.
    
    Parameters:
    dna_sequence (str): DNA sequence
    
    Returns:
    dict: Dictionary with nucleotide counts
    """
    # Your code here
    pass

# Test
sequence = "ATGCGATCGATCGTAGCTA"
print(count_nucleotides(sequence))
# Expected: {'A': 5, 'T': 5, 'G': 5, 'C': 4}

---

## Part 2: Git/GitHub Introduction

### 2.1 Version Control Concepts

**Discussion Questions** (to be discussed in small groups):
1. Why is version control important in scientific computing?
2. What problems does Git solve?
3. What's the difference between Git and GitHub?

### 2.2 Essential Git Commands

Below are the commands you'll practice during the hands-on session. **Do not run these in the notebook** - use your terminal instead!

```bash
# Configure Git (do this once)
git config --global user.name "Your Name"
git config --global user.email "your.email@example.com"

# Clone a repository
git clone <repository-url>

# Check status of your repository
git status

# Add files to staging area
git add <filename>
git add .  # adds all changed files

# Commit changes
git commit -m "Descriptive message about your changes"

# Push changes to remote repository
git push

# Pull changes from remote repository
git pull

# Create a new branch
git branch <branch-name>
git checkout <branch-name>
# Or do both at once:
git checkout -b <branch-name>

# View commit history
git log
```

### 2.3 Git Workflow Exercise

**Hands-on Task** (to be done in terminal):

1. Create a new directory for testing Git
2. Initialize a Git repository with `git init`
3. Create a simple Python file (e.g., `hello.py`)
4. Add and commit the file
5. Make a change to the file
6. Check the status, add, and commit again
7. View your commit history

### 2.4 Reflection Questions

Answer these in the markdown cell below after completing the Git exercises:

1. What does `git add` do, and why is it a separate step from `git commit`?
2. What makes a good commit message?
3. When would you use branches in your group project?

**Your answers here:**

1. 
2. 
3. 

---

## Part 3: Introduction to Object-Oriented Programming

### 3.1 What is OOP and Why Use It?

**Object-Oriented Programming (OOP)** is a programming paradigm that organizes code into **objects** - bundles of data (attributes) and functions (methods) that work together.

**Why use OOP?**
- **Organization**: Group related data and functions together
- **Reusability**: Create templates (classes) to make multiple similar objects
- **Maintainability**: Changes to a class affect all its objects
- **Real-world modeling**: Objects can represent real entities (genes, experiments, organisms)

### 3.2 Classes and Objects/Instances: A Simple Example

Think of a **class** as a blueprint, and an **object** as something built from that blueprint.

In [None]:
# Example: A simple Gene class
class Gene:
    """A class to represent a gene."""
    
    def __init__(self, name, sequence):
        """Initialize a Gene object."""
        self.name = name
        self.sequence = sequence
    
    def get_length(self):
        """Return the length of the gene sequence."""
        return len(self.sequence)
    
    def describe(self):
        """Print information about the gene."""
        print(f"Gene: {self.name}")
        print(f"Length: {self.get_length()} bp")

# Creating objects (instances) of the Gene class
gene1 = Gene("BRCA1", "ATGCGATCGATCG")
gene2 = Gene("TP53", "GCTAGCTAGCTA")

# Using the objects
gene1.describe()
print()
gene2.describe()

### 3.3 Understanding `__init__` and `self`

- **`__init__`**: The constructor method, called automatically when you create a new object
- **`self`**: Refers to the specific object instance. It allows the object to access its own attributes and methods

Think of `self` as "this specific object" - it's how each object keeps track of its own data.

**Exercise 3.1:** Create a `Sample` class to represent a biological sample with the following:
- Attributes: `sample_id`, `organism`, `collection_date`
- Method: `display_info()` that prints all the sample information

In [None]:
class Sample:
    """A class to represent a biological sample."""
    
    def __init__(self, sample_id, organism, collection_date):
        # Your code here
        pass
    
    def display_info(self):
        # Your code here
        pass

# Test your class
sample1 = Sample("S001", "Arabidopsis thaliana", "2024-01-15")
sample1.display_info()

### 3.4 Instance Methods

Methods are functions that belong to a class. They can access and modify the object's attributes using `self`.

**Exercise 3.2:** Create an `Experiment` class with:
- Attributes: `name`, `temperature`, `ph`, `measurements` (initialize as empty list)
- Methods:
  - `add_measurement(value)`: adds a value to the measurements list
  - `get_average()`: returns the average of all measurements
  - `condition_string()`: returns a formatted string with temperature and pH

In [None]:
class Experiment:
    """A class to represent a scientific experiment."""
    
    def __init__(self, name, temperature, ph):
        # Your code here
        pass
    
    def add_measurement(self, value):
        # Your code here
        pass
    
    def get_average(self):
        # Your code here
        pass
    
    def condition_string(self):
        # Your code here
        pass

# Test your class
exp = Experiment("Growth rate study", 25, 7.0)
exp.add_measurement(1.2)
exp.add_measurement(1.5)
exp.add_measurement(1.3)

print(exp.condition_string())
print(f"Average measurement: {exp.get_average():.2f}")
# Expected: Temperature: 25°C, pH: 7.0
# Expected: Average measurement: 1.33

### 3.5 Multiple Objects

The power of OOP is creating multiple objects from the same class, each with their own data.

**Exercise 3.3:** Create a `Protein` class with:
- Attributes: `name`, `sequence` (string of amino acids), `molecular_weight`
- Methods:
  - `get_length()`: returns the number of amino acids
  - `has_motif(motif)`: returns True if the motif string is found in the sequence
  - `get_info()`: returns a formatted string with all protein information

Then create at least 3 different protein objects and test all methods.

In [None]:
class Protein:
    """A class to represent a protein."""
    
    # Your code here
    pass

# Create and test multiple protein objects
# Your code here

### 3.6 Challenge Exercise

**Exercise 3.4:** Create a `GeneExpression` class that:
- Stores gene name and a dictionary of expression values for different conditions
- Has methods to:
  - Add a new condition with its expression value
  - Get the condition with highest expression
  - Get the condition with lowest expression
  - Calculate fold change between two conditions
  - Display all expression data

In [None]:
class GeneExpression:
    """A class to track gene expression across different conditions."""
    
    # Your code here
    pass

# Test case
gene = GeneExpression("BRCA1")
gene.add_condition("control", 100)
gene.add_condition("heat_stress", 250)
gene.add_condition("cold_stress", 80)
gene.add_condition("drought", 180)

gene.display_expression()
print(f"\nHighest expression: {gene.get_max_condition()}")
print(f"Lowest expression: {gene.get_min_condition()}")
print(f"Fold change (heat/control): {gene.calculate_fold_change('heat_stress', 'control'):.2f}")

---

## Part 4: Integration Exercise

**Exercise 4.1:** Combine everything you've learned today.

Create a `SequenceAnalyzer` class that:
1. Takes a DNA sequence as input
2. Has methods to:
   - Calculate GC content (use your function from Part 1)
   - Count nucleotides (use your function from Part 1)
   - Transcribe DNA to RNA (replace T with U)
   - Get the reverse complement
   - Generate a summary report as a formatted string

After creating the class, save it as a `.py` file and practice using Git to:
- Add it to your repository
- Commit with a good message
- (If working in groups) Push to your shared repository

In [None]:
class SequenceAnalyzer:
    """A comprehensive DNA sequence analysis tool."""
    
    # Your code here
    pass

# Test your class
seq = SequenceAnalyzer("ATGCGATCGATCGTAGCTA")
print(seq.generate_report())

---

## Reflection and Next Steps

### What we covered today:
✓ Python fundamentals: functions, data structures, control flow  
✓ Version control with Git/GitHub  
✓ OOP basics: classes, objects, attributes, methods  

### For your group project this week:
1. Form your groups (3-4 people)
2. Decide on a project topic
3. Set up a shared GitHub repository
4. Each member clones the repository
5. Draft a project outline (README.md file)
6. Decide who will work on what components
7. Practice the Git workflow: branch → add → commit → push → pull request

### Questions to discuss with your group:
- What classes might be useful for your project?
- How will you divide the work?
- What branching strategy will you use?
- How often will you integrate your code?

### Prepare for next week:
- Review OOP concepts 
- Make sure everyone in your group can push/pull from the shared repository
- Start thinking about the data structures your project needs

---

## Additional Practice Exercises

### Bonus Exercise 1: Enhanced Experiment Class

Extend the `Experiment` class to include:
- A method to remove outliers (values that are >2 standard deviations from mean)
- A method to get the standard deviation
- A method to export data to a dictionary format

In [None]:
    pass



### Bonus Exercise 2: Organism Class

Create an `Organism` class that could be useful for ecological studies:

In [None]:
class Organism:
    """A class to represent an organism for ecological studies."""
    def __init__(self, species_name, common_name, kingdom):
    pass

