# A primer on Control Flow

- Control flow refers to the order in which the program's code executes.
- It determines how and when the code inside a program is run, which is how programs  make decisions, repeat actions, and handle different data inputs.
- The control flow of a program is directed using conditional statements, loops, and function calls.

## If-Else Statements
- **Purpose**: Used for decision-making in a program.
- **Structure**: Consists of `if`, `elif` (else if), and `else` blocks.
- **Behavior**: Executes a block of code based on condition evaluation. Only one block in the `if-else` chain is executed per condition check.
- **Use Case**: Ideal for branching paths in a program, like categorizing data or handling different input cases.

In [None]:
# Example: A function for determining if a pH value is acidic, basic, or neutral
def classify_ph(ph_value):
    if ph_value < 7: # if True, return acidic... 
        return "Acidic" 
    elif ph_value > 7: # if True, return basic...
        return "Basic"
    else:
        return "Neutral" # if all previous are False, return neutral...

# Testing the function
ph_test = 7.4
print(f"pH {ph_test} is {classify_ph(ph_test)}")

### Exercise 1: If Else Decision Statements
**Task**: Create a function that takes the pKa of a weak acid and a pH value of the solution as inputs.
- Test whether the two values are equal.
- If they are equal print "50% of the weak acid molecules are protonated". Note, must use '==' as equals in conditional statements (rather than '='). 
- Else if the pH > pKa, print "< 50% of the weak acid molecules are protonated"
- Else, print "> 50% of the weak acid molecules are protonated"

In [None]:
# your answer here

## For Loops
- **Purpose**: Designed for iterating over sequences (like lists, tuples, strings) or ranges.
- **Structure**: Defined with a `for` keyword, followed by a variable name and a sequence.
- **Behavior**: Iterates over each item in the sequence, executing the block of code for each item.
- **Use Case**: Useful for operations that require action on every element of a sequence, such as data processing or aggregation tasks.

In [None]:
# Example: Calculating the average molecular weight of a dictionary of amino acids and molecular weights
amino_acids = {'Alanine': 89.1, 'Cysteine': 121.2, 'Aspartic Acid': 133.1}
total_weight = 0
for weight in amino_acids.values():
    total_weight += weight # this is equivalent to total_weight = total_weight + weight
average_weight = total_weight / len(amino_acids)
print("Average Molecular Weight:", average_weight)

### Exercise 2: For loops
**Task**: create a dictionary of molecule names and molecular weights
- For each molecule in the dictionary, test whether the molecular weight is above or below a cutoff value

In [None]:
# Your answer here

## While Loops
- **Purpose**: Executes a block of code repeatedly as long as a condition is true.
- **Structure**: Starts with the `while` keyword, followed by a condition.
- **Behavior**: Continuously executes the code block until the condition becomes false.
- **Use Case**: Best suited for scenarios where the number of iterations is not known in advance, like waiting for a specific event or condition.

In [None]:
# Example: Simulating a reaction until a substrate concentration falls below a threshold
substrate_concentration = 100  # initial concentration
product_concentration = 0
reaction_rate = 0.1  # rate of conversion of substrate to product

while substrate_concentration > 10:
    substrate_concentration -= reaction_rate * substrate_concentration
    product_concentration += reaction_rate * substrate_concentration
    print(f"Substrate: {substrate_concentration:.2f}, Product: {product_concentration:.2f}")

### Exercise 3: While Loops
**Task**: Create a while loop to simulate a reaction that involves two substrates combining to form a single product.
- Include a variable that defines the substrate concentration threshold
- Print only the ratio of substrate/product rather than each value separately

In [None]:
# Your Answer Here

## Quick Review and a Few Additional Points..
The previous primer focused on conditional statements and loops. Here we'll add a few addional concepts that are useful for error handling.

**Review**:

1. Conditional Statements (`if`, `elif`, `else`)

- **`if` Statement**: Used to execute a block of code if a certain condition is true.
- **`elif` (else if) Statement**: Follows an `if` statement and is used to check multiple conditions, one after the other.
- **`else` Statement**: Used after `if` and `elif` statements to define a block of code to be executed if all previous conditions are false.

2. Loops

- **`for` Loop**: Used for iterating over a sequence (like a list, tuple, dictionary, set, or string). It's more like an iterator method in other programming languages.
- **`while` Loop**: Executes a set of statements as long as a condition is true.

**More Advanced Control-Flow Concepts**  

3. Loop Control Statements

- **`break`**: Used to exit the loop entirely.
- **`continue`**: Skips the current iteration of the loop and moves to the next iteration.
- **`pass`**: Does nothing; it's a placeholder to avoid syntax errors when a statement is required syntactically but no code needs to be executed.

4. Nested Control Structures

* Control structures can be nested within each other. For example, an `if` statement inside a `for` loop, or a `for` loop inside a `while` loop.

6. Exception Handling

- **`try` and `except` Blocks**: Used to catch and handle exceptions (errors) in Python. The `try` block lets you test a block of code for errors, while the `except` block lets you handle the error. (There will be a separate primer on these)

6. Function Calls

- Functions are blocks of reusable code. Typical programs consist of defining functions and creating a workflow that `calls` those functions to perform desired tasks.

### Advanced example
- Let's sew together some of these ideas to do something a little more complex.
- This example showcases a protein size classifying algorithm that uses a `lookup table` of amino acid molecular weights and some size cutoffs to make the classification. 

In [None]:
# Let's assume we have a list of protein sequences
# Here we're inputing them manually, but they could have been imported from a file
protein_sequences = ["MKQLEDKVEELLSKNYHLENEVARLKKLV", "MTEEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLPARTVETRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQYRMKKLNSSDDGTQGEGENS"]

# Function to calculate the molecular weight of a protein sequence from a lookup table of AA mol weights
def calculate_molecular_weight(sequence):
    molecular_weights = {
        'A': 89.1, 'R': 174.2, 'N': 132.1, 'D': 133.1, 'C': 121.2,
        'E': 147.1, 'Q': 146.2, 'G': 75.1, 'H': 155.2, 'I': 131.2,
        'L': 131.2, 'K': 146.2, 'M': 149.2, 'F': 165.2, 'P': 115.1,
        'S': 105.1, 'T': 119.1, 'W': 204.2, 'Y': 181.2, 'V': 117.1
    }
    weight = 0
    for amino_acid in sequence:
        weight += molecular_weights.get(amino_acid, 0) # .get() returns the value of the item with the specified key
    return weight

# Iterate over each protein sequence
for sequence in protein_sequences:
    # Calculate the molecular weight
    weight = calculate_molecular_weight(sequence)
    
    # Check if the molecular weight is within a certain range
    if weight > 2000 and weight < 5000:
        print(f"Sequence: {sequence}")
        print(f"Molecular Weight: {weight} - Medium sized protein")
    elif weight >= 5000:
        print(f"Sequence: {sequence}")
        print(f"Molecular Weight: {weight} - Large sized protein")
    else:
        print(f"Sequence: {sequence}")
        print(f"Molecular Weight: {weight} - Small sized protein")

# Example of a while loop to find the first large protein
index = 0
while index < len(protein_sequences): # What do you think the len() function does?
    weight = calculate_molecular_weight(protein_sequences[index])
    if weight >= 5000:
        print(f"First large protein found at index {index}: {protein_sequences[index]}")
        break
    index += 1
else:
    print("No large protein found in the list.")


## Advanced Exercises  

1. Exercise 1: Protein Sequence Analysis

- Given a list of protein sequences, write a Python function to classify each protein based on its length:
- Short (length < 50)
- Medium (length between 50 and 100)
- Long (length > 100)

2. Exercise 2: Amino Acid Count
   
- Write a function that takes a protein sequence and returns a dictionary with the count of each amino acid in the sequence.

3. Exercise 3: Molecular Weight Calculator Enhancement

- Modify the `calculate_molecular_weight` function from the previous example to include an error message if an unrecognized amino acid single letter abbreviation is encountered

5. Exercise 4: Finding the First Large Protein with a While Loop
 
- Using a `while` loop, iterate through a list of protein sequences and print the sequence of the first protein that is classified as 'Large' based on the molecular weight cutoff parameter.


In [None]:
# Your Answers Here; Create new cells as needed

### Solutions

In [None]:
# Exercise 1: Protein Sequence Analysis
def classify_protein_by_length(protein_sequences):
    for sequence in protein_sequences:
        if len(sequence) < 50:
            print(f"Sequence: {sequence} - Short")
        elif 50 <= len(sequence) <= 100:
            print(f"Sequence: {sequence} - Medium")
        else:
            print(f"Sequence: {sequence} - Long")

protein_sequences = ["MKQLEDKVEELLSKNYHLENEVARLKKLV", "MTEEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHQYREQIKRVKDSDDVPMVLVGNKCDLPARTVETRQAQDLARSYGIPYIETSAKTRQGVEDAFYTLVREIRQYRMKKLNSSDDGTQGEGENS", "MPMILGYGNGKTT"]
classify_protein_by_length(protein_sequences)

# Exercise 2: Amino Acid Count
def amino_acid_count(sequence):
    count_dict = {}
    for amino_acid in sequence:
        if amino_acid in count_dict:
            count_dict[amino_acid] += 1
        else:
            count_dict[amino_acid] = 1
    return count_dict

sequence = "MKQLEDKVEELLSKNYHLENEVARLKKLV"
print(amino_acid_count(sequence))

# Exercise 3: Molecular Weight Calculator Enhancement
def calculate_molecular_weight(sequence):
    molecular_weights = {
        'A': 89.1, 'R': 174.2, 'N': 132.1, 'D': 133.1, 'C': 121.2,
        'E': 147.1, 'Q': 146.2, 'G': 75.1, 'H': 155.2, 'I': 131.2,
        'L': 131.2, 'K': 146.2, 'M': 149.2, 'F': 165.2, 'P': 115.1,
        'S': 105.1, 'T': 119.1, 'W': 204.2, 'Y': 181.2, 'V': 117.1
    }
    weight = 0
    for amino_acid in sequence:
        if amino_acid in molecular_weights:
            weight += molecular_weights[amino_acid]
        else:
            print(f"Unknown amino acid {amino_acid} encountered.")
            return None
    return weight

# Exercise 4: Finding the First Large Protein with a While Loop
def find_first_large_protein(protein_sequences):
    index = 0
    while index < len(protein_sequences):
        if len(protein_sequences[index]) > 100:
            print(f"First large protein found at index {index}: {protein_sequences[index]}")
            return
        index += 1
    print("No large protein found in the list.")

find_first_large_protein(protein_sequences)
