# Python Level Assessment
*Discover your programming baseline through problem-solving*

## Instructions
Work through these problems at your own pace. **Don't look anything up** - I want to see your natural problem-solving approach. Skip problems that seem too advanced.

This assessment will help me understand your current skills and create a personalized learning plan.

---

## Problem 1: Cell Culture Data (Basic Variables & Types)

A lab tech recorded cell counts from 3 culture dishes:
- Dish A: 1,234,567 cells
- Dish B: 987,321 cells  
- Dish C: 1,456,789 cells

**Tasks:**
1. Store these values in appropriately named variables
2. Calculate the total cell count
3. Find the average cells per dish
4. Determine which dish has the most cells

In [4]:
# Your solution here:
#1.
dish_A = int(1234567)
dish_B = int(987321)
dish_C = int(1456789)

#2.
total_cell_count = dish_A + dish_B + dish_C
print(total_cell_count)

#3.
import numpy as np

avg_cells = np.mean([dish_A, dish_B, dish_C])
print(avg_cells)

#4.
if dish_A > dish_B & dish_A > dish_C:
    print("dish_A has the most cells")
elif dish_B > dish_A & dish_B > dish_C:
    print("dish_B has the most cells")
else:
    print("dish_C has the most cells")


3678677
1226225.6666666667
dish_C has the most cells


## Problem 2: DNA Sequence Analysis (Strings)

You have a DNA sequence: `"ATCGATCGTAGCTAGCTA"`

**Tasks:**
1. Count how many times each nucleotide (A, T, C, G) appears
2. Calculate the GC content (percentage of G and C nucleotides)
3. Find the complement sequence (A↔T, C↔G)
4. Check if the sequence contains the start codon "ATG"

In [14]:
dna_sequence = "ATCGATCGTAGCTAGCTA"

# Your solution here:
#1.
A = dna_sequence.count('A')
T = dna_sequence.count('T')
G = dna_sequence.count('G')
C = dna_sequence.count('C')

print(f"There are {A} Adenine in {dna_sequence}")
print(f"There are {T} Thymine in {dna_sequence}")
print(f"There are {G} Guanine in {dna_sequence}")
print(f"There are {C} Cytosine in {dna_sequence}")

#2.
dna_length = len(dna_sequence)
GC = ((G+C)/dna_length)*100
print(f"{dna_sequence} has a GC content of {GC}%")

#3.
complement = {'A': 'T', 'C': 'G', 'G': 'C', 'T': 'A'}
complement_dna_sequence = "".join(complement[base] for base in dna_sequence)
print(complement_dna_sequence)

#4.
if "ATG" in dna_sequence:
    print("Sequence contains ATG")
else:
    print("Sequence does not contain ATG")


There are 5 Adenine in ATCGATCGTAGCTAGCTA
There are 5 Thymine in ATCGATCGTAGCTAGCTA
There are 4 Guanine in ATCGATCGTAGCTAGCTA
There are 4 Cytosine in ATCGATCGTAGCTAGCTA
ATCGATCGTAGCTAGCTA has a GC content of 44.44444444444444%
TAGCTAGCATCGATCGAT
Sequence does not contain ATG


## Problem 3: Experimental Data (Lists & Loops)

You measured enzyme activity at different temperatures (°C):
`temperatures = [25, 30, 35, 40, 45, 50, 55]`
`activities = [12, 18, 24, 31, 28, 19, 8]`

**Tasks:**
1. Find the temperature with maximum activity
2. Calculate average activity across all temperatures
3. Count how many temperatures gave activity > 20
4. Create a list of "high activity" temperatures (activity > 25)

In [25]:
temperatures = [25, 30, 35, 40, 45, 50, 55]
activities = [12, 18, 24, 31, 28, 19, 8]

# Your solution here:
#1.
max_activity = max(activities)
max_temp = temperatures[activities.index(max_activity)]
print(f"The maximum activity of {max_activity} is reached at {max_temp}C")

#2. 
import numpy as np
avg_activity = np.mean(activities)
print(f"The average activity is {avg_activity}")

#3.
count = sum(1 for i in activities if i > 20)
print(count)

#4. 
high_temperatures = [temp for temp, act in zip(temperatures, activities) if act > 25]
print(high_temperatures)

The maximum activity of 31 is reached at 40C
The average activity is 20.0
3
[40, 45]


## Problem 4: Growth Curve Function (Functions)

Bacterial growth follows the equation: `population = initial_pop * 2^(time/doubling_time)`

**Tasks:**
1. Write a function called `bacterial_growth` that takes initial population, time, and doubling time as parameters
2. Calculate population after 6 hours with initial=1000, doubling_time=2 hours
3. Modify your function to also return the growth rate (population/time)
4. Use your function to find when population reaches 50,000 (try different times)

In [37]:
# Your solution here:
#1. 
def bacterial_growth(initial, time, doubling_time):
    population = initial * 2**(time/doubling_time)
    return(population)

#2. 
print(bacterial_growth(1000, 6, 2))

#3.
def bacterial_growth_rate(initial, time, doubling_time):
    population = initial * 2**(time/doubling_time)
    growth_rate = population/time
    return(population, growth_rate)

#4. 
for i in range(1,21):
    a = bacterial_growth_rate(1000, i, 2)
    print(a)

print(bacterial_growth(1000, 12, 2))
print("50,000 cells reached after 12 hours")

8000.0
(1414.213562373095, 1414.213562373095)
(2000.0, 1000.0)
(2828.42712474619, 942.8090415820634)
(4000.0, 1000.0)
(5656.85424949238, 1131.370849898476)
(8000.0, 1333.3333333333333)
(11313.70849898476, 1616.2440712835373)
(16000.0, 2000.0)
(22627.41699796952, 2514.1574442188357)
(32000.0, 3200.0)
(45254.83399593904, 4114.075817812641)
(64000.0, 5333.333333333333)
(90509.66799187809, 6962.282153221391)
(128000.0, 9142.857142857143)
(181019.33598375617, 12067.955732250412)
(256000.0, 16000.0)
(362038.67196751235, 21296.392468677197)
(512000.0, 28444.444444444445)
(724077.3439350247, 38109.33389131709)
(1024000.0, 51200.0)
64000.0
50,000 cells reached after


## Problem 5: Patient Data Dictionary (Dictionaries)

You have patient genetic test results:
```python
patient_data = {
    "P001": {"age": 34, "gene_variants": ["BRCA1", "TP53"], "risk_score": 0.73},
    "P002": {"age": 45, "gene_variants": ["BRCA2"], "risk_score": 0.56},
    "P003": {"age": 29, "gene_variants": [], "risk_score": 0.12}
}
```

**Tasks:**
1. Find all patients with risk_score > 0.5
2. Calculate average age of high-risk patients
3. Count total unique gene variants across all patients
4. Add a new patient "P004" with your chosen values

In [None]:
patient_data = {
    "P001": {"age": 34, "gene_variants": ["BRCA1", "TP53"], "risk_score": 0.73},
    "P002": {"age": 45, "gene_variants": ["BRCA2"], "risk_score": 0.56},
    "P003": {"age": 29, "gene_variants": [], "risk_score": 0.12}
}

# Your solution here:
#1. 
high_risk_patients = {patient_id: data for patient_id, data in patient_data.items() if data["risk_score"] > 0.5}

## Problem 6: Species Classification (Classes) - Advanced

**Task:** Create a `Species` class to represent biological organisms

Requirements:
1. Initialize with name, kingdom, and population
2. Method to check if species is endangered (population < 10,000)
3. Method to simulate population growth (multiply by growth factor)
4. Create 2 species instances and test your methods

In [None]:
# Your solution here:
class Species(name, kingdom, population):
    name.init = name

## Problem 7: Data Analysis Challenge (Pandas/NumPy) - Advanced

**Skip if you haven't used pandas/numpy before**

Create some mock gene expression data and analyze it:
1. Generate data for 100 genes across 20 samples
2. Find genes with highest average expression
3. Identify samples with most similar expression patterns
4. Create a simple visualization

In [38]:
# Your solution here (import libraries as needed):
#1. 
import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0,1000, size=(100,20)))



---

## Reflection Questions

**Answer in the cell below:**
1. Which problems felt most challenging?
2. What Python concepts do you want to improve most?
3. How comfortable are you with biological data analysis?
4. What specific biology topics interest you most?

**Your reflection here:**

1. Most challenging problems: i havent worked too much with dictionaries, also i struggled with classes, as i have never been taught about them. the last advanced topic was also difficult

2. Python concepts to improve: I want to improve basic concepts as necessary first

3. Biological data analysis comfort level: Should be okay

4. Biology topics of interest: Doesnt matter, not all examples have to be related to biology.


---

## Next Steps

After completing this assessment:
1. Save this notebook with your solutions
2. Ask Claude Code to review your work and create your personalized learning plan
3. Start with your first recommended exercise

*Ready to begin learning Python?*