# Python Lists: Organizing Data

In biology, we rarely work with single measurements. We have:
- Multiple samples to track
- Series of concentrations to test
- Collections of gene names
- Sets of experimental conditions

Python lists are perfect for organizing this kind of data!

## What are Lists?

Think of lists as:
- A 96-well plate with samples in each well
- A gel with multiple lanes
- A lab notebook with a list of reagents
- A spreadsheet column of data

Lists are created with square brackets `[]`

In [None]:
# Creating different types of lists
sample_names = ["Control", "Treatment1", "Treatment2", "Treatment3"]
concentrations = [0, 5, 10, 25, 50, 100]  # μM concentrations
temperatures = [37.0, 42.0, 50.0, 65.0, 95.0]  # °C
dna_bases = ["A", "T", "G", "C"]

print("Sample names:", sample_names)
print("Concentrations (μM):", concentrations)
print("Temperatures (°C):", temperatures)
print("DNA bases:", dna_bases)

## Accessing Items in Lists

**Important**: Python starts counting from 0, not 1!

- First item = index 0
- Second item = index 1
- Third item = index 2
- etc.

In [None]:
# Accessing individual items
samples = ["Control", "Drug_A", "Drug_B", "Drug_C"]

print("First sample (index 0):", samples[0])
print("Second sample (index 1):", samples[1])
print("Fourth sample (index 3):", samples[3])

# Negative indices count from the end
print("Last sample (index -1):", samples[-1])
print("Second to last (index -2):", samples[-2])

## Useful List Properties

In [None]:
# How many items are in a list?
gene_list = ["BRCA1", "TP53", "EGFR", "MYC", "PTEN"]

print("Number of genes:", len(gene_list))
print("First gene:", gene_list[0])
print("Last gene:", gene_list[len(gene_list) - 1])  # Or just gene_list[-1]

# Check if something is in the list
print("Is BRCA1 in our list?", "BRCA1" in gene_list)
print("Is GAPDH in our list?", "GAPDH" in gene_list)

## Exercise 1: Your First Lists

Create some biological lists and practice accessing them:

In [None]:
# Create a list of amino acids (single letter codes)
amino_acids = # YOUR CODE HERE

# Create a list of pH values you might test
ph_values = # YOUR CODE HERE

# Print the first amino acid
# YOUR CODE HERE

# Print the last pH value
# YOUR CODE HERE

# How many amino acids are in your list?
# YOUR CODE HERE

# Is pH 7.0 in your pH list?
# YOUR CODE HERE

## Lists of Lists (2D Data)

For complex data, we often need lists inside lists:
- Each row in a data table
- Multiple measurements per sample
- Coordinates (x, y pairs)
- Experimental conditions with multiple parameters

In [None]:
# Experiment data: each inner list is [sample_name, drug_concentration, cell_viability]
experiment_results = [
    ["Control", 0, 100],
    ["Drug_A", 1, 89],
    ["Drug_A", 5, 76],
    ["Drug_A", 10, 45],
    ["Drug_B", 1, 95],
    ["Drug_B", 5, 82],
    ["Drug_B", 10, 71]
]

print("All results:")
print(experiment_results)

print("\nFirst experiment:")
print(experiment_results[0])  # Gets the first inner list

print("\nAccessing specific values:")
print("First experiment sample name:", experiment_results[0][0])
print("First experiment concentration:", experiment_results[0][1])
print("First experiment viability:", experiment_results[0][2])

## More Complex Examples

In [None]:
# DNA primer information: [name, sequence, melting_temp, concentration]
primers = [
    ["GAPDH_F", "GTCAACGGATTTGGTCTGTATT", 58.2, 10],
    ["GAPDH_R", "AGTCTTCTGGGTGGCAGTGAT", 60.1, 10],
    ["ACTB_F", "CATGTACGTTGCTATCCAGGC", 61.5, 5],
    ["ACTB_R", "CTCCTTAATGTCACGCACGAT", 59.8, 5]
]

# Access different pieces of information
print("Primer inventory:")
print(f"First primer name: {primers[0][0]}")
print(f"First primer sequence: {primers[0][1]}")
print(f"First primer Tm: {primers[0][2]}°C")
print(f"First primer concentration: {primers[0][3]} μM")

print(f"\nSecond primer name: {primers[1][0]}")
print(f"Last primer in list: {primers[-1][0]}")

## Exercise 2: Working with 2D Lists

Create and work with complex biological data:

In [None]:
# Create a list of patient data: [patient_id, age, blood_pressure, cholesterol]
patients = [
    # Add at least 4 patients with different values
    # YOUR CODE HERE
]

# Print information about the first patient
print("First patient info:")
print(f"ID: {patients[0][0]}")
print(f"Age: {patients[0][1]} years")
# Add more print statements for blood pressure and cholesterol
# YOUR CODE HERE

# Print the cholesterol level of the last patient
# YOUR CODE HERE

## Exercise 3: Stock Solution Data

Let's work with the reagent data from our lecture:

In [None]:
# Reagent data: [name, molecular_weight, mass_weighed_mg]
reagents = [
    ["MG132", 475.6, 89.5],
    ["Rapamycin", 914.2, 125.3],
    ["Cycloheximide", 281.4, 45.8],
    ["Staurosporine", 466.5, 78.2]
]

# Practice accessing the data:
# 1. Print the name of the second reagent
# YOUR CODE HERE

# 2. Print the molecular weight of the last reagent
# YOUR CODE HERE

# 3. Print all information about the third reagent in a nice format
# YOUR CODE HERE

# 4. Check if we have "MG132" in our reagent list
# Hint: You'll need to check the names (first element of each inner list)
# YOUR CODE HERE

## Adding and Modifying Lists

Lists can be changed after creation:

In [None]:
# Start with a simple list
samples = ["Control", "Treatment1"]
print("Original list:", samples)

# Add new items
samples.append("Treatment2")
samples.append("Treatment3")
print("After adding items:", samples)

# Change an existing item
samples[1] = "Drug_A_5uM"
print("After changing item:", samples)

# Add multiple items at once
new_treatments = ["Drug_B_1uM", "Drug_B_10uM"]
samples.extend(new_treatments)
print("After extending:", samples)

## Exercise 4: Building an Experiment

Start with an empty list and build up your experiment:

In [None]:
# Start with an empty list of experimental conditions
conditions = []

# Add a control condition
# YOUR CODE HERE

# Add three different drug concentrations
# YOUR CODE HERE

# Print your final list
print("Experimental conditions:", conditions)
print(f"Number of conditions: {len(conditions)}")

# Change the second condition to something else
# YOUR CODE HERE

print("Updated conditions:", conditions)

## Real-World Example: Analyzing Multiple Samples

Let's use lists to organize and analyze real biological data:

In [None]:
# Absorbance readings from a protein assay
sample_data = [
    ["Standard_0", 0.000],
    ["Standard_25", 0.125],
    ["Standard_50", 0.250],
    ["Standard_100", 0.485],
    ["Sample_A", 0.342],
    ["Sample_B", 0.198],
    ["Sample_C", 0.567]
]

print("Protein Assay Results:")
print("=" * 25)

# Let's examine our data
print(f"Total number of samples: {len(sample_data)}")
print(f"First sample: {sample_data[0][0]} - {sample_data[0][1]} AU")
print(f"Highest reading: {sample_data[-1][0]} - {sample_data[-1][1]} AU")

# Check which samples are above a threshold
threshold = 0.300
print(f"\nSamples above {threshold} AU:")

# We'll manually check each one (in the next notebook, we'll use loops!)
if sample_data[0][1] > threshold:
    print(f"- {sample_data[0][0]}: {sample_data[0][1]}")
if sample_data[4][1] > threshold:  # Sample_A
    print(f"- {sample_data[4][0]}: {sample_data[4][1]}")
if sample_data[6][1] > threshold:  # Sample_C
    print(f"- {sample_data[6][0]}: {sample_data[6][1]}")

## Exercise 5: Final Challenge

Create a comprehensive dataset and practice accessing it:

In [None]:
# Create a dataset of cell culture information
# Each entry should be: [cell_line, passage_number, confluency_percent, viability_percent]
cell_cultures = [
    # Add at least 5 different cell culture entries
    # YOUR CODE HERE
]

# Answer these questions using your data:

# 1. How many cell cultures do you have?
# YOUR CODE HERE

# 2. What's the name of the third cell line?
# YOUR CODE HERE

# 3. What's the confluency of the last culture?
# YOUR CODE HERE

# 4. Print complete information about the second culture in a nice format
# YOUR CODE HERE

# 5. Which culture has the highest passage number? (Check manually)
# YOUR CODE HERE

## Summary: What You've Learned

### Lists Basics
- **Create lists**: `my_list = [item1, item2, item3]`
- **Access items**: `my_list[0]` (first), `my_list[-1]` (last)
- **List length**: `len(my_list)`
- **Check membership**: `item in my_list`

### Lists of Lists
- **2D data**: `data = [["name1", value1], ["name2", value2]]`
- **Access nested data**: `data[0][1]` (first list, second item)
- **Real biological applications**: Patient data, experimental results, reagent information

### Modifying Lists
- **Add single item**: `my_list.append(item)`
- **Add multiple items**: `my_list.extend(other_list)`
- **Change item**: `my_list[0] = new_value`

## Why This Matters

Lists are fundamental to biological data analysis:
- **Organize samples** from experiments
- **Store measurements** from multiple conditions
- **Track reagents** and their properties
- **Prepare data** for analysis and visualization

## Next Up: For Loops!

Now that you can organize data in lists, you'll learn how to **automatically process** every item in a list using for loops. No more manual checking of each sample - let Python do the work!