# L3_N1: Built-in Python Packages for Biology

Python comes with many useful built-in packages - no installation required!
Here are some that are particularly useful for biological research.

---

## 🎲 random - Simulations & Sampling

In [None]:
import random

# Generate random DNA sequence
bases = ['A', 'T', 'G', 'C']
random_dna = ''.join(random.choices(bases, k=20))
print(f"Random DNA: {random_dna}")
Ÿ
# Sample from a population
patients = [f'Patient_{i}' for i in range(1, 101)]
sample = random.sample(patients, 5)
print(f"Random sample: {sample}")

## 📊 collections.Counter - Count Everything!

In [None]:
# collections.Counter - perfect for biology!
from collections import Counter

dna = "ATCGATCGATCGTAGC"
nucleotide_counts = Counter(dna)
print(nucleotide_counts)  # Counter({'A': 4, 'T': 4, 'C': 4, 'G': 4})

# Count amino acids in a protein
protein = "MKTAYAALKGKVALVTGAGGVGKSAMTMFYAGVKK"
aa_counts = Counter(protein)
print(f"Most common amino acids: {aa_counts.most_common(3)}")

## 📈 statistics - Built-in Stats

In [None]:
# statistics - built-in stats without numpy
import statistics

measurements = [23.1, 24.5, 22.8, 23.9, 24.2]
print(f"Mean: {statistics.mean(measurements):.2f}")
print(f"Median: {statistics.median(measurements):.2f}")
print(f"StDev: {statistics.stdev(measurements):.2f}")

# Quick correlation coefficient
x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
correlation = statistics.correlation(x, y)
print(f"Correlation: {correlation:.3f}")

## ⏰ datetime - Experiment Tracking

In [None]:
# datetime - experiment tracking
from datetime import datetime, timedelta

experiment_start = datetime.now()
incubation_time = timedelta(hours=24)
harvest_time = experiment_start + incubation_time

print(f"Experiment started: {experiment_start.strftime('%Y-%m-%d %H:%M')}")
print(f"Harvest at: {harvest_time.strftime('%Y-%m-%d %H:%M')}")

# Time between measurements
measurement1 = datetime(2024, 1, 15, 9, 0)
measurement2 = datetime(2024, 1, 15, 15, 30)
time_diff = measurement2 - measurement1
print(f"Time between measurements: {time_diff}")

## 📁 pathlib - File Management

In [None]:
from pathlib import Path

# Create paths that work on any operating system
data_dir = Path("experiment_data")
results_file = data_dir / "results.csv"
backup_file = data_dir / "backup" / "results_backup.csv"

print(f"Data directory: {data_dir}")
print(f"Results file: {results_file}")
print(f"File extension: {results_file.suffix}")
print(f"File stem: {results_file.stem}")

# Check if files exist (won't create actual files here)
print(f"File exists: {results_file.exists()}")

## 📄 csv - Simple Data Files

In [None]:
import csv
from io import StringIO

# Simulate CSV data for demonstration
csv_data = """sample,concentration,absorbance
A1,0.5,0.123
A2,1.0,0.245
A3,1.5,0.367
A4,2.0,0.489"""

# Read CSV data
csv_file = StringIO(csv_data)
reader = csv.DictReader(csv_file)

print("Sample data:")
for row in reader:
    sample = row['sample']
    conc = float(row['concentration'])
    abs_val = float(row['absorbance'])
    print(f"{sample}: {conc}M → {abs_val} AU")

# Writing CSV is just as easy!
output = StringIO()
writer = csv.writer(output)
writer.writerow(['Gene', 'Expression', 'p-value'])
writer.writerow(['BRCA1', '2.34', '0.001'])
writer.writerow(['TP53', '1.89', '0.025'])

print("\nGenerated CSV:")
print(output.getvalue())

## 🎯 Summary

These built-in packages are always available and perfect for:

- **random**: Simulations, sampling, randomization
- **collections.Counter**: Counting nucleotides, amino acids, anything!
- **statistics**: Quick stats without numpy
- **datetime**: Experiment timing and scheduling
- **pathlib**: Cross-platform file path handling
- **csv**: Simple data file I/O

**💡 Pro tip**: Always check if there's a built-in solution before installing external packages!