# ELISA Data Analysis - Interactive Demo

This notebook demonstrates the complete ELISA data processing pipeline, from raw optical density measurements to protein concentration calculations.

## Overview

ELISA (Enzyme-Linked Immunosorbent Assay) is a laboratory technique for detecting and quantifying proteins. This analysis:
1. Processes raw OD readings from duplicate wells
2. Corrects for background signal (blank)
3. Fits a standard curve using 4-parameter logistic regression
4. Calculates protein concentrations in unknown samples

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import seaborn as sns

# Set plotting style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 5)

## Step 1: Load Raw ELISA Data

In [None]:
# Load ELISA sample data
elisa_data = pd.read_csv('data/elisa_data.csv')
print("ELISA Sample Data:")
display(elisa_data)

## Step 2: Calculate Average OD from Duplicates

In [None]:
# Calculate average OD
elisa_data['AverageOD'] = (elisa_data['OD1'] + elisa_data['OD2']) / 2

# Calculate coefficient of variation (CV%) for quality control
elisa_data['CV%'] = (
    elisa_data[['OD1', 'OD2']].std(axis=1) / elisa_data['AverageOD'] * 100
)

print("Data with Average OD:")
display(elisa_data)

print(f"\nQuality Check - Average CV%: {elisa_data['CV%'].mean():.2f}%")
print("(Good duplicates typically have CV < 15%)")

## Step 3: Blank Correction

In [None]:
# Find blank and calculate corrected OD
blank_od = elisa_data[elisa_data['Sample'] == 'BLANK']['AverageOD'].values[0]
elisa_data['CorrectedOD'] = elisa_data['AverageOD'] - blank_od

# Set blank corrected OD to 0
elisa_data.loc[elisa_data['Sample'] == 'BLANK', 'CorrectedOD'] = 0

print(f"Blank OD: {blank_od:.4f}")
print("\nCorrected Data:")
display(elisa_data[['Sample', 'OD1', 'OD2', 'AverageOD', 'CorrectedOD']])

## Step 4: Load and Process Standard Curve Data

In [None]:
# Load standard curve data
standard_data = pd.read_csv('data/standard_values.csv')

# Process standards
standard_data['AverageOD'] = (standard_data['OD1'] + standard_data['OD2']) / 2
standard_data['CorrectedOD'] = standard_data['AverageOD'] - blank_od

print("Standard Curve Data:")
display(standard_data)

## Step 5: Fit 4-Parameter Logistic (4PL) Model

The 4PL model is the gold standard for ELISA curve fitting:

$$OD = D + \frac{A - D}{1 + (\frac{x}{C})^B}$$

Where:
- A = minimum asymptote
- B = Hill's slope
- C = inflection point (EC50)
- D = maximum asymptote

In [None]:
def four_parameter_logistic(x, A, B, C, D):
    """4PL function for ELISA standard curves."""
    return D + (A - D) / (1 + (x / C) ** B)

# Extract x (concentration) and y (OD) values
x = standard_data['Concentration (ng/ml)'].values
y = standard_data['CorrectedOD'].values

# Initial parameter guesses
A = np.min(y)
D = np.max(y)
C = np.median(x)
B = 1.0

# Fit the curve
params, covariance = curve_fit(
    four_parameter_logistic, x, y,
    p0=[A, B, C, D],
    maxfev=10000
)

# Calculate R²
y_pred = four_parameter_logistic(x, *params)
r_squared = 1 - (np.sum((y - y_pred)**2) / np.sum((y - np.mean(y))**2))

print("4PL Curve Parameters:")
print(f"  A (min): {params[0]:.4f}")
print(f"  B (slope): {params[1]:.4f}")
print(f"  C (EC50): {params[2]:.4f}")
print(f"  D (max): {params[3]:.4f}")
print(f"\nR² = {r_squared:.4f}")

## Step 6: Visualize Standard Curve

In [None]:
# Generate smooth curve for plotting
x_smooth = np.logspace(np.log10(x.min()), np.log10(x.max()), 200)
y_smooth = four_parameter_logistic(x_smooth, *params)

# Create plot
plt.figure(figsize=(10, 6))
plt.scatter(x, y, s=150, alpha=0.7, color='navy', label='Standards', zorder=5)
plt.plot(x_smooth, y_smooth, 'r-', linewidth=2.5, label='4PL Fit', alpha=0.8)
plt.xlabel('Concentration (ng/ml)', fontsize=13, fontweight='bold')
plt.ylabel('Corrected OD (450 nm)', fontsize=13, fontweight='bold')
plt.title(f'ELISA Standard Curve (R² = {r_squared:.4f})', fontsize=15, fontweight='bold')
plt.xscale('log')
plt.grid(True, alpha=0.3, linestyle='--')
plt.legend(fontsize=11, loc='best')
plt.tight_layout()
plt.show()

## Step 7: Calculate Sample Concentrations

In [None]:
def calculate_concentration(od, params):
    """Calculate concentration from OD using inverse 4PL."""
    A, B, C, D = params
    
    if od >= D or od <= A:
        return np.nan  # Out of range
    
    return C * ((D - A) / (od - A) - 1) ** (1 / B)

# Calculate concentrations for all samples
concentrations = []
for _, row in elisa_data.iterrows():
    if row['Sample'] == 'BLANK':
        concentrations.append(0)
    else:
        conc = calculate_concentration(row['CorrectedOD'], params)
        concentrations.append(conc)

elisa_data['Concentration (ng/ml)'] = concentrations

# Display results
print("Final Results:")
display(elisa_data[['Sample', 'CorrectedOD', 'Concentration (ng/ml)']])

# Summary statistics
valid_samples = elisa_data[elisa_data['Sample'] != 'BLANK']['Concentration (ng/ml)'].dropna()
print(f"\nSummary Statistics:")
print(f"  Mean: {valid_samples.mean():.2f} ng/ml")
print(f"  Std Dev: {valid_samples.std():.2f} ng/ml")
print(f"  Range: {valid_samples.min():.2f} - {valid_samples.max():.2f} ng/ml")

## Step 8: Visualize Sample Concentrations

In [None]:
# Create bar plot of sample concentrations
samples = elisa_data[elisa_data['Sample'] != 'BLANK'].copy()
samples_valid = samples[samples['Concentration (ng/ml)'].notna()]

plt.figure(figsize=(12, 6))
x_pos = np.arange(len(samples_valid))
plt.bar(x_pos, samples_valid['Concentration (ng/ml)'], 
        color='steelblue', alpha=0.7, edgecolor='navy', linewidth=1.5)
plt.xlabel('Sample', fontsize=13, fontweight='bold')
plt.ylabel('Concentration (ng/ml)', fontsize=13, fontweight='bold')
plt.title('Protein Concentrations in Unknown Samples', fontsize=15, fontweight='bold')
plt.xticks(x_pos, samples_valid['Sample'], rotation=45, ha='right')
plt.grid(True, alpha=0.3, linestyle='--', axis='y')
plt.tight_layout()
plt.show()

## Step 9: Export Results

In [None]:
# Save processed data
output_file = 'elisa_result_notebook.csv'
elisa_data.to_csv(output_file, index=False)
print(f"Results saved to: {output_file}")

# Display first few rows of saved data
print("\nSaved data preview:")
display(pd.read_csv(output_file).head())

## Conclusion

This analysis successfully:
- ✅ Processed raw ELISA data with duplicate averaging
- ✅ Corrected for background signal (blank subtraction)
- ✅ Fitted a high-quality standard curve (R² > 0.98)
- ✅ Calculated protein concentrations in unknown samples
- ✅ Generated publication-quality visualizations

The results can now be used for downstream analysis in the study of protein involvement in Disease XX.