# Cell Thalamus - Parallel Simulation on JupyterHub

This notebook runs the full Cell Thalamus Phase 0 simulation using parallel processing.

**Recommended JupyterHub Server Settings:**
- Image: `ijupyterhub-scipy-notebook`
- CPU: 32-64 vCPUs (more = faster)
- Memory: 16-32 GB

**Expected Runtime:**
- Full campaign (2304 wells):
  - 16 CPUs: ~4.5 minutes
  - 32 CPUs: ~2.2 minutes
  - 64 CPUs: ~1.1 minutes

vs. Serial (1 CPU): ~2 hours

In [None]:
# Setup - Install cell_os if needed
# If running on JupyterHub, you may need to install the package first
# !pip install -e /path/to/cell_OS

import sys
import os
from multiprocessing import cpu_count
import logging

# Setup logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

print(f"Available CPUs: {cpu_count()}")
print(f"Python version: {sys.version}")

In [None]:
# Import parallel runner
from cell_os.cell_thalamus.parallel_runner import run_parallel_simulation

## Configuration

Choose your simulation parameters:

In [None]:
# Simulation parameters
MODE = "full"  # Options: "demo" (4 wells), "quick" (few hundred), "full" (2304 wells)
WORKERS = cpu_count()  # Use all available CPUs (or set a specific number like 32, 64)
CELL_LINES = ['A549', 'HepG2']  # Cell lines to test
DB_PATH = "/home/jovyan/cell_thalamus_results.db"  # Save to your home directory

print(f"Mode: {MODE}")
print(f"Workers: {WORKERS} CPUs")
print(f"Cell lines: {CELL_LINES}")
print(f"Results will be saved to: {DB_PATH}")

## Run Simulation

Execute the parallel simulation. Progress updates every 100 wells.

In [None]:
# Run the simulation
design_id = run_parallel_simulation(
    cell_lines=CELL_LINES,
    compounds=None,  # None = use all 10 compounds
    mode=MODE,
    workers=WORKERS,
    db_path=DB_PATH
)

print(f"\n✓ Simulation complete! Design ID: {design_id}")

## View Results

Load and explore the results:

In [None]:
# Load results from database
from cell_os.database.cell_thalamus_db import CellThalamusDB
import pandas as pd

db = CellThalamusDB(db_path=DB_PATH)
results = db.get_results(design_id)
db.close()

# Convert to pandas DataFrame
df = pd.DataFrame(results)

print(f"Total results: {len(df)}")
print(f"\nColumns: {df.columns.tolist()}")
df.head()

In [None]:
# Summary statistics
print("=" * 60)
print("SUMMARY STATISTICS")
print("=" * 60)
print(f"Total wells: {len(df)}")
print(f"Experimental wells: {(~df['is_sentinel']).sum()}")
print(f"Sentinel wells: {df['is_sentinel'].sum()}")
print(f"\nCell lines: {df['cell_line'].unique()}")
print(f"Compounds: {df['compound'].unique()}")
print(f"Timepoints: {df['timepoint_h'].unique()}")
print(f"\nMorphology channels:")
for channel in ['morph_er', 'morph_mito', 'morph_nucleus', 'morph_actin', 'morph_rna']:
    print(f"  {channel}: mean={df[channel].mean():.3f}, std={df[channel].std():.3f}")
print(f"\nATP signal: mean={df['atp_signal'].mean():.3f}, std={df['atp_signal'].std():.3f}")

In [None]:
# Export to CSV
csv_path = f"/home/jovyan/cell_thalamus_{design_id}.csv"
df.to_csv(csv_path, index=False)
print(f"Results exported to: {csv_path}")

## Quick Visualizations

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

# Dose-response for tBHQ
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# ATP response
tbhq_data = df[(df['compound'] == 'tBHQ') & (df['cell_line'] == 'A549')].copy()
axes[0].plot(tbhq_data['dose_uM'], tbhq_data['atp_signal'], 'o', alpha=0.5)
axes[0].set_xlabel('Dose (µM)')
axes[0].set_ylabel('ATP Signal')
axes[0].set_title('tBHQ Dose-Response (A549) - ATP')
axes[0].set_xscale('log')

# Morphology response
axes[1].plot(tbhq_data['dose_uM'], tbhq_data['morph_er'], 'o', alpha=0.5, label='ER')
axes[1].plot(tbhq_data['dose_uM'], tbhq_data['morph_mito'], 's', alpha=0.5, label='Mito')
axes[1].set_xlabel('Dose (µM)')
axes[1].set_ylabel('Morphology Signal')
axes[1].set_title('tBHQ Dose-Response (A549) - Morphology')
axes[1].set_xscale('log')
axes[1].legend()

plt.tight_layout()
plt.show()