# Morphogenetic System Lineage Analysis

This notebook analyzes the telemetry data produced by the adversarial evolution harness, focusing on lineage dynamics, fitness, and population diversity over generations.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import json
from pathlib import Path

sns.set_theme(style="whitegrid")

## 1. Load Data

First, we load the aggregated run summary data from the `pitch_demo` output.

In [2]:
def load_summaries(run_dir: Path) -> pd.DataFrame:
    """Load all summary.json files from a run directory and aggregate them."""
    summaries = []
    for summary_path in sorted(run_dir.glob('*_summary.json')):
        with open(summary_path, 'r') as f:
            data = json.load(f)
            
            scenario_label = summary_path.name.replace('_summary.json', '')
            
            record = {
                'scenario': scenario_label,
                'scenario_name': data.get('run_metadata', {}).get('scenario_name'),
                'fitness_score': data.get('annotations', {}).get('fitness_score'),
                'lineage_pressure': data.get('annotations', {}).get('lineage_pressure'),
                'lineage_component': data.get('annotations', {}).get('lineage_component'),
                'breach_observed': data.get('annotations', {}).get('breach_observed'),
                'final_cell_count': data.get('run_metadata', {}).get('final_cell_count'),
                'total_replications': data.get('stats', {}).get('total_replications'),
                'total_signals': data.get('stats', {}).get('total_signals'),
            }
            summaries.append(record)
            
    return pd.DataFrame(summaries)

# --- Point this to the output directory of the pitch demo ---
RUN_DIRECTORY = Path("target/pitch_demo") 
# ----------------------------------------------------------------

if RUN_DIRECTORY.exists():
    df_summary = load_summaries(RUN_DIRECTORY)
    if not df_summary.empty:
        print(f"Loaded {len(df_summary)} summaries from {RUN_DIRECTORY.as_posix()}")
        display(df_summary.head())
    else:
        print(f"No summary.json files found in {RUN_DIRECTORY.as_posix()}")
        df_summary = pd.DataFrame() # Create an empty dataframe to avoid errors later
else:
    print(f"Run directory {RUN_DIRECTORY.as_posix()} not found. Please run 'scripts/pitch_demo.sh' first.")
    df_summary = pd.DataFrame() # Create an empty dataframe to avoid errors later


Run directory target/pitch_demo not found. Please run 'scripts/pitch_demo.sh' first.


## 2. Compare Scenarios
This plot helps us understand the difference in fitness and lineage pressure between the `baseline` and `intense` scenarios.

In [3]:
if not df_summary.empty:
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))
    
    sns.barplot(data=df_summary, x='scenario', y='fitness_score', ax=ax1)
    ax1.set_title('Fitness Score Comparison')

    sns.barplot(data=df_summary, x='scenario', y='lineage_pressure', ax=ax2)
    ax2.set_title('Lineage Pressure Comparison')
    
    fig.suptitle('Comparison of Baseline vs. Intense Scenarios')
    plt.show()
else:
    print("No summary data to plot.")

No summary data to plot.


## 3. Lineage Diversity Analysis (Intense Scenario)

Now we'll load the detailed lineage data for the `intense` scenario to analyze how the population of cell lineages changes over the course of the run.

In [4]:
def load_lineage_data(run_dir: Path, scenario_label: str = "intense") -> pd.DataFrame:
    """Loads a specific lineage CSV file."""
    lineage_file = run_dir / f'{scenario_label}_lineage.csv'
    if not lineage_file.exists():
        return pd.DataFrame()
    
    return pd.read_csv(lineage_file)

if RUN_DIRECTORY.exists():
    df_lineage = load_lineage_data(RUN_DIRECTORY, scenario_label="intense")
    if not df_lineage.empty:
        print(f"Loaded {len(df_lineage)} lineage records from 'intense' scenario.")
        display(df_lineage.head())
    else:
        print("No lineage data found for the 'intense' scenario.")
else:
    print(f"Run directory {RUN_DIRECTORY.as_posix()} not found.")
    df_lineage = pd.DataFrame()

Run directory target/pitch_demo not found.


In [5]:
if not df_lineage.empty:
    plt.figure(figsize=(14, 7))
    sns.lineplot(data=df_lineage, x='step', y='count', hue='lineage', marker='.')
    plt.title('Lineage Population Over Time (Intense Scenario)')
    plt.xlabel('Simulation Step')
    plt.ylabel('Number of Cells')
    plt.legend(title='Lineage Type', bbox_to_anchor=(1.05, 1), loc='upper left')
    plt.tight_layout()
    plt.show()
else:
    print("No lineage data to plot.")

No lineage data to plot.


### Analysis
This plot shows which cell lineages dominate at different points in the simulation. A successful attack might be one that quickly suppresses defensive lineages or promotes the growth of a specific adversarial lineage.