# SEIR Model Metaflow Demo

This notebook demonstrates how to use Metaflow to experiment with different parameter settings for a SEIR (Susceptible-Exposed-Infectious-Recovered) compartmental model.

## What is Metaflow?

Metaflow is a framework that helps data scientists build and manage real-life data science projects. It was originally developed at Netflix to streamline the development, deployment, and operations of data science projects.

Key features of Metaflow:
- **Local prototyping to production**: Develop locally, then scale to production seamlessly
- **Parallel execution**: Run multiple experiments in parallel
- **Data versioning**: Track data and results across runs
- **Parameter tuning**: Easily experiment with different parameters
- **Dependency management**: Specify dependencies for each step

## Workflow Overview

Our Metaflow workflow for SEIR model experimentation includes the following steps:

1. **Start**: Initialize the workflow and create output directories
2. **Load Data**: Load and preprocess COVID-19 vaccination data
3. **Create Vaccination Model**: Create a model for vaccination rate interpolation
4. **Run Experiments**: Run multiple SEIR model experiments with different parameters in parallel
5. **Join Results**: Collect and compare results from all experiments
6. **Visualize Comparison**: Create visualizations comparing the results
7. **End**: Summarize the results and provide access to the best model

## Examining the Metaflow Workflow

Let's look at the Metaflow workflow we've defined in `seir_metaflow_demo.py`:

In [None]:
# Display the workflow code
!cat seir_metaflow_demo.py

## Understanding the Workflow Structure

The workflow is structured as a directed acyclic graph (DAG) of steps. Let's visualize this structure:

In [None]:
# Show the workflow DAG
!python seir_metaflow_demo.py show

## Running the Workflow with Default Parameters

Now, let's run the workflow with the default parameters:

In [None]:
# Run the workflow
!python seir_metaflow_demo.py run

## Customizing Parameters

One of the key advantages of Metaflow is the ability to easily customize parameters. Let's run the workflow with different parameters:

In [None]:
# Run with custom parameters
!python seir_metaflow_demo.py run \
    --start_date 2021-02-01 \
    --simulation_days 150 \
    --initial_s 0.85 \
    --initial_e 0.05 \
    --initial_i 0.05 \
    --initial_r 0.05

## Accessing Results from Previous Runs

Metaflow makes it easy to access results from previous runs. Let's see how to do this:

In [None]:
# Import Metaflow
from metaflow import Flow, get_metadata
import pandas as pd
import matplotlib.pyplot as plt

# Set metadata provider
get_metadata()

In [None]:
# List all runs
runs = Flow('SEIRModelFlow').runs()
print(f"Found {len([*runs])} runs:")
for run in runs:
    print(f"Run ID: {run.id}, Created: {run.created_at}")

In [None]:
# Get the latest successful run
latest_run = Flow('SEIRModelFlow').latest_successful_run
print(f"Latest successful run: {latest_run.id}")

In [None]:
# Access the comparison DataFrame from the latest run
comparison_df = latest_run.data.comparison_df
comparison_df

## Visualizing Results

Let's visualize the results from the latest run:

In [None]:
# Plot peak infectious values
plt.figure(figsize=(10, 6))
df_sorted = comparison_df.sort_values(by="peak_infectious")
plt.bar(df_sorted["name"], df_sorted["peak_infectious"])
plt.xlabel("Experiment")
plt.ylabel("Peak Infectious Fraction")
plt.title("Comparison of Peak Infectious Values Across Experiments")
plt.xticks(rotation=45, ha="right")
plt.tight_layout()
plt.show()

## Comparing Multiple Runs

One of the powerful features of Metaflow is the ability to compare results across multiple runs. Let's see how to do this:

In [None]:
# Get all successful runs
successful_runs = [run for run in Flow('SEIRModelFlow').runs() if run.successful]

# Create a DataFrame to compare runs
run_comparison = []
for run in successful_runs:
    # Get the best model from each run
    best_model_name = run.data.best_model_name
    
    # Find the corresponding row in the comparison DataFrame
    best_model_data = run.data.comparison_df[run.data.comparison_df['name'] == best_model_name].iloc[0]
    
    run_comparison.append({
        'run_id': run.id,
        'created_at': run.created_at,
        'best_model': best_model_name,
        'peak_infectious': best_model_data['peak_infectious'],
        'total_infected': best_model_data['total_infected']
    })

run_comparison_df = pd.DataFrame(run_comparison)
run_comparison_df.sort_values(by='created_at', ascending=False)

## Creating a Custom Experiment

Let's modify the workflow to add a custom experiment with different parameters:

In [None]:
%%writefile custom_seir_metaflow.py
from seir_metaflow_demo import SEIRModelFlow as BaseFlow
from metaflow import FlowSpec, step, Parameter

class CustomSEIRModelFlow(BaseFlow):
    """
    A custom flow that extends the base SEIR model flow with additional experiments.
    """
    
    @step
    def create_vax_model(self):
        """
        Create a model for vaccination rate interpolation and define experiments.
        """
        # Call the parent method to create the vaccination model and base experiments
        super().create_vax_model()
        
        # Add custom experiments
        custom_experiments = [
            {
                "name": "Custom_Low_Beta_High_Gamma",
                "beta": 0.15,
                "sigma": 0.2,
                "gamma": 0.25,
                "vax_eff": 0.8
            },
            {
                "name": "Custom_High_Beta_Low_Gamma",
                "beta": 0.6,
                "sigma": 0.2,
                "gamma": 0.05,
                "vax_eff": 0.8
            }
        ]
        
        # Extend the experiment parameters list
        self.experiment_params.extend(custom_experiments)
        
        # Continue to the next step
        self.next(self.run_experiment, foreach='experiment_params')

if __name__ == "__main__":
    CustomSEIRModelFlow()

In [None]:
# Run the custom workflow
!python custom_seir_metaflow.py run

## Deploying to Production

One of the key advantages of Metaflow is the ability to seamlessly transition from local development to production. Here's how you might deploy this workflow to production:

1. **Configure a remote metadata service**: Set up a metadata service to track runs across environments
2. **Configure compute resources**: Specify compute resources for each step
3. **Schedule the workflow**: Set up a scheduler to run the workflow on a regular basis

Here's an example of how to configure the workflow for production:

In [None]:
%%writefile production_seir_metaflow.py
from seir_metaflow_demo import SEIRModelFlow
from metaflow import FlowSpec, step, batch, schedule, resources, retry

# Uncomment to schedule the workflow to run daily
# @schedule(daily=True)
class ProductionSEIRModelFlow(SEIRModelFlow):
    """
    A production version of the SEIR model flow with resource specifications and error handling.
    """
    
    @resources(memory=1000, cpu=1)
    @retry(times=3)
    @step
    def start(self):
        super().start()
    
    @resources(memory=2000, cpu=2)
    @retry(times=3)
    @step
    def load_data(self):
        super().load_data()
    
    @resources(memory=1000, cpu=1)
    @step
    def create_vax_model(self):
        super().create_vax_model()
    
    @batch(cpu=2, memory=4000)
    @retry(times=3)
    @step
    def run_experiment(self):
        super().run_experiment()
    
    @resources(memory=4000, cpu=2)
    @step
    def join_results(self, inputs):
        super().join_results(inputs)
    
    @resources(memory=2000, cpu=1)
    @step
    def visualize_comparison(self):
        super().visualize_comparison()
    
    @step
    def end(self):
        super().end()

if __name__ == "__main__":
    ProductionSEIRModelFlow()

## Conclusion

In this notebook, we've demonstrated how to use Metaflow to experiment with different parameter settings for a SEIR compartmental model. We've shown how to:

1. Define a workflow with multiple steps
2. Run experiments in parallel
3. Compare results across experiments
4. Access and visualize results from previous runs
5. Extend the workflow with custom experiments
6. Configure the workflow for production

Metaflow provides a powerful framework for data scientists to experiment with models and parameters, while also providing a path to production deployment. This makes it an excellent choice for modeling workflows like the SEIR model demonstrated here.