<a class="anchor" id="toc"></a>
# EXPLORE CASE STUDY

This notebook works through process of exploring the case study `VESSEL_COLLAPSE` simulation set.

---
- [WORKSPACE VARIABLES](#workspace-variables)
- [PARSE SIMULATIONS](#parse-simulations)
- [EXPLORE_SIMULATIONS](#explore-simulations)
    - **[DEGRADATION RATE](#explore-simulations-degradation-rate)**
    - **[STABILIZATION FRACTION](#explore-simulations-stabilization-fraction)**
---

This simulation set performs a sensitivity analysis on two parameters: degradation rate (rate at which vessel walls are degraded by adjacent cancerous cells) and stabilization fraction (fraction of vessels at the edge of the tumor that are stabilized).

We quantify the simulations by calculating total volumetric flow over the tumor area and average volumetric flow over the tumor area.
Metric summaries are saved as `VESSEL_COLLAPSE.DEGRADATION.csv` and `VESSEL_COLLAPSE.STABILIZED.csv`, which are used as inputs to D3 for plotting the area plots  ([go to figure](http://0.0.0.0:8000/figures/case_study.html)).

In [None]:
from scripts.VESSEL_COLLAPSE import VESSEL_COLLAPSE

<a class="anchor" id="workspace-variables"></a>

### WORKSPACE VARIABLES
<span style="float:right;">[back to top](#toc)</span>

Set up workspace variables for analyzing simulations.

- **`DATA_PATH`** is the path to data files (`.tar.xz` files of compressed simulation outputs)
- **`RESULTS_PATH`** is the path to result files (`.pkl` files generated by parsing)
- **`ANALYSIS_PATH`** is the path for analysis files (`.json` and `.csv` files, `.tar.xz` compressed archives)

In [None]:
DATA_PATH = "/path/to/data/files/"
RESULTS_PATH = "/path/to/result/files/"
ANALYSIS_PATH = "/path/to/analysis/files/"

<a class="anchor" id="parse-simulations"></a>

### PARSE SIMULATIONS
<span style="float:right;">[back to top](#toc)</span>

Parse relevant metrics from simulation outputs.

For the case study, we are specifically interested in extracting vessel collapse metrics.
We iterate through each combination of degradation rate or stabilization fraction parameter value, vascular structure, and seed to extract flow rate across the entire tissue and across the tumor colony.

Parsing can take some time, so parsed `.csv` files, compressed into `.tar.xz` archives, for all simulations are provided along with the raw simulation data.

In [None]:
VESSEL_COLLAPSE.analyze_degradation(RESULTS_PATH, DATA_PATH)
VESSEL_COLLAPSE.analyze_stabilized(RESULTS_PATH, DATA_PATH)

We then merge all the files for degradation rate (`VESSEL_COLLAPSE_degradation.csv`) and stabilization fraction (`VESSEL_COLLAPSE_stabilized.csv`).

In [None]:
VESSEL_COLLAPSE.merge_degradation(RESULTS_PATH)
VESSEL_COLLAPSE.merge_stabilized(RESULTS_PATH)

<a class="anchor" id="explore-simulations"></a>

### EXPLORE SIMULATIONS
<span style="float:right;">[back to top](#toc)</span>

Explore simulation data and results.

In [None]:
import numpy as np
import pandas as pd

In [None]:
df_degradation = pd.read_csv(f"{RESULTS_PATH}VESSEL_COLLAPSE/VESSEL_COLLAPSE_degradation.csv")
df_stabilized = pd.read_csv(f"{RESULTS_PATH}VESSEL_COLLAPSE/VESSEL_COLLAPSE_stabilized.csv")

In [None]:
SUMMARY_METRICS = [x + y for x in ["TOTAL", "AVERAGE"] for y in ["", "_POSITIVE", "_NEGATIVE", "_ZERO"]]

def summarize_metrics(df_baseline, df_modified, edges):
    """Summarizes vessel collapse metrics."""
    total_baseline = df_baseline["FLOW_" + edges].reset_index(drop=True)
    total_modified = df_modified["FLOW_" + edges].reset_index(drop=True)
    
    # Parse average baseline and modified flow for selected edges
    number_baseline = df_baseline["N_" + edges].reset_index(drop=True)
    number_modified = df_modified["N_" + edges].reset_index(drop=True)
    average_baseline = total_baseline/number_baseline
    average_modified = total_modified/number_modified

    # Calculate sums
    delta_total = np.sum(total_modified - total_baseline)
    delta_average = np.sum(average_modified - average_baseline)
    
    # Binarize total results
    binary_total_positive = 1 if delta_total > 0 else 0
    binary_total_negative = 1 if delta_total < 0 else 0
    binary_total_zero = 1 if delta_total == 0 else 0
    
    # Binarize average results
    binary_average_positive = 1 if delta_average > 0 else 0
    binary_average_negative = 1 if delta_average < 0 else 0
    binary_average_zero = 1 if delta_average == 0 else 0
    
    return delta_total, binary_total_positive, binary_total_negative, binary_total_zero, \
        delta_average, binary_average_positive, binary_average_negative, binary_average_zero

<a class="anchor" id="explore-simulations-degradation-rate"></a>

#### DEGRADATION RATE
<span style="float:right;">[back to top](#toc)</span>

Simulations vary degradation rate from 0.0 to 1.0 $\mu$m/day in increments of 0.1.

In [None]:
def summarize_degradation_metrics(df, structure, seed, degradation):
    """Summarize vessel collapse metrics for degradation rate simulation."""
    df_filtered = df[(df["STRUCTURE"] == structure) & (df["SEED"] == int(seed)) & (df["TIMEPOINT"] > 15)]
    df_baseline = df_filtered[df_filtered["DEGRADATION"] == 0]
    df_modified = df_filtered[df_filtered["DEGRADATION"] == int(degradation)]

    tumor_metrics = summarize_metrics(df_baseline, df_modified, "TUMOR")
    tissue_metrics = summarize_metrics(df_baseline, df_modified, "TISSUE")
    
    return tumor_metrics, tissue_metrics

In [None]:
def summarize_degradation(df):
    """Summarize degradation rate simulations."""
    summary = []
    for structure in VESSEL_COLLAPSE.STRUCTURES:
        for degrade in VESSEL_COLLAPSE.DEGRADATION:
            for seed in VESSEL_COLLAPSE.SEEDS:
                tumor, tissue = summarize_degradation_metrics(df, structure, seed, degrade) 
                summary.append([structure, seed, degrade, *tumor, *tissue])


    columns = ["STRUCTURE", "SEED", "DEGRADATION"] + ["TUMOR_" + m for m in SUMMARY_METRICS] + ["TISSUE_" + m for m in SUMMARY_METRICS]
    return pd.DataFrame(summary, columns=columns)

In [None]:
# summarize degradation rate simulations
degradation_summary = summarize_degradation(df_degradation)

In [None]:
# group by degradation rate value
degradation_summary_by_time = degradation_summary.groupby("DEGRADATION").sum()

In [None]:
# save grouped to use as input to figure
degradation_summary_by_time.to_csv(f"{ANALYSIS_PATH}_/VESSEL_COLLAPSE.DEGRADATION.csv")

<a class="anchor" id="explore-simulations-stabilization-fraction"></a>

#### STABILIZATION FRACTION
<span style="float:right;">[back to top](#toc)</span>

Simulations vary stabilization fraction from 0.0 to 1.0 in increments of 0.1.

In [None]:
def summarize_stabilization_metrics(df, structure, seed, stabilized):
    """Summarize vessel collapse metrics for stabilization fraction simulation."""
    df_filtered = df[(df["STRUCTURE"] == structure) & (df["SEED"] == int(seed)) & (df["TIMEPOINT"] > 15)]
    df_baseline = df_filtered[df_filtered["STABILIZED"] == 0]
    df_modified = df_filtered[df_filtered["STABILIZED"] == int(stabilized)]
    
    tumor_metrics = summarize_metrics(df_baseline, df_modified, "TUMOR")
    tissue_metrics = summarize_metrics(df_baseline, df_modified, "TISSUE")
    
    return tumor_metrics, tissue_metrics

In [None]:
def summarize_stabilization(df):
    """Summarize stabilization fraction simulations."""
    summary = []
    for structure in VESSEL_COLLAPSE.STRUCTURES:
        for stabilized in VESSEL_COLLAPSE.STABILIZED:
            for seed in VESSEL_COLLAPSE.SEEDS:
                tumor, tissue = summarize_stabilization_metrics(df, structure, seed, stabilized) 
                summary.append([structure, seed, stabilized, *tumor, *tissue])


    columns = ["STRUCTURE", "SEED", "STABILIZED"] + ["TUMOR_" + m for m in SUMMARY_METRICS] + ["TISSUE_" + m for m in SUMMARY_METRICS]
    return pd.DataFrame(summary, columns=columns)

In [None]:
# summarize stabilization fraction simulations
stabilized_summary = summarize_stabilization(df_stabilized)

In [None]:
# group by stabilization fraction value
stabilized_summary_by_time = stabilized_summary.groupby("STABILIZED").sum()

In [None]:
# save grouped to use as input to figure
stabilized_summary_by_time.to_csv(f"{ANALYSIS_PATH}_/VESSEL_COLLAPSE.STABILIZED.csv")