<a class="anchor" id="toc"></a>
# GENERATE FIGURE INPUTS

This notebook provides functions and scripts for generating input files for visualization with D3.

---
- [WORKSPACE VARIABLES](#workspace-variables)
- **[SYSTEM REPRESENTATION](#system-representation)**
    - [colony borders](#system-representation-colony-borders)
    - [simulation outlines](#system-representation-simulation-outlines)
    - [simulation states](#system-representation-simulation-states)
    - [simulation metrics](#system-representation-simulation-metrics)
- **[CELL VARIABILITY](#cell-variability)**
    - [feature quantiles](#cell-variability-feature-quantiles)
    - [simulation outlines](#cell-variability-simulation-outlines)
    - [simulation states](#cell-variability-simulation-states)
    - [simulation metrics](#cell-variability-simulation-metrics)
- **[NUTRIENT DYNAMICS](#nutrient-dynamics)**
    - [concentration profiles](#nutrient-dynamics-concentration-profiles)
    - [simulation outlines](#nutrient-dynamics-simulation-outlines)
    - [simulation states](#nutrient-dynamics-simulation-states)
    - [simulation metrics](#nutrient-dynamics-simulation-metrics)
---

Each of the simulation sets has a corresponding script of the same name, which contains a class (of the same name) with relevant condition variables and methods to iterate through these condition.
The `loop` method will be used the most often to iterate through all conditions, extract relevant information, and then compile the information into a single output file.
The `loop` method works with both individual files for each condition (`.csv` or `.json`) or a `.tar.xz` compressed archive of the files for each condition.
For a given figure, not all conditions may be used.

All generated figure inputs are provided in the `analysis` directory.
As much as possible, files are provided in compressed archive form, using the provided `compress.sh` script, taking three arguments: `[NAME OF SIMULATION SET] [EXTENSION NAME] [json OR csv]`.
The corresponding arguments for each of the following figures is provided in the description.

The `generate.py` file contains functions to generate figure inputs.

In [None]:
from scripts.generate import *

<a class="anchor" id="workspace-variables"></a>

### WORKSPACE VARIABLES

Set up workspace variables for analyzing simulations.

- **`DATA_PATH`** is the path to data files (`.tar.xz` files of compressed simulation outputs)
- **`RESULTS_PATH`** is the path to result files (`.csv` files generated by parsing)
- **`ANALYSIS_PATH`** is the path for analysis files (`.json` and `.csv` files, `.tar.xz` compressed archives)

In [None]:
DATA_PATH = "/path/to/data/files/"
RESULTS_PATH = "/path/to/result/files/"
ANALYSIS_PATH = "/path/to/analysis/files/"

<a class="anchor" id="system-representation"></a>

### SYSTEM REPRESENTATION

Generate `SYSTEM_REPRESENTATION` figure inputs.

In [None]:
from scripts.SYSTEM_REPRESENTATION import SYSTEM_REPRESENTATION

<a class="anchor" id="system-representation-colony-borders"></a>

**colony borders**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/colony_borders.html) &#x2022;
    [back to top](#toc)
</span>

The colony borders figure shows overlay of borders for 2D and different 3D simulation conditions across context and geometries for seed 0 at t = 15 days.

The function `get_colony_borders` extracts colony borders for the given timepoint and seed.
Files created by the function can be compressed with `compress.sh` using `SYSTEM_REPRESENTATION .BORDERS csv`.

In [None]:
SYSTEM_REPRESENTATION.run(ANALYSIS_PATH, RESULTS_PATH, get_colony_borders, timepoints=15.0, seeds=0)

<a class="anchor" id="system-representation-simulation-outlines"></a>

**simulation outlines**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_outlines.html) &#x2022;
    [back to top](#toc)
</span>

The simulation outlines figure shows colony outlines for each seed and simulation condition.

The `.OUTLINES` files produced in the basic analysis step are used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `SYSTEM_REPRESENTATION.OUTLINES.tar.xz`.
Only the conditions with extension `.150.csv` are needed.

<a class="anchor" id="system-representation-simulation-states"></a>

**simulation states**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_states.html) &#x2022;
    [back to top](#toc)
</span>

The simulation states figure shows states of each cell for seed 0 for each seed and simulation condition.

The `.POSITIONS` files produced in the basic analysis step are used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `SYSTEM_REPRESENTATION.POSITIONS.tar.xz`.
Only the conditions with extension `.150.00.csv` are needed.

<a class="anchor" id="system-representation-simulation-metrics"></a>

**simulation metrics**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_metrics.html) &#x2022;
    [back to top](#toc)
</span>

The simulation metrics figures shows emergent metrics over time for populations grouped by selected conditions and context.

The `.METRICS` files produced in the basic analysis step are merged into a single file per metric (`SIMULATION_REPRESENTATION.METRICS.*.json`) which are then used as inputs for D3.

For simulations in 3D, we also extract metrics from only the center layer (z = 0).
Files created by the function `get_center_layers` can be compressed with `compress.sh` using `SYSTEM_REPRESENTATION .LAYERS json`.

In [None]:
dimensions = ["3D"]
SYSTEM_REPRESENTATION.run(ANALYSIS_PATH, RESULTS_PATH, get_center_layers_metrics, \
                          dimensions=dimensions)
SYSTEM_REPRESENTATION.run(ANALYSIS_PATH, RESULTS_PATH, get_center_layer_seeds, \
                          timepoints=[15.0], dimensions=dimensions)

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES"]
for metric in METRICS:
    SYSTEM_REPRESENTATION.loop(ANALYSIS_PATH, merge_metrics, save_metrics, f".METRICS.{metric}", timepoints=[None])

<a class="anchor" id="cell-variability"></a>

### CELL VARIABILITY

Generate `CELL_VARIABILITY` figure inputs.

In [None]:
from scripts.CELL_VARIABILITY import CELL_VARIABILITY

<a class="anchor" id="cell-variability-feature-quantiles"></a>

**feature quantiles**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/feature_quantiles.html) &#x2022;
    [back to top](#toc)
</span>

The feature quantiles figure shows distributions of age and volume for selected contexts and conditions.

The function `get_feature_quantiles` calculates quantiles across all seeds and timepoints.
These files are then merged into a single file `CELL_VARIABILITY.QUANTILES.json` that is used as input for D3.
Files created by `get_feature_quantiles` can be compressed with `compress.sh` using `CELL_VARIABILITY .QUANTILES csv`.

In [None]:
timepoints = [1, 2, 3, 4, 8, 15]
CELL_VARIABILITY.load(ANALYSIS_PATH, DATA_PATH, get_feature_quantiles, timepoints=timepoints)
CELL_VARIABILITY.loop(ANALYSIS_PATH, merge_feature_quantiles, save_feature_quantiles, \
                        f".QUANTILES", timepoints=[None])

<a class="anchor" id="cell-variability-simulation-outlines"></a>

**simulation outlines**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_outlines.html) &#x2022;
    [back to top](#toc)
</span>

The simulation outlines figure shows colony outlines for each seed and simulation condition.

The `.OUTLINES` files produced in the basic analysis step are used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `CELL_VARIABILITY.OUTLINES.tar.xz`.
Only the conditions with extension `.150.csv` are needed.

<a class="anchor" id="cell-variability-simulation-states"></a>

**simulation states**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_states.html) &#x2022;
    [back to top](#toc)
</span>

The simulation states figure shows states of each cell for seed 0 for each seed and simulation condition.

The `.POSITIONS` files produced in the basic analysis step are used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `CELL_VARIABILITY.POSITIONS.tar.xz`.
Only the conditions with extension `.150.00.csv` are needed.

<a class="anchor" id="cell-variability-simulation-metrics"></a>

**simulation metrics**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_metrics.html) &#x2022;
    [back to top](#toc)
</span>

The simulation metrics figures shows emergent metrics over time for populations grouped by selected conditions and context.

The `.METRICS` files produced in the basic analysis step are merged into a single file per metric (`CELL_VARIABILITY.METRICS.*.json`) which are then used as inputs for D3.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES"]
for metric in METRICS:
    CELL_VARIABILITY.loop(ANALYSIS_PATH, merge_metrics, save_metrics, f".METRICS.{metric}", timepoints=[None])

<a class="anchor" id="nutrient-dynamics"></a>

### NUTRIENT DYNAMICS

Generate `NUTRIENT_DYNAMICS` figure inputs.

In [None]:
from scripts.NUTRIENT_DYNAMICS import NUTRIENT_DYNAMICS

<a class="anchor" id="nutrient-dynamics-concentration-profiles"></a>

**concentration profiles**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/concentration_profiles.html) &#x2022;
    [back to top](#toc)
</span>

The concentration profiles figure shows timecourse of average glucose concentration at the center of the simulation across seeds for all conditions.

The function `make_concentration_profiles` extracts center concentrations for the every timepoint.
Files created by the function can be compressed with `compress.sh` using `NUTRIENT_DYNAMICS .PROFILES csv`.
The `.PROFILES` files produced are merged into a single file (`NUTRIENT_DYNAMICS.PROFILES.csv`) which is used as input for D3.

In [None]:
NUTRIENT_DYNAMICS.load(ANALYSIS_PATH, DATA_PATH, make_concentration_profiles, timepoints=None)
NUTRIENT_DYNAMICS.loop(ANALYSIS_PATH, merge_concentration_profiles, save_concentration_profiles, \
                       f".PROFILES", timepoints=[None])

<a class="anchor" id="nutrient-dynamics-simulation-outlines"></a>

**simulation outlines**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_outlines.html) &#x2022;
    [back to top](#toc)
</span>

The simulation outlines figure shows colony outlines for each seed and simulation condition.

The `.OUTLINES` files produced in the basic analysis step are used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `NUTRIENT_DYNAMICS.OUTLINES.tar.xz`.
Only the conditions with extension `.150.csv` are needed.

<a class="anchor" id="nutrient-dynamics-simulation-states"></a>

**simulation states**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_states.html) &#x2022;
    [back to top](#toc)
</span>

The simulation states figure shows states of each cell for seed 0 for each seed and simulation condition.

The `.POSITIONS` files produced in the basic analysis step are used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `NUTRIENT_DYNAMICS.POSITIONS.tar.xz`.
Only the conditions with extension `.150.00.csv` are needed.

<a class="anchor" id="nutrient-dynamics-simulation-metrics"></a>

**simulation metrics**
<span style="float:right;">
    [go to figure](http://0.0.0.0:8000/figures/simulation_metrics.html) &#x2022;
    [back to top](#toc)
</span>

The simulation metrics figures shows emergent metrics over time for populations grouped by selected conditions and context.

The `.METRICS` files produced in the basic analysis step are merged into a single file per metric (`NUTRIENT_DYNAMICS.METRICS.*.json`) which are then used as inputs for D3.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES"]
for metric in METRICS:
    NUTRIENT_DYNAMICS.loop(ANALYSIS_PATH, merge_metrics, save_metrics, f".METRICS.{metric}", timepoints=[None])