# GENERATE FIGURE INPUTS

This notebook provides functions and scripts for generating input files for visualization with D3.

---
- [WORKSPACE VARIABLES](#workspace-variables)
- **[SITE ARCHITECTURE](#site-architecture)**
    - [point distances](#site-architecture-point-distances)
    - [type grid](#site-architecture-type-grid)
    - [site scatter](#site-architecture-site-scatter)
    - [outline graph](#site-architecture-outline-graph)
    - [site ladder](#site-architecture-site-ladder)
- **[ESTIMATED HEMODYNAMICS](#estimated-hemodynamics)**
    - [value sensitivity](#estimated-hemodynamics-value-sensitivity)
- **[EXACT HEMODYNAMICS](#exact-hemodynamics)**
    - [capillary density](#exact-hemodynamics-capillary-density)
    - [value sensitivity](#exact-hemodynamics-value-sensitivity)
    - [layout merged](#exact-hemodynamics-layout-merged)
- **[VASCULAR DAMAGE](#vascular-damage)**
    - [value sensitivity](#vascular-damage-value-sensitivity)
- **[VASCULAR FUNCTION](#vascular-function)**
    - [value sensitivity](#vascular-function-value-sensitivity)
    - [layout merged](#vascular-function-layout-merged)
- ***[MULTIPLE SIMULATION SETS](#multiple-simulation-sets)***
    - [pattern compare](#multiple-simulation-sets-pattern-compare)
    - [pattern types](#multiple-simulation-sets-pattern-types)
    - [layout merged](#multiple-simulation-sets-layout-merged)
    - [layout scatter](#multiple-simulation-sets-layout-scatter)
    - [property distribution](#multiple-simulation-sets-property-distribution)
    - [measure ladder](#multiple-simulation-sets-measure-ladder)
---

Each of the simulation sets has a corresponding script of the same name, which contains a class (of the same name) with relevant condition variables and methods to iterate through these condition.
The `loop` method will be the most often, to iterate through all conditions, extract relevant information, and then compile the information into a single output file.
The `loop` method works with both individual files for each condition (`.csv` or `.json`) or a `.tar.xz` compressed archive of the files for each condition.
For a given figure, not all conditions may be used.

All generated figure inputs are provided in the `analysis` directory.
As much as possible, files are provided in compressed archive form, using the provided `compress.sh` script, taking three arguments: `[NAME OF SIMULATION SET] [EXTENSION NAME] [json OR csv]`.
The corresponding arguments for each of the following figures is provided in the description.

The `generate.py` file contains functions to generate figure inputs.

In [None]:
from scripts.generate import *

<a class="anchor" id="workspace-variables"></a>

### WORKSPACE VARIABLES

Set up workspace variables for analyzing simulations.

- **`DATA_PATH`** is the path to data files (`.tar.xz` files of compressed simulation outputs)
- **`RESULTS_PATH`** is the path to result files (`.pkl` files generated by parsing)
- **`ANALYSIS_PATH`** is the path for analysis files (`.json` and `.csv` files, `.tar.xz` compressed archives)

In [None]:
DATA_PATH = "/path/to/data/files/"
RESULTS_PATH = "/path/to/result/files/"
ANALYSIS_PATH = "/path/to/analysis/files/"

<a class="anchor" id="site-architecture"></a>

### SITE ARCHITECTURE

Generate `SITE_ARCHITECTURE` figure inputs.

In [None]:
from scripts.SITE_ARCHITECTURE import SITE_ARCHITECTURE

<a class="anchor" id="site-architecture-point-distances"></a>

**point distances**

The point distances figure shows the maximum distance of cells from the center of the simulation as a function of point source distance from the center.

The function `make_point_distances` calculates this distance for each of the point sources.
These files are then merged into a single file `SITE_ARCHITECTURE.DISTANCES.csv` that is used as inputs for D3.
Files created by the function `make_point_distances` can be compressed with `compress.sh` using `SITE_ARCHITECTURE .DISTANCES csv`.

In [None]:
sites = ["SOURCE"]
layouts = {
    "SOURCE": [
        "point0","point2","point4","point6","point8","point10",
        "point12","point14","point16","point18","point20",
        "point22","point24","point26","point28","point30",
        "point32","point34","point36","point38",
    ]
}

SITE_ARCHITECTURE.run(ANALYSIS_PATH, RESULTS_PATH, make_point_distances, timepoints=[None], \
                      sites=sites, layouts=layouts)
SITE_ARCHITECTURE.loop(ANALYSIS_PATH, merge_point_distances, save_point_distances, ".DISTANCES", \
                       timepoints=[None], sites=sites, layouts=layouts)

<a class="anchor" id="site-architecture-state-grid"></a>

**type grid**

The type grid figure shows cell type distribution across the colony for different combinations of source grid spacing in the horizontal and vertical directions.

The function `make_type_grid_borders` extracts the farthest border for each of the grid sources across seeds.
These files are then merged into a single file `SITE_ARCHITECTURE.BORDERS.csv` that is used as inputs for D3.
Files created by the function `make_type_grid_borders` can be compressed with `compress.sh` using `SITE_ARCHITECTURE.BORDERS csv`.

The `.LOCATIONS` files produced in the basic analysis step are also used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `SITE_ARCHITECTURE.LOCATIONS.tar.xz`.
Only the `SOURCE_constant`, `SOURCE_x*y*`, and `SOURCE_grid#` (# = 2, 3, 4, 5) conditions with extension `.LOCATIONS.150.csv` are needed.

In [None]:
sites = ["SOURCE"]
layouts = {
    "SOURCE": [
        "constant","grid2","grid3","grid4","grid5",
        "x1y2","x1y3","x1y4","x1y5",
        "x2y1","x2y3","x2y4","x2y5",
        "x3y1","x3y2","x3y4","x3y5",
        "x4y1","x4y2","x4y3","x4y5",
        "x5y1","x5y2","x5y3","x5y4",
    ]
}

SITE_ARCHITECTURE.run(ANALYSIS_PATH, RESULTS_PATH, make_type_grid_borders, timepoints=[30], \
                      sites=sites, layouts=layouts)
SITE_ARCHITECTURE.loop(ANALYSIS_PATH, merge_type_grid_borders, save_type_grid_borders, ".BORDERS", \
                       timepoints=[None], sites=sites, layouts=layouts)

<a class="anchor" id="site-architecture-site-scatter"></a>

**site scatter**

The site scatter figures shows scatter plots of emergent metrics, center concentrations, or cell type distribution as a function of sites (either point sources or grid sources).

The `.SEEDS` and `.CENTERS` files produced in the basic analysis step are merged into single files per metric (`SITE_ARCHITECTURE.SEEDS.*.json` and `SITE_ARCHITECTURE.CENTERS.json`) which are then used as inputs for D3.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES", "ACTIVITY", "TYPES"]
for metric in METRICS:
    SITE_ARCHITECTURE.loop(ANALYSIS_PATH, merge_seeds, save_seeds, f".SEEDS.{metric}", timepoints=[None])
SITE_ARCHITECTURE.loop(ANALYSIS_PATH, merge_centers, save_centers, ".CENTERS", timepoints=[None])

<a class="anchor" id="site-architecture-outline-graph"></a>

**outline graph**

The outline graph figure shows colony outlines of each seed for different graph sources.

The `.OUTLINES` files produced in the basic analysis step are used as input.
Make sure to EXTRACT THE ARCHIVE FILES from `SITE_ARCHITECTURE.OUTLINES.tar.xz`.
Only the `GRAPH`, `PATTERN`, and `SOURCE_constant` conditions with extension `.OUTLINES.150.csv` are needed.

<a class="anchor" id="site-architecture-site-ladder"></a>

**site ladder**

The site ladder figures show ladder plots of emergent metrics and center concentrations for different graph source layouts.

These figures use the same merged files (`SITE_ARCHITECTURE.SEEDS.*.json` and`SITE_ARCHITECTURE.CENTERS.json`) as the **site scatter** figures (see above).

<a class="anchor" id="estimated-hemodynamics"></a>

### ESTIMATED HEMODYNAMICS

Generate `ESTIMATED_HEMODYNAMICS` figure inputs.

In [None]:
from scripts.ESTIMATED_HEMODYNAMICS import ESTIMATED_HEMODYNAMICS

<a class="anchor" id="estimated-hemodynamics-value-sensitivity"></a>

**value sensitivity**

The value sensitivity figures shows sensitivity of emergent metrics or center concentrations to changes in estimated hemodynamic factor weights along with value from exact hemodynamics simulations.

The `.SEEDS` and `.CENTERS` files produced in the basic analysis step are merged into single files per metric (`ESTIMATED_HEMODYNAMICS.SEEDS.*.json` and `ESTIMATED_HEMODYNAMICS.CENTERS.json`) which are then used as inputs for D3.

These figures include files from the `EXACT_HEMODYNAMICS` set; see corresponding [value sensitivity](#exact-hemodynamics-value-sensitivity) section.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES", "ACTIVITY"]
for metric in METRICS:
    ESTIMATED_HEMODYNAMICS.loop(ANALYSIS_PATH, merge_seeds, save_seeds, f".SEEDS.{metric}", timepoints=[None])
ESTIMATED_HEMODYNAMICS.loop(ANALYSIS_PATH, merge_centers, save_centers, ".CENTERS", timepoints=[None])

<a class="anchor" id="exact-hemodynamics"></a>

### EXACT HEMODYNAMICS

Generate `EXACT_HEMODYNAMICS` figure inputs.

In [None]:
from scripts.EXACT_HEMODYNAMICS import EXACT_HEMODYNAMICS

<a class="anchor" id="exact-hemodynamics-capillary-density"></a>

**capillary density**

The capillary density figure shows capillaries per area for different graph source layouts as a ladder plot.

The function `merge_graph` extracts individual graph metrics from the `.GRAPH` files produced in the basic analysis step.
Capillary density is approximated as the number of capillary edges that cross vertical sections equally spaces across the environment.
Values across conditions and times are merged into a single file `EXACT_HEMODYNAMICS.GRAPH.DENSITY.json` that is used as input for D3.

In [None]:
ALL_TIMES = ["010", "020", "030", "040", "050", "060", "070", "080", "090", "100", "110", "120", "130", "140", "150"]
EXACT_HEMODYNAMICS.loop(ANALYSIS_PATH, merge_graph, save_graph, ".GRAPH.DENSITY", timepoints=ALL_TIMES)

<a class="anchor" id="exact-hemodynamics-value-sensitivity"></a>

**value sensitivity**

The value sensitivity figures shows sensitivity of emergent metrics or center concentrations to changes in estimated hemodynamic factor weights along with value from exact hemodynamics simulations.

The `.SEEDS` and `.CENTERS` files produced in the basic analysis step are merged into single files per metric (`EXACT_HEMODYNAMICS.SEEDS.*.json` and `EXACT_HEMODYNAMICS.CENTERS.json`) which are then used as inputs for D3.

These figures include files from the `ESTIMATED_HEMODYNAMICS` set; see corresponding [value sensitivity](#estimated-hemodynamics-value-sensitivity) section.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES", "ACTIVITY"]
for metric in METRICS:
    EXACT_HEMODYNAMICS.loop(ANALYSIS_PATH, merge_seeds, save_seeds, f".SEEDS.{metric}", timepoints=[None])
EXACT_HEMODYNAMICS.loop(ANALYSIS_PATH, merge_centers, save_centers, ".CENTERS", timepoints=[None])

<a class="anchor" id="exact-hemodynamics-layout-merged"></a>

**layout merged**

The layout merged figures compares emergent metrics or center concentrations for different graph layouts.

The functions `make_layout_merged_metrics` and `make_layout_merged_concentrations` extracts timecourse of metrics or concentrations, respectively, for each seed.
Files created by the functions can be compressed with `compress.sh` using `EXACT_HEMODYNAMICS .MERGED json`.

Files for these figures are combined with files from the `VASCULAR_FUNCTION` set; see corresponding [layout merged](#vascular-function-layout-merged) section and [combined layout merge](#multiple-simulation-sets-layout-merged) section.

In [None]:
EXACT_HEMODYNAMICS.run(ANALYSIS_PATH, RESULTS_PATH, make_layout_merged_metrics)
EXACT_HEMODYNAMICS.load(ANALYSIS_PATH, DATA_PATH, make_layout_merged_concentrations)

<a class="anchor" id="vascular-damage"></a>

### VASCULAR DAMAGE

Generate `VASCULAR_DAMAGE` figure inputs.

In [None]:
from scripts.VASCULAR_DAMAGE import VASCULAR_DAMAGE

<a class="anchor" id="vascular-damage-value-sensitivity"></a>

**value sensitivity**

The value sensitivity figures shows sensitivity of emergent metrics or center concentrations to changes in vascular damage fraction along with value from simulations with degradation and remodeling.

The `.SEEDS` and `.CENTERS` files produced in the basic analysis step are merged into single files per metric (`VASCULAR_DAMAGE.SEEDS.*.json` and `VASCULAR_DAMAGE.CENTERS.json`) which are then used as inputs for D3.

These figures include files from the `VASCULAR_FUNCTION` set; see corresponding [value sensitivity](#vascular-function-value-sensitivity) section.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES", "ACTIVITY"]
for metric in METRICS:
    VASCULAR_DAMAGE.loop(ANALYSIS_PATH, merge_seeds, save_seeds, f".SEEDS.{metric}", timepoints=[None])
VASCULAR_DAMAGE.loop(ANALYSIS_PATH, merge_centers, save_centers, ".CENTERS", timepoints=[None])

<a class="anchor" id="vascular-function"></a>

### VASCULAR FUNCTION

Generate `VASCULAR_FUNCTION` figure inputs.

In [None]:
from scripts.VASCULAR_FUNCTION import VASCULAR_FUNCTION

<a class="anchor" id="vascular-function-value-sensitivity"></a>

**value sensitivity**

The value sensitivity figures shows sensitivity of emergent metrics or center concentrations to changes in vascular damage fraction along with value from simulations with degradation and remodeling.

The `.SEEDS` and `.CENTERS` files produced in the basic analysis step are merged into single files per metric (`VASCULAR_FUNCTION.SEEDS.*.json` and `VASCULAR_FUNCTION.CENTERS.json`) which are then used as inputs for D3.

These figures include files from the `VASCULAR_DAMAGE` set; see corresponding [value sensitivity](#vascular-damage-value-sensitivity) section.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES", "ACTIVITY"]
for metric in METRICS:
    VASCULAR_FUNCTION.loop(ANALYSIS_PATH, merge_seeds, save_seeds, f".SEEDS.{metric}", timepoints=[None])
VASCULAR_FUNCTION.loop(ANALYSIS_PATH, merge_centers, save_centers, ".CENTERS", timepoints=[None])

<a class="anchor" id="vascular-function-layout-merged"></a>

**layout merged**

The layout merged figures compares emergent metrics or center concentrations for different graph layouts.

The functions `make_layout_merged_metrics` and `make_layout_merged_concentrations` extracts timecourse of metrics or concentrations, respectively, for each seed.
Files created by the functions can be compressed with `compress.sh` using `VASCULAR_FUNCTION .MERGED json`.

Files for these figures are combined with files from the `EXACT_HEMODYNAMICS` set; see corresponding [layout merged](#exact-hemodynamics-layout-merged) section and [combined layout merge](#multiple-simulation-sets-layout-merged) section.

In [None]:
VASCULAR_FUNCTION.run(ANALYSIS_PATH, RESULTS_PATH, make_layout_merged_metrics)
VASCULAR_FUNCTION.load(ANALYSIS_PATH, DATA_PATH, make_layout_merged_concentrations)

<a class="anchor" id="multiple-simulation-sets"></a>

### MULTIPLE SIMULATION SETS

Generate figure inputs that draw from files from more than one simulation set.

<a class="anchor" id="multiple-simulation-sets-pattern-compare"></a>

**pattern compare**

The pattern compare figures shows emergent metrics or center concentrations over time for the pattern layout with varying hemodynamics and damage (using the `EXACT_HEMODYNAMICS`, `VASCULAR_DAMAGE`, and `VASCULAR_FUNCTION` simulation sets).

For emergent metrics, the `.METRICS` files produced in the basic analysis step are merged into single files per metric (`PATTERN_COMPARE.*.json`) which are then used as inputs for D3.
For concentrations, we loop through select data files `.tar.xz` to extract glucose and oxygen concentration at the center of the simulation for all timepoints.

In [None]:
NAMES = ["EXACT_HEMODYNAMICS", "VASCULAR_DAMAGE", "VASCULAR_FUNCTION"]
CONTEXTS = ["C", "CHX"]
CASES = {
    "EXACT_HEMODYNAMICS": [
        [("graphs", "PATTERN")],
    ],
    "VASCULAR_DAMAGE": [
        [("value", "000"), ("frac", "000")],
        [("value", "100"), ("frac", "000")],
        [("value", "000"), ("frac", "100")],
        [("value", "100"), ("frac", "100")],
    ],
    "VASCULAR_FUNCTION": [
        [("graphs", "PATTERN")],
    ],
}

for metric in ["GROWTH", "SYMMETRY", "CYCLES", "ACTIVITY"]:
    make_pattern_compare_metrics(ANALYSIS_PATH, metric, NAMES, CONTEXTS, CASES)
make_pattern_compare_concentrations(DATA_PATH, ANALYSIS_PATH, NAMES, CONTEXTS, CASES)

<a class="anchor" id="multiple-simulation-sets-pattern-types"></a>

**pattern types**

The pattern types figures shows cell type distribution across the colony for varying hemodynamics and damage (using the `EXACT_HEMODYNAMICS`, `VASCULAR_DAMAGE`, and `VASCULAR_FUNCTION` simulation sets).

The function `make_pattern_types_locations` merges selected `.LOCATIONS` files produced in the basic analysis step into a single file (`PATTERN_TYPES.LOCATIONS.csv`).
The function `make_pattern_types_borders` extracts the farthest border for each of the combinations across seeds, merged into a single file `PATTERN_TYPES.BORDERS.csv`.
Both files are used as inputs for D3.

In [None]:
NAMES = ["VASCULAR_DAMAGE","EXACT_HEMODYNAMICS",  "VASCULAR_FUNCTION"]
CONTEXTS = ["C", "CHX"]
CASES = {
    "EXACT_HEMODYNAMICS": [
        ("PATTERN", "exact", "no"),
    ],
    "VASCULAR_DAMAGE": [
        ("000_000", "static", "no"),
        ("000_100", "simple", "no"),
        ("100_000", "static", "yes"),
        ("100_100",  "simple", "yes"),
    ],
    "VASCULAR_FUNCTION": [
        ("PATTERN", "exact", "yes"),
    ],
}

make_pattern_types_locations(ANALYSIS_PATH, NAMES, CONTEXTS, CASES)
make_pattern_types_borders(RESULTS_PATH, ANALYSIS_PATH, NAMES, CONTEXTS, CASES)

<a class="anchor" id="multiple-simulation-sets-layout-merged"></a>

**layout merged**

The layout merged figures compares emergent metrics or center concentrations for different graph layouts.

The `make_layout_merged` function loops through the simulation sets and contexts to calculated mean, standard deviation, and confidence interval for the PATTERN layout and all five GRAPH layouts.

Files for these figures use outputs from the `EXACT_HEMODYNAMICS` and `VASCULAR_FUNCTION` sets: see corresponding `EXACT_HEMODYNAMICS`[layout merged](#exact-hemodynamics-layout-merged) section and `VASCULAR_FUNCTION` [layout merge](#vascular-function-layout-merged) section.

In [None]:
METRICS = ["GROWTH", "SYMMETRY", "CYCLES", "ACTIVITY", "GLUCOSE", "OXYGEN"]
for metric in METRICS:
    make_layout_merged(ANALYSIS_PATH, metric)

<a class="anchor" id="multiple-simulation-sets-layout-scatter"></a>

**layout scatter**

The layout scatter figures show different hemodynamic properties for different layouts and coupling, with overlaid experimental data.

The `make_layout_scatter` function loops through selected `.GRAPH` files produced in the basic analysis step from the `EXACT_HEMODYNAMICS` (uncoupled) and `VASCULAR_FUNCTION` (coupled) sets.
Entries are separated into pattern layout (`PATTERN`) or the root layout (`Lav`, `Lava`, `Lvav`, `Sav`, and `Savav`) outputs.

Note that the figure only shows a randomly selected subset of edges for clarity, but kernel smoothed lines are calculated across all edges.

In [None]:
make_layout_scatter(ANALYSIS_PATH)

<a class="anchor" id="multiple-simulation-sets-property-distribution"></a>

**property distribution**

The property distribution figures show distribution of different hemodynamic properties for different layouts and coupling as violins.

The `make_property_distribution` function loops through selected `.GRAPH` files produced in the basic analysis step from the `EXACT_HEMODYNAMICS` and `VASCULAR_FUNCTION` sets to extract relevent properties (`RADIUS`, `WALL`, `SHEAR`, `CIRCUM`, `FLOW`, and `PRESSURE`.
Merged files (`PROPERTY_DISTRIBUTION.*.json`) contain dictionaries of binned data keyed on the context and layout type.

The `VASCULAR_FUNCTION` set uses the `C` (colony) and `CH` (tissue) contexts.
The `EXACT_HEMODYNAMICS` set (which does not alter the vascular structure) uses the `C/CH` context, indicating that properties do not differ between contexts.

In [None]:
make_property_distribution(ANALYSIS_PATH)

<a class="anchor" id="multiple-simulation-sets-measure-ladder"></a>

**measure ladder**

The measure ladder figures show ladder plots of graph measures for different root layouts.

The `make_graph_measures` function loops through selected `.MEASURES` files produced in the basic analysis step from the `EXACT_HEMODYNAMICS` and `VASCULAR_FUNCTION` sets to extract graph measures at day 15 into a single file (`GRAPH_MEASURES.csv`)
The `VASCULAR_FUNCTION` set uses the `C` (colony) and `CH` (tissue) contexts.
The `EXACT_HEMODYNAMICS` set (which does not alter the vascular structure) uses the `C/CH` context, indicating that measures do not differ between contexts.

In [None]:
make_graph_measures(ANALYSIS_PATH)