<a class="anchor" id="toc"></a>
# ANALYZE DATA & RESULTS

This notebook provides functions and scripts for running basic analysis on simulation data and parsed results.

---
- [WORKSPACE VARIABLES](#workspace-variables)
- **[DEFAULT](#default)**
- **[MODULE_COMPLEXITY](#module-complexity)**
- **[PARAMETER SENSITIVITY](#parameter-sensitivity)**
- **[GROWTH CONTEXT](#growth-context)**
- **[CELL COMPETITION](#cell-competition)**
- **[POPULATION HETEROGENEITY](#population-heterogeneity)**
---

The following basis analyses are performed for every simulation set:

- `analyze_metrics` extracts number of cells (`COUNTS`), total volume (`VOLUMES`), average cell cycle length (`CYCLES`), colony diameter (`DIAMETERS`), number of cell of each type (`TYPES`), number of cells in each populations (`POPS`), growth rate (`GROWTH`), symmetry (`SYMMETRY`), and activity (`ACTIVITY`) across time
- `analyze_seeds` extracts the above metrics per seed at selected timepoints
- `analyze_locations` extracts cell counts, volumes, types, and populations per location at selected timepoints
- `analyze_distribution` extracts distribution of cell types and populations as a function of radius from the center of the simulation
- `analyze_outlines` extracts colony outline at selected timepoints
- `analyze_concentrations` extracts concentration profiles as a function of radius from the center of the simulation over time

Each of the simulation sets has a corresponding script of the same name, which contains a class (of the same name) with relevant condition variables and methods to iterate through these condition.

The `analyze.py` file contains general analysis functions to perform each of the above analyses as well as specific methods to calculate the above listed metrics and measures.

Analysis can take some time, so all resulting `.json` and `.csv` files are provided in the `analysis` directory.
Note that not all analyses need to be run at the same time.

These files are compressed using the provided `compress.sh` script, which takes the simulation set name as an argument:

```bash
$ ./compress.sh DEFAULT
$ ./compress.sh MODULE_COMPLEXITY
$ ./compress.sh PARAMETER_SENSITIVITY
$ ./compress.sh GROWTH_CONTEXT
$ ./compress.sh CELL_COMPETITION
$ ./compress.sh POPULATION_HETEROGENEITY
```

In [None]:
from scripts.analyze import *

<a class="anchor" id="workspace-variables"></a>

### WORKSPACE VARIABLES
<span style="float:right;">[back to top](#toc)</span>

Set up workspace variables for analyzing simulations.

- **`DATA_PATH`** is the path to data files (`.tar.xz` files of compressed simulation outputs)
- **`RESULTS_PATH`** is the path to result files (`.pkl` files generated by parsing)
- **`ANALYSIS_PATH`** is the path for analysis files (`.json` and `.csv` files, `.tar.xz` compressed archives)

In [None]:
DATA_PATH = "/path/to/data/files/"
RESULTS_PATH = "/path/to/result/files/"
ANALYSIS_PATH = "/path/to/analysis/files/"

- **`TIMEPOINTS`** is a list of select timepoint indices
- **`TIMEPOINTS_OFFSET`** is a list of select timepoint indices offset by 1 day
- **`TIMEPOINTS_ADDITIONAL`** is a list of additional select timepoint indices
- **`ALL_TIMEPOINTS`** is a list of all timepoint indices
- **`ALL_TIMEPOINTS_OFFSET`** is a list of all timepoint indices offset by 1 day

In [None]:
TIMEPOINTS = [2, 16, 30]
TIMEPOINTS_OFFSET = [0, 14, 28]
TIMEPOINTS_ADDITIONAL = [2, 10, 16, 20, 30]
ALL_TIMEPOINTS = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30]
ALL_TIMEPOINTS_OFFSET = [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

<a class="anchor" id="default"></a>

### DEFAULT
<span style="float:right;">[back to top](#toc)</span>

Analyze `DEFAULT` simulations.

In [None]:
from scripts.DEFAULT import DEFAULT

In [None]:
DEFAULT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_metrics, timepoints=TIMEPOINTS)
DEFAULT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_seeds, timepoints=TIMEPOINTS)
DEFAULT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_locations, timepoints=TIMEPOINTS)
DEFAULT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_distribution, timepoints=TIMEPOINTS)
DEFAULT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_outlines, timepoints=TIMEPOINTS)
DEFAULT.load(ANALYSIS_PATH, DATA_PATH, analyze_concentrations, timepoints=ALL_TIMEPOINTS)

<a class="anchor" id="module-complexity"></a>

### MODULE COMPLEXITY
<span style="float:right;">[back to top](#toc)</span>

Analyze `MODULE_COMPLEXITY` simulations.

In [None]:
from scripts.MODULE_COMPLEXITY import MODULE_COMPLEXITY

In [None]:
MODULE_COMPLEXITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_metrics, timepoints=TIMEPOINTS)
MODULE_COMPLEXITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_seeds, timepoints=TIMEPOINTS)
MODULE_COMPLEXITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_locations, timepoints=TIMEPOINTS)
MODULE_COMPLEXITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_distribution, timepoints=TIMEPOINTS)
MODULE_COMPLEXITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_outlines, timepoints=TIMEPOINTS)
MODULE_COMPLEXITY.load(ANALYSIS_PATH, DATA_PATH, analyze_concentrations, timepoints=ALL_TIMEPOINTS)

<a class="anchor" id="parameter-sensitivity"></a>

### PARAMETER SENSITIVITY
<span style="float:right;">[back to top](#toc)</span>

Analyze `PARAMETER_SENSITIVITY` simulations.

In [None]:
from scripts.PARAMETER_SENSITIVITY import PARAMETER_SENSITIVITY

In [None]:
PARAMETER_SENSITIVITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_metrics, timepoints=TIMEPOINTS)
PARAMETER_SENSITIVITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_seeds, timepoints=TIMEPOINTS)
PARAMETER_SENSITIVITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_locations, timepoints=TIMEPOINTS)
PARAMETER_SENSITIVITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_distribution, timepoints=TIMEPOINTS)
PARAMETER_SENSITIVITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_outlines, timepoints=TIMEPOINTS)
PARAMETER_SENSITIVITY.load(ANALYSIS_PATH, DATA_PATH, analyze_concentrations, timepoints=ALL_TIMEPOINTS)

<a class="anchor" id="growth-context"></a>

### GROWTH CONTEXT
<span style="float:right;">[back to top](#toc)</span>

Analyze `GROWTH_CONTEXT` simulations.

In [None]:
from scripts.GROWTH_CONTEXT import GROWTH_CONTEXT

In [None]:
GROWTH_CONTEXT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_metrics, timepoints=TIMEPOINTS)
GROWTH_CONTEXT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_seeds, timepoints=TIMEPOINTS_ADDITIONAL)
GROWTH_CONTEXT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_locations, timepoints=TIMEPOINTS)
GROWTH_CONTEXT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_distribution, timepoints=TIMEPOINTS)
GROWTH_CONTEXT.run(ANALYSIS_PATH, RESULTS_PATH, analyze_outlines, timepoints=TIMEPOINTS)
GROWTH_CONTEXT.load(ANALYSIS_PATH, DATA_PATH, analyze_concentrations, timepoints=ALL_TIMEPOINTS)

<a class="anchor" id="cell-competition"></a>

### CELL COMPETITION
<span style="float:right;">[back to top](#toc)</span>

Analyze `CELL_COMPETITION` simulations.

In [None]:
from scripts.CELL_COMPETITION import CELL_COMPETITION

In [None]:
CELL_COMPETITION.run(ANALYSIS_PATH, RESULTS_PATH, analyze_metrics, timepoints=TIMEPOINTS_OFFSET)
CELL_COMPETITION.run(ANALYSIS_PATH, RESULTS_PATH, analyze_seeds, timepoints=TIMEPOINTS_OFFSET)
CELL_COMPETITION.run(ANALYSIS_PATH, RESULTS_PATH, analyze_locations, timepoints=TIMEPOINTS_OFFSET)
CELL_COMPETITION.run(ANALYSIS_PATH, RESULTS_PATH, analyze_distribution, timepoints=TIMEPOINTS_OFFSET)
CELL_COMPETITION.run(ANALYSIS_PATH, RESULTS_PATH, analyze_outlines, timepoints=TIMEPOINTS_OFFSET)
CELL_COMPETITION.load(ANALYSIS_PATH, DATA_PATH, analyze_concentrations, timepoints=ALL_TIMEPOINTS_OFFSET)

<a class="anchor" id="population-heterogeneity"></a>

### POPULATION HETEROGENEITY
<span style="float:right;">[back to top](#toc)</span>

Analyze `POPULATION_HETEROGENEITY` simulations.

In [None]:
from scripts.POPULATION_HETEROGENEITY import POPULATION_HETEROGENEITY

In [None]:
POPULATION_HETEROGENEITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_metrics, timepoints=TIMEPOINTS)
POPULATION_HETEROGENEITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_seeds, timepoints=TIMEPOINTS)
POPULATION_HETEROGENEITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_locations, timepoints=TIMEPOINTS)
POPULATION_HETEROGENEITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_distribution, timepoints=TIMEPOINTS)
POPULATION_HETEROGENEITY.run(ANALYSIS_PATH, RESULTS_PATH, analyze_outlines, timepoints=TIMEPOINTS)
POPULATION_HETEROGENEITY.load(ANALYSIS_PATH, DATA_PATH, analyze_concentrations, timepoints=ALL_TIMEPOINTS)