<a class="anchor" id="toc"></a>
# PARSE SIMULATION OUTPUTS

This notebook provides the functions and scripts for parsing simulation files (`.json`) into pickled numpy arrays (`.pkl`). 

---
- [WORKSPACE VARIABLES](#workspace-variables)
- **[DEFAULT](#default)**
- **[MODULE_COMPLEXITY](#module-complexity)**
- **[PARAMETER SENSITIVITY](#parameter-sensitivity)**
- **[GROWTH CONTEXT](#growth-context)**
- **[CELL COMPETITION](#cell-competition)**
- **[POPULATION HETEROGENEITY](#population-heterogeneity)**
---

The main parsing function (`parse_simulations`) iterates through each file in the data path and parses each simulation instance, extracting fields from the simulation setup, cells, and environment.

The parsed arrays are organized as:

```json
{
    "setup": {
        "radius": R,
        "height": H,
        "time": [],
        "pops": [],
        "types": [],
        "coords": []
    },
    "agents": (N seeds) x (T timepoints) x (H height) x (C coordinates) x (P positions),
    "environments": {
        "glucose": (N seeds) x (T timepoints) x (H height) x (R radius)
        "oxygen": (N seeds) x (T timepoints) x (H height) x (R radius)
        "tgfa": (N seeds) x (T timepoints) x (H height) x (R radius)
    }
}
```

where each entry in the agents array is a structured entry of the shape:

```
"pop"       int8    population code
"type"      int8    cell type code
"volume"    int16   cell volume (rounded)
"cycle"     int16   average cell cycle length (rounded)
```

The `parse.py` file contains general parsing functions.

Parsing can take some time, so parsed `.pkl` files for all simulations are provided along with the raw simulation data.

In [None]:
from scripts.parse import *

<a class="anchor" id="workspace-variables"></a>

### WORKSPACE VARIABLES 
<span style="float:right;">[back to top](#toc)</span>

Set up workspace variables for parsing simulations.

- **`DATA_PATH`** is the path to data files (`.tar.xz` files of compressed simulation outputs)
- **`RESULT_PATH`** is the path for result files (`.pkl` files generated by parsing)
- **`EXCLUDE`** is a list of seeds to exclude from parsing.

In [None]:
DATA_PATH = "/path/to/data/files/"
RESULTS_PATH = "/path/to/result/files/"
EXCLUDE = []

<a class="anchor" id="default"></a>

### DEFAULT
<span style="float:right;">[back to top](#toc)</span>

Parse `DEFAULT` simulations.

In [None]:
parse_simulations("DEFAULT", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="module-complexity"></a>

### MODULE COMPLEXITY
<span style="float:right;">[back to top](#toc)</span>

Parse `MODULE_COMPLEXITY` simulations.

In [None]:
parse_simulations("MODULE_COMPLEXITY", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="parameter-sensitivity"></a>

### PARAMETER SENSITIVITY
<span style="float:right;">[back to top](#toc)</span>

Parse `PARAMETER_SENSITIVITY` simulations.

In [None]:
parse_simulations("PARAMETER_SENSITIVITY", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="growth-context"></a>

### GROWTH CONTEXT
<span style="float:right;">[back to top](#toc)</span>

Parse `GROWTH_CONTEXT` simulations.

In [None]:
parse_simulations("GROWTH_CONTEXT", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="cell-competition"></a>

### CELL COMPETITION
<span style="float:right;">[back to top](#toc)</span>

Parse `CELL_COMPETITION` simulations.

In [None]:
parse_simulations("CELL_COMPETITION", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="population-heterogeneity"></a>

### POPULATION HETEROGENEITY
<span style="float:right;">[back to top](#toc)</span>

Parse `POPULATION_HETEROGENEITY` simulations.

In [None]:
parse_simulations("POPULATION_HETEROGENEITY", DATA_PATH, RESULTS_PATH, EXCLUDE)