# PARSE SIMULATION OUTPUTS

This notebook provides the functions and scripts for parsing simulation files (`.json`) into pickled numpy arrays (`.pkl`). 

---
- [WORKSPACE VARIABLES](#workspace-variables)
- **[SITE ARCHITECTURE](#site-architecture)**
- **[ESTIMATED HEMODYNAMICS](#estimated-hemodynamics)**
- **[EXACT HEMODYNAMICS](#exact-hemodynamics)**
- **[VASCULAR DAMAGE](#vascular-damage)**
- **[VASCULAR FUNCTION](#vascular-function)**
---

The main parsing function (`parse_simulations`) iterates through each file in the data path and parses each simulation instance, extracting fields from the simulation setup, cells, and environment.

The parsed arrays are organized as:

```json
{
    "setup": {
        "radius": R,
        "height": H,
        "time": [],
        "pops": [],
        "types": [],
        "coords": []
    },
    "agents": (N seeds) x (T timepoints) x (H height) x (C coordinates) x (P positions),
    "environments": {
        "glucose": (N seeds) x (T timepoints) x (H height) x (R radius)
        "oxygen": (N seeds) x (T timepoints) x (H height) x (R radius)
        "tgfa": (N seeds) x (T timepoints) x (H height) x (R radius)
    }
}
```

where each entry in the agents array is a structured entry of the shape:

```
"pop"       int8    population code
"type"      int8    cell type code
"volume"    int16   cell volume (rounded)
"cycle"     int16   average cell cycle length (rounded)
```

The `parse.py` file contains general parsing functions.

Parsing can take some time, so parsed `.pkl` files for all simulations are provided along with the raw simulation data.

In [None]:
from scripts.parse import *

<a class="anchor" id="workspace-variables"></a>

### WORKSPACE VARIABLES 

Set up workspace variables for parsing simulations.

- **`DATA_PATH`** is the path to data files (`.tar.xz` files of compressed simulation outputs)
- **`RESULT_PATH`** is the path for result files (`.pkl` files generated by parsing)
- **`EXCLUDE`** is a list of seeds to exclude from parsing.

In [None]:
DATA_PATH = "/path/to/data/files/"
RESULTS_PATH = "/path/to/result/files/"
EXCLUDE = []

<a class="anchor" id="site-architecture"></a>

### SITE ARCHITECTURE

Parse `SITE_ARCHITECTURE` simulations.

In [None]:
parse_simulations("SITE_ARCHITECTURE", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="estimated-hemodynamics"></a>

### ESTIMATED HEMODYNAMICS

Parse `ESTIMATED_HEMODYNAMICS` simulations.

In [None]:
parse_simulations("ESTIMATED_HEMODYNAMICS", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="exact-hemodynamics"></a>

### EXACT HEMODYNAMICS

Parse `EXACT_HEMODYNAMICS` simulations.

In [None]:
parse_simulations("EXACT_HEMODYNAMICS", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="vascular-damage"></a>

### VASCULAR DAMAGE

Parse `VASCULAR_DAMAGE` simulations.

In [None]:
parse_simulations("VASCULAR_DAMAGE", DATA_PATH, RESULTS_PATH, EXCLUDE)

<a class="anchor" id="vascular-function"></a>

### VASCULAR FUNCTION

Parse `VASCULAR_FUNCTION` simulations.

In [None]:
parse_simulations("VASCULAR_FUNCTION", DATA_PATH, RESULTS_PATH, EXCLUDE)