# Running ONTraC on simulated dataset

## Notes

This notebook will show you the process of running ONTraC on simulation data.

# ONTraC installation

We assume that you have installed ONTraC based on following instructions and open this notebook using installed Python kernel (Python 3.11 (ONTraC)).

In [None]:
conda create -n ONTraC python=3.11
conda activate ONTraC
pip install "ONTraC[analysis]==1.*"
pip install ipykernel
python -m ipykernel install --user --name ONTraC --display-name "Python 3.11 (ONTraC)"

## Running ONTraC on simulated data

ONTraC will run on CPU if CUDA is not available.


Download `simulated_dataset_meta_input.csv` and precomputed results from [Zenodo](https://zenodo.org/records/XXXXXX)

In [None]:
%%bash

source ~/.bash_profile
conda activate ONTraC
ONTraC --meta-input full_simulation_data_with_noise.csv --NN-dir simulation_NN --GNN-dir simulation_GNN --NT-dir simulation_NT --device cuda --epochs 1000 -s 42 --lr 0.03 --hidden-feats 4 -k 6 --modularity-loss-weight 0.3 --regularization-loss-weight 0.1 --purity-loss-weight 300 --beta 0.03 2>&1 | tee simulation.log

## Results visualization

We only show two simple examples here, please see [post analysis tutorial](../../tutorials/post_analysis.md) for details and more figures.

### Plotting prepare

In [None]:
import numpy as np
import pandas as pd

import matplotlib as mpl

mpl.rcParams['pdf.fonttype'] = 42
mpl.rcParams['ps.fonttype'] = 42
mpl.rcParams['font.family'] = 'Arial'
import matplotlib.pyplot as plt
import seaborn as sns

from ONTraC.analysis.data import AnaData

### Loading ONTraC results

In [None]:
from ONTraC.analysis.data import AnaData
from optparse import Values

options = Values()
options.NN_dir = 'simulation_NN'
options.GNN_dir = 'simulation_GNN'
options.NT_dir = 'simulation_NT'
options.log = 'simulation.log'
options.reverse = True  # Set it to False if you don't want reverse NT score
options.output = None  # We save the output figure by our self here
ana_data = AnaData(options)

### Spatial cell type distribution

In [None]:
from ONTraC.analysis.cell_type import plot_spatial_cell_type_distribution_dataset_from_anadata


cell_type_pal = {'A': '#7CAE00',
                 'B': '#00BC5A',
                 'C': '#00C0B3',
                 'D': '#00B4F0',
                 'E': '#8E92FF',
                 'F': '#EA6AF1',
                 'G': '#FF64B0',
                 'H': '#C42F5D',
                 'I': '#A45900',
                 'J': '#6A7300'}



fig, axes = plot_spatial_cell_type_distribution_dataset_from_anadata(ana_data = ana_data,
                palette=cell_type_pal)
fig.savefig('figures/Spatial_cell_type.png', dpi=150)

![spatial cell type distribution](img/simulation_spatial_cell_type.png)

### Cell-level NT score spatial distribution

In [None]:
from ONTraC.analysis.spatial import plot_cell_NT_score_dataset_from_anadata

fig, ax = plot_cell_NT_score_dataset_from_anadata(ana_data=ana_data)
fig.savefig('cell_level_NT_score.png', dpi=300)

![cell-level NT score](img/simulation_cell_level_NT_score.png)