# Running ONTraC on simulated dataset

## Notes

This notebook will show you the process of running ONTraC on simulation data.

# ONTraC installation

We assume that you have installed ONTraC based on following instructions and open this notebook using installed Python kernel (Python 3.11 (ONTraC)).

In [None]:
conda create -n ONTraC python=3.11
conda activate ONTraC
pip install "ONTraC[analysis]==1.*"
pip install ipykernel
python -m ipykernel install --user --name ONTraC --display-name "Python 3.11 (ONTraC)"

## Running ONTraC on simulated data

ONTraC will run on CPU if CUDA is not available.


Download `simulated_dataset_meta_input.csv` and precomputed results from [Zenodo](https://zenodo.org/records/XXXXXX)

In [None]:
%%bash

source ~/.bash_profile
conda activate ONTraC
ONTraC --meta-input simulated_dataset_meta_input.csv --preprocessing-dir simulated_preprocessing --GNN-dir simulated_GNN --NTScore-dir simulated_NTScore \
       --device cuda --epochs 1000 --batch-size 5 -s 42 --patience 100 --min-delta 0.001 --min-epochs 50 --lr 0.03 --hidden-feats 4 -k 6 \
       --modularity-loss-weight 1 --purity-loss-weight 30 --regularization-loss-weight 0.3 --beta 0.3 --equal-space 2>&1 | tee simulated.log

## Results visualization

We only show two simple examples here, please see [post analysis tutorial](../../tutorials/post_analysis.md) for details and more figures.

### Plotting prepare

In [None]:
import numpy as np
import pandas as pd

import matplotlib as mpl

mpl.rcParams['pdf.fonttype'] = 42
mpl.rcParams['ps.fonttype'] = 42
mpl.rcParams['font.family'] = 'Arial'
import matplotlib.pyplot as plt
import seaborn as sns

from ONTraC.analysis.data import AnaData

### Loading ONTraC results

In [1]:
from optparse import Values

options = Values()
options.preprocessing_dir = 'train_backup/V2/V1_reproduce_simulated_dataset/simulated_preprocessing/'
options.GNN_dir = 'train_backup/V2/V1_reproduce_simulated_dataset/simulated_GNN/'
options.NTScore_dir = 'train_backup/V2/V1_reproduce_simulated_dataset/simulated_NTScore/'
options.log = 'train_backup/V2/V1_reproduce_simulated_dataset/simulated.log'
options.reverse = True  # Set it to False if you don't want reverse NT score
options.embedding_adjust = False

ana_data = AnaData(options)

### Spatial cell type distribution

In [None]:
cell_type_pal = {'A': '#7CAE00',
                 'B': '#00BC5A',
                 'C': '#00C0B3',
                 'D': '#00B4F0',
                 'E': '#8E92FF',
                 'F': '#EA6AF1',
                 'G': '#FF64B0',
                 'H': '#C42F5D',
                 'I': '#A45900',
                 'J': '#6A7300'}

fig, ax = plt.subplots(1, 1, figsize = (4, 3))
sns.scatterplot(data = ana_data.meta_data,
            x = 'x',
            y = 'y',
            hue = 'Cell_Type',
            palette=cell_type_pal,
            edgecolor=None,
            s = 8,
            ax = ax)
# ax.set_aspect('equal', 'box')  # uncomment this line if you want set the x and y axis with same scaling
# ax.set_xticks([])  # uncomment this line if you don't want to show x coordinates
# ax.set_yticks([]) # uncomment this line if you don't want to show y coordinates
ax.legend(loc='upper left', bbox_to_anchor=(1,1))


fig.tight_layout()
fig.savefig('simulation_spatial_cell_type.png', dpi=300)

![spatial cell type distribution](img/simulation_spatial_cell_type.png)

### Cell-level NT score spatial distribution

In [None]:
fig, ax = plt.subplots(1, 1, figsize = (3.5, 3))
NT_score = ana_data.NT_score['Cell_NTScore'] if not ana_data.options.reverse else 1 - ana_data.NT_score['Cell_NTScore']
scatter = ax.scatter(x=ana_data.NT_score['x'],
                     y=ana_data.NT_score['y'],
                     c=NT_score,
                     cmap='rainbow',
                     vmin=0,
                     vmax=1,
                     s=1)
# ax.set_aspect('equal', 'box')  # uncomment this line if you want set the x and y axis with same scaling
# ax.set_xticks([])  # uncomment this line if you don't want to show x coordinates
# ax.set_yticks([]) # uncomment this line if you don't want to show y coordinates
plt.colorbar(scatter)
ax.set_title(f"Cell-level NT score")


fig.tight_layout()
fig.savefig('simulation_cell_level_NT_score.png', dpi=300)

![cell-level NT score](img/simulation_cell_level_NT_score.png)