# Running ONTraC on Stereo-seq dataset

## Notes

This notebook will show you the process of running ONTraC on stereo-seq data.

# ONTraC installation

We assume that you have installed ONTraC based on following instructions and open this notebook using installed Python kernel (Python 3.11 (ONTraC)).

In [None]:
conda create -n ONTraC python=3.11
conda activate ONTraC
pip install "ONTraC[analysis]==1.*"
pip install ipykernel
python -m ipykernel install --user --name ONTraC --display-name "Python 3.11 (ONTraC)"

## Running ONTraC on Stereo-seq data

ONTraC will run on CPU if CUDA is not available.


Download `stereo_seq_dataset_meta_input.csv` and precomputed results from [Zenodo](https://zenodo.org/records/XXXXXX)

In [1]:
%%bash

source ~/.bash_profile
conda activate ONTraC
ONTraC --meta-input stereo_seq_dataset_meta_input.csv --preprocessing-dir stereo_seq_preprocessing --GNN-dir stereo_seq_GNN --NTScore-dir stereo_seq_NTScore \
       --device cuda --epochs 1000 --batch-size 5 -s 42 --patience 100 --min-delta 0.001 --min-epochs 50 --lr 0.03 --hidden-feats 4 -k 6 \
       --modularity-loss-weight 1 --purity-loss-weight 30 --regularization-loss-weight 0.1 --beta 0.3 --equal-space 2>&1 | tee stereo_seq.log

## Results visualization

We only show two simple examples here, please see [post analysis tutorial](../../tutorials/post_analysis.md) for details and more figures.

### Plotting prepare

In [None]:
import numpy as np
import pandas as pd

import matplotlib as mpl

mpl.rcParams['pdf.fonttype'] = 42
mpl.rcParams['ps.fonttype'] = 42
mpl.rcParams['font.family'] = 'Arial'
import matplotlib.pyplot as plt
import seaborn as sns

from ONTraC.analysis.data import AnaData

### Loading ONTraC results

In [1]:
from optparse import Values

options = Values()
options.preprocessing_dir = 'train_backup/V2/V1_reproduce_stereo_seq_dataset/stereo_seq_preprocessing/'
options.GNN_dir = 'train_backup/V2/V1_reproduce_stereo_seq_dataset/stereo_seq_GNN/'
options.NTScore_dir = 'train_backup/V2/V1_reproduce_stereo_seq_dataset/stereo_seq_NTScore/'
options.log = 'train_backup/V2/V1_reproduce_stereo_seq_dataset/stereo_seq.log'
options.reverse = True  # Set it to False if you don't want reverse NT score
options.embedding_adjust = False

ana_data = AnaData(options)

### Spatial cell type distribution

In [None]:
data_df = ana_data.meta_data
samples = data_df['Sample'].unique()
N = len(samples)
fig, axes = plt.subplots(1, N, figsize = (4 * N, 3))
for i, sample in enumerate(samples):
    sample_df = data_df.loc[data_df['Sample'] == sample]
    ax = axes[i] if N > 1 else axes
    sns.scatterplot(data = sample_df,
                x = 'x',
                y = 'y',
                hue = 'Cell_Type',
                hue_order = ['RGC', 'GlioB', 'NeuB', 'GluNeuB', 'GluNeu', 'GABA', 'Ery', 'Endo', 'Fibro', 'Basal'],  # change based on your own dataset or remove this line
                s = 8,
                ax = ax)
    # ax.set_aspect('equal', 'box')  # uncomment this line if you want set the x and y axis with same scaling
    # ax.set_xticks([])  # uncomment this line if you don't want to show x coordinates
    # ax.set_yticks([]) # uncomment this line if you don't want to show y coordinates
    ax.set_title(f"{sample}")
    ax.legend(loc='upper left', bbox_to_anchor=(1,1))


fig.tight_layout()
fig.savefig('stereo_seq_spatial_cell_type.png', dpi=150)

![spatial cell type distribution](img/stereo_seq_spatial_cell_type.png)

### Cell-level NT score spatial distribution

In [None]:
samples = ana_data.NT_score['Sample'].unique().tolist()

N = len(samples)
fig, axes = plt.subplots(1, N, figsize=(3.5 * N, 3))
for i, sample in enumerate(samples):
    sample_df = ana_data.NT_score.loc[ana_data.NT_score['Sample'] == sample]
    ax = axes[i] if N > 1 else axes
    NT_score = sample_df['Cell_NTScore'] if not ana_data.options.reverse else 1 - sample_df['Cell_NTScore']
    scatter = ax.scatter(sample_df['x'], sample_df['y'], c=NT_score, cmap='rainbow', vmin=0, vmax=1, s=1)
    # ax.set_aspect('equal', 'box')  # uncomment this line if you want set the x and y axis with same scaling
    # ax.set_xticks([])  # uncomment this line if you don't want to show x coordinates
    # ax.set_yticks([]) # uncomment this line if you don't want to show y coordinates
    plt.colorbar(scatter)
    ax.set_title(f"{sample} Cell-level NT Score")

fig.tight_layout()
fig.savefig('stereo_seq_cell_level_NT_score.png', dpi=200)

![cell-level NT score](img/stereo_seq_cell_level_NT_score.png)