# mcDETECT Tutorial

Authors: Chenyang Yuan, Krupa Patel, Hongshun Shi, Hsiao-Lin V. Wang, Feng Wang, Ronghua Li, Yangping Li, Victor G. Corces, Hailing Shi, Sulagna Das, Jindan Yu, Peng Jin, Bing Yao* and Jian Hu*

### Outline

1. [Installation](#1-installation)
2. [Import Python modules](#2-import-python-modules)
3. [Read in data](#3-read-in-data)
4. [Parameter settings](#4-parameter-settings)
5. [Synapse detection](#5-synapse-detection)
6. [Spatial domain assignment](#6-spatial-domain-assignment)
7. [Synapse transcriptome profiling](#7-synapse-transcriptome-profiling)
8. [Synapse subtyping](#8-synapse-subtyping)

### 1. Installation

The detailed installation procedure can be found in [Installation](../README.md/#installation). Here I directly install `mcDETECT` by running:

```bash
python3 -m pip install mcDETECT
```

Check the current version:

In [8]:
import mcDETECT
mcDETECT.__version__

'1.0.8'

### 2. Import Python modules

Compiling this tutorial file needs the following Python packages:

In [9]:
import anndata
import math
import matplotlib.colors as clr
import matplotlib.pyplot as plt
import miniball
import numpy as np
import pandas as pd
import scanpy as sc
import SpaGCN as spg
from mcDETECT import mcDETECT

import warnings
warnings.filterwarnings("ignore")
sc.settings.verbosity = 0

### 3. Read in data

The toy dataset used in this tutorial is part of the isocortex region from [Xenium 5K mouse brain data](https://www.10xgenomics.com/datasets/xenium-prime-fresh-frozen-mouse-brain).

`mcDETECT` requires the following input:

* Transcript file (dataframe): records gene identity and 3D spatial coordinates of each mRNA molecule

In [10]:
transcripts = pd.read_parquet("toy_data/transcripts.parquet")

We need to rename some columns of the transcript file to adapt to the input format. The input transcript file should look like:

In [11]:
transcripts = transcripts[['cell_id', 'overlaps_nucleus', 'feature_name', 'x_location', 'y_location', 'z_location']]
transcripts = transcripts.rename(columns = {"feature_name": "target", "x_location": "global_x", "y_location": "global_y", "z_location": "global_z"})
transcripts.head()

Unnamed: 0,cell_id,overlaps_nucleus,target,global_x,global_y,global_z
163006771,fgdhmaei-1,0,A1cf,5994.734375,2021.46875,15.125
163006772,UNASSIGNED,0,A2m,5763.109375,2043.625,15.78125
163006773,UNASSIGNED,0,A2m,5951.984375,2085.984375,16.578125
163006774,hieeideh-1,1,Aatf,5757.59375,2163.453125,17.28125
163006775,fghnlpdi-1,1,Aatf,5969.40625,2149.40625,17.625


In [None]:
transcripts

* Synaptic markers (list)

In [None]:
syn_genes = ['Snap25', 'Camk2a', 'Slc17a7', 'Vamp2', 'Syp', 'Syn1', 'Dlg4', 'Gria2', 'Gap43', 'Gria1', 'Bsn', 'Slc32a1']

* Negative control markers (list)

In [None]:
nc_genes = pd.read_csv('toy_data/negative_controls.csv')
nc_genes = list(nc_genes['Gene'])

### 4. Parameter settings

In [None]:
mc = mcDETECT(type = "Xenium", transcripts = transcripts, syn_genes = syn_genes, nc_genes = nc_genes, eps = 1.5, grid_len = 1, cutoff_prob = 0.95, alpha = 5, low_bound = 3,
              size_thr = 5, in_nucleus_thr = (0.5, 0.5), l = 1, rho = 0.2, s = 1, nc_top = 20, nc_thr = 0.1)

### 5. Synapse detection

In [None]:
sphere = mc.detect()

In [None]:
sphere

In [None]:
a, b = mc.construct_grid()

In [None]:
len(b)

In [None]:
aaa = mc.spot_expression(grid_len=50)

### 6. Spatial domain assignment

### 7. Synapse transcriptome profiling

In [None]:
a = mc.profile(sphere)

### 8. Synapse subtyping