GitHub - MihirBafna/clarify: Multi-level Graph Autoencoder (GAE) to clarify cell cell interactions and gene regulatory network inference from spatially resolved transcriptomics

Article

Check out our ISMB'2023 and Bioinformatics paper here.

Installation & Setup

Make sure to clone this repository along with the submodules as follows:

git clone --recurse-submodules https://github.com/MihirBafna/clarify.git
cd clarify

To install dependencies to a conda environment, follow the instructions provided in the installation.md

Data

Three datasets were utilized for evaluation:

seqFISH profile of mouse visual cortex (Zhu et al., 2018)
MERFISH profile of mouse hypothalamic preoptic region (Moffitt et al., 2018)
scMultiSim simulated dataset (Li et al., 2022)

All of the preprocessed data are organized into pandas dataframes and are located at ./data. These dataframes can be used directly as input to Clarify.

Demos & Results

To reproduce results, make sure you either run Clarify's preprocessing or copy the contents from this dropbox link. Download this directory and copy each "1_preprocessing_output" folder into the respective folders ("./out/seqfish", "./out/merfish", "./out/scmultisim_final") in your cloned repository.

To visualize results (from pretrained Clarify models) and play around with demos, use the following interactive notebooks:

Run Clarify

To run Clarify, run main.py and configure parameters based on their definitions below:

usage: main.py [-h] [-m MODE] [-i INPUTDIRPATH] [-o OUTPUTDIRPATH] [-s STUDYNAME] [-t SPLIT] 
               [-n NUMGENESPERCELL] [-k NEARESTNEIGHBORS] [-l LRDATABASE] [--fp FP] [--fn FN] [-a OWNADJACENCYPATH]

The first row of parameters are necessary

-m MODE, --mode MODE clarify mode: preprocess,train (pick one or both separated by a comma)
-i INPUTDIRPATH, --inputdirpath Input directory path where ST dataframe is stored
-o OUTPUTDIRPATH, --outputdirpath Output directory path where results will be stored
-s STUDYNAME, --studyname clarify study name to act as identifier for outputs
-t SPLIT, --split ratio of test edges [0,1)

This second row of parameters have defaults set and are not needed.

-n NUMGENESPERCELL, --numgenespercell Number of genes in each gene regulatory network (default 45)
-k NEARESTNEIGHBORS, --nearestneighbors Number of nearest neighbors for each cell (default 5)
-l LRDATABASE, --lrdatabase 0/1/2 for which Ligand-Receptor Database to use (default 0 corresponds to mouse DB)
--fp FP (experimentation only) add # of fake edges to train set [0,1)
--fn FN (experimentation only) remove # of real edges from train set [0,1)
-a OWNADJACENCYPATH, --ownadjacencypath Using your own cell level adjacency (give path)

For example, if you wanted to run Clarify (both preprocessing and training) on the seqFISH data input with a 70/30 train-test split, then use the following command and set the output folder and studyname accordingly:

python main.py -m preprocess,train -i ../data/seqFISH/seqfish_dataframe.csv -o [OUTPUT FOLDER PATH] -s [STUDYNAME] -t 0.3

Since we have already preprocessed these datasets (link provided above), you can also skip that step by running the following command. Note that you should use the specified out folder (not your own) as that is where the preprocessed results are stored. You can still set your desired studyname.

python main.py -m train -i ../data/seqFISH/seqfish_dataframe.csv -o ../out/seqfish/ -s [STUDYNAME] -t 0.3

Cite

@Article{pmid37387180,
   Author="Bafna, M.  and Li, H.  and Zhang, X. ",
   Title="{{C}{L}{A}{R}{I}{F}{Y}: cell-cell interaction and gene regulatory network refinement from spatially resolved transcriptomics}",
   Journal="Bioinformatics",
   Year="2023",
   Volume="39",
   Number="Supplement_1",
   Pages="i484-i493",
   Month="Jun"
}

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
benchmark		benchmark
data		data
out		out
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
development.ipynb		development.ipynb
environment.yml		environment.yml
evaluation.ipynb		evaluation.ipynb
false_edges_experiment.ipynb		false_edges_experiment.ipynb
genie3.py		genie3.py
genie3_evaluation.ipynb		genie3_evaluation.ipynb
installation.md		installation.md
preprocessing.ipynb		preprocessing.ipynb
requirements.txt		requirements.txt
test.ipynb		test.ipynb

MihirBafna/clarify

Folders and files

Latest commit

History

Repository files navigation

Article

Installation & Setup

Data

Demos & Results

Run Clarify

Cite

About

Topics

Resources

Stars

Watchers

Forks

Languages