GitHub - FunctionLab/mahi: Framework to predict gene essentiality across cell lines.

Mahi is a deep learning framework integrating chromatin features and protein structure with tissue-specific networks for context-dependent gene representation.

Installation

Recommended Installation (using 'environment.yaml')

# clone GitHub repository
git clone https://github.com/FunctionLab/mahi.git
cd mahi

# create Conda environment from YAML
conda env create -f environment.yaml
conda activate mahi

# install PyTorch Geometric dependencies
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-2.1.0+cu121.html

Manual Installation

# create new Conda environment
conda create --name mahi python=3.10 pytorch=2.1 torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda activate mahi

# install dependencies
pip install "numpy<2"
pip install torch-geometric wandb pytorch-lightning ipykernel umap-learn biopython pyfaidx seaborn xgboost
conda install scikit-learn matplotlib pandas -c conda-forge

# install PyTorch Geometric dependencies
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-2.1.0+cu121.html

# install transformers package
pip install "transformers[torch]"

Demo: gene essentiality prediction

Please start by downloading the data from the following link & unzip the data: https://drive.google.com/drive/folders/1xWfPkC8bs3aQCsI6YMqYpXnSn6f6E1-B?usp=share_link

unzip <data>.zip

This demo runs gene essentiality prediction on one cell line to verify your set up (GPU not needed).

# attach gene essentiality labels to Mahi demo embeddings for lung tissue
python scripts/gene_essentiality/add_labels.py \
  --mahi_root data/demo/mahi_embeddings \
  --data_dir data

# evaluate gene essentiality (5-fold CV + test eval)
python scripts/gene_essentiality/evaluate_mahi_gene_essentiality.py \
  --out_dir outputs/demo \
  --mahi_root data/demo/mahi_embeddings \
  --mapping_file resources/cell_lines.txt \
  --cell_line ACH-000012                         # comment out this flag to run on all 1,183 cell lines

Optional (HPC/SLURM)

For much faster runtime on CPUs, you can also submit the demo as a SLURM job:

sbatch demo.slurm

Outputs

outputs/demo/mahi_gene_essentiality_eval/
  ├── mahi.metrics_by_cellline_and_tissue.csv    # summary metrics on training set
  ├── cv_preds/                                  # per-gene out-of-fold predictions
  └── test_preds/                                # per-gene test predictions

Mahi: End-to-end

Mahi can be run entirely on CPU (unless you are re-training the multigraph GNN). Please download the tissue-specific functional network before running Mahi.

Generate Mahi embeddings

python wt_mahi.py \
  --dir data \
  --tissue lung \
  --checkpoint checkpoints/best-checkpoint.ckpt

Perturbation (gene KO) analysis

python perturb_mahi.py \
  --dir data \
  --gene <Entrez ID> \
  --tissue lung \
  --checkpoint checkpoints/best-checkpoint.ckpt

Rank perturbation effects

python get_top_genes.py \
  --wt data/mahi_embeddings/<tissue>.pkl \
  --ko data/<Entrez ID>/mahi_embeddings/<tissue>.pkl \
  --avg resources/averaged_distances.csv \
  --out data/<Entrez ID>/top_genes_fc.csv \
  --top 1000

To-Do

Allow processing for multiple tissues at once (WT Mahi & perturb Mahi)
Add code to generate baseline averages across 200 random global perturbations
Add more information about how to run each Mahi step individually for more control

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Installation

Recommended Installation (using 'environment.yaml')

Manual Installation

Demo: gene essentiality prediction

Optional (HPC/SLURM)

Outputs

Mahi: End-to-end

Generate Mahi embeddings

Perturbation (gene KO) analysis

Rank perturbation effects

To-Do

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.ipynb_checkpoints		.ipynb_checkpoints
checkpoints		checkpoints
outputs/demo/mahi_gene_essentiality_eval		outputs/demo/mahi_gene_essentiality_eval
resources		resources
scripts		scripts
.gitignore		.gitignore
README.md		README.md
demo.slurm		demo.slurm
environment.yaml		environment.yaml
get_top_genes.py		get_top_genes.py
mahi_logo.png		mahi_logo.png
perturb_mahi.py		perturb_mahi.py
wt_mahi.py		wt_mahi.py

FunctionLab/mahi

Folders and files

Latest commit

History

Repository files navigation

Installation

Recommended Installation (using 'environment.yaml')

Manual Installation

Demo: gene essentiality prediction

Optional (HPC/SLURM)

Outputs

Mahi: End-to-end

Generate Mahi embeddings

Perturbation (gene KO) analysis

Rank perturbation effects

To-Do

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages