Skip to content

Lav-i/GNNImpute

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GNNImpute

Installation

Option 1: Use python virtual environment with conda

Download the source code from github.

git clone git@github.com:Lav-i/GNNImpute.git
cd GNNImpute

Create a python virtual environment and install the required packages. If your device is cuda available, you can choose to use torch with gpu.

conda create -n gnnimpute python=3.6
conda activate gnnimpute
pip install -r requirements.txt

Option 2: Use docker

Build from Dockerfile or download docker image from docker hub.

docker pull razzil/gnnimpute:v0.1.2
docker run --gpus all --rm -it razzil/gnnimpute:v0.1.2

The benchmark data set has been provided in the docker image.

Prepare data

Download

wget https://www.ncbi.nlm.nih.gov/geo/download/\?acc\=GSE65525\&format\=file -O ./data/Klein/GSE65525_RAW.tar
tar xvf ./data/Klein/GSE65525_RAW.tar -C ./data/Klein
wget https://cf.10xgenomics.com/samples/cell-exp/1.1.0/frozen_pbmc_donor_a/frozen_pbmc_donor_a_filtered_gene_bc_matrices.tar.gz -O ./data/PBMC/frozen_pbmc_donor_a_filtered_gene_bc_matrices.tar.gz
tar xvf ./data/PBMC/frozen_pbmc_donor_a_filtered_gene_bc_matrices.tar.gz -C ./data/PBMC
mv ./data/PBMC/filtered_matrices_mex/hg19/* ./data/PBMC

Preprocess

Process Klein data set into standard format.

python ./data/Klein/preprocess.py

Process PBMC data set into standard format.

python ./data/PBMC/preprocess.py

Output file (./data/{name}/processed/{name}.h5ad) is the filtered expression matrix, the file format is h5ad.

Mask (Get benchmark data)

Mask Klein data set.

python ./data/mask.py --masked_prob=0.1 --dataset=Klein

Mask PBMC data set.

python ./data/mask.py --masked_prob=0.1 --dataset=PBMC

Output folder (./data/{name}/masked/) contains the main output file (representing the masked expression matrix) in h5ad and csv formats. And the file in npz format indicates the location of the dropout event.

Usage

Quick Start

import scanpy as sc
from GNNImpute.api import GNNImpute

adata = sc.read_h5ad('./data/Klein/masked/Klein_01.h5ad')

adata = GNNImpute(adata=adata,
                  layer='GATConv',
                  no_cuda=False,
                  epochs=3000,
                  lr=0.001,
                  weight_decay=0.0005,
                  hidden=50,
                  patience=200,
                  fastmode=False,
                  heads=3,
                  use_raw=True,
                  verbose=True)

Output variable (adata) contains the main output file (representing the imputed expression matrix) in AnnData format.

Tutorials

For more details, please see to Example File.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published