This package enables network-based protein activity estimation on Python. It provides also interfaces for scanpy (single-cell RNASeq analysis in Python). Functions are partly transplanted from R package viper and the R package NaRnEA.
scanpy
for single cell pipelinepandas
(>=1.3.0 & <2.0, due toscanpy
incompatibility (issue)) andanndata
for data computing and storage.numpy
andscipy
for scientific computation.joblib
for parallel computingtqdm
show progress bar
pip install viper-in-python
git clone https://github.com/alevax/pyviper/
cd pyviper
pip install -e .
import pandas as pd
import anndata
import pyviper
# Load sample data
ges = anndata.read_text("test/unit_tests/test_1/test_1_inputs/LNCaPWT_gExpr_GES.tsv").T
# Load network
network = pyviper.load.msigdb_regulon("h")
# Translate sample data from ensembl to gene names
ges = pyviper.translate_adata_index(ges, desired_format = "human_symbol")
## Filter targets in the interactome
network.filter_targets(ges.var_names)
# Compute regulon activities
## area
activity = pyviper.viper(gex_data=ges, interactome=network, enrichment="area")
print(activity.to_df())
## narnea
activity = pyviper.viper(gex_data=ges, interactome=network, enrichment="narnea", eset_filter=False)
print(activity.to_df())
- Analyzing scRNA-seq data at the Protein Activity Level
- Inferring Protein Activity from scRNA-seq data from multiple cell populations with the meta-VIPER approach
- Generating Metacells for ARACNe3 network generation and VIPER protein activity analysis (note: to be updated soon)
The main functions available from pyviper
are:
pyviper.viper
: "pyviper" function for Virtual Inference of Protein Activity by Enriched Regulon Analysis (VIPER). The function allows using 2 enrichment algorithms, aREA and (matrix)-NaRnEA (see below).pyviper.aREA
: computes aREA (analytic rank-based enrichment analysis) and meta-aREApyviper.NaRnEA
: computes matrix-NaRnEA, a vectorized, implementation of NaRnEApyviper.pp.translate_adata_index
: for translating between species (i.e. mouse vs human) and between ensembl, entrez and gene symbols.pyviper.tl.path_enr
: computes pathway enrichment
Other notable functions include:
pyviper.tl.OncoMatch
: computes OncoMatch, an algorithm to assess the overlap in differentially active MR proteins between two sets of samples (e.g. validate GEMMs as effective models of human samples)pyviper.pp.stouffer
: computes signatures on a cluster-by-cluster basis using Cluster integration method for pathway enrichmentpyviper.pp.viper_similarity
: computes the similarity between VIPER signaturespyviper.pp.repr_metacells
: compute representative metacells (e.g. for ARACNe) using our method to maximize unique sample usage and minimize resampling (users can specify depth, percent data usage, etc).pyviper.pp.repr_subsample
: select a representative subsample of data using our method to ensure a widely distributed sampling.
Additionally, the following submodules are available:
pyviper.load
: submodule containing several utility functions useful for different analyses, includingload_msigdb_regulon
,load_TFs
etcpyviper.pl
: submodule containing pyviper-wrappers forscanpy
plottingpyviper.tl
: submodule containing pyviper-wrappers forscanpy
data transformationpyviper.config
: submodule allowing users to specify current species and filepaths for regulators
Please, report any issues that you experience through this repository "Issues".
For any other info or queries please write to Alessandro Vasciaveo (av2729@cumc.columbia.edu)
pyviper
is distributed under a MIT License (see LICENSE).
Manuscript in review