Python implementation of PUMA
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
ToyData
img
pypuma
.gitignore
README.md
output_puma.txt
requirements.txt
run_puma.py
setup.py
test.py

README.md

Description

Implementation of the Puma algorithm. Python code based on the Panda implementation pypanda from https://github.com/QuackenbushLab/pypanda and https://github.com/davidvi/pypanda.

Table of Contents

Links to literature

  • PUMA (PANDA Using MicroRNA Associations)
    Manuscript in preparation, used in PUMA.
    C and MATLAB code: https://github.com/mararie/PUMA

  • PANDA Passing Attributes between Networks for Data Assimilation
    Glass K, Huttenhower C, Quackenbush J, Yuan GC. Passing Messages Between Biological Networks to Refine Predicted Interactions, PLoS One, 2013 May 31;8(5):e64832
    Original PANDA C++ code: http://sourceforge.net/projects/panda-net/.

  • LIONESS (Linear Interpolation to Obtain Network Estimates for Single Samples)
    Marieke Lydia Kuijjer, Matthew Tung,GuoCheng Yuan,John Quackenbush, Kimberly Glass. Estimating sample-specific regulatory networks

LIONESS can be used to estimate single-sample networks using aggregate networks made with any network reconstruction algorithm (http://arxiv.org/pdf/1505.06440.pdf).

Puma algorithm

To find agreement between the three input networks first the responsibility (R) is calculated.

Thereafter availability (A) is calculated.

Availability and responsibility are combined with the following formula.

Protein cooperativity and gene co-regulatory networks are updated.

P and C are updated to satisfy convergence.

Hamming distance is calculated every iteration.

Installation

PyPanda runs on Python 2.7. You can either run the pypanda script directly (see Usage) or install it on your system. We recommend the following commands to install pypandas on UNIX systems:

Using a virtual environment

Using python virtual environments is the cleanest installation method.

Cloning git and setting up a python virtual environment:

pip install --user pipenv   #Make sure you have pipenv
git clone https://github.com/aless80/PyPuma.git
cd PyPuma

Creating a virtual environment and installing pypanda:

virtualenv pypumaenv #virtual environment created in a folder inside the git folder 
source pypumaenv/bin/activate
(pypumaenv)$ pip install -r requirements.txt
(pypumaenv)$ python setup.py install --record files.txt

Uninstall pypanda from virtual environment:

cat files.txt | xargs rm -rf

Complete removal of virtual environment and pypanda:

(pypuma)$ deactivate	#Quit virtual environment
rm -rf pypumaenv

Using pip

Never use sudo pip. Instead you can use pip on the user's install directory:

git clone https://github.com/aless80/PyPuma.git
cd PyPuma
python setup.py install --user
#to run from the command line you will need to make PyPuma executable and add the bin directory to your PATH:
cd bin
chmod +x PyPuma
echo "$(pwd):PATH" >> ~/.bashrc
source ~/.bashrc

To run PyPuma from Windows (not fully tested) install Git (https://git-scm.com/downloads) and Anaconda Python2.7 (https://www.continuum.io/downloads) and from the Anaconda prompt run:

git clone https://github.com/aless80/PyPuma.git
cd PyPuma
python setup.py install

Usage

Run from terminal

PyPuma can be run directly from the terminal with the following options:

-h help
-e, --expression: expression values
-m, --motif: pair file of motif edges, or Pearson correlation matrix when not provided 
-p, --ppi: pair file of PPI edges
-o, --output: output file
-i, --mir: mir data miR file
-r, --rm_missing
-q, --lioness: output for Lioness single sample networks 

To run PyPuma on toy data:

python run_puma.py -e ./ToyData/ToyExpressionData.txt -m ./ToyData/ToyMotifData.txt -p ./ToyData/ToyPPIData.txt -i ToyData/ToyMiRList.txt -o output_puma.txt -i ./ToyData/ToyMiRList.txt -i ToyData/ToyMiRList.txt

To reconstruct a single sample Lioness Pearson correlation network (this can take some time):

python run_puma.py -e ./ToyData/ToyExpressionData.txt -m ./ToyData/ToyMotifData.txt -p ./ToyData/ToyPPIData.txt -i ToyData/ToyMiRList.txt -o output_puma.txt -q output_lioness.txt

Run from python

Fire up your python shell or ipython notebook. Import the classes in the PyPuma library:

from pypuma.puma import Puma
from pypuma.lioness import Lioness

Run the Puma algorithm, leave out motif and PPI data to use Pearson correlation network:

puma_obj = Puma('ToyData/ToyExpressionData.txt', 'ToyData/ToyMotifData.txt', 'ToyData/ToyPPIData.txt','ToyData/ToyMiRList.txt')

Save the results:

puma_obj.save_puma_results('Toy_Puma.pairs.txt')

Return a network plot:

puma_obj.top_network_plot(top=70, file='top_genes.png')

Calculate indegrees for further analysis:

indegree = puma_obj.return_puma_indegree()

Calculate outdegrees for further analysis:

outdegree = puma_obj.return_puma_outdegree()

Run the Lioness algorithm for single sample networks:

lioness_obj = Lioness(puma_obj)

Save Lioness results:

lioness_obj.save_lioness_results('Toy_Lioness.txt')

Return a network plot for one of the Lioness single sample networks:

plot = AnalyzeLioness(lioness_obj)
plot.top_network_plot(column= 0, top=100, file='top_100_genes.png')

Toy data

The example gene expression data that we have available here contains gene expression profiles for different samples in the columns. Of note, this is just a small subset of a larger gene expression dataset. We provided these "toy" data so that the user can test the method.

However, if you plan to model gene regulatory networks on your own dataset, you should use your own expression data as input.