Skip to content

srjun/panfp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


PanFP is a Python pipeline to predict pangenome-based functional profiles for microbial communities.

Requirements

Specific libraries are required by PanFP. We provide a requirements file to install everything at once. To do so, you will need first to have pip installed and then run:

pip3 --version                      # Check if installed
sudo apt-get install python3-pip    # if you need to install pip, you can check installation with the previous command
pip3 install -r requirements.txt

Installation & Help

Download this repository and run:

python3 setup.py install

You may require to call it using sudo. Once installed, panfp`should be available anywhere in your terminal.

In the case you need to install the package in a specific directory of your system, you can call the argument --install-lib followed by a directory path:

python3 setup.py install --install-lib /custom/path/

Example

Requirements to run an experiment are:

-d [database of reference genomes with functional annotation] [here]
-a [directory which contains functional profiles of genomes in database] [here]
-i [otu-sample table]

To see additional arguments:

bin/panfp --help

As example, we included an example script [here] with a full workflow of how panfp works and an example otu-sample table [here].

Note that an input, otu-sample table should be in a tab delimited format as follows:

#OTU ID S1 S2 ... S10 Lineage
OTU_1 0.0 10.0 ... 2.0 k__Bacteria; p__Proteobacteria; c__Betaproteobacteria; o__MND1; f__
OTU_2 4.0 430.0 ... 24.0 k__Bacteria; p__Proteobacteria; c__Betaproteobacteria; o__; f__; g__; s__
... ... ... ... ... k__Bacteria;p__Cyanobacteria;c__Oxyphotobacteria
OTU_99 1.0 5.0 ... 0.0 k__Bacteria;p__Chloroflexi;c__
OTU_100 0.0 35.0 ... 2.0 k__Bacteria; p__Proteobacteria; c__Gammaproteobacteria; o__Enterobacteriales; f__Enterobacteriaceae; g__Gluconacetobacter; s__liquefaciens

where the first column represents OTU ids, numbers represent raw frequency of 16S rRNA, and the last column represents lineage of OTUs.

As example, we included an example script [here] with a full workflow of how panfp works and an example otu-sample table [here].

Output Information:

The following files are generated in the following order:

  • updated_otu_table.txt - otu-sample table with updated taxonomic information according to database lineages [example]
  • lineage_copynum.txt - copy numbers for lineages in an updated otu-sample table [example]
  • for example, k__Bacteria.p__Ignavibacteriae.c__Ignavibacteria.KO.txt - functional profiles for lineages [example]
  • updated_otu_table_norm_by_copynum.txt - otu-sample table normalized by median copy numbers of lineages [example]
  • updated_otu_table_norm_by_copynum_depth.txt - otu-sample table normalized by sequencing depth [example]
  • lineage_sample_table.txt - lineage-sample table derived from otu-sample table grouping by lineages [example]
  • function_sample_table.txt - funciton-sample table by multiplying lineage-sample table and lineage-function table [example]

Contact

This project has been fully developed at the group of Translational Bioinformatics - Jun Lab.

If you experience any problem at any step involving the program, you can use the 'Issues' page of this repository or contact: Se-Ran Jun

License

PanFP is under a common GNU GENERAL PUBLIC LICENSE. Plese, check LICENSE for further information.

[2020] - Se-Ran Jun - All Rights Reserved*

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published