Skip to content

wenbostar/PDV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PDV: an integrative proteomics data viewer

Downloads ReleaseDownloads Citation Badge

PDV is a lightweight visualization tool that enables intuitive and fast exploration of diverse, large-scale proteomics datasets in different formats on standard desktop computers in both graphical user interface and command line modes. One of the most important functions of PDV is to visualize peptide identification results from different search engines and generate high quality annotated spectra for publication.

Usage

A user's manual is available at http://pdv.zhang-lab.org. You can find some visualization examples in the user's manual or the manuscript of PDV.

Installation

The PDV package can be downloaded at https://github.com/wenbostar/PDV/releases.

Example

PDV supports visualizing N-linked intact glycopeptide identification result (identification file: psm.tsv, MS/MS file: mzML format) generated by MSFragger-Glyco/Philosopher. An example input can be downloaded at MSFragger-Glyco_example.

USI (Universal Spectrum Identifier)

Mirror plot (Experimental spectrum VS predicted spectrum using deep learning)

Top panel: experimental spectrum, bottom panel: predicted spectrum using deep learning.

Database searching:

Software Example files
PepQuery PepQuery
MS-GF+ (v2017.01.13) mgf:mzid
mzML:mzid
mzXML:mzid
X!Tandem (v2017.2.1.2) mgf:mzid (convert X!Tandem XML result to mzid file using MzidLib)
MyriMatch (v2.2.10165) mgf:mzid
mgf:pepXML
mzML:mzid
mzML:pepXML
mzXML:mzid
mzXML:pepXML
Comet (v2018.01 rev. 2) mgf:pepXML
mzML:pepXML
mzXML:pepXML
Crux/Tide (v3.2) mgf:pepXML
mgf:mzid
mzML:pepXML
mzML:mzid
mzXML:pepXML
mzXML:mzid
Crux/Tide (v4.1) mzML:pepXML
mzML:mzid
mzXML:pepXML
mzXML:mzid
MS Amanda (v2.0.0.11219) mgf:csv(MS Amanda format)
mzML:csv(MS Amanda format)
MSFragger (v20180316) mzML:pepXML
mzXML:pepXML
FragPipe Manual
MaxQuant version 1.3.0.5
version 1.5.3.30
version 1.5.4.1
version 1.5.7.4
version 1.5.8.3
version 1.6.2.3
version 1.6.5.0
IPeak mgf:mzid
IdentiPy (v0.2) mgf:pepXML
mzML:pepXML
MetaMorpheus (v0.0.286) mzML:mzid
OMSSA (v2.1.9) mgf:pepXML
Mascot (v2.5.1) mgf:pepXML
mgf:mzid
mzML:pepXML
mzML:mzid
dat
pFind (>=v3.1.5) Only supported MGF file search result
TPP (v5.1.0) mzML:pepXML (Comet + PeptideProphet + iProphet + PTMProphet)

Denovo sequencing:

Software Example files
Casanovo Manual
Novor mgf:csv (only support the Novor result generated through DeNovoGUI)
DeepNovo mgf:txt
PepNovo+ mgf:txt
pNovo+ mgf:txt

Proteogenomics:

Type Example files
proBAM ProBAM.tar.gz
proBed ProBed.tar.gz

One PSM:

Spectrum library:

Spectrum Library Central at PeptideAtlas

MS data:

Type Example files
mzML SF_200217_U2OS_TiO2_HCD_OT_rep1.mzML.gz
mzXML SF_200217_U2OS_TiO2_HCD_OT_rep1.mzXML.gz

PRIDE XML:

PRIDE_Exp_Complete_Ac_22028.xml.gz

QC analysis:

Please find an example in this tutorial: QC analysis.

Command line:

PDV provides a command line module to produce figures of annotated spectra or TIC in batch mode. It can be used to generate figures according to a list of peptide sequences or a list of spectrum indexes.

 $ java -jar PDV-1.1.0/PDV-1.1.0.jar -h
usage: Options
 -a <arg>    Error window for MS/MS fragment ion mass values. Unit is Da.
             The default value is 0.5.
 -ah         Whether or not to consider neutral loss of H2O.
 -an         Whether or not to consider neutral loss of NH3.
 -c <arg>    The intensity percentile to consider for annotation. Default
             is 3 (3%), it means that the peaks with intensities >= (3% *
             max intensity) will be annotated.
 -fh <arg>   Figure height. Default is 400
 -ft <arg>   Figure type. Can be png, pdf or tiff.
 -fu <arg>   The units in which ‘height’(fh) and ‘width’(fw) are given.
             Can be cm, mm or px. Default is px
 -fw <arg>   Figure width. Default is 800
 -h          Help
 -help       Help
 -i <arg>    A file containing peptide sequences or spectrum IDs. PDV will
             generate figures for these peptides or spectra.
 -k <arg>    The input data type for parameter -i (Spectrum ID: s, peptide
             sequence: p).
 -o <arg>    Output directory.
 -pw <arg>   Peak width. Default is 1
 -r <arg>    Identification file.
 -rt <arg>   Identification file format (mzIdentML: 1, pepXML: 2, proBAM:
             3, txt: 4, maxQuant: 5, TIC: 6).
 -s <arg>    MS/MS data file
 -st <arg>   MS/MS data format (mgf: 1, mzML: 2, mzXML: 3).

Please find a few examples below. Please download the example data here: input_data.tar.gz

(1) Input: mgf and mzID

java -jar PDV-1.1.0/PDV-1.1.0.jar -r input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1_myrimatch_mgf.mzid -rt 1 -s input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1.mgf -st 1 -i input_data/spectrum_title.txt -k s -o output -a 0.05 -c 3 -pw 1 -fw 800 -fh 400 -fu px -ft pdf

(2) Input: mzML and mzID

java -jar PDV-1.1.0/PDV-1.1.0.jar -r input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1_myrimatch_mzML.mzid -rt 1 -s input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1.mzML -st 2 -i input_data/spectrum_scan_number.txt -k s -o output -a 0.05 -c 3 -pw 1 -fw 800 -fh 400 -fu px -ft pdf

(3) Input: mgf and pepXML

java -jar PDV-1.1.0/PDV-1.1.0.jar -r input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1_myrimatch_mgf.pepXML -rt 2 -s input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1.mgf -st 1 -i input_data/spectrum_title.txt -k s -o output -a 0.05 -c 3 -pw 1 -fw 800 -fh 400 -fu px -ft pdf

(4) Input mzML and pepXML

java -jar PDV-1.1.0/PDV-1.1.0.jar -r input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1_myrimatch_mzML.pepXML -rt 2 -s input_data/SF_200217_U2OS_TiO2_HCD_OT_rep1.mzML -st 2 -i input_data/spectrum_scan_number.txt -k s -o output -a 0.05 -c 3 -pw 1 -fw 800 -fh 400 -fu px -ft pdf

Citation

To cite the PDV package in publications, please use:

Li, K., et al. "PDV: an integrative proteomics data viewer." Bioinformatics, Volume 35, Issue 7, 01 April 2019, Pages 1249–1251. https://doi.org/10.1093/bioinformatics/bty770

List of citations

PDV has been cited or used in the following manuscripts:

  1. Wang X, Codreanu S G, Wen B, et al. Detection of proteome diversity resulted from alternative splicing is limited by trypsin cleavage specificity. Molecular & Cellular Proteomics, 2017: mcp. RA117. 000155.
  2. Menschaert G, Wang X, Jones A R, et al. The proBAM and proBed standard formats: enabling a seamless integration of genomics and proteomics data. Genome biology, 2018, 19(1): 12.
  3. Wen, Bo, Xiaojing Wang, and Bing Zhang. "PepQuery enables fast, accurate, and convenient proteomic validation of novel genomic alterations." Genome research 29.3 (2019): 485-493.
  4. Rong, Mingqiang, et al. "PPIP: Automated Software for Identification of Bioactive Endogenous Peptides." Journal of proteome research 18.2 (2018): 721-727.
  5. Zhang X, Huang H, He Y, et al. High-throughput identification of heavy metal binding proteins from the byssus of chinese green mussel (Perna viridis) by combination of transcriptome and proteome sequencing. PloS one, 2019, 14(5): e0216605.
  6. Ren, Zhe, et al. "Improvements to the Rice Genome Annotation Through Large-Scale Analysis of RNA-Seq and Proteomics Data Sets." Molecular & Cellular Proteomics 18.1 (2019): 86-98.

Contribution

Contributions to the package are more than welcome.