JUMPptm

Introduction
Input Data
Sample Data
Basic Installation
JUMPptm Commands
Test Data Exercise
Input and Output Data Organization

Introduction

JUMPptm aims to identify PTM events from unmatched spectra after conventional peptide analysis of the whole proteome. By default, JUMPptm assumes that the whole proteome data has been analyzed by the JUMP suite, which outputs the identification of peptides/proteins, and the high-quality (HQ) unmatched spectra with de novo tags. Such HQ spectra (input #1) are taken as JUMPptm input, and searched against a list of PTMs specified by the user (input #2) using the multi-stage database search strategy using Comet. The PTM list can be guided by the results from an open search using MSFragger. JUMPptm exports the PTM peptide identification with TMT-based quantification. We can also use mzXML file as the direct input instead of ms2 files.

NOTE : ptm_pipeline.py script can run entire pipeline for Platform LSF to schedule jobs on the computational cluster or on standalone version (select the correct parameter in the parameter file). As the list of PTMs gets larger or number of mzXML files increases, is better to use cluster rather than standalone version. For other platform, users may edit the job submission functions.

Note that JUMPptm only supports 64-bit macOS and Linux. Also, JUMPptm can be run without TAGs file too. You can directly use ms2 file or mzXML file and turn off the tag path in the parameter file

Top of page

JUMPptm Publication:

publication
If you use JUMPptm as part of a publication, please include this reference.

Poudel, S., Vanderwall, D., Yuan, Z. F., Wu, Z., Peng, J., & Li, Y. (2022). JUMPptm: Integrated software for sensitive identification of post‐translational modifications and its application in Alzheimer's disease study. Proteomics, 2100369.

Top of page

Input Data

JUMPptm requires different input for analyzing the PTMs :

ms2 files - Ideally High Quality spectra; however ms2 file or mzXML file containing all spectra could be used (lower sensitivity)
PTMs list (may be derived by using open searches or PTMs of interest) -- this can be updated in the parameter file
denovo JUMP derived tags file (you can turn this off in the parameter file by assigning tags_input_path = 0 if you have do not have TAGS file from JUMP)

NOTE: To facilitate subsequent stages of PTM analysis after default peptide analysis, the latest JUMP suite (v1.13.1; https://github.com/JUMPSuite/JUMP) outputs three files after analyzing a whole proteome dataset: i) the accepted unique proteins (.fasta format); ii) HQ unmatched spectra generated by the spectrum QC module (.ms2 format); and iii) de novo amino acid tags for each MS2 spectrum (.tags format). The identified proteins are used to generate a customized database to restrict the search space, and the latter two files are input files for JUMPptm. Note that all the HQ spectra are mass-corrected by the JUMP analysis, allowing narrow mass tolerance (e.g., < 6 ppm) for subsequent PTM identification.

Top of page

Sample Data

To evaluate the JUMPptm, we provide a data set composed of 2 fractions sample_data directory.

File name	Description	Total Demo Spectra and tags
w001.ms2	High Quality ms2 file	792
w010.ms2	High Quality ms2 file	654
w001.tags	jump derived denovo tags	792
w010.tags	jump derived denovo tags	654

Top of page

Basic Installation

A basic install is sufficient for multicore laptops, desktops, workstations and servers. For a basic install, use the bootstrap.sh script provided in the repo. Then,

Place the JUMPptm distribution source in the desired location (call this <path to JUMPptm>)
Change your working directory to the top level of the JUMPptm install and run bash bootstrap.sh

Obtaining JUMPptm source You can obtain the latest version of JUMP from git; simple clone the git repository:

    git clone https://github.com/surPoudel/JUMP-ptm.git

in the directory where you would like JUMPptm to be installed (call this directory <path to JUMPptm>). Note that JUMPptm does not support out-of-place installs; the JUMPptm git repository is the entire installation.

Bootstrapping To get dependencies installed, we recommend Conda, and we have provided a bootstrapping script bootstrap.sh that downloads all dependencies and installs them alongside JUMPptm. Execute

    ./bootstrap.sh

and it will create a new directory JUMPptm in the current working directory. The directory JUMPptm will contain the conda environment to be used by JUMPptm. JUMPptm will be set up to use the PERL and python interpreters in that environment.

Once bootstrap.sh is finished, activate the conda environment

    conda activate $PWD/JUMPptm

Note: Comet version 2021 binaries are added here [1]. If user wants different version of Comet. They could simply replace with exact name comet_linux_2021 or comet_mac_2021

Top of page

JUMPptm Commands

Once the conda environment (JUMPptm) is activated

make a working directory
keep all the ms2 files or mzXML files and tags file (optional) in the same directory
copy the parameter file (ptm_pipeline.params) from parameterFiles to the same directory
make necessary changes for the parameters (including PTM searches stages)
Run the command below

    python /path of JUMPptm/ptm_pipeline.py ptm_pipeline.params

Top of page

Test Data Exercise

Make a directory called Test
Copy w001.ms2 and w001.tags files from sample_data to Test
Copy ptm_pipeline.params from parameterFiles to Test
Open the parameter file copied from step # 3 and change the following information
- ms2_input_path = replace the path with the folder that has sample_data for eg. /home/spoudel1/JUMP-ptm/sample_data
- tags_input_path = replace the path with the folder that has sample_data for eg. /home/spoudel1/JUMP-ptm/sample_data
- database_name = replace the path with the folder that has the sample database name for eg. /home/spoudel1/Database_AD_Banner/humanComprehensive_v2_ft_mc2_c57_TMTpro.fasta
Download database
- pitfile = replace the path with the folder that has the sample pitfile name for eg. /home/spoudel1/Database_AD_Banner/humanComprehensive_v2_ft_mc2_c57_TMTpro.pit
Download pitfile

Download test database and pitfile here database_pitfile
Execute

    python /path of JUMPptm/ptm_pipeline.py ptm_pipeline.params
eg. python /home/spoudel/JUMP-ptm/ptm_pipeline.py ptm_pipeline.params

Note: This will automatically run Stage_1 and Stage_2 (Phosphorylation and Deamidation)

Top of page

Input and Output Data Organization

.
├── Pipeline_Results_OUTPUT_FOLDER          # Output folder that contains pipeline results (suffixed by Pipeline_Results_)
│   ├── comet.params.new          # comet search parameter file (template)
│   ├── ptm_pipeline.log          # ptm pipeline log file
│   ├── ptm_pipeline.params       # ptm pipeline parameter file is copied inside the results folder for record
│   ├── merge_and_consolidation   # Folder that have results after merging and consolidation of PSMS from each stages 
│   │   ├── ID.txt      # merged IDs from all stages
│   │   ├── jump_fq_merged.params         # automatically generated quantification parameter file
│   │   ├── publications                  # merged filtering results for peptide identification
│   │   │   ├── id_all_pep.txt  # merged peptides identification (all proteins)
│   │   │   └── id_uni_pep.txt  # merged peptides identification (unique proteins)
│   │   ├── quan_HH_tmt10_human_comet     # quantification results folder
│   │   │   └── publications    # folder containing quantification of peptides
│   │   │       ├── id_all_pep_quan.txt    # peptides mapped to all proteins
│   │   │       └── id_uni_pep_quan.txt    # peptides mapped to unique protein
│   │   └── results_table
│   │       └── Pan_PTM_Quan_Table.xlsx  # Pan PTM output excel file
│   ├── Stage_1                                     # Stage_1 search results based on parameter file description
│   │   ├── jump_fc_Stage_1_FDR_1.params  # filtering parameter file for stage 1
│   │   ├── stage_1_comet.params          # search comet parameter file (customized automatically by program based on parameter file)
│   │   ├── sum_Stage_1_FDR_1             # filtering result folder
│   │   │   ├── ID.txt          # identified PSMS (Target only) -- at given FDR 
│   │   │   ├── IDwDecoy.txt    # identified PSMS (Target + Decoy) -- at ven FDR
│   │   │   ├── publications    # folder containing tables file for unique peptide and all peptides
│   │   │   │   ├── id_all_pep_1FDR.txt
│   │   │   │   ├── id_all_pep.txt
│   │   │   │   ├── id_all_prot_1FDR.txt
│   │   │   │   ├── id_all_prot.txt
│   │   │   │   ├── id_uni_pep_1FDR.txt
│   │   │   │   ├── id_uni_pep.txt
│   │   │   │   ├── id_uni_prot_1FDR.txt
│   │   │   │   └── id_uni_prot.txt
│   │   │   └── simplified_report
│   │   │       └── id_uni_prot.txt
│   │   └── w001                         # example fraction name searched by the pipeline --  program makes separate folder for each fraction
│   │       ├── comet.params             # search parameter file is copied
│   │       ├── Results_start_scan_0_end_scan_0_min_tag_len_2        # folder containing tags matched intermediate files
│   │       │   ├── expect_minusLog10.pdf
│   │       │   ├── expect_minusLog10.png
│   │       │   ├── spectrum_tag_count.txt
│   │       │   ├── spectrum_unique_tag_table.txt
│   │       │   ├── tag_qc.params
│   │       │   ├── Total_Tag_matched.pdf
│   │       │   ├── Total_Tag_matched.png
│   │       │   ├── w001_reordered_final.pickle
│   │       │   ├── xcorr.pdf
│   │       │   └── xcorr.png
│   │       ├── search_log.txt          # comet search log
│   │       ├── tag_match.log           # tag match program log file
│   │       ├── tag_qc.params           # tag match parameter file
│   │       ├── w001.1.pep.xml          # pep.xml file output with tag match information
│   │       ├── w001.1.txt              # search file in txt file format
│   │       └── w001.ms2 -> /home/spoudel1/conda_work/test/w001.ms2        # input ms2 softlinked to search fraction folder
│   └── Stage_2
│              .
│              .
│              .
│              .
│
├── ptm_pipeline.params                                     # input parameter file
├── w001.1.tags                                             # input tag file
└── w001.ms2                                                # input ms2 file

Pan_PTM_Quan_Table.xlsx --- concentanated Pan PTM output file

NOTE: The ID.txt file in merge_and_consolidation folder is modified for the sake of concatenation of different stages. The peptides have Z alphabet appended at the Cterminus that designates the stage. Z = Stage_1; ZZ = Stage_2 etc. The original peptide sequence is also retained. This helps in accurate quantification of peptides using the psms that belongs to specfic stage (so we get unique stagewise psms)

Top of page

References

[1] Eng, Jimmy K., Tahmina A. Jahan, and Michael R. Hoopmann. "Comet: an open‐source MS/MS sequence database search tool." Proteomics 13.1 (2013): 22-24.

Top of page

Maintainers

To submit bug reports and feature suggestions, please contact

Suresh Poudel (suresh.poudel@stjude.org)

Top of page

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
JUMPf		JUMPf
JUMPnl_im		JUMPnl_im
JUMPq		JUMPq
JUMPtagmatch		JUMPtagmatch
__pycache__		__pycache__
parameterFiles		parameterFiles
presearch		presearch
resources		resources
sample_data		sample_data
README.md		README.md
bootstrap.sh		bootstrap.sh
comet_linux_2021		comet_linux_2021
comet_mac_2021		comet_mac_2021
customDB_gen.py		customDB_gen.py
format_pep_xml.py		format_pep_xml.py
id_all_uni_from_id.py		id_all_uni_from_id.py
idtxt_merge_filter.py		idtxt_merge_filter.py
install_packages.sh		install_packages.sh
jump_ptm_linux.yml		jump_ptm_linux.yml
jump_ptm_mac.yml		jump_ptm_mac.yml
logFunctions.py		logFunctions.py
parameterFileFunctions.py		parameterFileFunctions.py
pep_xml_parser.py		pep_xml_parser.py
postQuantificationFunctions.py		postQuantificationFunctions.py
program_signature.py		program_signature.py
ptm_pipeline.py		ptm_pipeline.py
stagewise_comet_search.py		stagewise_comet_search.py

surPoudel/JUMP-ptm

Folders and files

Latest commit

History

Repository files navigation

JUMPptm

Introduction

JUMPptm Publication:

Input Data

Sample Data

Basic Installation

JUMPptm Commands

Test Data Exercise

Download database

Download pitfile

Download test database and pitfile here database_pitfile

Input and Output Data Organization

References

Maintainers

About

Resources

Stars

Watchers

Forks

Languages