Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why am I running MSPathFinder slow? #16

Closed
sunyusui opened this issue Jan 19, 2019 · 4 comments
Closed

Why am I running MSPathFinder slow? #16

sunyusui opened this issue Jan 19, 2019 · 4 comments

Comments

@sunyusui
Copy link

I read the paper "Informed-Proteomics: open-source software package for top-down proteomics", which mentions that MSPathFinder is a faster proteoform identification tool. But I don't know where I am operating, which makes the identification speed slower. First, I used the msconvert tool in the ProteoWizard package to convert the original spectral file to the .mzML file format. Then experimented with the following parameters, MSPathFinderT.exe running total time is 8d8h12m.
SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 300
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 30
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 0

When I used the default parameters below and added a modified file, the experiment speed became slower. Running 0d 22h 35.02m only ran 0.4%. I want to know  where the problem is?

SpecFile 2DLC_H3_1.pbf
DatabaseFile human_proteome_database.fasta
FeatureFile 2DLC_H3_1.ms1ft
InternalCleavageMode SingleInternalCleavage
Tag-based search True
Tda Target+Decoy
PrecursorIonTolerancePpm 10
ProductIonTolerancePpm 10
MinSequenceLength 21
MaxSequenceLength 500
MinPrecursorIonCharge 2
MaxPrecursorIonCharge 50
MinProductIonCharge 1
MaxProductIonCharge 20
MinSequenceMass 3000
MaxSequenceMass 50000
ActivationMethod Unknown
MaxDynamicModificationsPerSequence 4
Modification C(2) H(2) N(0) O(1) S(0),R,opt,Everywhere,Acetyl
Modification C(2) H(2) N(0) O(1) S(0),K,opt,Everywhere,Acetyl
Modification C(1) H(2) N(0) O(0) S(0),R,opt,Everywhere,Methyl
Modification C(1) H(2) N(0) O(0) S(0),K,opt,Everywhere,Methyl
Modification C(2) H(4) N(0) O(0) S(0),R,opt,Everywhere,Dimethyl
Modification C(2) H(4) N(0) O(0) S(0),K,opt,Everywhere,Dimethyl
Modification C(3) H(6) N(0) O(0) S(0),R,opt,Everywhere,Trimethyl
Modification C(0) H(1) N(0) O(3) S(0) P(1),S,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),T,opt,Everywhere,Phospho
Modification C(0) H(1) N(0) O(3) S(0) P(1),Y,opt,Everywhere,Phospho

Processing, 93499 proteins done, 0.4% complete, 80266.5 sec elapsed
Total Progress: 42.58%, 0d 22h 20.02m elapsed, Current Task: Searching the targe
t database
Processing, 93950 proteins done, 0.4% complete, 80566.8 sec elapsed
Total Progress: 42.58%, 0d 22h 25.02m elapsed, Current Task: Searching the targe
t database
Processing, 94352 proteins done, 0.4% complete, 80881.6 sec elapsed
Total Progress: 42.58%, 0d 22h 30.02m elapsed, Current Task: Searching the targe
t database
Processing, 94955 proteins done, 0.4% complete, 81189.5 sec elapsed
Total Progress: 42.58%, 0d 22h 35.02m elapsed, Current Task: Searching the targe
t database
Another problem is that the. fasta file I used contains 20410 entries, why the search shows that 94352 proteins done?

@FarmGeek4Life
Copy link
Member

What version of MSPathFinder are you running? How many spectra are in the input file? Release 1.0 (if I remember correctly) reports "proteins", but what was really being counted was peptides.

@sunyusui
Copy link
Author

The input file contains 3460 spectra, and the version I am using is 1.0.6510.1956

@alchemistmatt
Copy link
Member

Please use the latest release since it has numerous bug fixes compared to the preview release that you are currently using. See:
https://github.com/PNNL-Comp-Mass-Spec/Informed-Proteomics/releases/tag/v1.0.6619

@sunyusui
Copy link
Author

I am very grateful to the scholars for their help, I will download the latest version to complete my experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants