Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help needed. Empty PSM list stops pipeline. #328

Closed
fstein opened this issue Mar 10, 2022 · 36 comments
Closed

Help needed. Empty PSM list stops pipeline. #328

fstein opened this issue Mar 10, 2022 · 36 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@fstein
Copy link

fstein commented Mar 10, 2022

Hello,

I wanted to analyze a simple label-free experiment (it's after an acid hydrolysis of a gel band). Unfortunately, the philosopher pipeline does not finish, since the PSM list is empty. Analyzing this experiment with IsobarQuant or MaxQuant leads to the identification of one desired protein. What did I do wrong? Readout of the ms2 was done in the ion trap. Therefore, I set the fragment_mass_tolerance to 0.5. Also, the search needs to be unspecific due to unspecific hydrolysis of the protein). Did I set any wrong parameter here?
Please find below the philosopher output as well as my philosopher.yml file.

Thanks already a lot for you help.

Best,

Frank
philosopher.yml.txt

@fstein
Copy link
Author

fstein commented Mar 10, 2022

Here is the output:
philosopher pipeline --config C:\MS_TestRun_gelband/philosopher.yml C:\MS_TestRun_gelband
INFO[10:31:57] Executing Pipeline v4.1.1
INFO[10:31:57] Creating workspace
WARN[10:31:57] A meta data folder was found and will not be overwritten.
INFO[10:31:57] Initiating the workspace on C:\MS_TestRun_gelband
INFO[10:31:57] Creating workspace
WARN[10:31:57] A meta data folder was found and will not be overwritten.
INFO[10:31:57] Annotating the database
INFO[10:31:57] Running the Database Search
MSFragger version MSFragger-3.4
Batmass-IO version 1.23.6
timsdata library version timsdata-2-8-7-1
(c) University of Michigan
RawFileReader reading tool. Copyright (c) 2016 by Thermo Fisher Scientific, Inc. All rights reserved.
System OS: Windows 10, Architecture: AMD64
Java Info: 1.8.0_201, Java HotSpot(TM) 64-Bit Server VM, Oracle Corporation
JVM started with 35 GB memory
Checking database...
Parameter 'search_enzyme_cutafter' was not supplied. Using default value: KR
Parameter 'search_enzyme_butnotafter' was not supplied. Using default value:
Parameter 'search_enzyme_name' was not supplied. Using default value: stricttrypsin
Deisotoping doesn't support low resolution tandem mass spectra. Changing deisotope to 0.
deisotope = 0. Changing deneutralloss to 0.
Checking spectral files...
C:\MS_TestRun_gelband\Zelda_220225_P1990_PH_JS_band01_H_R1.mzML: Scans = 18140
FIRST SEARCH*
Parameters:
num_threads = 6
database_name = C:\MS_TestRun_gelband\2022-03-10-decoys-contam-Ecoli_UP000000625_05142016_4314entries.fasta.fas
decoy_prefix = rev_
precursor_mass_lower = -20.0
precursor_mass_upper = 20.0
precursor_mass_units = 1
data_type = 0
precursor_true_tolerance = 20.0
precursor_true_units = 1
fragment_mass_tolerance = 500.0
fragment_mass_units = 1
calibrate_mass = 2
use_all_mods_in_first_search = false
write_calibrated_mgf = 0
isotope_error = 0
mass_offsets = 0
labile_search_mode = OFF
restrict_deltamass_to = all
precursor_mass_mode = SELECTED
localize_delta_mass = false
delta_mass_exclude_ranges = (-1.5,3.5)
fragment_ion_series = b,y
ion_series_definitions =
search_enzyme_name = stricttrypsin
search_enzyme_sense_1 = C
search_enzyme_cut_1 = KR
search_enzyme_nocut_1 =
allowed_missed_cleavage_1 = 2
num_enzyme_termini = 0
clip_nTerm_M = true
allow_multiple_variable_mods_on_residue = false
max_variable_mods_per_peptide = 3
max_variable_mods_combinations = 5000
output_format = tsv_pepxml_pin
output_report_topN = 1
output_max_expect = 50.0
report_alternative_proteins = false
override_charge = false
precursor_charge_low = 1
precursor_charge_high = 4
digest_min_length = 8
digest_max_length = 15
digest_mass_range_low = 500.0
digest_mass_range_high = 5000.0
max_fragment_charge = 2
deisotope = 0
deneutralloss = false
track_zero_topN = 0
zero_bin_accept_expect = 0.0
zero_bin_mult_expect = 1.0
add_topN_complementary = 0
minimum_peaks = 10
use_topN_peaks = 150
minIonsScoring = 2
min_matched_fragments = 4
minimum_ratio = 0.01
intensity_transform = 0
remove_precursor_peak = 0
remove_precursor_range = -1.5,1.5
clear_mz_range_low = 0.0
clear_mz_range_high = 0.0
excluded_scan_list_file =
mass_diff_to_variable_mod = 0
min_sequence_matches = 2
check_spectral_files = true
variable_mod_02 = 42.01060 [^ 1
add_A_alanine = 0.000000
add_C_cysteine = 57.021464
add_Cterm_peptide = 0.0
add_Cterm_protein = 0.0
add_D_aspartic_acid = 0.000000
add_E_glutamic_acid = 0.000000
add_F_phenylalanine = 0.000000
add_G_glycine = 0.000000
add_H_histidine = 0.000000
add_I_isoleucine = 0.000000
add_K_lysine = 0.000000
add_L_leucine = 0.000000
add_M_methionine = 0.000000
add_N_asparagine = 0.000000
add_Nterm_peptide = 0.0
add_Nterm_protein = 0.0
add_P_proline = 0.000000
add_Q_glutamine = 0.000000
add_R_arginine = 0.000000
add_S_serine = 0.000000
add_T_threonine = 0.000000
add_V_valine = 0.000000
add_W_tryptophan = 0.000000
add_Y_tyrosine = 0.000000
Selected fragment index width 2.50 Da.
446665730 fragments to be searched in 1 slices (6.66 GB total)
Operating on slice 1 of 1:
Fragment index slice generated in 4.34 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzML 1.4 s
[progress: 17628/17628 (100%) - 3773 spectra/s] 4.7s | postprocessing 0.1 s
*FIRST SEARCH DONE IN 0.230 MIN

**MASS CALIBRATION AND PARAMETER OPTIMIZATION
-----|---------------|---------------|---------------|---------------
| MS1 (Old) | MS1 (New) | MS2 (Old) | MS2 (New)
-----|---------------|---------------|---------------|---------------
Run | Median MAD | Median MAD | Median MAD | Median MAD
001 | -1.95 0.72 | -0.24 0.79 | 8.00 69.85 | -3.52 67.43
-----|---------------|---------------|---------------|---------------
Finding the optimal parameters:
-------|-------|-------
MS2 | 200 | 300
-------|-------|-------
Count | 85| 73
-------|-------|-------
-------|-------|-------|-------
Peaks | 150_1 | 100_1 | 75_1
-------|-------|-------|-------
Count | 85| 27| skip rest
-------|-------|-------|-------
-------|-------
Int. | 1
-------|-------
Count | 75
-------|-------
-------|-------
Rm P. | 1
-------|-------
Count | 78
-------|-------
-------|-------
FragChg| 1
-------|-------
Count | 119
-------|-------
New fragment_mass_tolerance = 200 PPM
New use_topN_peaks = 150
New minimum_ratio = 0.010000
New intensity_transform = 0
New remove_precursor_peak = 0
New max_fragment_charge = 1
***MASS CALIBRATION AND PARAMETER OPTIMIZATION DONE IN 0.934 MIN

MAIN SEARCH
output_format = tsv_pepXML_pin but report_alternative_proteins = 0. Change report_alternative_proteins to 1.
Checking database...
Parameter 'search_enzyme_cutafter' was not supplied. Using default value: KR
Parameter 'search_enzyme_butnotafter' was not supplied. Using default value:
Parameter 'allowed_missed_cleavage' was not supplied. Using default value: 2
Parameter 'search_enzyme_name' was not supplied. Using default value: stricttrypsin
Deisotoping doesn't support low resolution tandem mass spectra. Changing deisotope to 0.
deisotope = 0. Changing deneutralloss to 0.
variable_mod_03 has an empty value.
variable_mod_04 has an empty value.
variable_mod_05 has an empty value.
variable_mod_06 has an empty value.
variable_mod_07 has an empty value.
Parameters:
num_threads = 6
database_name = C:\MS_TestRun_gelband\2022-03-10-decoys-contam-Ecoli_UP000000625_05142016_4314entries.fasta.fas
decoy_prefix = rev_
precursor_mass_lower = -20.0
precursor_mass_upper = 20.0
precursor_mass_units = 1
data_type = 0
precursor_true_tolerance = 20.0
precursor_true_units = 1
fragment_mass_tolerance = 200.0
fragment_mass_units = 1
calibrate_mass = 2
use_all_mods_in_first_search = false
write_calibrated_mgf = 0
isotope_error = 0/1/2
mass_offsets = 0
labile_search_mode = OFF
restrict_deltamass_to = all
precursor_mass_mode = SELECTED
localize_delta_mass = false
delta_mass_exclude_ranges = (-1.5,3.5)
fragment_ion_series = b,y
ion_series_definitions =
search_enzyme_name = stricttrypsin
search_enzyme_sense_1 = C
search_enzyme_cut_1 = KR
search_enzyme_nocut_1 =
allowed_missed_cleavage_1 = 2
num_enzyme_termini = 0
clip_nTerm_M = true
allow_multiple_variable_mods_on_residue = false
max_variable_mods_per_peptide = 3
max_variable_mods_combinations = 5000
output_format = tsv_pepxml_pin
output_report_topN = 1
output_max_expect = 50.0
report_alternative_proteins = true
override_charge = false
precursor_charge_low = 1
precursor_charge_high = 4
digest_min_length = 7
digest_max_length = 50
digest_mass_range_low = 500.0
digest_mass_range_high = 5000.0
max_fragment_charge = 2
deisotope = 0
deneutralloss = false
track_zero_topN = 0
zero_bin_accept_expect = 0.0
zero_bin_mult_expect = 1.0
add_topN_complementary = 0
minimum_peaks = 10
use_topN_peaks = 150
minIonsScoring = 2
min_matched_fragments = 4
minimum_ratio = 0.01
intensity_transform = 0
remove_precursor_peak = 0
remove_precursor_range = -1.5,1.5
clear_mz_range_low = 0.0
clear_mz_range_high = 0.0
excluded_scan_list_file =
mass_diff_to_variable_mod = 0
min_sequence_matches = 2
check_spectral_files = true
variable_mod_01 = 15.99490 M 3
variable_mod_02 = 42.01060 [^ 1
add_A_alanine = 0.000000
add_C_cysteine = 57.021464
add_Cterm_peptide = 0.0
add_Cterm_protein = 0.0
add_D_aspartic_acid = 0.000000
add_E_glutamic_acid = 0.000000
add_F_phenylalanine = 0.000000
add_G_glycine = 0.000000
add_H_histidine = 0.000000
add_I_isoleucine = 0.000000
add_K_lysine = 0.000000
add_L_leucine = 0.000000
add_M_methionine = 0.000000
add_N_asparagine = 0.000000
add_Nterm_peptide = 0.0
add_Nterm_protein = 0.0
add_P_proline = 0.000000
add_Q_glutamine = 0.000000
add_R_arginine = 0.000000
add_S_serine = 0.000000
add_T_threonine = 0.000000
add_V_valine = 0.000000
add_W_tryptophan = 0.000000
add_Y_tyrosine = 0.000000
Selected fragment index width 1.00 Da.
11115526066 fragments to be searched in 7 slices (165.63 GB total)
Operating on slice 1 of 7:
Fragment index slice generated in 24.63 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzBIN_calibrated 0.4 s
[progress: 17620/17620 (100%) - 1932 spectra/s] 9.1s
Operating on slice 2 of 7:
Fragment index slice generated in 15.16 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzBIN_calibrated 0.4 s
[progress: 17620/17620 (100%) - 4174 spectra/s] 4.2s
Operating on slice 3 of 7:
Fragment index slice generated in 17.92 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzBIN_calibrated 0.3 s
[progress: 17620/17620 (100%) - 6340 spectra/s] 2.8s
Operating on slice 4 of 7:
Fragment index slice generated in 14.01 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzBIN_calibrated 0.3 s
[progress: 17620/17620 (100%) - 7948 spectra/s] 2.2s
Operating on slice 5 of 7:
Fragment index slice generated in 14.51 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzBIN_calibrated 0.3 s
[progress: 17620/17620 (100%) - 8375 spectra/s] 2.1s
Operating on slice 6 of 7:
Fragment index slice generated in 14.32 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzBIN_calibrated 0.4 s
[progress: 17620/17620 (100%) - 8806 spectra/s] 2.0s
Operating on slice 7 of 7:
Fragment index slice generated in 14.04 s
001. Zelda_220225_P1990_PH_JS_band01_H_R1.mzBIN_calibrated 0.3 s
[progress: 17620/17620 (100%) - 8347 spectra/s] 2.1s | postprocessing 2.5 s
MAIN SEARCH DONE IN 2.800 MIN

TOTAL TIME 3.965 MIN*
INFO[10:35:59] Running the validation and inference on C:\MS_TestRun_gelband
INFO[10:35:59] Executing PeptideProphet on C:\MS_TestRun_gelband
file 1: C:\MS_TestRun_gelband\Zelda_220225_P1990_PH_JS_band01_H_R1.pepXML
processed altogether 14958 results
INFO: Results written to file: C:\MS_TestRun_gelband\interact.pep.xml

  • C:\MS_TestRun_gelband\interact.pep.xml
  • Building Commentz-Walter keyword tree...
  • Searching the tree...
  • Linking duplicate entries...
  • Printing results...

using Accurate Mass Bins
using PPM mass difference
Using Decoy Label "rev_".
Decoy Probabilities will be reported.
Using non-parametric distributions
(X! Tandem) (using Tandem's expectation score for modeling)
adding ACCMASS mixture distribution
using search_offsets in ACCMASS mixture distr: 0
init with X! Tandem stricttrypsin
MS Instrument info: Manufacturer: UNKNOWN, Model: UNKNOWN, Ionization: UNKNOWN, Analyzer: UNKNOWN, Detector: UNKNOWN

INFO: Processing standard MixtureModel ...
PeptideProphet (TPP v5.2.1-dev Flammagenitus, Build 201906281613-exported (Windows_NT-x86_64)) AKeller@ISB
read in 0 1+, 5326 2+, 6021 3+, 2785 4+, 617 5+, 209 6+, and 0 7+ spectra.
Initialising statistical models ...
Found 6688 Decoys, and 8270 Non-Decoys
Iterations: .........10.........20.........30
WARNING: Mixture model quality test failed for charge (1+).
WARNING: Mixture model quality test failed for charge (5+).
WARNING: Mixture model quality test failed for charge (7+).
model complete after 31 iterations
INFO[10:36:22] Running the validation and inference on C:\MS_TestRun_gelband
INFO[10:36:22] Executing ProteinProphet on C:\MS_TestRun_gelband
ProteinProphet (C++) by Insilicos LLC and LabKey Software, after the original Perl by A. Keller (TPP v6.0.0-rc15 Noctilucent, Build 202105101442-exported (Windows_NT-x86_64))
(no FPKM) (using degen pep info)
Reading in C:/MS_TestRun_gelband/interact.pep.xml...
...read in 0 1+, 750 2+, 396 3+, 119 4+, 0 5+, 4 6+, 0 7+ spectra with min prob 0.05

Initializing 1008 peptide weights: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Calculating protein lengths and molecular weights from database c:/MS_TestRun_gelband/2022-03-10-decoys-contam-Ecoli_UP000000625_05142016_4314entries.fasta.fas
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........1000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........2000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........3000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........4000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........5000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........6000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........7000
.........:.........:.........:.........:.........:.........:.........:.........:.........:.........8000
.........:.........:.........:.........:.........:.........:.........:.........:...... Total: 8862
Computing degenerate peptides for 178 proteins: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Computing probabilities for 179 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing probabilities for 179 proteins. Loop 1: 0%...20%...40%...60%...80%...100% Loop 2: 0%...20%...40%...60%...80%...100%
Computing 175 protein groups: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
Calculating sensitivity...and error tables...
Computing MU for 179 proteins: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%
INFO: mu=5.05591e-05, db_size=5598271
INFO[10:36:23] Executing filter on C:\MS_TestRun_gelband
INFO[10:36:23] Processing peptide identification files
INFO[10:36:23] Printing models
INFO[10:36:26] 1+ Charge profile decoy=0 target=0
INFO[10:36:26] 2+ Charge profile decoy=72 target=678
INFO[10:36:26] 3+ Charge profile decoy=46 target=350
INFO[10:36:26] 4+ Charge profile decoy=13 target=106
INFO[10:36:26] 5+ Charge profile decoy=0 target=0
INFO[10:36:26] 6+ Charge profile decoy=0 target=4
INFO[10:36:26] Database search results ions=1008 peptides=830 psms=1269
INFO[10:36:26] Converged to 0.98 % FDR with 405 PSMs decoy=4 threshold=0.98 total=409
INFO[10:36:26] Converged to 0.73 % FDR with 271 Peptides decoy=2 threshold=0.9865 total=273
INFO[10:36:26] Converged to 0.92 % FDR with 316 Ions decoy=2 threshold=0.9813 total=318
INFO[10:36:26] Protein inference results decoy=89 target=86
INFO[10:36:26] Converged to 50.00 % FDR with 2 Proteins decoy=1 threshold=0.9344 total=3
INFO[10:36:26] Applying sequential FDR estimation ions=328 peptides=298 psms=405
INFO[10:36:26] Converged to 0.00 % FDR with 405 PSMs decoy=0 threshold=0.98 total=405
INFO[10:36:26] Converged to 0.00 % FDR with 298 Peptides decoy=0 threshold=0.9801 total=298
INFO[10:36:26] Converged to 0.00 % FDR with 328 Ions decoy=0 threshold=0.9801 total=328
INFO[10:36:27] Post processing identifications
INFO[10:36:27] Mapping modifications
INFO[10:36:27] Assigning protein identifications to layers
INFO[10:36:27] Processing protein inference
INFO[10:36:27] Synchronizing PSMs and proteins
INFO[10:36:27] Total report numbers after FDR filtering, and post-processing ions=0 peptides=0 proteins=0 psms=0
INFO[10:36:27] Saving
INFO[10:36:27] Executing label-free quantification on C:\MS_TestRun_gelband
INFO[10:36:27] Indexing PSM information
INFO[10:36:27] Reading spectra and tracing peaks
INFO[10:36:27] Assigning intensities to data layers
FATA[10:36:27] Cannot quantify data set. the PSM list is enpty

@fstein fstein changed the title Help need. Empty PSM list stops pipeline. Help needed. Empty PSM list stops pipeline. Mar 10, 2022
@fstein
Copy link
Author

fstein commented Mar 10, 2022

Also, why does it say "Parameter 'search_enzyme_name' was not supplied. Using default value: stricttrypsin"?
In the parameter.yml file there is only 'search_enzyme_name_1' which I gave the value 'nonspecific'. This seems to be not used. Should I still have a parameter called 'search_enzyme_name' although it's not part of the original parameter.yml file? Is this maybe the issue that it does not do a nonspecific search?

@prvst prvst self-assigned this Mar 10, 2022
@prvst prvst added the help wanted Extra attention is needed label Mar 10, 2022
@prvst
Copy link
Collaborator

prvst commented Mar 10, 2022

@fstein I suggest you look at the FragPipe workflows. The program has automatic configuration for several scenarios, including nonspecific searches. The fragger message might be related to the fact that the program was updated, and you are still using a configuration file from a previous version. Try generating a new parameter file before running again.

@fstein
Copy link
Author

fstein commented Mar 10, 2022

@prvst I am using the latest version of both philosopher, msfragger 3.4 and also the philosopher.yml file. This output is coming after using the pipeline command.

@prvst
Copy link
Collaborator

prvst commented Mar 10, 2022

Thanks, I'll take a look. On a side note, we will push forward the option of running a cmd version of fragpipe in the future. You can look at the program now to get familiar with it.

@fstein
Copy link
Author

fstein commented Mar 10, 2022

Having a cmd version of fragpipe would be extremely useful for us.

I already tried to save the MSFragger details for a nonspecific search in a params file and making sure the parameters match the ones I use in the philosopher.yml file. Did not work :-(

@prvst prvst assigned prvst and unassigned prvst Mar 10, 2022
@prvst
Copy link
Collaborator

prvst commented Mar 10, 2022

@fcyu, can we send him the beta for testing?

@fcyu
Copy link
Member

fcyu commented Mar 10, 2022

Sure.

@fstein , here (https://www.dropbox.com/s/rppc134jznqvonj/FragPipe-17.2-build27.zip?dl=1) is the link to download the pre-release version.

Best,

Fengchao

@fstein
Copy link
Author

fstein commented Mar 10, 2022

Thanks a lot...

@prvst
Copy link
Collaborator

prvst commented Mar 10, 2022

Please submit new tickets to the FragPipe github in case you need help.

@prvst prvst closed this as completed Mar 10, 2022
@fstein
Copy link
Author

fstein commented Mar 10, 2022

Is there some kind of documentation how to use it from the command line?
I will comment on the FragPipe github page for future questions.

@fcyu
Copy link
Member

fcyu commented Mar 10, 2022

Here (Nesvilab/FragPipe#560 (comment)) has a brief documentation.

@fstein
Copy link
Author

fstein commented Mar 10, 2022

Thanks, I'll take a look. On a side note, we will push forward the option of running a cmd version of fragpipe in the future. You can look at the program now to get familiar with it.

So, although you just closed this thread, would be still important for us to be able to analyze acid hydrolysis data with philsopher using the pipeline command. Let me know if you would need any further files or information.

@prvst prvst reopened this Mar 15, 2022
@prvst
Copy link
Collaborator

prvst commented Mar 15, 2022

Let me add Alexey to this discussion. He might give you a better insight on how to process acid hydrolysis samples.

@prvst
Copy link
Collaborator

prvst commented Mar 15, 2022

BTW, did you set the enzyme to nonspecific?

@fstein
Copy link
Author

fstein commented Mar 16, 2022

As you could see from the philosopher.yml file, I choose "search_enzyme_name_1: nonspecific" and "num_enzyme_termini: 0". Because we measured the mass of the ms2 spectra in the ion trap, I also set "fragment_mass_tolerance: 0.5".

When I put "nonspecific" as the search_enzyme_name_1, I got this error "Parameter 'search_enzyme_name' was not supplied. Using default value: stricttrypsin". However, I leave "search_enzyme_name: trypsin" and still set "num_enzyme_termini: 0", I don't get this error but still no identified proteins ("FATA[10:36:27] Cannot quantify data set. the PSM list is enpty").

I also tried fragpipe with the worflow "nonspecific-peptidome" and still now proteins were identified. I am grateful for any hints what I might do wrong.

@prvst
Copy link
Collaborator

prvst commented Mar 16, 2022

@fcyu, can you look at the nonspecific-peptidome workflow,and see if you spot any issues? If I understood correctly, @fstein is also having issues running this search using FragPipe.

@fcyu
Copy link
Member

fcyu commented Mar 16, 2022

Hi @fstein , if you have issues using FragPipe for the nonspecific-peptidome workflow, please send us the log file.

Best,

Fengchao

@fstein
Copy link
Author

fstein commented Mar 17, 2022

log_2022-03-15_14-00-17.txt

Here is it...
Thanks for looking into this.

@fcyu
Copy link
Member

fcyu commented Mar 17, 2022

Hi @fstein ,

Your log shows that the task finished without any error. But Felipe @prvst said that

If I understood correctly, @fstein is also having issues running this search using FragPipe.

Then, I am confused. Do you have any issue running FragPipe?

Best,

Fengchao

@fstein
Copy link
Author

fstein commented Mar 17, 2022

Dear Fengchao,

I don't have any issues running FragPipe or philosopher. It's just, that in this experiment, no proteins were identified. Analyzing this experiment with IsobarQuant or MaxQuant yields tons of PSM's belonging to one protein (it's an acid hydrolysis of a rather clean gel band after purification of a protein). I just don't know why FragPipe or the philosopher pipeline does not yield any of these PSMs. For this experiment, we also measured the MS2 spectrum in the ion trap. Is it maybe because the ms2 mass tolerance was not properly set to 0.5? Any hint is welcome here.

Best,

Frank

@anesvi
Copy link

anesvi commented Mar 17, 2022 via email

@fstein
Copy link
Author

fstein commented Mar 24, 2022

Dear Alexey,

thanks a lot for your comment.
If I leave out the Protein Inference as you suggested (by setting the Protein Infernence to no in the Steps of the philosopher.yml file), I get the error: FATA[11:25:02] Cannot read file. open interact.prot.xml: The system cannot find the file specified.
If I set Protein Inference and FDR Filtering to no, I get the error: Cannot read file:open .meta\psm.bin: The system cannot find the file specified.
Also just setting the proteinFDR to 1 in the FDR Filtering tab, does not result in any peptides in the psm.tsv file.
So could you maybe be a bit more precise, which parameter in the philosopher.yml file should I set to which value?

When checking the raw_file_name.pin file, I found many PSMs to be identified. How could I check which parameter is responsible for not using any of these PSMs to be reported?

Thanks for your help.

@anesvi
Copy link

anesvi commented Mar 24, 2022 via email

@prvst
Copy link
Collaborator

prvst commented Mar 24, 2022

@fstein this is tricky. Are you working with a FASTA file containing only one protein at all?

@fstein
Copy link
Author

fstein commented Mar 24, 2022

No, I am working with a Fasta file containing all proteins of Ecoli and the protein of interest.

@Nesvilab Nesvilab deleted a comment from fstein Mar 24, 2022
@prvst
Copy link
Collaborator

prvst commented Mar 24, 2022

I got your files and removed the link for you. I'll take a look.

@fstein
Copy link
Author

fstein commented Mar 24, 2022

Thanks a lot. The protein is called P1990_JS.

@prvst
Copy link
Collaborator

prvst commented Mar 28, 2022

Hi @fstein, I found your protein in the sample, here's what I did:

  1. Run ProteinProphet to create the protein inference.
  2. Run the filter with the default options: filter --pepxml interact.pep.xml --protxml interact.prot.xml --razor
  3. Run the freequant to get the precursor intensity
  4. Run the report
INFO[15:48:03] Executing Filter  v4.2.1                     
INFO[15:48:03] Processing peptide identification files      
INFO[15:48:03] Parsing interact.pep.xml                     
INFO[15:48:06] 1+ Charge profile                             decoy=0 target=0
INFO[15:48:06] 2+ Charge profile                             decoy=2260 target=3066
INFO[15:48:06] 3+ Charge profile                             decoy=2743 target=3276
INFO[15:48:06] 4+ Charge profile                             decoy=1283 target=1501
INFO[15:48:06] 5+ Charge profile                             decoy=298 target=319
INFO[15:48:06] 6+ Charge profile                             decoy=102 target=107
INFO[15:48:06] Database search results                       ions=14032 peptides=13712 psms=14955
INFO[15:48:06] Converged to 0.99 % FDR with 392 PSMs         decoy=3 threshold=0.9811 total=395
INFO[15:48:06] Converged to 0.73 % FDR with 274 Peptides     decoy=2 threshold=0.9862 total=276
INFO[15:48:06] Converged to 0.96 % FDR with 311 Ions         decoy=2 threshold=0.9833 total=313
INFO[15:48:06] Protein inference results                     decoy=91 target=85
INFO[15:48:06] Converged to 100.00 % FDR with 1 Proteins     decoy=1 threshold=0.9931 total=2
INFO[15:48:06] 2D FDR estimation: Protein mirror image       decoy=1 target=1
INFO[15:48:06] Second filtering results                      ions=1151 peptides=881 psms=1583
INFO[15:48:06] Converged to 0.99 % FDR with 392 PSMs         decoy=3 threshold=0.9811 total=395
INFO[15:48:06] Converged to 0.73 % FDR with 274 Peptides     decoy=2 threshold=0.9862 total=276
INFO[15:48:06] Converged to 0.96 % FDR with 311 Ions         decoy=2 threshold=0.9833 total=313
INFO[15:48:06] Post processing identifications              
INFO[15:48:07] Assigning protein identifications to layers  
INFO[15:48:07] Processing protein inference                 
INFO[15:48:07] Synchronizing PSMs and proteins              
INFO[15:48:07] Total report numbers after FDR filtering, and post-processing  ions=311 peptides=274 proteins=1 psms=392
INFO[15:48:07] Saving                                       
INFO[15:48:07] Done   

image

I suggest you try one more time like I did above. I can send you the tables if you want, let me know.

@fstein
Copy link
Author

fstein commented Mar 29, 2022

This is great news. However, for us it is important, that it works with the pipeline command and the philosopher.yml file. It is a little bit confusing for me, why it works for you with the standard parameter. Conclusively, it should also work with the pipeline command, or?
I also checked the interact.pep.xml file which is produced running the pipeline command. In this file, I also find all the peptides. But I don't understand, why the output report files stay empty.
Here, the output files are only produced if I set the protein fdr to 1. If I leave it at 0.01, then I get the error I reported above.
Does the pipeline command works for you? In this case, could you send me your philosopher.yml file?

@fstein
Copy link
Author

fstein commented Mar 29, 2022

In the philosopher.yml file, I now set only razor to true and picked, mapMods, models and sequential to false. I got pretty much the same output as you, with the exception of the last line that only zereo ions, peptides, psms and proteins were reported:

INFO[10:04:22] Executing filter on C:\MS_TestRun_gelband
INFO[10:04:22] Processing peptide identification files
INFO[10:04:24] 1+ Charge profile decoy=0 target=0
INFO[10:04:24] 2+ Charge profile decoy=2260 target=3066
INFO[10:04:24] 3+ Charge profile decoy=2743 target=3276
INFO[10:04:24] 4+ Charge profile decoy=1283 target=1501
INFO[10:04:24] 5+ Charge profile decoy=298 target=319
INFO[10:04:24] 6+ Charge profile decoy=102 target=107
INFO[10:04:24] Database search results ions=14032 peptides=13712 psms=14955
INFO[10:04:24] Converged to 0.99 % FDR with 392 PSMs decoy=3 threshold=0.9811 total=395
INFO[10:04:24] Converged to 0.73 % FDR with 274 Peptides decoy=2 threshold=0.9862 total=276
INFO[10:04:24] Converged to 0.96 % FDR with 311 Ions decoy=2 threshold=0.9833 total=313
INFO[10:04:24] Protein inference results decoy=91 target=85
INFO[10:04:24] Converged to 100.00 % FDR with 1 Proteins decoy=1 threshold=0.9931 total=2
INFO[10:04:24] 2D FDR estimation: Protein mirror image decoy=1 target=1
INFO[10:04:24] Second filtering results ions=1151 peptides=881 psms=1583
INFO[10:04:24] Converged to 0.99 % FDR with 392 PSMs decoy=3 threshold=0.9811 total=395
INFO[10:04:24] Converged to 0.73 % FDR with 274 Peptides decoy=2 threshold=0.9862 total=276
INFO[10:04:24] Converged to 0.96 % FDR with 311 Ions decoy=2 threshold=0.9833 total=313
INFO[10:04:24] Post processing identifications
INFO[10:04:24] Assigning protein identifications to layers
INFO[10:04:24] Processing protein inference
INFO[10:04:24] Synchronizing PSMs and proteins
INFO[10:04:24] Total report numbers after FDR filtering, and post-processing ions=0 peptides=0 proteins=0 psms=0
INFO[10:04:24] Saving
INFO[10:04:24] Executing report on C:\MS_TestRun_gelband
INFO[10:04:24] Creating reports
INFO[10:04:24] Done

Any idea what might be the reason?

@prvst
Copy link
Collaborator

prvst commented Mar 29, 2022

This might be because of the version you are using. We fixed a bug that would affect the output of the results, pretty ,much like yours. I can send you the current test version for you to try.

@fstein
Copy link
Author

fstein commented Mar 29, 2022

I am happy to try the new version and give you feedback...

@fstein
Copy link
Author

fstein commented Mar 30, 2022

Dear Filipe,

short feedback. The new version solved the issue with the empty summary files. Thanks a lot.

For an unspecific search there is still one bug remaining.
If you set in the philosopher.yml file the following parameter:
search_enzyme_name_1: nonspecific

It spits out the following error:
"Parameter 'search_enzyme_cutafter' was not supplied. Using default value: KR
Parameter 'search_enzyme_butnotafter' was not supplied. Using default value:
Parameter 'search_enzyme_name' was not supplied. Using default value: stricttrypsin"
So it will do a search with trypsin instead of a nonspecific search.

Apparently "nonspecific" is not a valid search_enzyme name.
If you leave "search_enzyme_name_1: stricttrypsin" and set "num_enzyme_termini: 0", it will do a non-specific search nevertheless. And with your new version, I also did not encounter any further issues with this workaround.

But do you have any clue, why it does not allow "nonspecific" as a parameter there? If you choose a nonspecific search in FragPipe, it will set the "search_enzyme_name_1 = nonspecific" in the msfragger params file (in case one exports it). So I was assuming, that I could use this parameter as such also in the philosopher.yml file since most of the parameter names are matching.

Thanks a lot...

@fstein
Copy link
Author

fstein commented Mar 30, 2022

PS:
Even with this error message mentioned above ("Parameter 'search_enzyme_cutafter' was not supplied. Using default value: KR
Parameter 'search_enzyme_butnotafter' was not supplied. Using default value:
Parameter 'search_enzyme_name' was not supplied. Using default value: stricttrypsin"), it still works and it identifies all peptides.

@prvst
Copy link
Collaborator

prvst commented Mar 31, 2022

@fstein we are replacing the philosopher pipeline by the FragPipe CMD option, that is why we are putting more effort in adding new functionalities to FragPipe, than the philosopher pipeline. I think we can send you the pre-release version for testing, and all this issues you're having will not be a problem.

@fstein fstein closed this as completed May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants