Skip to content

Setting the Parameters

hayse1 edited this page Aug 19, 2019 · 7 revisions

General Parameters

num_threads

  • Number of CPU threads to use. Should be set to the number of logical processors. A value of 0 (auto-detect) will cause MSFragger to use the auto-detected number of processors.
  • Default: 0

database_name

  • Path to the protein database file in FASTA format. The database must contain decoy sequences.

Search Parameters

precursor_mass_lower

  • Lower bound of the precursor mass window.
  • Default: -20

precursor_mass_upper

  • Upper bound of the precursor mass window.
  • Default: 20

precursor_mass_units

  • Precursor mass tolerance units (0 for Da, 1 for ppm).
  • Default: 1

precursor_true_tolerance

  • True precursor mass tolerance (window is +/- this value). Used for tie breaker of results (in spectral ambiguous cases), zero bin boosting in open searches (0 disables these features), and mass calibration. This option is STRONGLY recommended for open searches.
  • Default: 20

precursor_true_units

  • True precursor mass tolerance units (0 for Da, 1 for ppm).
  • Default: 1

fragment_mass_tolerance

  • Fragment mass tolerance (window is +/- this value).
  • Default: 20

fragment_mass_units

  • Fragment mass tolerance units (0 for Da, 1 for ppm).
  • Default: 1

calibrate_mass

  • Perform mass calibration (0 for OFF, 1 for ON, 2 for ON and find optimal parameters).
  • Default: 2

decoy_prefix

  • Prefix added to the decoy protein ID.
  • No default value.

isotope_error

  • Isotope correction for MS/MS events triggered on isotopic peaks. Should be set to 0 (disabled) for open search or 0/1/2 for correction of narrow window searches. Shifts the precursor mass window to multiples of this value multiplied by the mass of C13-C12.
  • Default: 0

mass_offsets

  • Creates multiple precursor tolerance windows with specified mass offsets. These values are multiplexed with the isotope error option. For example, mass_offsets = 0/79.966 can be used as a restricted 'open' search that looks for unmodified and phosphorylated peptides. Setting isotope_error to 0/1/2 in combination with this example will create search windows around (0, 1, 2, 79.966, 80.966, 81.966).
  • Default: 0

precursor_mass_mode

  • One of isolated/selected/recalculated. Isolated uses the isolation m/z, selected uses the selected m/z, while recalculated uses a recalculated m/z from .ma files within the same directory. If the desired m/z type is not present for a scan, it will default to whatever m/z is available.
  • Default: selected

localize_delta_mass

  • Generate and use mass difference fragment index in addition to the regular fragment index for search. This allows shifted fragment ions - fragment ions with mass increased by the calculated mass difference, to be included in scoring.
  • Default: 0

delta_mass_exclude_ranges

  • Exclude mass range for searching with delta mass to remove double counting of fragments in chimeric spectra and instances of monoisotopic error.
  • Default: (-1.5, 3.5)

fragment_ion_series

  • Ion series used in search. Can be the combinations of a, b, c, x, y, or z.
  • Default: b,y

precursor_charge

  • When override_charge is set to 0, use the original precursor charge if there is one in the spectral file, or try the charge states in the specified range if there is no precursor charge in the spectral file.
  • When override_charge is set to 1, always try the charge states in the specified range.
  • Default: 1 4

override_charge

  • Ignores precursor charge and uses charge state specified in precursor_charge range (0 or 1).
  • Default: 0

max_fragment_charge

  • Maximum charge state for theoretical fragments to match (1-4).
  • Default: 2

In Silico Digestion

search_enzyme_name

  • Name of enzyme to be written to the pepXML file. A complete list of enzymes can be found at the bottom of this page. Note that the enzyme name must exactly match for downstream processing to run properly.
  • Default: trypsin
  • For Nonspecific searches, please use nonspecific as the enzyme name.

search_enzyme_cutafter

  • Residues after which the enzyme cuts (specified as a string of amino acids).
  • Default: KR

search_enzyme_butnotafter

  • Residues that the enzyme will not cut before (misnomer: should really be called butnotbefore).
  • Default: <blank>

num_enzyme_termini

  • Number of enzyme termini (0 for non-enzymatic, 1 for semi-enzymatic, and 2 for fully-enzymatic).
  • Default: 2

allowed_missed_cleavage

  • Allowed number of missed cleavages.
  • Default: 1

digest_min_length

  • Minimum length of peptides to be generated during in silico digestion.
  • Default: 7

digest_max_length

  • Maximum length of peptides to be generated during in silico digestion.
  • Default: 50

digest_mass_range

  • Mass range of peptides to be generated during in silico digestion in Daltons (specified as a space separated range).
  • Default: 500.0 5000.0

Variable Modifications

clip_nTerm_M

  • Specifies the trimming of a protein N-terminal methionine as a variable modification (0 or 1).
  • Default: 1

variable_mod_01 .. 07

  • Sets variable modifications (variable_mod_01 to variable_mod_07). Space separated values with 1st value being the modification mass and the second being the residues (specified consecutively as a string) it modifies.
* is used to represent any amino acid
^ is used to represent a terminus
[ is a modifier for protein N-terminal
] is a modifier for protein C-terminal
n is a modifier for peptide N-terminal
c is a modifier for peptide C-terminal
  • Syntax Examples:
15.9949 M (for oxidation on methionine)
79.66331 STY (for phosphorylation)
-17.0265 nQnC (for pyro-Glu or loss of ammonia at peptide N-terminal)
n^ (put the modification on the peptide N-terminus itself)
n* (put it on any amino acid that is located at peptide N-terminus (vs. e.g. -17 nQ that puts it on N-terminal Q only). There may not be a difference in the results but there is a big difference in the number of peptide candidates generated for scoring.)
  • Default:
variable_mod_01 = 15.9949 M
variable_mod_02 = 42.0106 [^

allow_multiple_variable_mods_on_residue

  • Allow each amino acid to be modified by multiple variable modifications (0 or 1).
  • Default: 1

max_variable_mods_per_mod

  • Maximum number of residues that can be occupied by each variable modification (maximum of 5).
  • Default: 3

max_variable_mods_combinations

  • Maximum allowed number of modified variably modified peptides from each peptide sequence, (maximum of 65534). If a greater number than the maximum is generated, only the unmodified peptide is considered.
  • Default: 5000

Spectral Processing

minimum_peaks

  • Minimum number of peaks in experimental spectrum for matching.
  • Default: 15

use_topN_peaks

  • Pre-process experimental spectrum to only use top N peaks.
  • Default: 150

minimum_ratio

  • Filters out all peaks in experimental spectrum less intense than this multiple of the base peak intensity.
  • Default: 0.01

clear_mz_range

  • Removes peaks in this m/z range prior to matching. Useful for iTRAQ/TMT experiments (i.e., 0.0 150.0).
  • Default: 0.0 0.0

excluded_scan_list_file

  • Takes the path of a text file containing scan names. MSFragger would skip those scans if the path is not empty. Comment or delete this parameter's name and its value if you don't want to use it.
  • Default: This parameter is commented in fragger.params file.

Open Search

track_zero_topN

  • Track top N unmodified peptide results separately from main results internally for boosting features. Should be set to a number greater than output_report_topN if zero bin boosting is desired.
  • Default: 0

zero_bin_accept_expect

  • Ranks a zero-bin hit above all non-zero-bin hit if it has expectation less than this value.
  • Default: 0.0

zero_bin_mult_expect

  • Multiplies expect value of PSMs in the zero-bin during results ordering (set to less than 1 for boosting).
  • Default: 1.00

add_topN_complementary

  • Inserts complementary ions corresponding to the top N most intense fragments in each experimental spectra. Useful for recovery of modified peptides near C-terminal in open search. Should be set to 0 (disabled) otherwise.
  • Default: 0

Modeling And Output

min_fragments_modelling

  • Minimum number of matched peaks in PSM for inclusion in statistical modeling.
  • Default: 2

min_matched_fragments

  • Minimum number of matched peaks for PSM to be reported. We recommend a minimum of 4 for narrow window searching and 6 for open searches.
  • Default: 4

output_file_extension

  • File extension of output files.
  • Default: pepXML

output_format

  • File format of output files (pepXML or tsv).
  • Default: pepXML

output_report_topN

  • Reports top N PSMs per input spectrum.
  • Default: 1

output_max_expect

  • Suppresses reporting of PSM if top hit has expectation greater than this threshold.
  • Default: 50.0

report_alternative_proteins

  • Report alternative proteins for peptides that are found in multiple proteins (0 for no, 1 for yes).
  • Default: 0

Static Modifications

add_Cterm_peptide

  • Statically add mass in Da to C-terminal of peptide.
  • Default: 0.0

add_Nterm_peptide

  • Statically add mass in Da to N-terminal of peptide.
  • Default: 0.0

add_Cterm_protein

  • Statically add mass in Da to C-terminal of protein.
  • Default: 0.0

add_Nterm_protein

  • Statically add mass in Da to N-terminal of protein.
  • Default: 0.0

add_C_cysteine

add_X_usertext

  • Statically add mass to cysteine (or whatever amino acid is specified after ‘add_’).
  • Default: 0.0
Examples:
add_C_cysteine = 57.021464
add_K_lysine = 144.1021
You can’t perform that action at this time.