Skip to content
Fengchao edited this page Aug 5, 2021 · 12 revisions

Introduction

Welcome to the MSFragger Wiki!

MSFragger is an ultrafast database search tool for peptide identification in mass spectrometry-based proteomics. MSFragger has demonstrated excellent performance across a wide range of datasets and applications. The speed of MSFragger makes it particularly suitable for the analysis of large datasets (including timsTOF data), for enzyme unconstrained searches, and for ‘open’ database searches (with the precursor mass tolerance set to hundreds of Daltons) for identification of modified peptides.

MSFragger can be used three different ways:

  1. With FragPipe
  2. Through ProteomeDiscoverer
  3. As a standalone Java executable (JAR) file in the command line, or through philosopher.

In each case, you will need the latest MSFragger JAR file. Example parameter files (for closed, open, and non-specific searches) can be found here.

MSFragger is implemented in the cross-platform Java programming language, and is compatible with standard open file formats for mass spectrometry data (mzXML/mzML). The standalone JAR file and the ProteomeDiscoverer node now support reading Thermo RAW files directly. MSFragger writes output in either tabular or pepXML formats, making it fully compatible with downstream data analysis pipelines such as Trans-Proteomic Pipeline and Philosopher.

Requirements

Input files

MSFragger can search Thermo/Sciex .mzML (conversion tutorial here), Thermo .raw, and Bruker timsTOF .d spectral file formats. For compatibility with complete analysis workflows, see the table on this page.

Hardware

MSFragger requires substantial amounts of memory due to its in-memory fragment index. We recommend at least 8-16 GB, but complex closed searches and open searches will require more. Specifying additional variable modifications, or performing semi-tryptic, non-enzymatic, and phospho searches may require more.

The processor requirements of MSFragger depends on the complexity of your search (and your patience to wait for search results). For an open search (500 Da precursor mass window) using a tryptic digest of the human proteome, a single processor core can search roughly 40,000 MS/MS spectra in under an hour. MSFragger scales well with the number of processor cores and runtimes of under 2 minutes per file have been achieved using a 28-core workstation. A desktop workstation with a quad core processor is sufficient for most simple workflows.

Software

Operating System requirements

MSFragger has been tested on Mac OS X, Windows 7, and a number of Linux distributions. Note that a 64-bit operating system is required to access more than 4 GB of memory.

Java requirements

MSFragger is written using Java 1.8 and requires the Java 8 Runtime Environment. We recommend the Oracle Java 8 Runtime Environment (download and installation instructions are available at www.java.com).