Calculation of QC metrics from mass spectrometry data
Data quality assessment is an integral part of preparatory data analysis to ensure sound biological information retrieval.
We present here the
MsQuality package, which provides functionality to calculate
quality metrics for mass spectrometry-derived, spectral data at the per-sample
MsQuality relies on the
framework of quality metrics defined by the Human Proteome
Organization-Proteomics Standards Intitiative (HUPO-PSI). These metrics
quantify the quality of spectral raw files using a controlled vocabulary.
The package is especially addressed towards users that acquire
mass spectrometry data on a large scale (e.g. data sets from clinical settings
consisting of several thousands of samples): while it is easier to control
for high-quality data acquisition in small-scale experiments, typically run
in one or few batches, clinical data sets are often acquired over longer
time frames and are prone to higher technical variation that is often
MsQuality tries to address this problem by calculating metrics that
can be stored along the spectral data sets (raw files or feature-extracted
MsQuality, thus, facilitates the tracking of shifts in data quality
and quantifies the quality using multiple metrics. It should be thus easier
to identify samples that are of low quality (high-number of missing values,
termination of chromatographic runs, low instrument sensitivity, etc.).
MsQuality package allows to calculate low-level quality metrics that require
minimum information on mass spectrometry data: retention time, m/z values,
and associated intensities.
The list included in the
mzQC framework is excessive, also including
metrics that rely on more high-level information, that might not be readily
accessible from .raw or .mzML files, e.g. pump pressure mean, or rely
on alignment results, e.g. retention time mean shift, signal-to-noise ratio,
precursor errors (ppm).
MsQuality package is built upon the
Spectra and the
Metrics will be calculated based on the information stored in a
Spectra object, thus, the spectral data of each sample should be stored
Spectra object. The
MsExperiment serves as a container to
store the mass spectral data of multiple samples.
MsQuality enables the user
to calculate quality metrics both on
You are welcome to
- write a mail to <thomasnaake (at) googlemail (dot) com>
- submit suggestions and issues: https://github.com/tnaake/MsQuality/issues
- send a pull request: https://github.com/tnaake/MsQuality/issues
if (!requireNamespace("BiocManager", quietly = TRUE))
if (!requireNamespace("remotes", quietly = TRUE))
MsQuality package then via
## to install from Bioconductor
## to install the development version from GitHub