Skip to content

Matrix-Science/MascotQCexample

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MascotQCexample

A single-script QC dashboard for Mascot Server search results, designed to run as a Mascot Daemon external task after each search completes.

This project was developed as a demonstration of AI-assisted development with Mascot using Claude Code and the msparser SDK. See the companion mascot-parser-skill repository for the Claude skill that drives Mascot Parser usage, and AI_DEVELOPMENT_GUIDE.md for the prompt-by-prompt account of how this script was built.

QC dashboard preview

What it produces

A single PNG dashboard per search containing:

Panel Description
ID count vs RT Number of significant PSMs per retention-time bin
TIC strip 1-D gel-style precursor ion current across the gradient
RT vs m/z scatter Every query plotted; identified PSMs coloured by Percolator score (-10 log10 PEP), unidentified shown as pale ghosts
Intensity vs score hexbin Precursor intensity vs Percolator score density map (NPG palette)

When Percolator / MS2Rescore output is available the script loads it via the published ms_peptidesummary + MSPEPSUM_PERCOLATOR API path and filters PSMs by q-value. Without Percolator it falls back to raw Mascot ions scores.

Requirements

  • Python 3.9+
  • msparser Python bindings
  • numpy, matplotlib, seaborn
  • mpl-scatter-density (optional, for --render density mode)

Installing msparser (Python)

Download the msparser SDK from https://www.matrixscience.com/msparser_download.html

Extract the archive and copy the Python bindings into your Python environment's site-packages directory:

Windows:

<site-packages>\
    msparser.py
    _msparser.pyd

Linux:

<site-packages>/
    msparser.py
    _msparser.so

Find your site-packages path with:

python -c "import site; print(site.getsitepackages()[0])"

Alternatively, set the PYTHONPATH environment variable to point to the msparser bindings directory.

Windows DLL dependencies: msparser requires the Microsoft Visual C++ Redistributable. If you encounter DLL loading errors, install the latest VC++ Redistributable from Microsoft.

Installing Python dependencies

pip install numpy matplotlib seaborn mpl-scatter-density

Usage

From the command line

python mascot_qc_report.py <result_url_or_path> <output_directory>

Example:

python mascot_qc_report.py \
  "http://localhost/mascot/cgi/master_results_2.pl?file=..%2Fdata%2F20260413%2FF019423.msr;MS2Rescore.ms2pip_model=HCD2019;percolate=1" \
  "C:\QC_output"

Mascot Server filesystem access

The script needs filesystem access to the Mascot Server's data directory to open .msr result files, read mascot.dat, and load Percolator .pop cache files. This is a limitation of the msparser createResfile() API which only accepts local file paths.

In practice this means one of:

  • Same machine: Mascot Daemon and Mascot Server run on the same host (the most common deployment).
  • Network share: The Mascot Server data directory is mapped as a network drive or UNC path on the Daemon machine.

Paths are resolved from the MASCOT_HOME environment variable or individual overrides:

Variable Default (Windows) Default (Linux) Description
MASCOT_HOME C:\inetpub\mascot /var/www/mascot Root of the Mascot Server installation
MASCOT_DATA_DIR <MASCOT_HOME>\data Result file directory
MASCOT_CACHE_DIR <MASCOT_HOME>\data\cache Percolator pip/pop file cache
MASCOT_CGI_DIR <MASCOT_HOME>\cgi CGI directory (for msparser path resolution)
MASCOT_DAT <MASCOT_HOME>\config\mascot.dat Path to mascot.dat

For a network-share deployment, set MASCOT_HOME to the UNC path or mapped drive letter, e.g. MASCOT_HOME=\\mascotserver\mascot.

From Mascot Daemon (Execute after search)

Field Value
Program C:\path\to\python.exe
Arguments "C:\path\to\mascot_qc_report.py" "<resulturl>" "<task_directory>"

Daemon expands <resulturl> to the full results URL (including any MS2Rescore adapter parameters) and <task_directory> to the task folder where the MGF and other task files reside. The QC PNG is saved there as <resultfile>_qc.png.

Options

Flag Default Description
--significant yes Filter by Percolator q-value (requires .target.pop in cache)
--rank1 Plot all rank-1 matches using raw ions score
--fdr N 0.01 q-value cutoff for --significant mode
--palette NAME lancet Colour palette for the main score scale (see below)
--render MODE scatter scatter, density, or hexbin for the main panel
--adapter-param from URL ML adapter parameter, repeatable

Available palettes

Journal palettes (from ggsci): npg, aaas, nejm, lancet, jama, jco, bmj, frontiers

Colour-blind safe: okabe_ito (Wong, Nature Methods 8, 441, 2011)

Perceptual (matplotlib built-in): viridis, magma, plasma, cividis

Percolator integration

The script follows the published msparser workflow for loading Percolator scores (see msparser documentation, "Using Percolator scores"):

  1. chdir to the Mascot CGI directory (msparser resolves paths relative to CWD).
  2. Open the result file with a writable cache directory.
  3. Call setPercolatorFeatures(mascot_options, '', adapter_params) after setting PercolatorExeFlags via getPercolatorRtFlags().
  4. Retrieve expected .pip / .pop filenames from getPercolatorFileNames().
  5. If the hash-derived filenames do not match the files the server generated (option drift between runs), copy the real cache files to the expected names -- the workaround recommended in the msparser docs.
  6. Create ms_peptidesummary with MSPEPSUM_PERCOLATOR in flags2.

With the flag set, pep.getIonsScore() returns -10 * log10(PEP) and the identity threshold is a fixed 13 (-10 * log10(0.05)).

Colour conventions

The dashboard uses two distinct colour families so that "high colour" is unambiguous:

  • Main scatter (score): Lancet palette (navy - cyan - green - peach - red)
  • Hexbin density (PSM count): NPG palette (dark blue - cyan - teal - salmon - red)

Both palettes follow Nature publishing colour guidelines and are sourced from the ggsci R package.

License

Copyright (C) 2026 Matrix Science Limited. All Rights Reserved.

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

About

An example QC plot that is generated from a DDA or DIA search result using the mascot-parser skill.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages