MultiQC_NGI is a plugin for MultiQC, providing additional tools which are specific to the National Genomics Infrastructure at the Science for Life Laboratory in Stockholm, Sweden.
For more information about NGI, see http://www.scilifelab.se/platforms/ngi/
For more information about MultiQC, see http://multiqc.info
This plugin provides two extra templates -
ngi is a
stand-alone template, much like the default MultiQC template but with additional
genstat produces a template suitable for
tornado, allowing reports
to be directly integrated into our internal sample tracking website,
Genomics Status. Both are able
to print data specific to this plugin (see below).
Pulling from StatusDB
This plugin connects MultiQC to our internal sample tracking database, statusdb.
Firstly, it retrieves information from statusdb to put into the report:
- Looks at sample names in the General Stats table for something that looks like
an NGI project number (eg.
P1234). Bails if none or more than one are found.
- Connects to statusdb and searches projects for this. Bails if not found.
- Retrieves project level information to be printed at the top of the report
- Goes through general stats table looking for sample identifiers (
- Searches statusdb for each of these and tries to pull interesting fields if possible:
- Initial QC RIN score
- Amount of sample taken for library prep
- Concentration of prepared library
- NB: These are skipped if multiple library preps are found.
Pushing to StatusDB
As well as retrieving data, MultiQC_NGI can push data back to statusdb. This is helpful as it allows us to do cross-project meta analyses, tracking the bioinformatics statistics across everything we run.
- If pulling data has worked, we already know the project and sample IDs
- Either pushes or updates records in the
analysisdatabase, using data saved by all MultiQC modules available in
This is dependent on either
doesn't run by default.
Saving reports to a server
Once the MultiQC report is complete and has been saved to disk, MultiQC_NGI can
transfer the report to a remote server by using the
scp command. We use this
to store reports in a central backed up location. Once there, we are able to
integrate them into our sample tracking website.
To run this tool, you must have MultiQC installed. You can install both MultiQC and this package with the following command:
pip install multiqc git+https://github.com/ewels/MultiQC_NGI.git
To use the new templates, specify their name with the
-t flag in MultiQC:
multiqc -t ngi .
There are two new command line flags introduced by the plugin:
- Specify a Project ID number, instead of automatically searching for one in sample names
- Override the config file default for whether to push results to StatusDB.
- Specify a JSON file to use for testing instead of StatusDB. For example, this one
- Disable the MultiQC_NGI plugin for this run
The MultiQC_NGI plugin has some configuration options which you can add to the main
MultiQC config files (
The available config options with some suggested values can be found in
The new templates are held in
multiqc_ngi/. The code that interacts
with statusdb is in
multiqc_ngi/multiqc_ngi.py and the new command line options
are defined in
The way that all of these plugin functions work is defined in
setup.py, in the
If you're developing this code, you'll want to clone it locally and install
it manually instead of using
git clone firstname.lastname@example.org:ewels/MultiQC_NGI.git cd MultiQC_NGI python setup.py develop
Note that you can use test data specifically for MultiQC_NGI, found within the MultiQC_TestData repository.
This dataset includes a JSON file with contents that emulate statusdb, so
that these features can be developed locally. To use this, tell MultiQC where to find
it using the
multiqc data -t ngi --test-db ngi_db_data.json