MultiQC_NGI is a plugin for MultiQC, providing additional tools which are specific to the National Genomics Infrastructure at the Science for Life Laboratory in Stockholm, Sweden.
For more information about NGI, see http://www.scilifelab.se/platforms/ngi/
For more information about MultiQC, see http://multiqc.info
This plugin provides two extra templates -
ngi is a
stand-alone template, much like the default MultiQC template but with additional
genstat produces a template suitable for
tornado, allowing reports
to be directly integrated into our internal sample tracking website,
Genomics Status. Both are able
to print data specific to this plugin (see below).
This module reads data files produced by our RNA pipeline
in a custom R script (
process sample_correlation). It plots a heatmap of sample
similarity distances and an MDS plot.
This module picks up summary statistics from featureCounts about the overlap counts of different gene biotypes (including rRNA). These files are generated by our RNA pipeline.
Pulling from StatusDB
This plugin connects MultiQC to our internal sample tracking database, statusdb.
Firstly, it retrieves information from statusdb to put into the report:
- Looks at sample names in the General Stats table for something that looks like
an NGI project number (eg.
P1234). Bails if none or more than one are found.
- Connects to statusdb and searches projects for this. Bails if not found.
- Retrieves project level information to be printed at the top of the report
- Goes through general stats table looking for sample identifiers (
- Searches statusdb for each of these and tries to pull interesting fields if possible:
- Initial QC RIN score
- Amount of sample taken for library prep
- Concentration of prepared library
- NB: These are skipped if multiple library preps are found.
Pushing to StatusDB
As well as retrieving data, MultiQC_NGI can push data back to statusdb. This is helpful as it allows us to do cross-project meta analyses, tracking the bioinformatics statistics across everything we run.
- If pulling data has worked, we already know the project and sample IDs
- Either pushes or updates records in the
analysisdatabase, using data saved by all MultiQC modules available in
This is dependent on either
doesn't run by default.
Saving reports to a server
Once the MultiQC report is complete and has been saved to disk, MultiQC_NGI can
transfer the report to a remote server by using the
scp command. We use this
to store reports in a central backed up location. Once there, we are able to
integrate them into our sample tracking website.
To run this tool, you must have MultiQC installed. You can install both MultiQC and this package with the following command:
pip install multiqc git+https://github.com/ewels/MultiQC_NGI.git
To use the new templates, specify their name with the
-t flag in MultiQC:
multiqc -t ngi .
There are two new command line flags introduced by the plugin:
- Specify a Project ID number, instead of automatically searching for one in sample names
- Override the config file default for whether to push results to StatusDB.
- Specify a JSON file to use for testing instead of StatusDB. For example, this one
- Disable the MultiQC_NGI plugin for this run
The MultiQC_NGI plugin has some configuration options which you can add to the main
MultiQC config files (
The available config options with some suggested values can be found in
The new modules and templates are held in
multiqc_ngi/. The code that interacts
with statusdb is in
multiqc_ngi/multiqc_ngi.py and the new command line options
are defined in
The way that all of these plugin functions work is defined in
setup.py, in the
If you're developing this code, you'll want to clone it locally and install
it manually instead of using
git clone email@example.com:ewels/MultiQC_NGI.git cd MultiQC_NGI python setup.py develop
v0.3 - 2016-09-27
- New dupRadar module
- Takes output from dupRadar script in the NGI-RNAseq pipeline
- New featureCounts biotype plot / rRNA in General Stats table
- Takes output from the NGI-RNAseq pipeline where featureCounts sums the counts for each biotype and plots this
- Reports now handle multiple projects
- No header is added to the top of the report, but other fuctions (eg. sample name swapping) now works
- Added functionality to copy reports to an external server via
scpon report completion
- New General Stats table column - NGI name
- New command line flag
v0.2.2 - 2016-07-08
- Another bugfix release to handle even more missing information in statusdb
v0.2.1 - 2016-07-06
- Minor bugfix release to handle missing information in statusdb
v0.2 - 2016-07-05
- Ability to test using a static JSON file instead of statusdb
- Compatability with new MultiQC features (eg. ENV vars)
- WGS-specific cleaning of reports (remove FastQC and FastQ Screen from general stats table)
- Got the RNA-seq MDS plot to work
- Made code more tolerant of missing statusdb values
- Lots of minor bug fixes
v0.1 - 2016-05-17
- Module for NextFlow RNA-Seq BP pipeline
- Heatmap of sample correlations
- MDS plot
- Automatically find Project ID from report, or specify with
- Pull project and sample metadata from StatusDB
NGItemplate shows project metadata at head of report, plus NGI logo
- General Stats columns added for
Library Amount Taken
- Push MultiQC report data to StatusDB
- Ability to disable StatusDB interactions with
genstatbarebones template started, but not complete.