# MultiQC Jupyter Notebook Example

This notebook has some example code showing how MultiQC can be used within an interactive analysis, such as a Jupyter Notebook.

MultiQC is written in Python, so must be used in a Python environment. If possible, Python 3.5 or later is recommended. MultiQC can be installed in a variety of ways: see [the documentation](https://multiqc.info/docs/#installing-multiqc) for more information. Note that MultiQC must be installed into the notebook _kernal_.

First, let's install MultiQC using pip (note the `%` magic which installs the package into the kernel and not the jupyter environment).

> NB: Support for imports is only available from v1.8 of MultiQC. At the time of writing this is not yet released, so we will install the development version directly from GitHub.

In [1]:
%pip install --force-reinstall --upgrade git+https://github.com/ewels/MultiQC.git

Collecting git+https://github.com/ewels/MultiQC.git
  Cloning https://github.com/ewels/MultiQC.git to /private/var/folders/tk/k7tjvpqs0tbfd0bzvt4htrzh0000gn/T/pip-req-build-f624acme
  Running command git clone -q https://github.com/ewels/MultiQC.git /private/var/folders/tk/k7tjvpqs0tbfd0bzvt4htrzh0000gn/T/pip-req-build-f624acme
  Running command git submodule update --init --recursive -q
Collecting click (from multiqc==1.8.dev0)
  Using cached https://files.pythonhosted.org/packages/fa/37/45185cb5abbc30d7257104c434fe0b07e5a195a6847506c074527aa599ec/Click-7.0-py2.py3-none-any.whl
Collecting coloredlogs (from multiqc==1.8.dev0)
  Using cached https://files.pythonhosted.org/packages/08/0f/7877fc42fff0b9d70b6442df62d53b3868d3a6ad1b876bdb54335b30ff23/coloredlogs-10.0-py2.py3-none-any.whl
Collecting future>0.14.0 (from multiqc==1.8.dev0)
Collecting jinja2>=2.9 (from multiqc==1.8.dev0)
  Using cached https://files.pythonhosted.org/packages/65/e0/eb35e762802015cab1ccee04e8a277b03f1d8e53da3ec31

> NB: You will probably need to restart the notebook kernal after installing MultiQC

Now let's import the `multiqc` package into your workbook:

In [2]:
import multiqc

Great! Now let's check that it's working properly by printing the version that we're using:

In [3]:
print(multiqc.__version__)

1.8.dev0 (73b81af)


Before we can use any outputs from MultiQC, we must first run it on some data. The [MultiQC website](https://multiqc.info/) has the logs used for all of the example reports on the homepage available for download, so let's grab the files for the RNA-seq report.

In [4]:
!wget https://multiqc.info/examples/rna-seq/data.zip
!unzip -o data.zip
!rm data.zip

--2019-11-11 16:32:13--  https://multiqc.info/examples/rna-seq/data.zip
Resolving multiqc.info (multiqc.info)... 91.238.163.174
Connecting to multiqc.info (multiqc.info)|91.238.163.174|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 18549188 (18M) [application/zip]
Saving to: ‘data.zip’


2019-11-11 16:32:15 (8.40 MB/s) - ‘data.zip’ saved [18549188/18549188]

Archive:  data.zip
   creating: data/
  inflating: data/fastqc_theoretical_gc_hg38_txome.txt  
  inflating: data/SRR3192396_1.fastq.gz_trimming_report.txt  
  inflating: data/SRR3192396_1_fastqc.html  
  inflating: data/SRR3192396_1_fastqc.zip  
  inflating: data/SRR3192396_1_star_aligned.bam_counts.txt.summary  
  inflating: data/SRR3192396_1_val_1_fastqc.html  
  inflating: data/SRR3192396_1_val_1_fastqc.zip  
  inflating: data/SRR3192396_1Log.final.out  
  inflating: data/SRR3192396_1Log.out  
  inflating: data/SRR3192396_1Log.progress.out  
  inflating: data/SRR3192396_1Log.std.out  
  inflating: data

You should now see a folder called `data/` in your notebook work directory with a bunch of log files within from a typical RNA-seq analysis run.

Now let's run MultiQC on those files.

In [5]:
multiqc.run('./data/')

[INFO   ]         multiqc : This is MultiQC v1.8.dev0 (73b81af)
[INFO   ]         multiqc : Template    : default
[INFO   ]         multiqc : Searching   : /Users/philewels/GitHub/MultiQC_Notebook/data
[ERROR  ]         multiqc : Oops! The 'seqyclean' MultiQC module broke... 
  Please copy the following traceback and report it at https://github.com/ewels/MultiQC/issues 
  If possible, please include a log file that triggers the error - the last file found was:
    None
Module seqyclean raised an exception: Traceback (most recent call last):
  File "/Users/philewels/miniconda2/envs/altair/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2451, in resolve
    return functools.reduce(getattr, self.attrs, module)
AttributeError: module 'multiqc.modules.seqyclean' has no attribute 'MultiqcModule'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/philewels/GitHub/MultiQC/multiqc/multiqc.py", line 541, in run
   

SystemExit: 1

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


Ok great! Now we have a report, we can show it inside the notebook.

> Note that we use `IFrame` and not `HTML` - this is because Jupyter has lots of its own CSS and JavaScript which doesn't play well with MultiQC.

In [6]:
from IPython.display import IFrame
IFrame('./multiqc_report.html', '100%', 600)