___
# Jovian analysis report
___

**Please visualize the report by pressing `Cell` in the toolbar and then selecting `Run All`. This can take a couple of minutes (depending on the size of your dataset).**  
<br> 
*N.B. The sum total of reads in this report will not add up to the sum total number of reads that were supplied as input. This is because, 1) human reads are removed, 2) PCR-duplicates might be removed depending on the chosen configuration, by default, PCR-duplicates are not removed.*  
<br>

In [None]:
%%html
<style>
div.output_subarea {
    padding-top: 0 !important;
    padding-bottom: 0 !important;
}

.standardheader {
    font-size: 30px !important;
    padding: 0px;
}

.standardanchor {
    font-size: 14px !important;
    margin-left: 5px;
}
.standardtext {
    font-family: sans-serif;
    font-size: 14px !important;
    margin-left: 5px;
}

.rendered_html {
    margin-left: 5px;
}

In [None]:
######################################
# Required packages for this script  #
######################################
import pandas as pd
import qgrid
import os
from IPython.display import display as dp
from IPython.display import Markdown as md
from IPython.display import IFrame as fr
from IPython.display import HTML as ht

### Standard Qgrid options
grid_options = {
    "fullWidthRows": True,
    "syncColumnCellResize": True,
    "forceFitColumns": False,
    "defaultColumnWidth": 100,
    "rowHeight": 23,
    "enableColumnReorder": True,
    "enableTextSelectionOnCells": True,
    "editable": True,
    "autoEdit": False,
    "explicitInitialization": True,
    "maxVisibleRows": 20,
    "minVisibleRows": 8,
    "sortable": True,
    "filterable": True,
    "highlightSelectedCell": True,
    "highlightSelectedRow": True,
    "show_toolbar": False,
}

goback = "[Go back to the top](#Jovian-analysis-report)  "


######################################
# ACTUAL CONTENT                     #
######################################

mqc_cell = """
___
## Quality control metrics report  
[Open MultiQC graph as a dedicated page by clicking here](results/multiqc.html)  
"""

composition_cell = """
___
### Read-based composition of analyzed samples:  
[Open the barchart as a dedicated page by clicking here](results/Sample_composition_graph.html)  
<br>
**Low-quality** reads are those that did not meet the stringency settings as specified in the config file.  
**Unclassified** reads are those that could not be assigned to a taxon.  
**Remaining** reads are those that could not be assembled into contigs longer than the user specified minimum contig length. 
"""

krona_cell = """
___
## Metagenomics:
___
<br>
<br>

### Interactive metagenomics overview (Krona):
[Open Krona graph as a dedicated page by clicking here](results/krona.html)  
"""

heatmap_cell_header = """
### Heatmaps:
"""

heatmap_cell_sup = """
#### Superkingdom heatmap  
Open superkingdoms heatmap as a dedicated page by clicking [here](results/heatmaps/Superkingdoms_heatmap.html).  
"""

heatmap_cell_vir = """
#### Virus heatmaps
Open virus heatmap as a dedicated page by clicking [here](results/heatmaps/Virus_heatmap.html).  
<br>
**Please note, many viruses have no "`order`" taxonomic rank so always check the "`family`" taxonomic rank.**
"""

heatmap_cell_pha = """
#### Phage heatmaps
Open phage heatmap as a dedicated page by clicking [here](results/heatmaps/Phage_heatmap.html).  
<br>
**Please note, many viruses have no "`order`" taxonomic rank so always check the "`family`" taxonomic rank.**
"""

heatmap_cell_bac = """
#### Bacteria heatmaps
Open bacteria heatmap as a dedicated page by clicking [here](results/heatmaps/Bacteria_heatmap.html).  
"""

clas_scf_cell = """
___
### Classified scaffolds:
"""

unclas_scf_cell = """
___
### Unclassified scaffolds ("Dark Matter"):
"""

unclas_byNoLCA_cell = """
___
### Ambigious LCA scaffolds:
These scaffolds were deemed ambigious by the lowest common ancestor (LCA) analysis, please perform manual inspection. This often occurs for bacteria and (pro)phages.  
"""

virhosts_cell = """
___
## Predicted virus hosts:
___
"""

virtyping_cell = """
___
## Virus typing results:
___
"""

virtyping_nov_cell = """
### Norovirus typing tool output:  
[Link to the norovirus typing tool](https://www.rivm.nl/mpf/typingtool/norovirus/)  
"""

virtyping_rva_cell = """
### Rotavirus A typing tool output:  
[Link to the Rotavirus A typing tool](https://www.rivm.nl/mpf/typingtool/rotavirusa/)  
"""

virtyping_env_cell = """
### Enterovirus typing tool output:  
[Link to the enterovirus typing tool](https://www.rivm.nl/mpf/typingtool/enterovirus/)  
"""

virtyping_hav_cell = """
### Hepatitis A typing tool output:  
[Link to the hepatatis A typing tool](https://www.rivm.nl/mpf/typingtool/hav/)  
"""

virtyping_hev_cell = """
### Hepatitis E typing tool output:  
[Link to the hepatatis E typing tool](https://www.rivm.nl/mpf/typingtool/hev/)  
"""

virtyping_hpv_cell = """
### Human Papillomavirus typing tool output:  
[Link to the HPV typing tool](https://www.rivm.nl/mpf/typingtool/papillomavirus/)  
"""

virtyping_fla_cell = """
### Flavivirus typing tool output:  
[Link to the flavivirus typing tool](https://www.rivm.nl/mpf/typingtool/flavivirus/)  
"""

scf_viewer_cell_ILM_META = """
___
## Scaffold viewer:
**Containing: SNPs and minority variants (quasispecies), predicted ORFs, depth of coverage graph, GC contents graph**
___
N.B. Depending on the depth of coverage of the selected contig it can be <b>(very) slow, or downright crash your browser</b>. This is a <b>client-sided</b> problem, meaning, your computer isn't powerful enough.  
Open the scaffold viewer as a dedicated page by clicking [here](results/igv.html).  
"""


snp_table_cell = """
___
## Minority variant table:
___
"""

snp_table_empty_cell = """
Either no SNP's were classified, maybe because you've set the minimum allele-frequency too high? Or something went wrong, please doublecheck the logfiles below:
    **logs/SNP_calling_[sample_name].log**
    **logs/Concat_filtered_SNPs.log**
"""

logging_header_cell = """
___
# Logging and audit-trail: 
___
"""

snakemake_report_cell = """
### Snakemake summary statistics
[Open Snakemake summary statistics as a dedicated page by clicking here](results/snakemake_report.html)
"""

logfile_index = """
<script>
function goBack() {
    window.history.back()
}
</script>

<button onclick="goBack()">Click this button to go back</button>

<div style="text-align: center">
    <iframe src="results/logfiles_index.html" width=100% height=980></iframe>
</div>
"""

acknowledgements_header_cell = """
___
# Acknowledgements:
___
"""

######################################
# WRITE-FUNCTIONS                    #
######################################


def multiqc():
    if os.path.exists("results/multiqc.html"):
        dp(md(mqc_cell))
        dp(fr("results/multiqc.html", "100%", "980px"))
        dp(md(goback))


def composition():
    if os.path.exists("results/Sample_composition_graph.html"):
        dp(md(composition_cell))
        dp(fr("results/Sample_composition_graph.html", "100%", "980px"))
        dp(md(goback))


def krona():
    if os.path.exists("results/krona.html"):
        dp(md(krona_cell))
        dp(fr("results/krona.html", "100%", "980px"))
        dp(md(goback))


def heatmaps():
    if os.path.exists("results/heatmaps"):
        dp(md(heatmap_cell_header))
        if os.path.exists("results/heatmaps/Superkingdoms_heatmap.html"):
            dp(md(heatmap_cell_sup))
            dp(fr("results/heatmaps/Superkingdoms_heatmap.html", "100%", "700px"))
            dp(md(goback))
        if os.path.exists("results/heatmaps/Virus_heatmap.html"):
            dp(md(heatmap_cell_vir))
            dp(fr("results/heatmaps/Virus_heatmap.html", "100%", "700px"))
            dp(md(goback))
        if os.path.exists("results/heatmaps/Phage_heatmap.html"):
            dp(md(heatmap_cell_pha))
            dp(fr("results/heatmaps/Phage_heatmap.html", "100%", "700px"))
            dp(md(goback))
        if os.path.exists("results/heatmaps/Bacteria_heatmap.html"):
            dp(md(heatmap_cell_bac))
            dp(fr("results/heatmaps/Bacteria_heatmap.html", "100%", "700px"))
            dp(md(goback))


def scaffoldviewer():
    if os.path.exists("results/igv.html"):
        dp(md(scf_viewer_cell_ILM_META))
        dp(fr("results/igv.html", "100%", "980px"))
        dp(md(goback))


def audit():
    if os.path.exists("logs/"):
        dp(md(logging_header_cell))
        dp(md("### Sample sheet"))
        sheet = open("results/samplesheet.yaml", "r")
        print(sheet.read())
        dp(md(snakemake_report_cell))
        if os.path.exists("results/snakemake_report.html"):
            dp(fr("results/snakemake_report.html", "100%", "980px"))
        else:
            print(
                """
We couldn't find a valid snakemake report.
This indicates that something went wrong during the used workflow. Please run your analysis again.
            """
            )
        dp(md("### All log-files:"))
        dp(ht(logfile_index))
        dp(md('### Full software list in "Jovian" environment:'))
        conda_log = open("results/log_conda.txt", "r")
        print(conda_log.read())
        dp(md("### Database versions:"))
        db_log = open("results/log_db.txt", "r")
        print(db_log.read())
        dp(md('### Unique methodological "fingerprint":'))
        id_log = open("results/log_git.txt", "r")
        print(id_log.read())
        dp(md("### Snakemake config files:"))
        conf_log = open("results/log_config.txt", "r")
        print(conf_log.read())
        dp(md(goback))


def acks():
    dp(md(acknowledgements_header_cell))
    with open("/opt/Jovian/files/acknowledgements.md", "r") as ack_base:
        ack_list = ack_base.read()
    dp(md(ack_list))
    with open("/opt/Jovian/files/authors.md", "r") as ath_base:
        ath_list = ath_base.read()
    dp(md(ath_list))


### interactive tables and/or grids
class Classified_grid:
    def dataframe(clas):
        if os.path.exists("results/all_taxClassified.tsv") and os.path.getsize("results/all_taxClassified.tsv") > 0:
            clas.df = pd.read_csv("results/all_taxClassified.tsv", sep="\t")
            clas.grid = qgrid.show_grid(clas.df, grid_options=grid_options)
            return

    def draw(clas):
        clas.dataframe()
        if os.path.exists("results/all_taxClassified.tsv") and os.path.getsize("results/all_taxClassified.tsv") > 0:
            dp(md(clas_scf_cell))
            dp(clas.grid)
            dp(md(goback))


class Unclassified_grid:
    def dataframe(unclas):
        if os.path.exists("results/all_taxUnclassified.tsv") and os.path.getsize("results/all_taxUnclassified.tsv") > 0:
            unclas.df = pd.read_csv("results/all_taxUnclassified.tsv", sep="\t")
            unclas.grid = qgrid.show_grid(unclas.df, grid_options=grid_options)
            return

    def draw(unclas):
        unclas.dataframe()
        if os.path.exists("results/all_taxUnclassified.tsv") and os.path.getsize("results/all_taxUnclassified.tsv") > 0:
            dp(md(unclas_scf_cell))
            dp(unclas.grid)
            dp(md(goback))


class Unclassified_byNoLCA_grid:
    def dataframe(noLCAclas):
        if os.path.exists("results/all_noLCA.tsv") and os.path.getsize("results/all_noLCA.tsv") > 0:
            noLCAclas.df = pd.read_csv("results/all_noLCA.tsv", sep="\t")
            noLCAclas.grid = qgrid.show_grid(noLCAclas.df, grid_options=grid_options)
            return

    def draw(noLCAclas):
        noLCAclas.dataframe()
        if os.path.exists("results/all_noLCA.tsv") and os.path.getsize("results/all_noLCA.tsv") > 0:
            dp(md(unclas_byNoLCA_cell))
            dp(noLCAclas.grid)
            dp(md(goback))


class Predict_virhosts:
    def dataframe(predicts):
        if os.path.exists("results/all_virusHost.tsv") and os.path.getsize("results/all_virusHost.tsv") > 0:
            predicts.df = pd.read_csv("results/all_virusHost.tsv", sep="\t")
            predicts.grid = qgrid.show_grid(predicts.df, grid_options=grid_options)
            return

    def draw(predicts):
        predicts.dataframe()
        if os.path.exists("results/all_virusHost.tsv") and os.path.getsize("results/all_virusHost.tsv") > 0:
            dp(md(virhosts_cell))
            dp(predicts.grid)
            dp(md(goback))


def virtypingheader():
    if os.path.exists("results/typingtools"):
        dp(md(virtyping_cell))


class nov_typing:
    def dataframe(nov):
        if os.path.exists("results/typingtools/all_nov-TT.csv") and os.path.getsize("results/typingtools/all_nov-TT.csv") > 0:
            nov.df = pd.read_csv("results/typingtools/all_nov-TT.csv")
            nov.grid = qgrid.show_grid(nov.df, grid_options=grid_options)
            return

    def draw(nov):
        nov.dataframe()
        if os.path.exists("results/typingtools/all_nov-TT.csv") and os.path.getsize("results/typingtools/all_nov-TT.csv") > 0:
            dp(md(virtyping_nov_cell))
            dp(nov.grid)
            dp(md(goback))


class rva_typing:
    def dataframe(rva):
        if os.path.exists("results/typingtools/all_rva-TT.csv") and os.path.getsize("results/typingtools/all_rva-TT.csv") > 0:
            rva.df = pd.read_csv("results/typingtools/all_rva-TT.csv")
            rva.grid = qgrid.show_grid(rva.df, grid_options=grid_options)
            return

    def draw(rva):
        rva.dataframe()
        if os.path.exists("results/typingtools/all_rva-TT.csv") and os.path.getsize("results/typingtools/all_rva-TT.csv") > 0:
            dp(md(virtyping_rva_cell))
            dp(rva.grid)
            dp(md(goback))


class env_typing:
    def dataframe(env):
        if os.path.exists("results/typingtools/all_ev-TT.csv") and os.path.getsize("results/typingtools/all_ev-TT.csv") > 0:
            env.df = pd.read_csv("results/typingtools/all_ev-TT.csv")
            env.grid = qgrid.show_grid(env.df, grid_options=grid_options)
            return

    def draw(env):
        env.dataframe()
        if os.path.exists("results/typingtools/all_ev-TT.csv") and os.path.getsize("results/typingtools/all_ev-TT.csv") > 0:
            dp(md(virtyping_env_cell))
            dp(env.grid)
            dp(md(goback))


class hav_typing:
    def dataframe(hav):
        if os.path.exists("results/typingtools/all_hav-TT.csv") and os.path.getsize("results/typingtools/all_hav-TT.csv") > 0:
            hav.df = pd.read_csv("results/typingtools/all_hav-TT.csv")
            hav.grid = qgrid.show_grid(hav.df, grid_options=grid_options)
            return

    def draw(hav):
        hav.dataframe()
        if os.path.exists("results/typingtools/all_hav-TT.csv") and os.path.getsize("results/typingtools/all_hav-TT.csv") > 0:
            dp(md(virtyping_hav_cell))
            dp(hav.grid)
            dp(md(goback))


class hev_typing:
    def dataframe(hev):
        if os.path.exists("results/typingtools/all_hev-TT.csv") and os.path.getsize("results/typingtools/all_hev-TT.csv") > 0:
            hev.df = pd.read_csv("results/typingtools/all_hev-TT.csv")
            hev.grid = qgrid.show_grid(hev.df, grid_options=grid_options)
            return

    def draw(hev):
        hev.dataframe()
        if os.path.exists("results/typingtools/all_hev-TT.csv") and os.path.getsize("results/typingtools/all_hev-TT.csv") > 0:
            dp(md(virtyping_hev_cell))
            dp(hev.grid)
            dp(md(goback))


class hpv_typing:
    def dataframe(hpv):
        if os.path.exists("results/typingtools/all_pv-TT.csv") and os.path.getsize("results/typingtools/all_pv-TT.csv") > 0:
            hpv.df = pd.read_csv("results/typingtools/all_pv-TT.csv")
            hpv.grid = qgrid.show_grid(hpv.df, grid_options=grid_options)
            return

    def draw(hpv):
        hpv.dataframe()
        if os.path.exists("results/typingtools/all_pv-TT.csv") and os.path.getsize("results/typingtools/all_pv-TT.csv") > 0:
            dp(md(virtyping_hpv_cell))
            dp(hpv.grid)
            dp(md(goback))


class fla_typing:
    def dataframe(fla):
        if os.path.exists("results/typingtools/all_flavi-TT.csv") and os.path.getsize("results/typingtools/all_flavi-TT.csv") > 0:
            fla.df = pd.read_csv("results/typingtools/all_flavi-TT.csv")
            fla.grid = qgrid.show_grid(fla.df, grid_options=grid_options)
            return

    def draw(fla):
        fla.dataframe()
        if os.path.exists("results/typingtools/all_flavi-TT.csv") and os.path.getsize("results/typingtools/all_flavi-TT.csv") > 0:
            dp(md(virtyping_fla_cell))
            dp(fla.grid)
            dp(md(goback))


class snp_variants:
    def dataframe(snps):
        if os.path.exists("results/all_filtered_SNPs.tsv") and os.path.getsize("results/all_filtered_SNPs.tsv") > 0:
            snps.df = pd.read_csv("results/all_filtered_SNPs.tsv", sep="\t")
            snps.grid = qgrid.show_grid(snps.df, grid_options=grid_options)
            return

    def draw(snps):
        snps.dataframe()
        if os.path.exists("results/all_filtered_SNPs.tsv") and os.path.getsize("results/all_filtered_SNPs.tsv") > 0:
            dp(md(snp_table_cell))
            dp(snps.grid)
            dp(md(goback))


classified_scaffolds = Classified_grid()
unclassified_scaffolds = Unclassified_grid()
unclassified_byNoLCA = Unclassified_byNoLCA_grid()
predicted_virushosts = Predict_virhosts()
virus_typing_nov = nov_typing()
virus_typing_rva = rva_typing()
virus_typing_env = env_typing()
virus_typing_hav = hav_typing()
virus_typing_hev = hev_typing()
virus_typing_hpv = hpv_typing()
virus_typing_fla = fla_typing()
minority_variant_table = snp_variants()

### actually draw the contents

multiqc()
composition()
krona()
heatmaps()

classified_scaffolds.draw()
unclassified_scaffolds.draw()
unclassified_byNoLCA.draw()

predicted_virushosts.draw()

virtypingheader()
virus_typing_nov.draw()
virus_typing_rva.draw()
virus_typing_env.draw()
virus_typing_hav.draw()
virus_typing_hev.draw()
virus_typing_hpv.draw()
virus_typing_fla.draw()

scaffoldviewer()

minority_variant_table.draw()

audit()
acks()

___
Jovian is available on [GitHub](https://github.com/DennisSchmitz/Jovian) under a [AGPL license](https://www.gnu.org/licenses/agpl-3.0). The virus-typing tools are public services hosted by the [RIVM](https://www.rivm.nl/en) and developed independently of Jovian.
___
*This study was financed under European Union’s Horizon H2020 grants COMPARE and VEO (grant no. 643476 and 874735).*

In [None]:
dp(md(goback))

In [None]:
%%javascript
$('<div id="toc"></div>').css({position: 'fixed', top: '120px', left: 0}).appendTo(document.body);
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js');