# antiSMASH 
Summary of antiSMASH results for: `[{{ project().name }}]`

## Description
> antiSMASH allows the rapid genome-wide identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genomes. It integrates and cross-links with a large number of in silico secondary metabolite analysis tools that have been [published earlier](https://pubmed.ncbi.nlm.nih.gov/?term=16221976%2C19297688%2C17506888%2C17400247%2C12691745%2C19360130%2C17913739%2C20462861%2C18950525%2C15980457%2C18978015%5Buid%5D).

In [1]:
import pandas as pd
from pathlib import Path
from IPython.display import display, Markdown, HTML
import json
import altair as alt

import warnings
warnings.filterwarnings('ignore')

from itables import to_html_datatable as DT
import itables.options as opt
opt.classes = ["display", "compact"]
opt.lengthMenu = [5, 10, 20, 50, 100, 200, 500]


report_dir = Path("../")

In [2]:
antismash_table = report_dir / "tables/df_antismash_6.1.1_summary.csv"
gtdb_table = report_dir / "tables/df_gtdb_meta.csv"

df_antismash = pd.read_csv(antismash_table, index_col=0)
df_gtdb = pd.read_csv(gtdb_table, index_col=0)

df_raw = pd.concat([df_antismash, df_gtdb], axis=1)
df = df_raw.loc[:, ["genome_id", "source", 'Organism', "strain", 'bgcs_count', 'bgcs_on_contig_edge']]
for i in df.index:
    gid = df.loc[i, 'genome_id']
    server_path = "<a href='{{ project().file_server() }}/antismash/6.1.1/"
    df.loc[i, "genome_id"] = server_path + f"{gid}/index.html?view' target='_blank''>{gid}</a>"
df = df.reset_index(drop=True)

## Result Summary

In [3]:
region = df_antismash.bgcs_count
incomplete = df_antismash.bgcs_on_contig_edge
text = f"""AntiSMASH detected {int(region.sum())} BGCs from {len(region)} genomes with the median of {int(region.median())}. Out of these, {'{:.2%}'.format(1 - incomplete.sum()/region.sum())} are deemed as complete."""
display(Markdown(text))

AntiSMASH detected 206 BGCs from 5 genomes with the median of 38. Out of these, 99.51% are deemed as complete.

In [4]:
source = df_raw

chart = alt.Chart(source).mark_circle().encode(
    x = 'bgcs_count',
    y = 'bgcs_on_contig_edge',
    color='Genus',
    tooltip=['genome_id', 'bgcs_count', 'bgcs_on_contig_edge', 'protoclusters_count', 'cand_clusters_count']
).properties(
    width=400,
    height=400,
    title = "BGC distribution overview",
).interactive()

chart = chart.configure_title(fontSize=20, offset=10, orient='top', anchor='middle')

chart

## Summary Table
Click on the genome ids to get the antiSMASH result.

[Download Table]({{ project().file_server() }}/tables/df_antismash_6.1.1_summary.csv){:target="_blank" .md-button}

In [5]:
display(HTML(DT(df, columnDefs=[{"className": "dt-center", "targets": "_all"}],)))

genome_id,source,Organism,strain,bgcs_count,bgcs_on_contig_edge
Loading... (need help?),,,,,


## References
<font size="2">
{% for i in project().rule_used['antismash']['references'] %}
- {{ i }} 
{% endfor %}
</font>