Skip to content

Commit

Permalink
feat(about): Updated the about page to contain more up-to-date inform…
Browse files Browse the repository at this point in the history
…ation.
  • Loading branch information
aaronmussig committed Mar 11, 2022
1 parent 0c4231f commit ad3f878
Showing 1 changed file with 43 additions and 51 deletions.
94 changes: 43 additions & 51 deletions pages/about.vue
Original file line number Diff line number Diff line change
Expand Up @@ -11,57 +11,49 @@

<v-card-text class="text--primary">

The Genome Taxonomy Database (GTDB) is an initiative to establish a standardised microbial
taxonomy based on genome phylogeny, primarly funded by an <a
href="https://www.uq.edu.au/news/article/2015/06/uq-leads-nation-prestigious-arc-laureate-fellowships"
target="_blank">Australian
Research Council Laureate Fellowship</a>. <br>
<br>
The genomes used to construct the phylogeny are obtained from
<a href="https://www.ncbi.nlm.nih.gov/refseq/" target="_blank">RefSeq</a> and
<a href="https://www.ncbi.nlm.nih.gov/genbank/" target="_blank">Genbank</a>, and GTDB releases are indexed
to
RefSeq releases, starting with release 76.
Importantly and increasingly, this dataset includes draft genomes of uncultured
microorganisms
obtained from metagenomes and single cells, ensuring improved genomic representation of the
microbial world.
All genomes are independently quality controlled using <a
href="https://github.com/Ecogenomics/CheckM/wiki" target="_blank">CheckM</a> before inclusion in GTDB,
see
statistics
<NuxtLink :to="latestStatsPageUrl">here</NuxtLink>
. <br>
<br>
The GTDB taxonomy is based on genome trees inferred with <a
href="http://www.microbesonline.org/fasttree/" target="_blank">FastTree</a> from an aligned concatenated
set of 120 single copy marker proteins for Bacteria, and with IQ-TREE from a concatenated
set of 122 marker proteins for Archaea
(download
page
<NuxtLink to="/downloads">here</NuxtLink>
).
Additional marker sets are also used to cross-validate tree topologies including
concatenated
ribosomal proteins and ribosomal RNA genes.<br>
<br>
<a href="https://www.ncbi.nlm.nih.gov/taxonomy" target="_blank">NCBI taxonomy</a> was initially used to
decorate
the genome tree via <a href="http://tax2tree.sourceforge.net/" target="_blank">tax2tree</a>.
The 16S rRNA-based <a href="http://greengenes.secondgenome.com" target="_blank">Greengenes</a> and
<a href="https://www.arb-silva.de/" target="_blank">SILVA</a> taxonomies are used
to supplement the taxonomy particularly in regions of the tree with no cultured
representatives.
<a href="https://lpsn.dsmz.de/" target="_blank">LPSN</a> is used as the primary taxonomic authority for
establishing naming priorities.
Taxonomic ranks are normalised using <a
href="https://github.com/dparks1134/PhyloRank" target="_blank">phylorank</a> and the taxonomy manually
curated to remove polyphyletic groups.
Polyphyly and rank evenness can be visualised in phylorank
<NuxtLink :to="latestStatsPageUrl">plots</NuxtLink>
.<br>
<br>
The GTDB taxonomy can be queried and downloaded through a number of tools on this website.
<p>
The Genome Taxonomy Database (GTDB) is an initiative to establish a standardised microbial taxonomy
based on genome phylogeny, primarily funded by the <a href="https://www.arc.gov.au/" target="_blank">Australian Research Council</a> via a
<a href="https://www.uq.edu.au/news/article/2015/06/uq-leads-nation-prestigious-arc-laureate-fellowships" target="_blank">Laureate Fellowship</a>
(<a href="https://app.dimensions.ai/details/grant/grant.5129370" target="_blank">FL150100038</a>) and Discovery Project
(<a href="https://dataportal.arc.gov.au/NCGP/Web/Grant/Grant/DP220100900" target="_blank">DP220100900</a>),
with the welcome assistance of <a href="https://research.uq.edu.au/research-support/research-management/funding-schemes/uq-internal-schemes" target="_blank">strategic funding from
The University of Queensland.</a>
</p>

<p>
The genomes used to construct the phylogeny are obtained from <a href="https://www.ncbi.nlm.nih.gov/refseq/" target="_blank">RefSeq</a>
and <a href="https://www.ncbi.nlm.nih.gov/genbank/" target="_blank">GenBank</a>, and GTDB
releases are indexed to RefSeq releases, starting with release 76. Importantly and increasingly,
this dataset includes draft genomes of uncultured microorganisms obtained from metagenomes and single
cells, ensuring improved genomic representation of the microbial world. All genomes are independently
quality controlled using <a href="https://github.com/Ecogenomics/CheckM/wiki" target="_blank">CheckM</a> before inclusion in GTDB,
see statistics <NuxtLink :to="latestStatsPageUrl">here</NuxtLink>.
</p>

<p>
The GTDB taxonomy is based on genome trees inferred with <a
href="http://www.microbesonline.org/fasttree/" target="_blank">FastTree</a> from an aligned concatenated set of
120 single copy marker proteins for Bacteria, and with <a href="http://www.iqtree.org/">IQ-TREE</a> from a concatenated set of 53
(starting with R07-RS207) and 122 (prior to R07-RS207) marker proteins for Archaea (download page <NuxtLink to="/downloads">here</NuxtLink>).
Additional marker sets are also used to cross-validate tree topologies including concatenated ribosomal
proteins and ribosomal RNA genes. <a href="https://www.ncbi.nlm.nih.gov/taxonomy" target="_blank">NCBI taxonomy</a> was initially used to decorate the genome tree via
<a href="http://tax2tree.sourceforge.net/" target="_blank">tax2tree</a> and subsequently used as a reference source of new taxonomic opinions including new names.
The 16S rRNA-based <a href="http://greengenes.secondgenome.com" target="_blank">Greengenes</a> and <a href="https://www.arb-silva.de/" target="_blank">SILVA</a> taxonomies were intially used to supplement the taxonomy
particularly in regions of the tree with no cultured representatives, however genome assembly
identifiers are now used to create placeholder names for uncultured taxa. <a href="https://lpsn.dsmz.de/" target="_blank">LPSN</a> is used as the
primary nomenclatural reference for establishing naming priorities and nomenclature types. All t
axonomic ranks except species are normalised using <a href="https://github.com/dparks1134/PhyloRank" target="_blank">PhyloRank</a> and the taxonomy manually curated to
remove polyphyletic groups. Polyphyly and rank evenness can be visualised in PhyloRank <NuxtLink :to="latestStatsPageUrl">plots</NuxtLink>.
Delineation of species was initially done based on phylogeny and rank normalization but was
substituted with the ANI-based method (starting with R04-RS89) to enable scalable and automated assignment of genomes to species clusters.
</p>

<p>
The GTDB taxonomy can be queried and downloaded through a number of tools on this website. Classification of new genomes based on GTDB framework can be done via
<a href="https://github.com/Ecogenomics/GTDBTk" target="_blank">GTDB-Tk</a>.
</p>


<!-- The team -->
<div class="d-block mt-10 ">
Expand Down

0 comments on commit ad3f878

Please sign in to comment.