@@ -72,7 +72,7 @@ Shotgun sequencing
7272------------------
7373
7474Qiita currently has one active shotgun metagenomics data analysis pipeline: a per sample, paired-end
75- bowtie2 alignment step with Woltka classification using either the WoLr2 (default) or RS210 databases.
75+ bowtie2 alignment step with Woltka classification using either the WoLr2 (default) or RS225 databases.
7676Below you will find more information about each of these options.
7777
7878.. note ::
@@ -197,6 +197,23 @@ Note that some of these are legacy option but not available for new processing.
197197 - Genera: 6,811
198198 - Species: 12,258
199199
200+ #. RS225: Collection of reference microbial genomes sampled from the NCBI
201+ RefSeq genome database, as of 2024-08-01. This time point corresponds to
202+ RefSeq release 225. RS225 contains 40,987 genomes from NCBI RefSeq and
203+ 11,771 genomes from external sources. The total number of genomes, 52,758,
204+ represents an 78% increase from the previous version of the database.
205+
206+ - Total number of genomes: 52,758
207+ - Total length of genomes (after adding linkers): 170,326,480,530 bp
208+ - Number of genomes by category:
209+ - Archaea: 870
210+ - Bacteria: 32,894
211+ - Fungi: 610
212+ - Protozoa: 93
213+ - Viral: 18,279
214+ - SynDNA Constructs: 12
215+
216+
200217#. RS210: Collection of reference microbial genomes sampled from the NCBI RefSeq
201218 genome database, as of 2022-01-01. This time point corresponds to RefSeq
202219 release 210.
@@ -212,6 +229,7 @@ Note that some of these are legacy option but not available for new processing.
212229 - Protozoa: 93
213230 - Viral: 7,493
214231
232+
215233#. WoLr1 ("Web of Life" release 1): An even representation of microbial diversity, selected using an prototype
216234 selection algorithm based on the MinHash distance matrix among all non-redundant bacterial and archaeal genomes
217235 from NCBI (RefSeq and GenBank, complete and draft), plus several genome quality control criteria. A
@@ -236,6 +254,7 @@ Note that some of these are legacy option but not available for new processing.
236254 - Strains: 89
237255 - Note: Nucleotide sequences per genome were concatenated with a linker of 20 "N"s.
238256
257+
239258#. Rep200: NCBI representative and reference microbial genomes, corresponding to RefSeq release 200 (2020-05-14)
240259
241260 - Genomes: 11,955
@@ -249,6 +268,7 @@ Note that some of these are legacy option but not available for new processing.
249268 - Protozoa: 88
250269 - Viral: 48
251270
271+
252272#. Rep94: NCBI representative and reference microbial genomes, corresponding to RefSeq release 94.
253273
254274 - Domains: Bacteria, Archaea
@@ -266,6 +286,7 @@ Note that some of these are legacy option but not available for new processing.
266286 - Species: 5,636
267287 - Strains: 84
268288
289+
269290#. Rep82: NCBI representative and reference microbial genomes, corresponding to RefSeq release 82.
270291
271292 - Not available anymore for new processing
@@ -284,6 +305,7 @@ Note that some of these are legacy option but not available for new processing.
284305 - Species: 11,852
285306 - Strains: 4,263
286307
308+
287309Metatranscriptome processing
288310----------------------------
289311
0 commit comments