Skip to content

PR2 version 5.0.0

Compare
Choose a tag to compare
@vaulot vaulot released this 06 Apr 11:16
· 35 commits to master since this release

Contributors

DOI

Minor updates since release 5.0.0

2023-10-23 - emu database files added

  • A database file for emu has been added: pr2_version_5.0.0_emu_db.tar.gz. Emu allows to obtain the composition of protist communities sequenced by long read technology such Nanopore or PacBio.

2023-05-15 - version 5.0.1

Assets 5.0.0 have been updated without changing the version #

  • Sequences removed: 6
  • Sequences updated: 79 (mostly Burkholderia_sp. which were wrongly labelled)

Main changes

  • Upgrade taxonomy from 8 levels to 9 levels
  • Add link to EukRibo database (Berney et al. 2022)
  • Add link to Mixoplankton Database (Mitra et al. 2023)

Major groups for which taxonomy has been updated

  • Bacteria and Archaea
  • New organelles sequences (plstids and mitochondria)
  • Stramenopiles
    • Diatoms
    • Chrysophyceae
  • Alveolates
    • Ciliates
    • Dinoflagellates
    • Perkinsea
  • Fungi
    • Chytrids
  • Amoebozoa
    * Myxogastria
  • New supergroup added: Provora

Sequences

Changes

  • Added : 5,718
  • Taxonomy Updated: 17,954
  • Quarantined/Removed: 100

Detailed changes

Taxonomy structure

  • We moved from 8 levels (kingdom to species) to 9 levels (domain to species) with a new level subdivision. The changes are explained here.

Taxonomic groups updated

Obazoa

  • Fungi
    • Microsporidia - Metchnikovellida:
      • Add 6 sequences
    • Chytrids
      • taxonomy completely revised to follow in particular Tedersoo et al. : 191 tatxa edited
      • Add 34 sequences

TSAR

  • Alveolata
    • Genus Alphamonas corrected from Aphamonas
    • Dinoflagellates
      • Suessiaceae, Borghiellaceae
      • Gonyaucales
      • Kareniaceae, Warnowiaceae
    • Ciliates
      • Spirotrichea updated to follow more closely EukRef annotations. In particular:
        • added new sequences, made some corrections, and updated names (sensu Adl et al. 2019).
          • removed these artificial groups:
            • Leegardiellidae_A and _B: replaced by Leegardiellidae
            • Strobilidiidae_A to J: replaced by Strobilidiidae
            • Strombidiidae_A to R: replaced by Strombidiidae
            • Tontoniidae_A and B: replaced by Tontoniidae
            • Strombidiida_A to H: replaced by Strombidiida
          • For tintinnids, we included both the order and suborder (Choreotrichida-Tintinnina) in the order column (best compromise, hopefully acceptable)
      • Taxonomy of following families updated
        • Discotrichidae
        • Plagiocampidae
        • Urotrichidae
        • Protocruziidae
    • Perkinsea
      • 10 sequences removed
      • 19 sequences reassigned
      • 614 new sequences added
  • Stramenopiles
    • Olisthodiscophyceae : New class added created in Barcytė et al. (2021)
    • Pseudochattonella was wrongly spelled.
    • Diatom taxonomy has been updated with the three recognized classes following Algaebase
      • Bacillariophyceae
      • Coscinodiscophyceae
      • Mediophyceae
      • Diatomea_X (class) is used for taxa that are not assigned to one of these three classes (e.g. Pheodactylum)
    • Diatoms genera Anaulus, Asterionellopsis, Ceratanaulus, Eunotogramma, Plagiogramma updated plus 2 new sequences assigned
    • Chrysophyceae taxonomy completely revised following in particular Scoble et Cavalier-Smith 2014 and Charvet et al. 2012.
      • Sequences updated: 1577
      • Sequences added: 160
      • Taxa updated: 428
      • Taxa added: 46
  • Rhizaria
    • Radiolaria
      • Spumellaria: 67 new sequences annotated

Archaeplastida

  • Picozoa
    • Are now classified within Archaeplastida.

Excavata

  • 14 Percolomonads sequences added

Amoebozoa

  • Myxogastria: taxonomy updated

Provora

  • New supergroup added

Organelles

  • 16S plastid: 252 new sequences added (some are shorter than 500 bp.)
  • 16S mitochondrion: 1818 sequences from original PR2 database
  • 18S nucleomorph: 12 sequences from original PR2 database

Bacteria, Archaea

  • Supergroup added
  • Cyanobacteria: supergroup replaced by Bacteria_X (before Terrabacteria)

Link PR2 to other databases

EukRibo database version 2.0

See Berney et al. 2022.

  • Add sequences that are not present in PR2: 510
  • Add taxa that are not present in PR2: 938
  • Update sequences taxonomy for all sequences that had no species assigned: 1257
  • Fields from EukRibo added to PR2 metadaata for 48,136 sequences
    • eukribo_UniEuk_taxonomy_string: taxonomy annotation from EukRibo (number of levels is variable)
    • eukribo_V4: Does the sequence contains the V4 region
    • eukribo_V9: Does the sequence contains the V4 region
  • See tutorial: EukRibo database

Mixoplankton database

See Mitra et al. 2023 (https://doi.org/10.1111/jeu.12972) and database at DOI 10.5281/zenodo.7560582

  • Added one column mixoplankton in the metadata
  • filter(!is.na(mixoplankton)) for mixotrophic species

WoRMS database

WoRMS database is an authoritative and comprehensive list of names of marine organisms, including information on synonymy. The content of WoRMS is controlled by taxonomic and thematic experts, not by database managers. For species in PR2 that have an entrin WoRMS we have now added a link to the AphaID (worms_id field) as well as information on the distribution of the species (maine, brackish, freshwater, terrestrial).

Sequences uploaded but not yet annotated

  • 35,884 18S rRNA sequences added from GenBank - 2021-03-23 to 2023-02-20
  • Only 17 504 of these 35 884 pass the current criteria for PR2 (length >= 500 bp etc..)

Sequences annotated automatically

  • 19 405 18S rRNA sequences from GenBank originating from strains corresponding to 2279 new species in taxonomy table

Sequences removed

  • Chimeras: 1259 from initial version of PR2
  • AY745555.1.1854_U, AY745597.1.1844_U, EF209781.1.1956_U, EF209774.1.1835_U, EF209794.1.1834_U which do not exist on Genbank anymore.

Database structure changes

R package

Scripts

These scripts are just to show some of the procedures used to update the PR2 database. Do not try to run them, they will not work as they require access to the MySQL PR2 database.

List of scripts

References

Taxonomy structure

  • Burki, Fabien, Andrew J. Roger, Matthew W. Brown, et Alastair G.B. Simpson. 2020. « The New Tree of Eukaryotes ». Trends in Ecology and Evolution 35 (1): 43‑55. https://doi.org/10.1016/j.tree.2019.08.008.

Linked databases

  • Berney, Cédric, Nicolas Henry, Frédéric Mahé, Daniel J. Richter, et Colomban de Vargas. 2022. EukRibo: A Manually Curated Eukaryotic 18S RDNA Reference Database to Facilitate Identification of New Diversity. Preprint. BiorXiv. https://doi.org/10.1101/2022.11.03.515105.
  • Mitra, Aditee, David A. Caron, Emile Faure, Kevin J. Flynn, Suzana Gonçalves Leles, Per J. Hansen, George B. McManus, et al. 2023. « The Mixoplankton Database – Diversity of Photo-Phago-Trophic Plankton in Form, Function and Distribution across the Global Ocean ». Journal of Eukaryotic Microbiology : e12972. https://doi.org/10.1111/jeu.12972.

Excavata - Percolomonads

  • Hohlfeld, Manon, Claudia Meyer, Alexandra Schoenle, Frank Nitsche, et Hartmut Arndt. 2023. « Biogeography, Autecology, and Phylogeny of Percolomonads Based on Newly Described Species ». Journal of Eukaryotic Microbiology 70 (1): e12930. https://doi.org/10.1111/jeu.12930.

TSAR - Stramenopiles

  • Barcytė, D., Eikrem, W., Engesmo, A., Seoane, S., Wohlmann, J., Horák, A., Yurchenko, T., & Eliáš, M. (2021). Olisthodiscus represents a new class of Ochrophyta. Journal of Phycology, 57(4), 1094‑1118. https://doi.org/10.1111/jpy.13155
  • Charvet, Sophie, Warwick F. Vincent, et Connie Lovejoy. 2012. « Chrysophytes and Other Protists in High Arctic Lakes: Molecular Gene Surveys, Pigment Signatures and Microscopy ». Polar Biology 35 (5): 733‑48. https://doi.org/10.1007/s00300-011-1118-7.
  • Scoble, Josephine Margaret, et Thomas Cavalier-Smith. 2014. « Scale Evolution in Paraphysomonadida (Chrysophyceae): Sequence Phylogeny and Revised Taxonomy of Paraphysomonas, New Genus Clathromonas, and 25 New Species ». European Journal of Protistology 50 (5): 551‑92. https://doi.org/10.1016/j.ejop.2014.08.001.

TSAR - Ciliates

  • Boscaro, Vittorio, Luciana F. Santoferrara, Qianqian Zhang, Eleni Gentekaki, Mitchell J. Syberg-Olsen, Javier del Campo, et Patrick J. Keeling. 2018. « EukRef-Ciliophora: a manually curated, phylogeny-based database of small subunit rRNA gene sequences of ciliates ». Environmental Microbiology 20 (6): 2218‑30. https://doi.org/10.1111/1462-2920.14264.
  • Ganser, Maximilian H., Luciana F. Santoferrara, et Sabine Agatha. 2022. « Molecular Signature Characters Complement Taxonomic Diagnoses: A Bioinformatic Approach Exemplified by Ciliated Protists (Ciliophora, Oligotrichea) ». Molecular Phylogenetics and Evolution 170 (mai): 107433. https://doi.org/10.1016/j.ympev.2022.107433.

Provora

  • Tikhonenkov, Denis V., Kirill V. Mikhailov, Ryan M. R. Gawryluk, Artem O. Belyaev, Varsha Mathur, Sergey A. Karpov, Dmitry G. Zagumyonnyi, et al. 2022. « Microbial Predators Form a New Supergroup of Eukaryotes ». Nature 612 (7941): 714‑19. https://doi.org/10.1038/s41586-022-05511-5.