You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Supervisor: Bérénice Batut For degree: Master Status: Open Keywords: Microbiome, Metagenomics, Galaxy, Assembly, Workflow, Benchmarking
Global Biological/Research context
Microbiome is the collection of all microbes, such as bacteria, fungi, viruses, along with their genes, which live inside and outside our bodies in all environments surrounding us [1]. To investigate microbiomes, researchers use sequencing data and microbiome analyses [2] . These analyses rely uses sequencing data to investigate microbiomes. Such analysis relies on sophisticated computational approaches: assembly, binning, taxonomic classification, functional profiling etc. Analysing microbiome data makes it possible to answer two main questions for most microbiome analysis
who (microorganisms) are there: by extracting the community from the microbiome reads
what are they doing (and how): by extracting the gene/pathway abundance profile from the metagenomics reads and transcript abundance profiles from the metatranscriptomics reads and combining them
Microbiome sequencing data gives also the possibility to assembly genomes of organisms that can not be cultivated invidually (e.g. [3,4]). However, building genomes out of metagenomics data (called Metagenome Assembled Genomes or MAGs) is complex given the mix of sequences from many organisms, requires many steps [5,6] and high computational resources.
Few workflows to build MAGs this data are available (e.g. [7,8]) and most are not openly available, not transparent or not easy to use by researchers.
Project context
Ihe Freiburg Galaxy team together with the microGalaxy community use Galaxy [9] to build a MAGs building workflow, that will be open, transparent, reusable, accessible.
This workflow has been developed with data from the cloud environment. Now we would like to adapt this workflows on data from other microbiome environments, evaluate it using benchmarking data, compare it against other workflows, document and share the workflow.
Objectives of the project
Evaluate the results of the workflow on the cloud data
Benchmark the workflow on the CAMI challenge benchmarking data [10]
Document, and share the workflow
Annotate the workflow
Create the skeleton for a tutorial
Submit the workflow to IWC
Proposed agenda for the project
Bibliography of metagenomic assembly, MAGs building, existing worklows
Get familiar with the implemented MAGs building workflow
Create the skeleton of a tutorial explaining each step and selected parameters
Evaluate the results of the workflow on the cloud data
Aggregate and analyze the different generated quality metrics into a Jupyter notebook
Run extra steps to evaluate the quality of created MAGs
Benchmark the workflow on the CAMI challenge benchmarking data
Run the workflow on the different datasets from the CAMI challenge
[1] Martin J. Blaser. “The microbiome revolution” The Journal of Clinical Investigation (2014): 124.
[2] Sharpton, Thomas J. "An introduction to the analysis of shotgun metagenomic data." Fontiers in plant science 5 (2014): 209.
[3] Xie, Fei, et al. "An integrated gene catalog and over 10,000 metagenome-assembled genomes from the gastrointestinal microbiome of ruminants." Microbiome 9.1 (2021): 1-20
[4] Nishimura, Yosuke, and Susumu Yoshizawa. "The OceanDNA MAG catalog contains over 50,000 prokaryotic genomes originated from various marine environments." Scientific Data 9.1 (2022): 1-11.
[5] Chen LX, Anantharaman K, Shaiber A, Eren AM, Banfield JF (2020) Accurate and complete genomes from metagenomes. Genome Res 30(3):315–333
[6] Sieber CMK, Probst AJ, Sharrar A, Thomas BC, Hess M, Tringe SG, et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat Microbiol. 2018;3(7):836–43
[7] Kieser, Silas, et al. "ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data." BMC bioinformatics 21.1 (2020): 1-8.
[8] Raguideau, Sebastien, et al. "Novel microbial syntrophies identified by longitudinal metagenomics." bioRxiv (2021).
[9] Enis Afgan, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update, Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W537–W544, doi:10.1093/nar/gky379
[10] Meyer, Fernando, et al. "Critical Assessment of Metagenome Interpretation: the second round of challenges." Nature methods 19.4 (2022): 429-440.
The text was updated successfully, but these errors were encountered:
Supervisor: Bérénice Batut
For degree: Master
Status: Open
Keywords: Microbiome, Metagenomics, Galaxy, Assembly, Workflow, Benchmarking
Global Biological/Research context
Microbiome is the collection of all microbes, such as bacteria, fungi, viruses, along with their genes, which live inside and outside our bodies in all environments surrounding us [1]. To investigate microbiomes, researchers use sequencing data and microbiome analyses [2] . These analyses rely uses sequencing data to investigate microbiomes. Such analysis relies on sophisticated computational approaches: assembly, binning, taxonomic classification, functional profiling etc. Analysing microbiome data makes it possible to answer two main questions for most microbiome analysis
Microbiome sequencing data gives also the possibility to assembly genomes of organisms that can not be cultivated invidually (e.g. [3,4]). However, building genomes out of metagenomics data (called Metagenome Assembled Genomes or MAGs) is complex given the mix of sequences from many organisms, requires many steps [5,6] and high computational resources.
Few workflows to build MAGs this data are available (e.g. [7,8]) and most are not openly available, not transparent or not easy to use by researchers.
Project context
Ihe Freiburg Galaxy team together with the microGalaxy community use Galaxy [9] to build a MAGs building workflow, that will be open, transparent, reusable, accessible.
This workflow has been developed with data from the cloud environment. Now we would like to adapt this workflows on data from other microbiome environments, evaluate it using benchmarking data, compare it against other workflows, document and share the workflow.
Objectives of the project
Proposed agenda for the project
Prerequisites
Further reading
Galaxy
References
[1] Martin J. Blaser. “The microbiome revolution” The Journal of Clinical Investigation (2014): 124.
[2] Sharpton, Thomas J. "An introduction to the analysis of shotgun metagenomic data." Fontiers in plant science 5 (2014): 209.
[3] Xie, Fei, et al. "An integrated gene catalog and over 10,000 metagenome-assembled genomes from the gastrointestinal microbiome of ruminants." Microbiome 9.1 (2021): 1-20
[4] Nishimura, Yosuke, and Susumu Yoshizawa. "The OceanDNA MAG catalog contains over 50,000 prokaryotic genomes originated from various marine environments." Scientific Data 9.1 (2022): 1-11.
[5] Chen LX, Anantharaman K, Shaiber A, Eren AM, Banfield JF (2020) Accurate and complete genomes from metagenomes. Genome Res 30(3):315–333
[6] Sieber CMK, Probst AJ, Sharrar A, Thomas BC, Hess M, Tringe SG, et al. Recovery of genomes from metagenomes via a dereplication, aggregation and scoring strategy. Nat Microbiol. 2018;3(7):836–43
[7] Kieser, Silas, et al. "ATLAS: a Snakemake workflow for assembly, annotation, and genomic binning of metagenome sequence data." BMC bioinformatics 21.1 (2020): 1-8.
[8] Raguideau, Sebastien, et al. "Novel microbial syntrophies identified by longitudinal metagenomics." bioRxiv (2021).
[9] Enis Afgan, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update, Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W537–W544, doi:10.1093/nar/gky379
[10] Meyer, Fernando, et al. "Critical Assessment of Metagenome Interpretation: the second round of challenges." Nature methods 19.4 (2022): 429-440.
The text was updated successfully, but these errors were encountered: