meta_taraocean

This repository is the result of a master's project in Bioinformatics, focusing on ocean metagenomics. For this analysis, data obtained from the Tara Oceans Expedition and the MEDUSA open source pipeline were used.

To reproduce the results, it is important to correctly follow the steps present in this README or even those present in each directory of this repository.

Repository structure

doc: Tables used in the project.
- tables/original: Tables retrieved directly from the Tara Oceans website.
- tables/used: Tables containing only information about the samples selected for the present work.
data: All material developed in this project.
- pipeline_workflow: MEDUSA pipeline configuration and execution files.
- results: Code and results obtained with the output of the MEDUSA pipeline. Each subfolder of these repositories has as respective subfolders "biodiversity", "functional" and "taxonomic", with their due explanations present in the README.md files present in each one of them.
  - results/code: All the code created to build tables, graphs and figures for interpreting the results.
  - results/input: Input files for analyzing the results. These files are the MEDUSA pipeline output files.
  - results/output: Output files with the results obtained in the analyses, such as tables, graphs and figures.

Preparation of the working environment

Here, we used the MEDUSA, a tool for metagenomics analysis. The code used in this project followed the step-by-step guide described on the Supplementary Material of the MEDUSA paper.

Go to data/pipeline_workflow/00-default

cd meta_taraoceans/data/pipeline_workflow/00-default

Adjust the paths of variables present in 00-var.sh to your local computer

nano 00-var.sh

Follow the steps present in 00-environment.sh

Pipeline execution

Activate the environment

conda activate medusaPipeline

Execute, in order, each step of the pipeline. Note: only start the next one after finishing the previous one.

nohup 01-raw_data.sh &
nohup 02-preprocessing.sh &
nohup 03-alignment.sh &
nohup 04-taxonomic_classification.sh &
nohup 05-functional_annotation.sh &

Results

Taxonomic Classification

cd meta_taraoceans/data/results/code/taxonomic
Rscript taxonomic.R

Biodiversity

cd meta_taraoceans/data/results/code/biodiversity
Rscript biodiversity.R

Functional Annotation

cd meta_taraoceans/data/results/code/functional
Rscript functional.R

For more information about the MEDUSA pipeline, check:

Morais, D.A.A.; Cavalcante, J.V.F.; Monteiro, S.S.; Pasquali, M.A.B.; Dalmolin, R.J.S. MEDUSA: A Pipeline for Sensitive Taxonomic Classification and Flexible Functional Annotation of Metagenomic Shotgun Sequences, 2022. https://doi.org/10.3389/fgene.2022.814437.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

doc

doc

README.md

README.md

Repository files navigation

meta_taraocean

Repository structure

Preparation of the working environment

Pipeline execution

Results

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
doc		doc
README.md		README.md

dalmolingroup/meta_taraoceans

Folders and files

Latest commit

History

data

data

doc

doc

README.md

README.md

Repository files navigation

meta_taraocean

Repository structure

Preparation of the working environment

Pipeline execution

Results

About

Resources

Stars

Watchers

Forks

Languages