Skip to content

SushiLab/magpipe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

magpipe

Like a bagpipe, but for MAGs.

Code associated with the paper 'Biosynthetic potential of the global ocean microbiome' and the Ocean Microbiomics Database.

Link to the preprint and the paper. The Ocean Microbiomics Database is available here: https://www.microbiomics.io/ocean/.

This repo contains the code behing the snakemake pipeline used for the generation of the MAGs, their analyses and the R code used for the figures.

Structure

  • configs: config files used by the launchers to run the pipeline.
  • envs: conda environments for the rules.
  • figures: R code behind the figures.
  • launchers: bash scripts to run the pipeline (on sge etc).
  • magpipe: python module for the pipeline.
  • resources: some fixed input information on methods and datasets.
  • rules: the rules of the snakemake pipeline.
  • sandbox: scripts used for postprocessing and analyses.
  • scripts: ad-hoc python scripts used by the pipeline.
  • snakes: snakemake files.

You can find additional documentation of the different metagenomic analyses steps here: https://methods-in-microbiomics.readthedocs.io/en/latest/.

Running the pipeline:

Installation

We highly recommend using conda to create a dedicated environment, as follows:

conda create -n mapgipe
conda activate magpipe
conda install snakemake

For the pipeline to work, you need to install the python module as follows:

git clone git@github.com:SushiLab/magpipe.git
cd magpipe/magpipe
python -m pip install -r requirements.txt -e .

Configuration

You now need to setup a few things:

  • In resources, you'll need to update the dataset table and add the name and path to a given metagenomic dataset, the corresponding assemblies and depth files.
  • In config, you'll need to specify the paths for the output, snakemake workdir, and fast drive (if appropriate).
  • Once configured, you can use the snakemake command line to start the snakemake pipeline.