Skip to content

rikenbit/mesh-workflow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mesh-workflow

Workflow to SQLite files of MeSH.db, MeSH.AOR.db, MeSH.PCR.db, and MeSH.XXX.eg.db-type packages.

Pre-requisites

  • Bash: GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)
  • Snakemake: 6.0.5
  • Singularity: 3.5.3

Summary

How to reproduce this workflow

1. Configuration

  • NCBI API Key: For the detail, check A General Introduction to the E-utilities.
  • USEARCH: This workflow needs ublast command. Download and install from USEARCH download.
  • config.yaml:
    • UBLAST_PATH: Set the path you downloaded USEARCH
    • THIS_YEAR: Update when the year changes
    • METADATA_VERSION: Update like v001 -> v002 -> ...and so on.
    • MESH_VERSION: Update as needed (check the latest NLM MeSH)
    • BIOC_VERSION: Set next version of Bioconductor

2. Perform snakemake command

The workflow consists of seven snakemake workflows.

In local machine:

export NCBI_API_KEY=ABCDE12345 # Your API Key
snakemake -s workflow/download.smk -j 4 --use-singularity
snakemake -s workflow/ublast.smk -j 4 --use-singularity
snakemake -s workflow/preprocess.smk -j 4 --use-singularity
snakemake -s workflow/categorize.smk -j 4 --use-singularity
snakemake -s workflow/sqlite.smk -j 4 --use-singularity
snakemake -s workflow/metadata.smk -j 4 --use-singularity
snakemake -s workflow/plot.smk -j 4 --use-singularity

In parallel environment (GridEngine):

export NCBI_API_KEY=ABCDE12345
snakemake -s workflow/download.smk -j 4 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/ublast.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/categorize.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/sqlite.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/metadata.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/plot.smk -j 96 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity

In parallel environment (Slurm):

export NCBI_API_KEY=ABCDE12345
snakemake -s workflow/download.smk -j 4 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/ublast.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/categorize.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/sqlite.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/metadata.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/plot.smk -j 96 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity

License

Copyright (c) 2021 Koki Tsuyuzaki and RIKEN Bioinformatics Research Unit Released under the Artistic License 2.0.

Authors

  • Koki Tsuyuzaki
  • Manabu Ishii
  • Itoshi Nikaido