Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 129 #151

Merged
merged 11 commits into from
Oct 22, 2020
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ install:
- docker build -t abricate:1.0.0 abricate/1.0.0
- docker build -t ariba:2.14.4 ariba/2.14.4
- docker build -t ivar:1.2.2_artic20200528 ivar/1.2.2_artic20200528
- docker build -t metaphlan:3.0.3 metaphlan/3.0.3
script:
- bash tests/mash-test.sh
- bash tests/mashtree-test.sh
Expand All @@ -23,3 +24,4 @@ script:
- bash tests/abricate.sh
- bash tests/ariba.sh
- bash tests/ivar.sh
- bash tests/metaphlan.sh
1 change: 1 addition & 0 deletions Program_Licenses.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ The licenses of the open-source software that is contained in these Docker image
| Mash | non-standard license (see link) | https://github.com/marbl/Mash/blob/master/LICENSE.txt |
| mashtree | GNU GPLv3 | https://github.com/lskatz/mashtree/blob/master/LICENSE |
| Medaka | Mozilla Public License 2.0 | https://github.com/nanoporetech/medaka/blob/master/LICENSE.md |
| Metaphlan | non-standard license (see link) | https://github.com/biobakery/MetaPhlAn/blob/3.0/license.txt |
kapsakcj marked this conversation as resolved.
Show resolved Hide resolved
| minimap2 | MIT | https://github.com/lh3/minimap2/blob/master/LICENSE.txt |
| mlst | GNU GPLv2 | https://github.com/tseemann/mlst/blob/master/LICENSE |
| Mugsy | Artistic License 2.0 | Archived in: <br/> https://sourceforge.net/projects/mugsy/files/mugsy_x86-64-v1r2.3.tgz |
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ For many people Docker is not an option, but Singularity is. Most Docker contain
| [Mash](https://hub.docker.com/r/staphb/mash/) <br/> [![docker pulls](https://img.shields.io/docker/pulls/staphb/mash.svg?style=popout)](https://hub.docker.com/r/staphb/mash) | <ul><li>2.1</li><li>2.2</li></ul> | https://github.com/marbl/Mash |
| [mashtree](https://hub.docker.com/r/staphb/mashtree) <br/> [![docker pulls](https://img.shields.io/docker/pulls/staphb/mashtree.svg?style=popout)](https://hub.docker.com/r/staphb/mashtree) | <ul><li>0.52.0</li><li>0.57.0</li><li>1.0.4</li><li>1.2.0</li></ul> | https://github.com/lskatz/mashtree |
| [medaka](https://hub.docker.com/r/staphb/medaka) <br/> [![docker pulls](https://img.shields.io/docker/pulls/staphb/medaka.svg?style=popout)](https://hub.docker.com/r/staphb/medaka) | <ul><li>0.8.1</li><li>1.0.1</li></ul> | https://github.com/nanoporetech/medaka |
| [metaphlan](https://hub.docker.com/r/staphb/metaphlan) <br/> [![docker pulls](https://img.shields.io/docker/pulls/staphb/metaphlan.svg?style=popout)](https://hub.docker.com/r/staphb/metaphlan) | <ul><li>3.0.3-no-db (no database)</li><li> 3.0.3 (~3GB db) | https://github.com/biobakery/MetaPhlAn/tree/3.0 |
| [minimap2](https://hub.docker.com/r/staphb/minimap2) <br/> [![docker pulls](https://img.shields.io/docker/pulls/staphb/minimap2.svg?style=popout)](https://hub.docker.com/r/staphb/minimap2) | <ul><li>2.17</li></ul> | https://github.com/lh3/minimap2 |
| [mlst](https://hub.docker.com/r/staphb/mlst) <br/> [![docker pulls](https://img.shields.io/docker/pulls/staphb/mlst.svg?style=popout)](https://hub.docker.com/r/staphb/mlst) | <ul><li>2.16.2</li><li>2.17.6</li><li>2.19.0</li></ul> | https://github.com/tseemann/mlst |
| [Mugsy](https://hub.docker.com/r/staphb/mugsy) <br/> [![docker pulls](https://img.shields.io/docker/pulls/staphb/mugsy.svg?style=popout)](https://hub.docker.com/r/staphb/mugsy) | <ul><li>1r2.3</li></ul> | http://mugsy.sourceforge.net/ |
Expand Down
91 changes: 91 additions & 0 deletions metaphlan/3.0.3-no-db/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
FROM ubuntu:bionic AS builder_metaphlan

# multstage build
# labels associated with final docker image are further down
# label the intermediate image so we can delete later
LABEL stage=builder_metaphlan_nodb

# install python (>3.6) and other dependencies
# R necessary if user wants to run unifrac function in metaphlan

RUN apt-get update && apt-get install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa&& \
apt-get update && apt-get install -y --no-install-recommends --no-install-suggests \
gcc \
wget \
python3.7 \
python3.7-dev \
python3-distutils \
python3-setuptools \
python3-pip \
unzip \
r-base=3.4.4-1ubuntu1 && \
python3.7 -m pip install pip --force-reinstall && \
python3.7 -m pip install numpy Cython six --force-reinstall && \
ln -s /usr/bin/python3.7 /usr/bin/python

# bowtie 2 dependency
RUN mkdir /usr/bin/bowtie2 && \
cd /usr/bin/bowtie2 && \
wget https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.3.3.1/bowtie2-2.3.3.1-linux-x86_64.zip/download && \
unzip download && \
rm download

ENV PATH="$PATH:/usr/bin/bowtie2/bowtie2-2.3.3.1-linux-x86_64" \
LC_ALL=C

# install metaphlan 3
RUN python3.7 -m pip install metaphlan==3.0.3 && \
mkdir /data

# don't install metaphlan database, user will do. see README.md
# RUN metaphlan --install

# build onto first stage
# this will make the final docker image ~0.5 GB smaller
# after the build completes, recommend deleting the intermediate image generated from builder stage
# can do this with the flag label=stage=filter:
# docker image ls --filter "label=stage=builder_metaphlan_nodb"
# docker image prune --filter "label=stage=builder_metaphlan_nodb"

FROM ubuntu:bionic

# labels for final image
LABEL base.image="ubuntu:bionic"
LABEL dockerfile.version="1"
LABEL software="MetaPhlAn3.0"
LABEL software.version="3.0.3-no-db"
LABEL description="microbial composition of metagenomes"
LABEL website="https://github.com/biobakery/MetaPhlAn/tree/3.0"
LABEL maintainer="Tara Gallagher"
LABEL maintainer.email="tgallagher@utah.gov"

# copy over necessary bin and packages

COPY --from=builder_metaphlan /usr/local/bin /usr/local/bin
COPY --from=builder_metaphlan /usr/bin/bowtie2/ /usr/bin/bowtie2/
COPY --from=builder_metaphlan /usr/bin/python3.7 /usr/bin/python3.7
COPY --from=builder_metaphlan /usr/lib/python3.7 /usr/lib/python3.7
COPY --from=builder_metaphlan /usr/lib/x86_64-linux-gnu/libexpat.so /usr/lib/x86_64-linux-gnu/libexpat.so
COPY --from=builder_metaphlan /lib/x86_64-linux-gnu/ /lib/x86_64-linux-gnu/
COPY --from=builder_metaphlan /usr/local/lib/python3.7/dist-packages/ /usr/local/lib/python3.7/dist-packages/
COPY --from=builder_metaphlan /usr/lib/python3/dist-packages/* /usr/lib/python3.7/dist-packages/
COPY --from=builder_metaphlan /usr/bin/R /usr/bin/R
COPY --from=builder_metaphlan /usr/bin/Rscript /usr/bin/Rscript
COPY --from=builder_metaphlan /usr/lib/R /usr/lib/R
COPY --from=builder_metaphlan /usr/local/lib/R /usr/local/lib/R
COPY --from=builder_metaphlan /etc/R /etc/R
COPY --from=builder_metaphlan /usr/lib/libR.so /usr/lib/libR.so

ENV PATH="$PATH:/usr/bin/bowtie2/bowtie2-2.3.3.1-linux-x86_64" \
LC_ALL=C

WORKDIR /data


# make dir for metaphlan database
RUN mkdir /usr/local/lib/python3.7/dist-packages/metaphlan/metaphlan_databases && \
ln -s /usr/bin/python3.7 /usr/bin/python

# check locale settings in env by running perl
RUN perl -e 'print'
102 changes: 102 additions & 0 deletions metaphlan/3.0.3-no-db/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# MetaPhlAn3 docker image

Main tool: [MetaPhlAn/3.0](https://github.com/biobakery/MetaPhlAn/tree/3.0)

This docker image contains the metaphlan3 program and its dependencies. It does not contain the metaphlan3 database, and the user will have to download the database to their machine and mount to the docker container (see below).

## Example Usage: Docker

Download docker image:
```bash
$ docker pull staphb/metaphlan:3.0.3-no-db
```

Example usage: downloading the database and running metaphlan in docker

```bash

# pull the metaphlan docker image
$ docker pull staphb:metaphlan/3.0.3-no-db

# have to download database on your machine to run this image
# need to download the following files from metaphlan's dropbox or google drive (see https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0#installation). recommend putting them in their own directory, i.e. named "metaphlan_database"
# files_list.txt
# mpa_latest
# mpa_v30_CHOCOPhlAn_201901.tar (or whichever version of database you prefer)
# mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2
# mpa_v30_CHOCOPhlAn_201901.md5


# extract database
$ tar -xvf mpa_v30_CHOCOPhlAn_201901.tar
$ bunzip2 mpa_v30_CHOCOPhlAn_201901.fna.bz2
$ ls
file_list.txt
mpa_latest
mpa_v30_CHOCOPhlAn_201901.fna
mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2
mpa_v30_CHOCOPhlAn_201901.md5
mpa_v30_CHOCOPhlAn_201901.pkl
mpa_v30_CHOCOPhlAn_201901.tar

# next, use bowtie2 to build and index the database. can use the metaphlan docker for this!
# note this will take ~15 minutes
# change directory to directory with database files
$ cd ./metaphlan_database/
$ docker run -v $PWD:/usr/local/lib/python3.7/dist-packages/metaphlan/metaphlan_databases/ -u $(id -u):$(id -g) \
--rm=True \
staphb:metaphlan/3.0.3-no-db \
bowtie2-build /usr/local/lib/python3.7/dist-packages/metaphlan/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901.fna /usr/local/lib/python3.7/dist-packages/metaphlan/metaphlan_databases/mpa_v30_CHOCOPhlAn_201901
# after bowtie2-build completes, should have indexed files in your metaphlan_databases directory
$ ls
file_list.txt
mpa_latest
mpa_v30_CHOCOPhlAn_201901.1.bt2
mpa_v30_CHOCOPhlAn_201901.2.bt2
mpa_v30_CHOCOPhlAn_201901.3.bt2
mpa_v30_CHOCOPhlAn_201901.4.bt2
mpa_v30_CHOCOPhlAn_201901.fna
mpa_v30_CHOCOPhlAn_201901_marker_info.txt.bz2
mpa_v30_CHOCOPhlAn_201901.md5
mpa_v30_CHOCOPhlAn_201901.pkl
mpa_v30_CHOCOPhlAn_201901.rev.1.bt2
mpa_v30_CHOCOPhlAn_201901.rev.2.bt2
mpa_v30_CHOCOPhlAn_201901.tar

# to run metaphlan
# for this example, stool metagenomes downloaded from SRA
# if you have SRA toolkit, can do:
$ fastq-dump --outdir ./data --skip-technical --readids --split-files --clip SRX2474191
$ ls
SRX2474191_1.fastq

# in this example, 2 directories to mount: /data contains the .fastq and /metaphlan_database contains the indexed database
# check to make sure these 2 directories are in current working directory
$ ls
data metaphlan_database

# run docker:
docker run -v $PWD/metaphlan_database:/usr/local/lib/python3.7/dist-packages/metaphlan/metaphlan_databases/ \
-v $PWD/data:/data \
-u $(id -u):$(id -g) \
--rm=True \
staphb:metaphlan/3.0.3-no-db metaphlan /data/SRX2474191_1.fastq --input_type fastq -o profiled_metagenome.txt


# OUTPUT:
WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.
$ head data/profiled_metagenome.txt
#mpa_v30_CHOCOPhlAn_201901
#/usr/local/bin/metaphlan /data/SRX2474191_1.fastq --input_type fastq -o profiled_metagenome.txt
#SampleID Metaphlan_Analysis
#clade_name NCBI_tax_id relative_abundance additional_species
k__Bacteria 2 100.0
k__Bacteria|p__Bacteroidetes 2|976 52.90346
k__Bacteria|p__Firmicutes 2|1239 45.08223
k__Bacteria|p__Actinobacteria 2|201174 2.01431
k__Bacteria|p__Bacteroidetes|c__Bacteroidia 2|976|200643 52.90346
k__Bacteria|p__Firmicutes|c__Clostridia 2|1239|186801 45.08223

```

88 changes: 88 additions & 0 deletions metaphlan/3.0.3/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
FROM ubuntu:bionic AS builder_metaphlan

# multstage build
# labels associated with final docker image are further down
# label the intermediate image so we can delete later
LABEL stage=builder_metaphlan

# install python (>3.6) and other dependencies
# R necessary if user wants to run unifrac function in metaphlan

RUN apt-get update && apt-get install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa&& \
apt-get update && apt-get install -y --no-install-recommends --no-install-suggests \
gcc \
wget \
python3.7 \
python3.7-dev \
python3-distutils \
python3-setuptools \
python3-pip \
unzip \
r-base=3.4.4-1ubuntu1 && \
python3.7 -m pip install pip --force-reinstall && \
python3.7 -m pip install numpy Cython six --force-reinstall && \
ln -s /usr/bin/python3.7 /usr/bin/python

# bowtie 2 dependency
RUN mkdir /usr/bin/bowtie2 && \
cd /usr/bin/bowtie2 && \
wget https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.3.3.1/bowtie2-2.3.3.1-linux-x86_64.zip/download && \
unzip download && \
rm download

ENV PATH="$PATH:/usr/bin/bowtie2/bowtie2-2.3.3.1-linux-x86_64" \
LC_ALL=C

# install metaphlan 3
RUN python3.7 -m pip install metaphlan==3.0.3

# download metaphlan database
RUN metaphlan --install && \
mkdir /data

# build onto first stage
# this will make the final docker image ~0.5 GB smaller
# after the build completes, recommend deleting the intermediate image generated from builder stage
# can do this with the flag label=stage=filter:
# docker image ls --filter "label=stage=builder_metaphlan"
# docker image prune --filter "label=stage=builder_metaphlan"

FROM ubuntu:bionic

# labels for final image
LABEL base.image="ubuntu:bionic"
LABEL dockerfile.version="1"
LABEL software="MetaPhlAn3.0"
LABEL software.version="3.0.3"
LABEL description="microbial composition of metagenomes"
LABEL website="https://github.com/biobakery/MetaPhlAn/tree/3.0"
LABEL maintainer="Tara Gallagher"
LABEL maintainer.email="tgallagher@utah.gov"

# copy over necessary bin and packages

COPY --from=builder_metaphlan /usr/local/bin /usr/local/bin
COPY --from=builder_metaphlan /usr/bin/bowtie2/ /usr/bin/bowtie2/
COPY --from=builder_metaphlan /usr/bin/python3.7 /usr/bin/python3.7
COPY --from=builder_metaphlan /usr/lib/python3.7 /usr/lib/python3.7
COPY --from=builder_metaphlan /usr/lib/x86_64-linux-gnu/libexpat.so /usr/lib/x86_64-linux-gnu/libexpat.so
COPY --from=builder_metaphlan /lib/x86_64-linux-gnu/ /lib/x86_64-linux-gnu/
COPY --from=builder_metaphlan /usr/local/lib/python3.7/dist-packages/ /usr/local/lib/python3.7/dist-packages/
COPY --from=builder_metaphlan /usr/lib/python3/dist-packages/* /usr/lib/python3.7/dist-packages/
COPY --from=builder_metaphlan /usr/bin/R /usr/bin/R
COPY --from=builder_metaphlan /usr/bin/Rscript /usr/bin/Rscript
COPY --from=builder_metaphlan /usr/lib/R /usr/lib/R
COPY --from=builder_metaphlan /usr/local/lib/R /usr/local/lib/R
COPY --from=builder_metaphlan /etc/R /etc/R
COPY --from=builder_metaphlan /usr/lib/libR.so /usr/lib/libR.so

ENV PATH="$PATH:/usr/bin/bowtie2/bowtie2-2.3.3.1-linux-x86_64" \
LC_ALL=C

WORKDIR /data

# link to python & check locale settings in env by running perl
RUN ln -s /usr/bin/python3.7 /usr/bin/python && \
perl -e 'print'

42 changes: 42 additions & 0 deletions metaphlan/3.0.3/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# MetaPhlAn3 docker image

Main tool: [MetaPhlAn/3.0](https://github.com/biobakery/MetaPhlAn/tree/3.0)

This docker image contains the metaphlan3 database along with the metaphlan3 program and its dependencies. Specifically, the database is a ~1.1M unique clade-specific marker genes from ~100,000 reference genomes (~99,500 bacterial and archaeal and ~500 eukaryotic). The database can be found on Metaphlan3's github (https://github.com/biobakery/MetaPhlAn/tree/3.0). To build this image, the database was downloaded on Oct 16 2020.

Please note the size of this docker image with the metaphlan3 database is ~3.6 GB. If downloading this image fails or takes too long, consider using the docker image version without the metaphlan 3 database:
* docker image name and tag: `staphb/metaphlan:3.0.3-no-db` https://hub.docker.com/r/staphb/metaphlan/tags

## Example Usage: Docker

```bash

# Download docker image:
$ docker pull staphb/metaphlan:3.0.3

# for this example, stool metagenomes downloaded from SRA
# if you have SRA toolkit, can do:
$ fastq-dump --outdir . --skip-technical --readids --split-files --clip SRX2474191
$ ls
SRX2474191_1.fastq

# run metaphlan3 via docker, mount $PWD to /data in the container
docker run --rm -u $(id -u):$(id -g) -v $PWD:/data staphb:metaphlan/3.0.3 \
metaphlan SRX2474191_1.fastq --input_type fastq -o profiled_metagenome.txt

# OUTPUT:
WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.
$ head profiled_metagenome.txt
#mpa_v30_CHOCOPhlAn_201901
#/usr/local/bin/metaphlan SRX2474191_1.fastq --input_type fastq -o profiled_metagenome.txt
#SampleID Metaphlan_Analysis
#clade_name NCBI_tax_id relative_abundance additional_species
k__Bacteria 2 100.0
k__Bacteria|p__Bacteroidetes 2|976 52.90346
k__Bacteria|p__Firmicutes 2|1239 45.08223
k__Bacteria|p__Actinobacteria 2|201174 2.01431
k__Bacteria|p__Bacteroidetes|c__Bacteroidia 2|976|200643 52.90346
k__Bacteria|p__Firmicutes|c__Clostridia 2|1239|186801 45.08223

```
5 changes: 5 additions & 0 deletions tests/metaphlan.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash
# test for metaphlan container
set -e

docker run metaphlan:3.0.3 metaphlan --help