Skip to content

Commit

Permalink
adding parsnp 2.0.5 (#980)
Browse files Browse the repository at this point in the history
  • Loading branch information
erinyoung committed May 22, 2024
1 parent 024d7f8 commit aa8cb89
Show file tree
Hide file tree
Showing 3 changed files with 230 additions and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ To learn more about the docker pull rate limits and the open source software pro
| [Panaroo](https://hub.docker.com/r/staphb/panaroo) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/panaroo)](https://hub.docker.com/r/staphb/panaroo) | <ul><li>[1.2.10](panaroo/1.2.10/)</li><li>[1.3.4](panaroo/1.3.4/)</li><li>[1.5.0](./panaroo/1.5.0/)</li></ul>| (https://hub.docker.com/r/staphb/panaroo) |
| [Pangolin](https://hub.docker.com/r/staphb/pangolin) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/pangolin)](https://hub.docker.com/r/staphb/pangolin) | <details><summary> Click to see Pangolin v4.2 and older versions! </summary> **Pangolin version & pangoLEARN data release date** <ul><li>1.1.14</li><li>2.0.4 & 2020-07-20</li><li>2.0.5 & 2020-07-20</li><li>2.1.1 & 2020-12-17</li><li>2.1.3 & 2020-12-17</li><li>2.1.6 & 2021-01-06</li><li>2.1.7 & 2021-01-11</li><li>2.1.7 & 2021-01-20</li><li>2.1.8 & 2021-01-22</li><li>2.1.10 & 2021-02-01</li><li>2.1.11 & 2021-02-01</li><li>2.1.11 & 2021-02-05</li><li>2.2.1 & 2021-02-06</li><li>2.2.2 & 2021-02-06</li><li>2.2.2 & 2021-02-11</li><li>2.2.2 & 2021-02-12</li><li>2.3.0 & 2021-02-12</li><li>2.3.0 & 2021-02-18</li><li>2.3.0 & 2021-02-21</li><li>2.3.2 & 2021-02-21</li><li>2.3.3 & 2021-03-16</li><li>2.3.4 & 2021-03-16</li><li>2.3.5 & 2021-03-16</li><li>2.3.6 & 2021-03-16</li><li>2.3.6 & 2021-03-29</li><li>2.3.8 & 2021-04-01</li><li>2.3.8 & 2021-04-14</li><li>2.3.8 & 2021-04-21</li><li>2.3.8 & 2021-04-23</li><li>2.4 & 2021-04-28</li><li>2.4.1 & 2021-04-28</li><li>2.4.2 & 2021-04-28</li><li>2.4.2 & 2021-05-10</li><li>2.4.2 & 2021-05-11</li><li>2.4.2 & 2021-05-19</li><li>3.0.5 & 2021-06-05</li><li>3.1.3 & 2021-06-15</li><li>3.1.5 & 2021-06-15</li><li>3.1.5 & 2021-07-07-2</li><li>3.1.7 & 2021-07-09</li><li>3.1.8 & 2021-07-28</li><li>3.1.10 & 2021-07-28</li><li>3.1.11 & 2021-08-09</li><li>3.1.11 & 2021-08-24</li><li>3.1.11 & 2021-09-17</li><li>3.1.14 & 2021-09-28</li><li>3.1.14 & 2021-10-13</li><li>3.1.16 & 2021-10-18</li><li>3.1.16 & 2021-11-04</li><li>3.1.16 & 2021-11-09</li><li>3.1.16 & 2021-11-18</li><li>3.1.16 & 2021-11-25</li><li>3.1.17 & 2021-11-25</li><li>3.1.17 & 2021-12-06</li><li>3.1.17 & 2022-01-05</li><li>3.1.18 & 2022-01-20</li><li>3.1.19 & 2022-01-20</li><li>3.1.20 & 2022-02-02</li><li>3.1.20 & 2022-02-28</li></ul> **Pangolin version & pangolin-data version** <ul><li>4.0 & 1.2.133</li><li>4.0.1 & 1.2.133</li><li>4.0.2 & 1.2.133</li><li>4.0.3 & 1.2.133</li><li>4.0.4 & 1.2.133</li><li>4.0.5 & 1.3</li><li>4.0.6 & 1.6</li><li>4.0.6 & 1.8</li><li>4.0.6 & 1.9</li><li>4.1.1 & 1.11</li><li>4.1.2 & 1.12</li><li>4.1.2 & 1.13</li><li>4.1.2 & 1.14</li><li>4.1.3 & 1.15.1</li><li>4.1.3 & 1.16</li><li>4.1.3 & 1.17</li><li>4.2 & 1.18</li><li>4.2 & 1.18.1</li><li>4.2 & 1.18.1.1</li><li>4.2 & 1.19</li></ul> </details> **Pangolin version & pangolin-data version** <ul><li>[4.3 & 1.20](pangolin/4.3-pdata-1.20/)</li><li>[4.3 & 1.21](pangolin/4.3-pdata-1.21/)</li><li>[4.3.1 & 1.22](pangolin/4.3.1-pdata-1.22/)</li><li>[4.3.1 & 1.23](pangolin/4.3.1-pdata-1.23/)</li><li>[4.3.1 & 1.23.1](pangolin/4.3.1-pdata-1.23.1/)</li><li>[4.3.1 & 1.23.1 with XDG_CACHE_HOME=/tmp](pangolin/4.3.1-pdata-1.23.1-1/)</li><li>[4.3.1 & 1.24](pangolin/4.3.1-pdata-1.24/)</li><li>[4.3.1 & 1.25.1](pangolin/4.3.1-pdata-1.25.1/)</li><li>[4.3.1 & 1.26](pangolin/4.3.1-pdata-1.26/)</li><li>[4.3.1 & 1.27](pangolin/4.3.1-pdata-1.27/)</li></ul> | https://github.com/cov-lineages/pangolin<br/>https://github.com/cov-lineages/pangoLEARN<br/>https://github.com/cov-lineages/pango-designation<br/>https://github.com/cov-lineages/scorpio<br/>https://github.com/cov-lineages/constellations<br/>https://github.com/cov-lineages/lineages (archived)<br/>https://github.com/hCoV-2019/pangolin (archived) |
| [parallel-perl](https://hub.docker.com/r/staphb/parallel-perl) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/parallel-perl)](https://hub.docker.com/r/staphb/parallel-perl) | <ul><li>20200722</li></ul> | https://www.gnu.org/software/parallel |
| [parsnp](https://hub.docker.com/r/staphb/parsnp) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/parsnp)](https://hub.docker.com/r/staphb/parsnp) | <ul><li>[1.5.6](./parsnp/1.5.6/)</li><li>[2.0.4](./parsnp/2.0.4/)</li></ul> | https://github.com/marbl/parsnp |
| [parsnp](https://hub.docker.com/r/staphb/parsnp) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/parsnp)](https://hub.docker.com/r/staphb/parsnp) | <ul><li>[1.5.6](./parsnp/1.5.6/)</li><li>[2.0.4](./parsnp/2.0.4/)</li><li>[2.0.5](./parsnp/2.0.5/)</li></ul> | https://github.com/marbl/parsnp |
| [pasty](https://hub.docker.com/r/staphb/pasty) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/pasty)](https://hub.docker.com/r/staphb/pasty) | <ul><li>1.0.2</li><li>[1.0.3](pasty/1.0.3/)</li></ul> | https://github.com/rpetit3/pasty |
| [pbmm2](https://hub.docker.com/r/staphb/pbmm2) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/pbmm2)](https://hub.docker.com/r/staphb/pbmm2) | <ul><li>[1.13.1](./pbmm2/1.13.1/)</li></ul> | https://github.com/PacificBiosciences/pbmm2 |
| [Pavian](https://hub.docker.com/r/staphb/pavian) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/pavian)](https://hub.docker.com/r/staphb/pavian) | <ul><li>[1.2.1](pavian/1.2.1/)</li></ul> | https://github.com/fbreitwieser/pavian |
Expand Down
177 changes: 177 additions & 0 deletions parsnp/2.0.5/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
ARG PARSNP_VER="2.0.5"
ARG HARVEST_VER="1.3"

FROM ubuntu:jammy as builder

ARG PARSNP_VER
ARG HARVEST_VER

ARG RAXML_VER="8.2.12"
ARG FASTTREE_VER="2.1.11"
ARG MASH_VER="2.3"
ARG FASTANI_VER="1.34"

# Update package index, install packages (ParSNP basic dependencies and packages needed to build from source)
RUN apt-get update && apt-get install -y \
autoconf \
make \
build-essential \
capnproto \
libcapnp-dev \
libgsl0-dev \
libprotobuf-dev \
libssl-dev \
libtool \
protobuf-compiler \
libsqlite3-dev \
wget \
zlib1g-dev \
python3 \
python3-pip \
unzip

# Add /usr/lib to the library path so Mash and HarvestTools can find the capnp libraries
ENV LD_LIBRARY_PATH="/usr/lib:/usr/local/lib"

# Move some static libraries that Mash and HarvestTools demand to where they want to see them
RUN cp /usr/lib/x86_64-linux-gnu/libprotobuf.a /usr/lib/ && \
cp /usr/lib/x86_64-linux-gnu/libcapnp.a /usr/lib/ && \
cp /usr/lib/x86_64-linux-gnu/libkj.a /usr/lib/

RUN pip3 install numpy

# install fasttree
# ParSNP expects the FastTree executable to be called 'fasttree'
RUN wget -q http://www.microbesonline.org/fasttree/FastTree && \
chmod +x FastTree && \
mv FastTree /usr/local/bin/fasttree

# Install RAxML: https://cme.h-its.org/exelixis/resource/download/NewManual.pdf
RUN wget -q https://github.com/stamatak/standard-RAxML/archive/refs/tags/v$RAXML_VER.tar.gz && \
tar -xvf v$RAXML_VER.tar.gz && \
cd standard-RAxML-$RAXML_VER && \
make -f Makefile.AVX.PTHREADS.gcc && \
cp /standard-RAxML-$RAXML_VER/raxmlHPC-PTHREADS-AVX /usr/local/bin/raxmlHPC-PTHREADS

# Install Mash: https://github.com/marbl/Mash/blob/master/INSTALL.txt
RUN wget -q https://github.com/marbl/Mash/releases/download/v${MASH_VER}/mash-Linux64-v${MASH_VER}.tar && \
tar -xf mash-Linux64-v${MASH_VER}.tar --no-same-owner && \
rm -rf mash-Linux64-v${MASH_VER}.tar && \
chown root:root /mash-Linux64-v${MASH_VER}/*

# Install PhiPack: https://www.maths.otago.ac.nz/~dbryant/software/phimanual.pdf
RUN wget -q https://www.maths.otago.ac.nz/~dbryant/software/PhiPack.tar.gz && \
tar -xvf PhiPack.tar.gz && \
cd /PhiPack/src && \
make && \
cp /PhiPack/Phi /usr/local/bin && \
cp /PhiPack/Profile /usr/local/bin

# Install HarvestTools
RUN wget -q https://github.com/marbl/harvest-tools/releases/download/v${HARVEST_VER}/harvesttools-Linux64-v${HARVEST_VER}.tar.gz && \
tar -vxf harvesttools-Linux64-v${HARVEST_VER}.tar.gz && \
cp harvesttools-Linux64-v${HARVEST_VER}/harvesttools /usr/local/bin/.

# Install ParSNP
RUN wget -q https://github.com/marbl/parsnp/archive/v$PARSNP_VER.tar.gz && \
tar -xvf v$PARSNP_VER.tar.gz && \
rm v$PARSNP_VER.tar.gz && \
cd /parsnp-$PARSNP_VER/muscle && \
./autogen.sh && \
./configure CXXFLAGS='-fopenmp' && \
make install && \
cd /parsnp-$PARSNP_VER && \
./autogen.sh && \
export ORIGIN=\$ORIGIN && \
./configure LDFLAGS='-Wl,-rpath,$$ORIGIN/../muscle/lib' && \
make LDADD=-lMUSCLE-3.7 && \
make install

# install fastani
RUN wget -q https://github.com/ParBLiSS/FastANI/releases/download/v${FASTANI_VER}/fastANI-Linux64-v${FASTANI_VER}.zip && \
unzip fastANI-Linux64-v${FASTANI_VER}.zip -d /usr/local/bin

FROM ubuntu:jammy as app

ARG PARSNP_VER
ARG HARVEST_VER

LABEL base.image="ubuntu:jammy"
LABEL dockerfile.version="1"
LABEL software="ParSNP"
LABEL software.version="${PARSNP_VER}"
LABEL description="ParSNP: Rapid core genome multi-alignment."
LABEL documentation="https://harvest.readthedocs.io/en/latest/content/parsnp.html"
LABEL website="https://github.com/marbl/parsnp"
LABEL license="https://github.com/marbl/parsnp/blob/master/LICENSE"
LABEL maintainer="Erin Young"
LABEL maintainer.email="eriny@utah.gov"

RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
ca-certificates \
procps \
python3 \
python3-pip \
python-is-python3 \
unzip \
libgomp1 && \
apt-get autoclean && rm -rf /var/lib/apt/lists/*

# Copy necessary packages into the production image
COPY --from=builder /parsnp-$PARSNP_VER/ /parsnp/
COPY --from=builder /usr/local/bin/* /usr/local/bin/
COPY --from=builder /usr/local/lib/* /usr/local/lib/

RUN pip install pyspoa numpy biopython tqdm

# Put harvesttools & parsnp in PATH and set LD_LIBRARY_PATH for MUSCLE
ENV PATH="/parsnp/:$PATH" \
LD_LIBRARY_PATH="/usr/local/lib"

CMD parsnp -h

WORKDIR /data

FROM app as test

# IS_GITHUB only applicable for tests run by GitHub action
ARG IS_GITHUB

RUN parsnp -h

# negative control
WORKDIR /test/negative

RUN mkdir input_dir && \
mkdir reference && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/703/365/GCA_000703365.1_Ec2011C-3609/GCA_000703365.1_Ec2011C-3609_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/016/766/575/GCA_016766575.1_PDT000040717.5/GCA_016766575.1_PDT000040717.5_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/748/565/GCA_000748565.2_ASM74856v2/GCA_000748565.2_ASM74856v2_genomic.fna.gz && \
gunzip *.gz && \
mv GCA_000703365.1_Ec2011C-3609_genomic.fna reference/. && \
mv *fna input_dir && \
parsnp -d input_dir -o filter --use-fasttree -v -r reference/GCA_000703365.1_Ec2011C-3609_genomic.fna

# positive control
WORKDIR /test/positive

RUN mkdir input_dir && \
mkdir reference && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/703/365/GCA_000703365.1_Ec2011C-3609/GCA_000703365.1_Ec2011C-3609_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/016/766/575/GCA_016766575.1_PDT000040717.5/GCA_016766575.1_PDT000040717.5_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/003/018/935/GCA_003018935.1_ASM301893v1/GCA_003018935.1_ASM301893v1_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/012/830/055/GCA_012830055.1_PDT000040719.3/GCA_012830055.1_PDT000040719.3_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/012/829/335/GCA_012829335.1_PDT000040724.3/GCA_012829335.1_PDT000040724.3_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/003/018/775/GCA_003018775.1_ASM301877v1/GCA_003018775.1_ASM301877v1_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/012/829/275/GCA_012829275.1_PDT000040726.3/GCA_012829275.1_PDT000040726.3_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/016/766/555/GCA_016766555.1_PDT000040728.5/GCA_016766555.1_PDT000040728.5_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/012/829/195/GCA_012829195.1_PDT000040729.3/GCA_012829195.1_PDT000040729.3_genomic.fna.gz && \
wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/012/829/295/GCA_012829295.1_PDT000040727.3/GCA_012829295.1_PDT000040727.3_genomic.fna.gz && \
gunzip *.gz && \
mv GCA_000703365.1_Ec2011C-3609_genomic.fna reference/. && \
mv *.fna input_dir/. && \
parsnp -d input_dir -o outdir_parsnp_raxml -v -c -r reference/GCA_000703365.1_Ec2011C-3609_genomic.fna && \
parsnp -d input_dir -o outdir_parsnp_fasttree --use-fasttree -v -c -r reference/GCA_000703365.1_Ec2011C-3609_genomic.fna && \
harvesttools -i outdir_parsnp_fasttree/parsnp.ggr -S outdir_parsnp_fasttree/snp_alignment.txt && \
harvesttools -i outdir_parsnp_raxml/parsnp.ggr -S outdir_parsnp_raxml/snp_alignment.txt
52 changes: 52 additions & 0 deletions parsnp/2.0.5/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
## ParSNP

This container implements [ParSNP](https://github.com/marbl/parsnp) from the [Harvest suite](https://harvest.readthedocs.io/en/latest/).

### Includes
- ParSNP: `parsnp`
- FastTree: `FastTree` or `fasttree` : 2.1.11
- RAxML: `raxmlHPC-PTHREADS` : 8.2.12
- Mash: `mash` : 2.3
- PhiPack: `Phi` : 1.1
- HarvestTools: `harvesttools` : 1.3
- FastANI: `fastani` : 1.34

### Requirements
- [Docker](https://docs.docker.com/get-docker/)

### Running a container
Pull the image from Docker Hub.
```
docker pull staphb/parsnp:latest
```
OR, clone this repository to build & test the image yourself.
```
git clone git@github.com:StaPH-B/docker-builds.git
cd docker-builds/parsnp/1.5.6
# Run tests
docker build --target=test -t parsnp-test .
# Build production image
docker build --target=app -t parsnp .
```

### Example data analysis
Set up some input data.
```
mkdir -p parsnp/input_dir
cd parsnp/input_dir
wget \
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/698/515/GCA_000698515.1_CFSAN000661_01.0/GCA_000698515.1_CFSAN000661_01.0_genomic.fna.gz \
https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/749/005/GCA_000749005.1_CFSAN000669_01.0/GCA_000749005.1_CFSAN000669_01.0_genomic.fna.gz
gunzip *.gz
cd ../
```
Run the container to generate a core genome alignment, call SNPs, and build a phylogeny. Output files are written to `output_dir`.
```
docker run --rm -v $PWD:/data -u $(id -u):$(id -g) staphb/parsnp:latest parsnp \
-d input_dir \
-o outdir_parsnp \
--use-fasttree \
-v \
-c \
-r !
```

0 comments on commit aa8cb89

Please sign in to comment.