Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add fasttree to docker #37

Merged
merged 8 commits into from
Oct 13, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- fixed a bug in detransitive, see this [commit](https://github.com/neherlab/pangraph/commit/a9651323aba2822d1b1c380a086fae4216c8030d)
- added snakemake pipeline in the `script` folder to perform the analysis published in our [paper](https://github.com/neherlab/pangraph#citing).
- added `-K` option to the `build` command to control kmer length for mmseqs aligner, see this [commit](https://github.com/neherlab/pangraph/commit/0857c36c7c8d11d53e8efab91cf5d18c35685a6e).
- added `fasttree` to docker container and PanX export to docker tests, see [#37](https://github.com/neherlab/pangraph/pull/37).

## v0.5.0

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ RUN set -euxo pipefail \
curl \
mafft \
make \
mash \
>/dev/null \
&& apt-get autoremove --yes >/dev/null \
&& apt-get clean autoclean >/dev/null \
Expand Down Expand Up @@ -59,6 +58,7 @@ RUN set -euxo pipefail \
&& apt-get install -qq --no-install-recommends --yes \
mafft \
mash \
fasttree \
>/dev/null \
&& apt-get autoremove --yes >/dev/null \
&& apt-get clean autoclean >/dev/null \
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,8 +77,8 @@ Moreover, for the compilation to work, it is necessary to have [MAFFT](https://m

### Optional dependencies

**pangraph** can _optionally_ use [mash](https://github.com/marbl/Mash), [MAFFT](https://mafft.cbrc.jp/alignment/software/) or [mmseqs2](https://github.com/soedinglab/MMseqs2), as explained in [the documentation](https://neherlab.github.io/pangraph/#Optional-dependencies).
For full functionality, it is recommended to install these tools and have them available on `$PATH`.
**pangraph** can _optionally_ use [mash](https://github.com/marbl/Mash), [MAFFT](https://mafft.cbrc.jp/alignment/software/), [mmseqs2](https://github.com/soedinglab/MMseqs2) or [fasttree](http://www.microbesonline.org/fasttree/) for some optional functionalities, as explained in [the documentation](https://neherlab.github.io/pangraph/#Optional-dependencies).
For use of these functionalities, it is recommended to install these tools and have them available on `$PATH`.

Alternatively, a script `bin/setup-pangraph` is provided to install both tools into `bin/` for Linux-based operating systems.

Expand Down
13 changes: 13 additions & 0 deletions bin/setup-pangraph
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,18 @@ downloadMMseqs()
mv "$name/bin/$name" "../bin/$name"
}

downloadFasttree()
{
name="$1"; shift 1
url="$1"; shift 1

cd $root

curl -L -o "$name" "$url"
chmod +x "$name"
mv "$name" "../bin/$name"
}

build()
{
name="$1"; shift 1
Expand All @@ -55,5 +67,6 @@ build()

(downloadMash "mash" "https://github.com/marbl/Mash/releases/download" "v2.2" "Linux64")
(downloadMMseqs "mmseqs" "https://github.com/soedinglab/MMseqs2/releases/download" "13-45111" "sse2" "linux")
(downloadFasttree "fasttree" "http://www.microbesonline.org/fasttree/FastTree")
(build "mafft" "https://mafft.cbrc.jp/alignment/software" "7.490")
# rm -r $root
30 changes: 15 additions & 15 deletions docs/src/cli/export.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,21 +4,21 @@
Export a pangraph to a chosen file format(s)

## Options
| Name | Type | Short Flag | Long Flag | Description |
| :------------------ | :------ | :--------- | :------------------ | :-------------------------------------------------------------------------------- |
| Edge minimum length | Integer | ell | edge-minimum-length | blocks below this length cutoff will be ignored for edges in graph |
| Edge maximum length | Integer | elu | edge-maximum-length | blocks above this length cutoff will be ignored for edges in graph |
| Edge minimum depth | Integer | edl | edge-minimum-depth | blocks below this depth cutoff will be ignored for edges in graph |
| Edge maximum depth | Integer | edu | edge-maximum-depth | blocks above this depth cutoff will be ignored for edges in graph |
| Minimum length | Integer | ll | minimum-length | blocks below this length cutoff will be ignored for export |
| Maximum length | Integer | lu | maximum-length | blocks above this length cutoff will be ignored for export |
| Minimum depth | Integer | dl | minimum-depth | blocks below this depth cutoff will be ignored for export |
| Maximum depth | Integer | du | maximum-depth | blocks above this depth cutoff will be ignored for export |
| No duplications | Boolean | nd | no-duplications | do not export any block that contains at least one strain repeated more than once |
| Output directory | String | o | output-directory | path to directory where output will be stored (default: `export`) |
| Prefix | String | p | prefix | basename of exported files (default: `pangraph`) |
| GFA | Boolean | ng | no-export-gfa | toggles whether pangraph is exported as GFA. |
| PanX | Boolean | px | export-panX | toggles whether pangraph is exported to panX visualization compatible format. |
| Name | Type | Short Flag | Long Flag | Description |
| :------------------ | :------ | :--------- | :------------------ | :-------------------------------------------------------------------------------------------------- |
| Edge minimum length | Integer | ell | edge-minimum-length | blocks below this length cutoff will be ignored for edges in graph |
| Edge maximum length | Integer | elu | edge-maximum-length | blocks above this length cutoff will be ignored for edges in graph |
| Edge minimum depth | Integer | edl | edge-minimum-depth | blocks below this depth cutoff will be ignored for edges in graph |
| Edge maximum depth | Integer | edu | edge-maximum-depth | blocks above this depth cutoff will be ignored for edges in graph |
| Minimum length | Integer | ll | minimum-length | blocks below this length cutoff will be ignored for export |
| Maximum length | Integer | lu | maximum-length | blocks above this length cutoff will be ignored for export |
| Minimum depth | Integer | dl | minimum-depth | blocks below this depth cutoff will be ignored for export |
| Maximum depth | Integer | du | maximum-depth | blocks above this depth cutoff will be ignored for export |
| No duplications | Boolean | nd | no-duplications | do not export any block that contains at least one strain repeated more than once |
| Output directory | String | o | output-directory | path to directory where output will be stored (default: `export`) |
| Prefix | String | p | prefix | basename of exported files (default: `pangraph`) |
| GFA | Boolean | ng | no-export-gfa | toggles whether pangraph is exported as GFA. |
| PanX | Boolean | pX | export-panX | toggles whether pangraph is exported to panX visualization compatible format. (requires `fasttree`) |

## Arguments
Zero or one pangraph file which must be formatted as a JSON.
Expand Down
3 changes: 2 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ There are a few **optional** external programs that PanGraph can utilize[^1]:
1. [Mash](https://github.com/marbl/Mash) can be used to construct a guide tree in place of our internal algorithm (see [build](https://neherlab.github.io/pangraph/cli/build/) command options).
2. [MAFFT](https://mafft.cbrc.jp/alignment/software/) can be optionally used to polish block alignments (see [polish](https://neherlab.github.io/pangraph/cli/polish/) command). Only recommended for short alignments.
3. [mmseqs2](https://github.com/soedinglab/MMseqs2) can be used as an alternative alignment kernel to the default *minimap2* (see [build](https://neherlab.github.io/pangraph/cli/build/) command options). It allows merging of more diverged sequences, at the cost of higher computational time.
4. [fasttree](http://www.microbesonline.org/fasttree/) is used to build phylogenetic trees for export in [PanX](https://github.com/neherlab/pan-genome-analysis)-compatible format (see [export](https://neherlab.github.io/pangraph/cli/export/) command options and the [tutorial section](https://neherlab.github.io/pangraph/tutorials/tutorial_3/#Explore-block-alignments-with-the-panX-visualization)).

In order to invoke all functionalities from PanGraph, these tools must be installed and available on `$PATH`.

Expand All @@ -153,7 +154,7 @@ It assumes GNU coreutils are available.

These dependencies are already available within the Docker container.

[^1]: We recommend `mmseqs` version `13-45111`, `mash` version `v2.2.2` and `MAFFT` version `v7.475`
[^1]: We recommend `mmseqs` version `13-45111`, `mash` version `v2.2.2`, `MAFFT` version `v7.475` and `fasttree` version `2.1.11`

## User's Guide

Expand Down
10 changes: 7 additions & 3 deletions tests/run-cli-tests.sh
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
#!/bin/bash
set -euxo pipefail

# test that mash, mafft and mmseqs are available in path
# test that mash, mafft, mmseqs and fasttree are available in path
echo "mash version:"
mash --version
echo "mafft version:"
mafft --version
echo "mmseqs version:"
mmseqs --help | grep "Version"

echo "fasttree help:"
fasttree -help

# test pangraph commands help
pangraph help build
Expand Down Expand Up @@ -46,9 +47,12 @@ pangraph build -c -k mmseqs -K 8 "$TESTDIR/input.fa" > "$TESTDIR/test3.json"
echo "Test pangraph polish"
pangraph polish -c -l 10000 "$TESTDIR/test1.json" > "$TESTDIR/polished.json"

echo "Test pangraph export"
echo "Test pangraph GFA export"
pangraph export -o "$TESTDIR/export" "$TESTDIR/test1.json"

echo "Test pangraph PanX export"
pangraph export -ng -pX -o "$TESTDIR/export" "$TESTDIR/test1.json"

echo "Test pangraph marginalize"
pangraph marginalize -o "$TESTDIR/marginalize" "$TESTDIR/test1.json"

Expand Down