Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big Bakta update #235

Merged
merged 11 commits into from Mar 10, 2023
Merged
7 changes: 7 additions & 0 deletions CHANGELOG.md
Expand Up @@ -7,12 +7,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### `Added`

- [#235](https://github.com/nf-core/funcscan/pull/235) Added parameter `annotation_bakta_db_light` to be able to switch between downloading either light (1.3 GB) or full (33.1 GB) versions of the Bakta database. The full version is generally recommended for best annotation results (see details in the parameter description). If download bandwidth, storage, memory, or run duration requirements become an issue, the user might want to switch to the light version. (by @jasmezz)
jasmezz marked this conversation as resolved.
Show resolved Hide resolved

### `Fixed`

- [#237](https://github.com/nf-core/funcscan/pull/237) Reactivate DeepARG automatic database downloading and CI tests as server is now back up. (by @jfy133)
- [#235](https://github.com/nf-core/funcscan/pull/235) Improved annotation speed by switching off Bakta annotation steps by default which are irrelevant for the three pipeline workflows. (by @jasmezz)
jasmezz marked this conversation as resolved.
Show resolved Hide resolved
- [#235](https://github.com/nf-core/funcscan/pull/235) Renamed parameter `annotation_bakta_db` to `annotation_bakta_db_local` and updated all occurrences in all files (to disambiguate the difference to `annotation_bakta_db_light`). (by @jasmezz)
jasmezz marked this conversation as resolved.
Show resolved Hide resolved

### `Dependencies`

- [#235](https://github.com/nf-core/funcscan/pull/235) Bumped bakta/bakta 1.6.1 -> 1.7.0 (by @jasmezz)
- [#235](https://github.com/nf-core/funcscan/pull/235) Bumped bakta/baktadbdownload 1.6.1 -> 1.7.0 (by @jasmezz)
jasmezz marked this conversation as resolved.
Show resolved Hide resolved

### `Deprecated`

## v1.0.1 - [2023-02-27]
Expand Down
22 changes: 13 additions & 9 deletions conf/modules.config
Expand Up @@ -91,6 +91,9 @@ process {
enabled: params.save_databases,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.args = [
params.annotation_bakta_db_light ? '--type light' : '--type full'
].join(' ').trim()
}

withName: BAKTA_BAKTA {
Expand All @@ -107,17 +110,18 @@ process {
params.annotation_bakta_complete ? '--complete' : '',
params.annotation_bakta_renamecontigheaders ? '' : '--keep-contig-headers',
params.annotation_bakta_compliant ? '--compliant' : '',
params.annotation_bakta_skiptrna ? '--skip-trna' : '',
params.annotation_bakta_skiptmrna ? '--skip-tmrna' : '',
params.annotation_bakta_skiprrna ? '--skip-rrna' : '',
params.annotation_bakta_skipncrna ? '--skip-ncrna' : '',
params.annotation_bakta_skipncrnaregion ? '--skip-ncrna-region' : '',
params.annotation_bakta_skipcrispr ? '--skip-crispr' : '',
params.annotation_bakta_trna ? '' : '--skip-trna',
params.annotation_bakta_tmrna ? '' : '--skip-tmrna',
params.annotation_bakta_rrna ? '' : '--skip-rrna',
params.annotation_bakta_ncrna ? '' : '--skip-ncrna',
params.annotation_bakta_ncrnaregion ? '' : '--skip-ncrna-region',
params.annotation_bakta_crispr ? '' : '--skip-crispr',
params.annotation_bakta_skipcds ? '--skip-cds' : '',
params.annotation_bakta_skippseudo ? '--skip-pseudo' : '',
params.annotation_bakta_pseudo ? '' : '--skip-pseudo',
params.annotation_bakta_skipsorf ? '--skip-sorf' : '',
params.annotation_bakta_skipgap ? '--skip-gap' : '',
params.annotation_bakta_skipori ? '--skip-ori' : ''
params.annotation_bakta_gap ? '' : '--skip-gap',
params.annotation_bakta_ori ? '' : '--skip-ori',
params.annotation_bakta_activate_plot ? '' : '--skip-plot'
].join(' ').trim()
}

Expand Down
4 changes: 2 additions & 2 deletions docs/usage.md
Expand Up @@ -100,7 +100,7 @@ As a reference, we will describe below where and how you can obtain databases an

nf-core/funcscan offers multiple tools for annotating input sequences. Bakta is a new tool touted as a bacteria-only successor to the well-established Prokka.

To supply the required Bakta database (and not have the pipeline do that at every new run), use the flag `--annotation_bakta_db`. It must be downloaded from the Bakta Zenodo archive, the link of which can be found on the [Bakta GitHub repository](https://github.com/oschwengers/bakta#database-download).
To supply the required Bakta database (and not have the pipeline do that at every new run), use the flag `--annotation_bakta_db_local`. It must be downloaded from the Bakta Zenodo archive, the link of which can be found on the [Bakta GitHub repository](https://github.com/oschwengers/bakta#database-download).
jasmezz marked this conversation as resolved.
Show resolved Hide resolved

Once downloaded this must be untarred:

Expand All @@ -111,7 +111,7 @@ tar xvzf db.tar.gz
And then passed to the pipeline with:

```bash
--annotation_bakta_db /<path>/<to>/db/
--annotation_bakta_db_local /<path>/<to>/db/
```

> ℹ️ The flag `--save_databases` saves the pipeline-downloaded databases in your results directory. You can then move these to a central cache directory of your choice for re-use in the future.
Expand Down
4 changes: 2 additions & 2 deletions modules.json
Expand Up @@ -47,12 +47,12 @@
},
"bakta/bakta": {
"branch": "master",
"git_sha": "eeb194e70c5acc713891a9eb21fdd397cca9dff8",
"git_sha": "280c5c86b3da7dfcc92ebd5420584dd6ff26c4a8",
"installed_by": ["modules"]
},
"bakta/baktadbdownload": {
"branch": "master",
"git_sha": "ade45f05a2659b5c130a483e09f50b7f33d075b2",
"git_sha": "280c5c86b3da7dfcc92ebd5420584dd6ff26c4a8",
"installed_by": ["modules"]
},
"bioawk": {
Expand Down
10 changes: 5 additions & 5 deletions modules/nf-core/bakta/bakta/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

14 changes: 6 additions & 8 deletions modules/nf-core/bakta/baktadbdownload/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion modules/nf-core/bakta/baktadbdownload/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 12 additions & 10 deletions nextflow.config
Expand Up @@ -26,24 +26,26 @@ params {
annotation_prodigal_transtable = 11
annotation_prodigal_forcenonsd = false

annotation_bakta_db = null
annotation_bakta_db_local = null
annotation_bakta_db_light = null
Copy link
Contributor

@louperelo louperelo Mar 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused about the new parameter:
--annotation_bakta_db expects a path to a local DB (might be full or light, depends on what the user downloaded himself)
while --annotation_bakta_db_light determines in the modules.config, if in BAKTA_BAKTADBDOWNLOAD the ext.args --type will be set to full or light:
params.annotation_bakta_db_light ? '--type light' : '--type full'
Shouldn't this then be a boolean parameter with default false instead of null? If no path to an existing baktaDB ist given via --annotation_bakta_db, then the full bakta_db is downloaded unless annotation_bakta_db_light was set to true.
Do we want the light or the full version to be default when downloading?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yes, that's why reviews are important :D The new parameter should of course be boolean.
I remember my reason for having null was that there was an issue with nf-core schema build when putting false as default, but forgot to check/fix that.
Default: I would go for full DB, but can test if there are big differences to the light DB in the output. If not, let's use light of course.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe consider renaming --annotation_bakta_db to make clear that this is for a local installation. Something like --annotation_bakta_db_path or --annotation_bakta_db_local?
I agree with the full db as default. Then the parameter --annotation_bakta_db_light can just stay as it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we should disambiguate, and default should be set to a boolean not null. Another suggestion:

--annotation_bakta_downloadlightdb
--annotation_bakta_localdbpath

Or something?

jasmezz marked this conversation as resolved.
Show resolved Hide resolved
annotation_bakta_mincontiglen = 1
annotation_bakta_translationtable = 11
annotation_bakta_gram = '?'
annotation_bakta_complete = false
annotation_bakta_renamecontigheaders = false
annotation_bakta_compliant = false
annotation_bakta_skiptrna = false
annotation_bakta_skiptmrna = false
annotation_bakta_skiprrna = false
annotation_bakta_skipncrna = false
annotation_bakta_skipncrnaregion = false
annotation_bakta_skipcrispr = false
annotation_bakta_trna = false
annotation_bakta_tmrna = false
annotation_bakta_rrna = false
annotation_bakta_ncrna = false
annotation_bakta_ncrnaregion = false
annotation_bakta_crispr = false
annotation_bakta_skipcds = false
annotation_bakta_skippseudo = false
annotation_bakta_pseudo = false
annotation_bakta_skipsorf = false
annotation_bakta_skipgap = false
annotation_bakta_skipori = false
annotation_bakta_gap = false
annotation_bakta_ori = false
annotation_bakta_activate_plot = false

annotation_prokka_singlemode = false
annotation_prokka_rawproduct = false
Expand Down