Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: adds --p-n-threads option to trim-alignment #174

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

mikerobeson
Copy link
Collaborator

Fixes #127

Also, made some minor help text and documentation edits.

Copy link
Collaborator

@nbokulich nbokulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @mikerobeson ! Small comment inline. Have you tested the performance speed-up with multi-threading? I can take a look, but curious how much of a bump we see.

"""
Trim alignment based on primer alignment or explicitly specified
positions. When at least one primer sequence is given, primer-based
trimming will be performed, otherwise position-based trimming is done.
trimming will be performed via the `addfragments` option of mafft,
otherwise position-based trimming is done.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the n_threads param would not be effective as far as I can tell when position-based trimming is done, so this should be mentioned in the param description

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RE param description: Oooh, good catch! Will fix!

RE multi threading: Yes this does scale well. The table below is based on ~5k sequences randomly subsampled from SILVA's 50k position FASTA alignment file, on my M1 Max laptop. Mirroring @colinbrislawn's results.

threads time
8 6m 40.278s
4 9m 20.686s
2 14m 36.920s
1 25m 13.873s

Example command:

qiime rescript trim-alignment \
    --i-aligned-sequences silva-138-1-nr99-aln-dna-ss01.qza \
    --p-primer-fwd GTGYCAGCMGCCGCGGTAA \
    --p-primer-rev CCGYCAATTYMTTTRAGTTT \
    --p-n-threads  8 \
    --o-trimmed-sequences silva-138-1-nr99-aln-dna-ss01-v4v5-trimmed.qza \
    --verbose

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

trim-alignment : expose multithreading
2 participants