Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prepare phylogenetic workflow for Nextclade workflow #57

Merged
merged 2 commits into from
May 24, 2024

Conversation

j23414
Copy link
Contributor

@j23414 j23414 commented May 24, 2024

Description of proposed changes

Following the plan laid out in #21 (comment)

This PR is making some minor updates to the phylogenetic workflow to prepare for the Nextclade workflow.

Related issue(s)

Checklist

  • Checks pass
  • Check a phylogenetic run (see run here)
genome E gene
all all/genome all/E
denv1 denv1/genome denv1/E
denv2 denv2/genome denv2/E
denv3 denv3/genome denv3/E
denv4 denv4/genome denv4/E

@j23414 j23414 requested a review from a team May 24, 2024 17:57
@@ -73,6 +74,7 @@ rule filter:
--group-by {params.group_by} \
--sequences-per-group {params.sequences_per_group} \
--min-length {params.min_length} \
--include {input.include} \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. is it worth putting this closer to the --exclude option to make it clear both are in play?
  2. is it worth a comment explaining how --exclude and --include interact, or is that assumed to be table-stakes knowledge for anybody that might want to modify this?

Copy link
Contributor Author

@j23414 j23414 May 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I meant to put --include next to --exclude. I got faked out by --exclude-where, thanks for flagging!
  2. The comments explaining how --exclude and --include interact is mostly pointing to augur filter docs, aka, not commented here. However, when I get into lineage defining strains, the "include.txt" may need to be split out into DENV1-DENV4 and I haven't figured out the design for that yet.

Potential design 1: Keep separate include_{serotype}.txt files

Potential design 2: Create a snakemake rule that takes a lineage.tsv and creates include_{serotype}.txt and append_clade_membership_{serotype}.tsv... I haven't written this up yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assumed to be table-stakes knowledge for anybody that might want to modify this [include.txt]

Actually this is a good point, I went with Potential design 1 as easier to explain to other people. 3ed18d2

Fixup of include next to exclude in c6e589f

Adds a configurable includes_{serotype}.txt to force-include key strains
(e.g. vaccine-related, lineage-defining) for each serotype tree.
@j23414 j23414 force-pushed the 56-prep-phylo-branch-for-nextclade branch from 3ed18d2 to 5af15a1 Compare May 24, 2024 20:26
@j23414 j23414 merged commit 75d9c5f into main May 24, 2024
32 checks passed
@j23414 j23414 deleted the 56-prep-phylo-branch-for-nextclade branch May 24, 2024 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants