Discussion: Do we still need the snp_clusters
param for diagnostics
rule?
#1017
Labels
source: discussion forum
Issue mentioned on Nextstrain Discussions
Context
@huddlej's response to a discussion question prompted me to take a closer look at the
diagnostics
rule/script.I wanted to understand how the hard-coded params were used in the script:
ncov/workflow/snakemake_rules/main_workflow.smk
Lines 536 to 539 in 44ab71a
I saw that the
snp_clusters
param is only relevant when there is asnp_clusters
column within the input metadata file.ncov/scripts/diagnostic.py
Lines 84 to 87 in 44ab71a
This input metadata file is generated by the
join-metadata-and-clades
script, which does not add asnp_clusters
column. Only the following columns from Nextclade are included in the metadata file:ncov/scripts/join-metadata-and-clades.py
Lines 19 to 41 in 44ab71a
It seems like the
snp_clusters
param is unused unless users add the column to their own metadata outside the workflow.snp_clusters
used to be included as a column in thencov-ingest
produced metadata file, but it has been removed. I think the presence of an unused param here can cause confusion for users (it definitely confused me!).Possible solution
I'm not familiar with the diagnosis rule so I wanted to ask what would be an appropriate action here.
diagnostics
rule that thesnp_clusters
is only kept for backwards compatibility, but it is not used in the latest version of the workflow.snp_clusters
param from thediagnostics
rule.diagnostics
script to checkQC_snp_clusters != "bad"
instead of the number of SNP clusters.The text was updated successfully, but these errors were encountered: