Prioritize rather than filter to "complete" genomes #24

trvrb · 2019-02-20T16:48:32Z

Currently, besides reference genomes, select_strains.py is only passing through viruses that possess both HA and NA segments (due to our use of --segments ha na in the snakefile). For time pivots back about 3-6 months this is okay and there are usually enough strains with both HA and NA to fill sampling bins. However, some strains are just only getting HA sequenced. This was especially obvious looking just now where there are a number of H3s from January with just HA. These tend to be uploaded by groups that are not CCs.

I would propose to modify select_strains.py so that "complete" genome (as in possessing all entries in --segments) becomes another factor in priority rather than a hard constraint.

The text was updated successfully, but these errors were encountered:

rneher · 2019-02-21T14:10:51Z

I would add a --all-segments flag to force the hard filtering, otherwise prioritize.

… hard completeness constraint optional. fixes issue #24

trvrb added the enhancement New feature or request label Feb 20, 2019

trvrb changed the title ~~Prioritize rather than filter "complete" genomes~~ Prioritize rather than filter to "complete" genomes Feb 20, 2019

rneher added a commit that referenced this issue May 26, 2019

select_strains: implement prioritization of complete genomes and make…

3ad6333

… hard completeness constraint optional. fixes issue #24

rneher closed this as completed May 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prioritize rather than filter to "complete" genomes #24

Prioritize rather than filter to "complete" genomes #24

trvrb commented Feb 20, 2019

rneher commented Feb 21, 2019

Prioritize rather than filter to "complete" genomes #24

Prioritize rather than filter to "complete" genomes #24

Comments

trvrb commented Feb 20, 2019

rneher commented Feb 21, 2019