Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

consider eliminating all of the identifier version removal in lca and tax #1962

Open
ctb opened this issue Apr 19, 2022 · 1 comment
Open
Labels
5.0 issues to address for a 5.0 release 6.0 sourmash v6 changes
Milestone

Comments

@ctb
Copy link
Contributor

ctb commented Apr 19, 2022

when I first implemented the LCA stuff, I was young and foolish and thought there was a point to allowing flexibility in removing identifier versions - e.g. converting GCF_000422665.1 into GCF_000422665 for purposes of lineage mapping.

But, in practice, this is really not used and just massively complicated the code base.

I think we should remove it completely and just standardize on identifiers for LCA and taxonomy being the first space-separated value in the signature name. We could even make it a SourmashSignature property, .ident. 🤔

A responsible way to do this would:

  • for sourmash v5, make --keep-identifier-versions=True the default everywhere; this would be a breaking change;
  • for sourmash v6, make remove support for keep-identifier-versions altogether.

This musing is occasioned by #1808 where the implementation of LCA_SqliteDatabase is made more confusing by this identifier stuff.

@ctb ctb added 5.0 issues to address for a 5.0 release 6.0 sourmash v6 changes labels Apr 19, 2022
@ctb
Copy link
Contributor Author

ctb commented Aug 15, 2022

plan: fix in tax, switch to using tax taxonomy loading in lca index per #2198, done.

@luizirber luizirber modified the milestones: 5.0, 6.0 Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5.0 issues to address for a 5.0 release 6.0 sourmash v6 changes
Projects
None yet
Development

No branches or pull requests

2 participants