Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classification with reads instead of assembled genomes? #26

Open
stitam opened this issue Feb 6, 2024 · 3 comments
Open

Classification with reads instead of assembled genomes? #26

stitam opened this issue Feb 6, 2024 · 3 comments

Comments

@stitam
Copy link

stitam commented Feb 6, 2024

In my data set unfortunately many genomes have low ("None" or "Low") confidence for the CPS type for Klebsiella. I understand this may be due to sequencing error or failure in assembly, but I wonder if there is anything I can do if resequencing the genome is not an option (many of these are publicly available genomes we do not have in our labs).

For example, errors from the assembly procedure could be eliminated with an assembly-free method based on reads. Is there any plan for Kaptive to have such functionality? Do you know of any related projects that could be interesting? Many thanks.

@kelwyres
Copy link
Collaborator

kelwyres commented Feb 9, 2024

Hi,
Yes, we think the database has pretty good coverage of the capsule diversity among Klebs pneumo species complex, so most low confidence calls are likely due to assembly issues rather than novel loci- and we are aware that this can apply to quite a lot of genomes. We're working on this right now and hope to have a new release within the next month that will enable more sensitive typing from genome assemblies. We're just finalising the scoring metrics. We also have a reads based version in development, but not yet tested as rigorously.

Just a note that if you are looking outside the Klebs pneumo species complex we do expect many novel loci, so not all low confidence calls will be due to assembly issues. (Klebs oxytoca species complex database coming soon.)

Thanks,
Kelly

@stitam
Copy link
Author

stitam commented Feb 9, 2024

Thanks @kelwyres, that's great to hear, we're looking forward to testing the new versions. As I understand these dev versions are currently in private repos so we cannot collaborate on them, right?

Thanks for your note, but we're working within the complex. There are currently about 55k K. pneumoniae genomes on NCBI and out of these about 9-10k have KL confidence "None" or "Low".

@kelwyres
Copy link
Collaborator

Hi @stitam,
You may have already seen, but we released the new version of Kaptive last week. It has higher sensitivity for genomes with fragmented K loci. Actually the code has been completely rewritten- making it much faster too - but that does mean it works a little differently so please check out the docs if you plan to try. If you do give it a go we'd be berry happy to receive any feedback you have.
Thanks
Kelly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants