Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

singlem query doesn't accept hyphens in sequences #17

Closed
ljmmm opened this issue Jun 16, 2017 · 7 comments
Closed

singlem query doesn't accept hyphens in sequences #17

ljmmm opened this issue Jun 16, 2017 · 7 comments

Comments

@ljmmm
Copy link

ljmmm commented Jun 16, 2017

Hi Ben,

I ran singlem query to assess similarity of ribosomal proteins within a set of genomes:
singlem query --query_otu_table fcpu_genomes_otu.csv --db fcpu_genomes.db/ > fcpu_rp_self_query.tsv

The analysis looked to run OK, however I got the following message in stderr:
CFastaReader: Hyphens are invalid and will be ignored around line 48
CFastaReader: Hyphens are invalid and will be ignored around line 106
CFastaReader: Hyphens are invalid and will be ignored around line 138
etc.

The hyphens in my files were generated by singlem pipe (representing gaps in the alignments). However it seems CFastaReader doesn't read them. Is there a CFastaReader-friendly gap character that could be used instead of hyphen?

Thanks,
Louis.

@wwood
Copy link
Owner

wwood commented Jun 16, 2017

hey, thanks for the report.

That error msg comes from blast itself. Does it result in query sequences not hitting themselves?

@ljmmm
Copy link
Author

ljmmm commented Jun 16, 2017 via email

@wwood
Copy link
Owner

wwood commented Jun 16, 2017

The default cutoff is 4, but you can change that with a command line parameter.

For what you are doing, I would set the divergence to be high (say 15) and then analyse those results. I wouldn't bother replacing them with Ns. The only thing to do here is for me to replace hyphens in the singlem query code.

@ljmmm
Copy link
Author

ljmmm commented Jun 16, 2017 via email

@wwood
Copy link
Owner

wwood commented Jun 17, 2017

I've updated the query code to report the correct divergence in 2139f93 and 2bf4183 so that issue will be fixed in the next version. BLAST no longer spews that warning either.

@wwood
Copy link
Owner

wwood commented Jun 17, 2017

and enjoy your weekend also sir.

@wwood
Copy link
Owner

wwood commented Jan 31, 2018

This is no longer an issue since BLAST is no longer used for querying singlem DBs.

@wwood wwood closed this as completed Jan 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants