Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning: Encountered reference sequence with only gaps #249

Open
wittler-github opened this issue Oct 20, 2022 · 2 comments
Open

Warning: Encountered reference sequence with only gaps #249

wittler-github opened this issue Oct 20, 2022 · 2 comments

Comments

@wittler-github
Copy link

wittler-github commented Oct 20, 2022

As you can see in attached files, I get this error many times, however centrifuge completes without error. It uses a vast data to build index about 40-70GB i reckon.

Is this a significant issue that one should clean up some NCBI indices .fna files for only showing NNNNN... and no real sequence ? Where the input .fna files was dustmasked with option centrifuge-download -d.
Will this just be a statistical issue, that is negligible in the large amount of data used, or is it something one should rectify ?

centrifuge_build.zip

Warning: Encountered reference sequence with only gaps
Warning: Encountered reference sequence with only gaps
.....

@mourisl
Copy link
Collaborator

mourisl commented Oct 20, 2022

I think it is fine to ignore those sequences. Many such cases are from the dustmasker that removes the simple sequences and others. So even if keeping their original sequences, they are hard to be classified with.

@wittler-github
Copy link
Author

wittler-github commented Oct 21, 2022

I think so too, in this case only a very very small fraction of reference sequences showed this error, the very large input data (about 40-60 Gb) was dustmasked also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants