Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graftM create builds an HMM from sequences that have not been deduplicated #121

Closed
caity-s opened this issue Oct 2, 2015 · 6 comments
Closed

Comments

@caity-s
Copy link

caity-s commented Oct 2, 2015

Hello Ben,

GraftM create builds the HMM using the sequences provided, and seems to be resulting in an HMM biased to duplicates.

Thanks,
Caitlin

@wwood
Copy link
Collaborator

wwood commented Oct 2, 2015

and also the HMM is created using sequences that ultimately do not pass the length cutoff

@geronimp
Copy link
Owner

and also the HMM is not dereplicated at the genus level, which would be a nice feature. I'm working on create now so leave this one to me.

@wwood
Copy link
Collaborator

wwood commented Oct 14, 2015

and also the duplicates are included in the diamond db, so it is possible to map to a read that is in the diamond database, but is not in the tree.

@wwood
Copy link
Collaborator

wwood commented Oct 14, 2015

One might argue that diamond sequences shouldn't be deduplicated, because the deduplication happens only for those positions aligned to the HMM, and diamond doesn't care about HMMs. What do you think @geronimp ?

@geronimp
Copy link
Owner

@wwood Yes I dont think we should de-duplicate the diamond sequences. The branch I'm working on now includes sequences that have not been deduplicated, but it ""doesn't"" remove those which did not pass the min percent align filter.

imo i think this is the best way.

@wwood
Copy link
Collaborator

wwood commented Nov 16, 2015

this issue seems to be fixed now thanks to Joel

@wwood wwood closed this as completed Nov 16, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants