-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dRep failes with symlinks pointing to the same file #25
Comments
Hi Johannes,
Best, |
I don't think it does contain duplicate files, they are all unique. I will check and see, if I can produce a minimal example of the error and send you the log file. Might take a little, though. Regarding the ANI calculation, that is a good tip! Maybe dRep could automatically skip the MASH filter for smaller bins, if the computations are not costly (which I believe they are not for small vs. small bins). Best, |
I've checked and can confirm your hypothesis. There were different files but some of them where symlinks pointing to the same file, by my mistake. It seems CheckM resolves these links and does not do a duplicate file filtering on the list of input files? Simple fix would be to filter the list and emit a warning (just like it does for empty files, for instance). |
Great- thanks for suggestion. Will add to my internal "to-do" list |
Hi, dRep produces errors when trying to lower the thresholds. I basically want to cluster/dereplicate all bins, regardless of size and completion levels. So I ran
dRep dereplicate -pa 0.8 -sa 0.98 -comp 0 -con 50 -l 20000
which gave the following error in v2.0.5:It would be great if dRep could also do ANI clustering and representative picking for smaller bins which usually get sorted out, this is the real challenge in metagenome data.
Best,
Johannes
The text was updated successfully, but these errors were encountered: