-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need for nonamecheck in bedtools closest #212
Comments
Thanks for reporting this, Michael. @nkindlon can you add |
Sure, will do. |
To clarify, though, it is exactly because the chromosome names aren't stable or predictable that we opted to die instead of warn in the first place. We'd thought it better to die with an error than to allow possibly / probably wrong results to go through. Warnings can be missed if people redirect stderr and don't check it, or are running in batch mode and not catching it, but I guess there's only so much we can do with that. |
Hey Neil, Your point about the warning is a fair one, as warnings can easily be missed given the vast output of most bedtools operations. Michael, the concer with not exiting is that users might miss warnings and therefore have incorrect results owing to inconsistent chromosome labelling schemes. |
I see - maybe adding a 'use nonamecheck if you want to bypass the filter' in the die message would be helpful then. To me it is still not clear why the chromosome names being not in the format we expect them is a problem - is it related to sort? |
Good point - Neil, does "CHR_MG132_PATCH" fail your test for chromosome goodness? |
Hi guys, All that being said, this and other user feedback we've gotten suggests that testing the name conventions doesn't work as well in practice as it does in theory. Revising it to cover all possibilities would be rather difficult; perhaps we should consider removing it. |
I'm running into a problem with CORRECTION: |
Hi @lparsons - does this behavior persist with the |
My apologies, this does appear to be working now, I am guessing that Lance On Mon, May 4, 2015 at 3:07 PM, Aaron Quinlan notifications@github.com
|
So recently a bug with the filter for chromosome name consistency was fixed in bedtools intersect, adding the nonamecheck option to bypass it.
Bedtools closest fails in the same manner and needs a similar option.
It is also my opinion that the filter is just plain harmful - at worst it should emit some warning, not crash the program completely as chromosome names are not stable nor predictable.
Here is one example occurring in the current Ensembl mouse annotation:
ERROR: File /tmp/pybedtools.HdzgEa.tmp has inconsistent naming convention for record:
CHR_MG132_PATCH 124291803 124294101 ENSMUSG00000098810 . - protein_coding exon CAAA01180111.1
The text was updated successfully, but these errors were encountered: