Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

empty bed files #30

Closed
jpaulLD opened this issue Nov 9, 2012 · 8 comments
Closed

empty bed files #30

jpaulLD opened this issue Nov 9, 2012 · 8 comments

Comments

@jpaulLD
Copy link

jpaulLD commented Nov 9, 2012

bedtools exhibits incorrect behavior when the bed file is empty (i.e. no records), saying that the file cannot be opened. I've chased this down to the use of ios::good() after the file has been opened instead of ios::fail(). This occurs in 4 different places in the code.

I've fixed locally, but don't really want to download/fork the 500MB git repo to give you a pull request. Nonetheless, thought I'd let you know.

@arq5x
Copy link
Owner

arq5x commented Nov 9, 2012

Thanks. Could you post a patch file? Sorry about the repo size...working
on that.

bedtools exhibits incorrect behavior when the bed file is empty (i.e. no
records), saying that the file cannot be opened. I've chased this down to
the use of ios::good() after the file has been opened instead of
ios::fail(). This occurs in 4 different places in the code.

I've fixed locally, but don't really want to download/fork the 500MB git
repo to give you a pull request. Nonetheless, thought I'd let you know.


Reply to this email directly or view it on GitHubhttps://github.com//issues/30.

@jpaulLD
Copy link
Author

jpaulLD commented Nov 9, 2012

Let me know if you get the patch -- attempting to send via email since
issues doesn't support file upload.

In case it doesn't come through: https://gist.github.com/4048664

Actually, I think the correct test is ios::bad(). Patch updated. Sorry for the confusion.

On Fri, Nov 9, 2012 at 1:48 PM, Aaron Quinlan notifications@github.comwrote:

Thanks. Could you post a patch file? Sorry about the repo size...working
on that.

bedtools exhibits incorrect behavior when the bed file is empty (i.e. no
records), saying that the file cannot be opened. I've chased this down
to
the use of ios::good() after the file has been opened instead of
ios::fail(). This occurs in 4 different places in the code.

I've fixed locally, but don't really want to download/fork the 500MB git
repo to give you a pull request. Nonetheless, thought I'd let you know.


Reply to this email directly or view it on GitHub<
https://github.com/arq5x/bedtools/issues/30>.


Reply to this email directly or view it on GitHubhttps://github.com//issues/30#issuecomment-10245840.

@arq5x
Copy link
Owner

arq5x commented Nov 10, 2012

Thanks, I got the gist. Will patch this in soon. Cheers.

@jmarshall
Copy link
Contributor

When the open fails, failbit will be set. When isGzipFile()'s read() encounters an empty file, both eofbit and failbit will be set. Thus fail() can't distinguish these two cases, and bad() can't either and in fact won't fall over when open fails. So neither of them is right!

Certainly Aaron has already realised this... what I mainly came to say is when this is being disentangled, please could you add printing out errno to the error message. So the least-changes fix for this would be:

  • Now that isGzipFile() only checks one character, it could use file->peek() == 0x1f instead of read(). Then isGzipFile() wouldn't affect failbit, because unlike read(), peek() doesn't affect that bit. And when the file open fails, isGzipFile() won't do anything and in particular (probably) won't affect errno.
  • Use the fail() version of your gist, but please while you're there also add some strerror; for example,
-        if ( !(_bedStream->good()) ) {
+        if ( _bedStream->fail() ) {
             cerr << "Error: The requested bed file (" 
                  << bedFile 
-                 << ") could not be opened. Exiting!" << endl;
+                 << ") could not be opened: "
+                << strerror(errno)
+                << ". Exiting!" << endl;

Printing out this sort of detail from the OS can help head off user problems like the "broadPeak file format in BEDTools" one the other day -- if that was indeed partly down to filename extension mistakes.

(The better more-work change would probably be to check for errors straight after each new ifstream(...) and again after each new igzstream reopening. Tedious, but more obviously guaranteed not to disrupt errno before you check for errors and print it out.)

@arq5x
Copy link
Owner

arq5x commented Nov 15, 2012

Thanks for the pointers, John. I am rather swamped at the moment, but will integrate these changes as soon as I have a breather.

@jpaulLD
Copy link
Author

jpaulLD commented Nov 15, 2012

John, you're correct, I didn't fully appreciate the subtleties of this problem. That being said, I think ios::bad() works, at least as I understand it.

There are three bits: bad, fail, and eof. When opening a non-existent file, (at least) bad gets set. It does not get set for an empty file, but isGzipFile sets fail and eof (but not bad). Thus, I think bad() is sufficient. Or perhaps I'm still missing something -- my C++ is a bit rusty.

In either case, I think the non-minimal solution you suggest is the correct one. And it could be factored out and re-used, rather than copied.

@jmarshall
Copy link
Contributor

@jpaulLD: I think you will find that opening a non-existent file causes failbit to be set, not badbit (C++98, 27.8.1.6/2). I agree that this is perhaps slightly unexpected...

@arq5x
Copy link
Owner

arq5x commented Dec 27, 2012

Thanks for the great suggestions. Commit ca50f590f integrates your ideas and resolves the original issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants