Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid BGZF block header #1131

Closed
alex-kj-chin opened this Issue Jul 20, 2018 · 6 comments

Comments

Projects
None yet
3 participants
@alex-kj-chin
Copy link

alex-kj-chin commented Jul 20, 2018

I am using JBrowse to visualize data stored on a remote server over ssh port forwarding and have run into an issue with BGZF headers. However, I have also reproduced this issue running my setup locally too.

When first running JBrowse on a bam it seemed to work successfully (I could be wrong, but I don't think anything in the console is cause for concern. I think the two errors in the console are just because I didn't run generate-names.pl since I'm not using index features, and the warnings are because chrome doesn't cache things over a MB).
screen shot 2018-07-20 at 6 08 08 pm
However, after going to reads further and further down in the bam, I eventually got the error invalid BGZF block header, skipping (there is supposed to be a read in the picture below, but it is not showing up in JBrowse--it shows up in IGV).
screen shot 2018-07-20 at 6 50 18 pm

Upon trying JBrowse with other bams, I got the same error. Interestingly, some of the reads are displayed, but others are not.

Some reads are displayed
screen shot 2018-07-20 at 6 29 04 pm
Other reads are not (there is supposed to be a read here--I checked on IGV)
screen shot 2018-07-20 at 6 30 46 pm

According to #552, this error is because of incorrect binary formatting. However, I am able to open the aforementioned bams with samtools, view them in IGV, manipulate them with Bio.bgzf, and I also manually checked that the first two 8-bit ints in the files are 31 and 139 respectively.

Additionally, #552 was closed because of a known Apache fix, but I am using python's simple http server module (python3 -m http.server 8080 is the exact command), so I don't think mime_magic is the issue here.

Possible leads:

  • One thought is that this may have something to do with file size. I have found that reads towards the start of bams usually are displayed but reads towards the ends of bams are not.
  • I used Bio.bgzf to check out the EOF blocks (these are a block that should be in ever bam as explained in 4.1.2 on page 13 of these specs). Using Bio.bgzf._load_bgzf_block(handle), I got a block size of 28 (this was confirmed with Bio.bgzf.BgzfBlocks) but the data I got was empty instead of \x1f\x8b\x08\x04\x00\x00\x00\x00\x00\xff\x06\x00BC\x02\x00\x1b\x00\x0\3\x00\x00\x00\x00\x00\x00\x00\x00\x00 as expected. I'm not sure if this is related but thought it should be noted.
@cmdcolin

This comment has been minimized.

Copy link
Contributor

cmdcolin commented Jul 21, 2018

The issue is very likely that python http.server does not handle byte-range requests (HTTP Range header)!

http://gmod.org/wiki/JBrowse_FAQ#What_webserver_is_needed_for_JBrowse

@rbuels

This comment has been minimized.

Copy link
Collaborator

rbuels commented Jul 21, 2018

Oh man it looks like you are fetching huuuuge chunks, probably because the Range header on the HTTP requests are being ignored. It must be dog slow, is it?

@rbuels rbuels added the user support label Jul 21, 2018

@alex-kj-chin

This comment has been minimized.

Copy link
Author

alex-kj-chin commented Jul 23, 2018

Ah yes--don't know how I missed that! Running RangeHTTPServer fixed the issue.

And yeah not having to load hundreds of megabytes of data did make it a lot faster 👍

@cmdcolin

This comment has been minimized.

Copy link
Contributor

cmdcolin commented Jul 23, 2018

Good to hear :)

I created #1134 to try and provide a proper warning when the server appears to be responding to range requests badly

maybe close for now

@cmdcolin cmdcolin closed this Jul 23, 2018

@cmdcolin

This comment has been minimized.

Copy link
Contributor

cmdcolin commented Jul 23, 2018

btw sorry you had to do all that detective work for this! glad to have a easy fix :)

@alex-kj-chin

This comment has been minimized.

Copy link
Author

alex-kj-chin commented Jul 23, 2018

Thanks for making #1134! I agree with the closure. And also glad it worked out so easily!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.