Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

null exception in remoteFile.js (content-range header) #20

Closed
jrobinso opened this issue Jan 17, 2019 · 18 comments
Closed

null exception in remoteFile.js (content-range header) #20

jrobinso opened this issue Jan 17, 2019 · 18 comments

Comments

@jrobinso
Copy link
Contributor

jrobinso commented Jan 17, 2019

Hey guys, I found a bug that occurs with some webservers when running in a browser. I'm reporting the issue, and wondering if you've seen it in JBrowse? I have some URLs below that reproduce the problem.

The code below is lines 33 and 34 of remoteFile.js. sizeMatch can be null when 'content-range' is not returned in the headers, causing the next line to fail.

  // try to parse out the size of the remote file
  const sizeMatch = /\/(\d+)$/.exec(response.headers.get('content-range'))
  if (sizeMatch[1]) this._stat = { size: parseInt(sizeMatch[1], 10) }

The test file I'm using is

const cramUrl =
  'https://data.broadinstitute.org/igvdata/test/data/cram/NA12878.cram'
const indexUrl =
  'https://data.broadinstitute.org/igvdata/test/data/cram/NA12878.crai'

Note that this error does not occur when running from node, only when running from a browser with a webpack converted bundle. Looking at the network traffic the 'content-range' header is returned by the server.

@jrobinso
Copy link
Contributor Author

Never mind -- CORS problem, the content-range header was not exposed. When implementing BAM in IGV I did not assume I would ever know the size of a remote file and worked around that assumption, consequently, I didn't see this error until now. Knowing the size seems cooked into CRAM, which is probably fine. Its not required for servers to return content-length with a 200 response, but reading the spec I think the content-range header is required with a partial response.

@jrobinso
Copy link
Contributor Author

jrobinso commented Jan 17, 2019

I'll leave this open to be sure you see and are aware of it, but feel free to close after reading.

This is likely to be a common problem as the content-range header is not automatically exposed. I suggest adding a exception, something like this

      const sizeMatch = /\/(\d+)$/.exec(response.headers.get('content-range'))
      if (sizeMatch && sizeMatch[1]) {
        this._stat = { size: parseInt(sizeMatch[1], 10) }
      } else {
        throw new Error(
          'No content-Range header in response. Content-Range is required, check CORS policy of server')
      }

@jrobinso jrobinso reopened this Jan 17, 2019
@cmdcolin
Copy link
Contributor

cmdcolin commented Jan 17, 2019

Actually jbrowse does not use the remoteFile instance given here in the jbrowse code so is not updated with fixes as much. So in this case for example, we noticed this same issue with CORS content-range not being accessible in many instances and made it not required on the jbrowse codebase, but didn't update this remoteFile.js. We can probably add a fix to the cram-js codebase for this or you might alternatively be interested in using the "filehandle" parameters to the CRAM codebase as that provides an abstraction for files you can control (would also be relevant to #21 that you created)

Random note but I did see a user have some weird issues related to requesting range out of bounds http error 416 that was maybe related to content-range not being accessible in CORS but I hadn't seen that in any other cases https://sourceforge.net/p/gmod/mailman/message/36519712/

@cmdcolin
Copy link
Contributor

Note that this 416 was not even relevant to cram though...probably something weird on the jbrowse side of things

@jrobinso
Copy link
Contributor Author

Hi @cmdcolin I also need to supply my own IO File implementation for other reasons, including google API keys and oauth parameters. However, the "stat()" method seems to be required and it looks like it should return {size: }. So I don't know how to get around setting that length. Could you point me to the jbrowse implementation of "file"? Thanks.

You should only get the 416 if you request a range which is completely outside (no overlap) with the file. You can request a range the file as long as the start position is < the file length.

@jrobinso
Copy link
Contributor Author

@cmdcolin It makes sense for clients to supply their own "file" implementation rather than support everything in the cram code, btw, so I'm fine with that. localFile and remoteFile can be for Node users and as examples.

@jrobinso
Copy link
Contributor Author

@cmdcolin back to file size, there is code like this in file.js. If fileSize is undefined it bails. So I don't see how you can avoid setting file size, and hence requiring content-range.

async readBlockHeader(position) {
    const sectionParsers = await this.getSectionParsers()
    const { cramBlockHeader } = sectionParsers
    const { size: fileSize } = await this.file.stat()

    if (position + cramBlockHeader.maxLength >= fileSize) return undefined

@cmdcolin
Copy link
Contributor

The fileSize there is undefined, there so then this check if (position + cramBlockHeader.maxLength >= fileSize) is actually false so it does not return there.

@cmdcolin
Copy link
Contributor

if(10>undefined) { console.log('hello world') }

will not print :)

@rbuels
Copy link
Contributor

rbuels commented Jan 17, 2019 via email

@cmdcolin
Copy link
Contributor

@rbuels it seems to me that the file size is just sanity check in the code, setting it to undefined allows it to work without knowing remote file size from what I can tell in my tests

@jrobinso
Copy link
Contributor Author

@cmdcolin I'll do some testing with undefined as file size and let you know how it goes.

@anderspitman
Copy link

FWIW according to the spec a 206 response is required to include Content-Range: https://tools.ietf.org/html/rfc7233#section-4.1

@cmdcolin
Copy link
Contributor

@anderspitman that is true but even if response includes Content-Range, the javascript may not have access to this header in a CORS scenario

With CORS you need the extra allowance "Access-Control-Expose-Headers: Content-Range" or similar. This does not exist by default on many CORS setups which is what @jrobinso discussed

@jrobinso
Copy link
Contributor Author

@anderspitman Yes the server returns it, but the browser does not expose it to javascript without CORS permission (expose-header)

@anderspitman
Copy link

Right, I realize it can break at either layer. I was mostly reiterating @jrobinso's soft assertion with a reference that HTTP requires it.

@jrobinso
Copy link
Contributor Author

I've tweeked remoteFile as follows which appears to work with no content-header. Previously it would fail on parsing because sizeMatch was null. Setting _stat is neccessary because without it failures happen downstream.

      // try to parse out the size of the remote file
      const sizeMatch = /\/(\d+)$/.exec(response.headers.get('content-range'))
      if (sizeMatch && sizeMatch[1]) {
        this._stat = { size: parseInt(sizeMatch[1], 10)}
      } else {
        this._stat = { size: undefined}
      }

@jrobinso
Copy link
Contributor Author

Everything seems good with size === undefined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants