Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF load error when Chrome responds to non-range XHR request with 206 response #11132

Closed
robertknight opened this issue Sep 10, 2019 · 4 comments

Comments

@robertknight
Copy link

Configuration:

  • Web browser and its version: Chrome 78.0.3902.4 (Official Build) dev (64-bit)
  • Operating system and its version: macOS 10.4.6
  • PDF.js version: v1.1.215 (but relevant code still exists in latest master)
  • Is a browser extension: No

We have observed an error in our application where PDF.js sometimes fails to load PDFs and displays an error about an unexpected 206 response. Adding logging to PDF.js's networking code shows that the initial XHR request from PDF.js to fetch the file has no Range header set, but Chrome sometimes responds with a 206 (Partial Content) response instead of a 200. The byte range in the response however covers the entire file.

Our application is admittedly using an old version of PDF.js, but the logic which generates the error still exists in src/display/network.js on the current master branch so I assume the issue still applies there.

I have not yet found a completely reliable way to reproduce the behavior in Chrome, but from looking at logs from chrome://net-export it seems that what happens is that the XHR request from PDF.js gets translated into a request with a Range: 0-<length of file> header and an If-None-Match: <etag> header. This is then resolved with a 206 response where the Content-Length matches the length of the file.

Original downstream issue: hypothesis/lms#890 (comment with details showing Chrome's behavior).

Steps to reproduce the problem:

I'm still trying to create an isolated reproduction, as the issue does not always happen reliably. From the testing I've done so far, it looks like the issue only happens if the PDF has been loaded recently and so Chrome's HTTP cache has some information about it.

What is the expected behavior? (add screenshot)

Assuming this is legitimate behavior from Chrome, the PDF should continue to load when this happens. I asked about what browsers are allowed to do here on Twitter, and a Googler responded that he thought it was allowed but browsers differed in their behavior.

Even if this behavior is eventually deemed a bug in Chrome, there are versions of Chrome out there that exhibit this behavior which I think will need to be worked around.

What went wrong? (add screenshot)

The PDF fails to load and an error message with an "Unexpected 206 response" message is displayed (see downstream issue for screenshots)

Additional Notes

When PDF.js makes network requests, they may or may not include a Range header. If a Range header is provided, either a 200 or 206 (Partial Content) response is accepted. If no Range header is set, PDF.js will produce an error if the request completes with a 206 XHR response.

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Sep 10, 2019

PDF.js version: v1.1.215 (but relevant code still exists in latest master)

The version you're using is over four years old, and there's been thousands of commits since then (with hundreds of parsing/rendering bugs fixed in addition to many general improvements), hence you really need to update to the latest release as found at https://github.com/mozilla/pdf.js/releases since your version is unsupported.

Our application is admittedly using an old version of PDF.js, but the logic which generates the error still exists in src/display/network.js on the current master branch so I assume the issue still applies there.

Rather than assuming that, please make sure to actually test with the latest version since that's the only one that's supported. (While the code may still exist, note that it's been moved from the worker-thread to the main-thread since then.)

Furthermore, up-to-date versions of the PDF.js library will generally use the Fetch API rather than XMLHttpRequest when possible. Does the problem exist in that case as well?

I'm still trying to create an isolated reproduction, as the issue does not always happen reliably.

Without a reduced test-case it's probably not possible to make any inroads here unfortunately; but please make sure that when you come up with one it's based on the latest PDF.js version.

Even if this behavior is eventually deemed a bug in Chrome, there are versions of Chrome out there that exhibit this behavior which I think will need to be worked around.

Please keep in mind that browser-specific compatibility hacks are generally not allowed in the main PDF.js code-base, and will have to be limited to files such as e.g. https://github.com/mozilla/pdf.js/blob/master/src/shared/compatibility.js and/or https://github.com/mozilla/pdf.js/blob/master/src/display/api_compatibility.js

@robertknight
Copy link
Author

Thanks for the response. I'll try updating our application to the current PDF.js version and see if the problem can still be reproduced.

@robertknight
Copy link
Author

We are still in the process of updating our application to PDF.js v2.x. In the meantime I was able to reproduce the basic issue in Chrome with an isolated HTML test case: https://gist.github.com/robertknight/e31f6448f3341189a8485c3cb0188aed. I will file this upstream since this looks like a bug in Chrome to me.

This uses fetch and operates on the main thread, so the problem observed in the context of PDF.js v1.x is not specific to the fact that it uses XHR + Web Workers.

@robertknight
Copy link
Author

The issue fundamentally still exists in Chrome, and they have confirmed it, but so far I've not been able to reproduce in the context of PDF.js since updating our application to v2.x. There might be something different about the network request pattern which means the issue isn't triggered any longer. I'm tentatively going to close this for now and will let you know if we see it happen again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants