Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need to resolve issues with HTTP(S) settings #8

Open
rkwright opened this issue Aug 8, 2018 · 9 comments
Open

Need to resolve issues with HTTP(S) settings #8

rkwright opened this issue Aug 8, 2018 · 9 comments

Comments

@rkwright
Copy link
Member

rkwright commented Aug 8, 2018

The files posted at http://readium.org/readium-test-files/ do not have correct settings for HTTPS. This is mainly due to settings for readium.org.

The following notes from Daniel Weck.

@rkwright
Copy link
Member Author

rkwright commented Aug 8, 2018

from @danielweck

==========================
4) Out of curiosity (to ensure that Readium's cloud/web reader optimally fetches data from the given links), I checked the HTTP CORS headers in the EPUB URLs (Access-Control-Allow-Origin, etc.), as well as HTTP 1.1 "Accept-Ranges: bytes", the "Content-Type" header, and secure HTTPS support. Here is an example with epub30-test-0201.epub:
curl -I -X GET {LINK_URL}

When {LINK_URL} = http[s]://readium.github.io/readium-test-files/functional/Revised-TS-FXL/epub30-test-0201.epub
(note that the result is the same with HTTPS and HTTP)
=> I imagine there is a DNS CNAME domain redirection for readium.org ==> readium.github.io, because the response is HTTP 301 "permanently moved" to:

{LINK_URL} = http://readium.org/readium-test-files/functional/Revised-TS-FXL/epub30-test-0201.epub
(note that this is not secure HTTPS!)
=> the HTTP response correctly supplies all useful headers.

{LINK_URL} = https://raw.githubusercontent.com/readium/readium-test-files/master/functional/Revised-TS-FXL/epub30-test-0201.epub
(this is GitHub's default "download" URL from their web interface)
=> interestingly, correctly supplies HTTP CORS and range headers, but 'Content-Type' is "application/octet-stream" instead of "application/epub+zip".

{LINK_URL} = https://rawgit.com/readium/readium-test-files/master/functional/Revised-TS-FXL/epub30-test-0201.epub
=> interestingly, RawGit responds with HTTP 301 "permanently moved" to raw.githubusercontent.com (see above).

So, it would seem that the best URL format is:
http://readium.org/readium-test-files/functional/Revised-TS-FXL/epub30-test-02{XX}.epub
...but, read below :(

==========================
5) there is a secure HTTP problem with https://readium.org:
Error code: SSL_ERROR_BAD_CERT_DOMAIN
readium.org uses an invalid security certificate.
The certificate is only valid for the following names: www.github.com, *.github.io, *.githubusercontent.com, *.github.com, github.com, github.io, githubusercontent.com

This is problematic because the Readium cloud/web reader app (just as any other website) cannot mix secure HTTPS and insecure HTTP, so we cannot use the optimum http[s]://readium.org URL mentioned above. Instead, we have to fallback to https://raw.githubusercontent.com (which serves "application/octet-stream" instead of "application/epub+zip" HTTP Content-Type header). Both Content-Types are supported by the Readium web app so this is not a deal-breaker, but it still sucks that we cannot directly use the http[s]://readium.org links (or even https://readium.github.io because of the HTTP 301 redirect to insecure readium.org).

Example of a working Readium web/cloud reader link:
https://readium.firebaseapp.com/?epub=https%3A%2F%2Fraw.githubusercontent.com%2Freadium%2Freadium-test-files%2Fmaster%2Ffunctional%2FRevised-TS-FXL%2Fepub30-test-0201.epub

...also works with RawGit as this service responds with a HTTP 301 redirect to the above regular GitHub URL:
https://readium.firebaseapp.com/?epub=https%3A%2F%2Frawgit.com%2Freadium%2Freadium-test-files%2Fmaster%2Ffunctional%2FRevised-TS-FXL%2Fepub30-test-0201.epub

==========================
6) Unfortunately none of the URLs listed above respond with the HTTP CORS header allowing Content-Length to be queried remotely (from another origin). The net result is that the Readium cloud/web reader is not capable of using HTTP 1.1 Accept-Ranges, so the app falls back to downloading the entire EPUB in memory instead of fetching byte ranges as needed.

Note that we currently have the exact same problem with packed/zipped EPUB files hosted at Firebase and Surge, so I will check our current configuration [8] to see if we can apply similar overrides as we do with the Readium2 NodeJS streamer [9].
curl -I -X GET https://readium.firebaseapp.com/epub_content/internal_link.epub
=> Access-Control-Allow-Origin = "*"
...but missing:
Access-Control-Allow-Methods = "GET, HEAD, OPTIONS" (intentionally excludes POST, DELETE, PUT, PATCH)
Access-Control-Allow-Headers = "Content-Type, Content-Length, Accept-Ranges, Link, Transfer-Encoding"

[8]
https://github.com/readium/readium-js-viewer/blob/develop/firebase.json

[9]
https://github.com/edrlab/r2-streamer-js/blob/52270ae154cdc4d8d3460d5bd62fa3c7235113b5/src/http/server.ts#L484-L493

@rkwright
Copy link
Member Author

rkwright commented Aug 8, 2018

from @danielweck

Quick follow-up about HTTP CORS:

With a bit of help from https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS I fixed the Firebase headers configuration:
https://github.com/readium/readium-js-viewer/blob/develop/firebase.json

CLI test:

curl -I -X GET https://readium.firebaseapp.com/epub_content/internal_link.epub

Readium cloud/web reader test:

https://readium.surge.sh/?epub=https%3A%2F%2Freadium.firebaseapp.com%2Fepub_content%2Finternal_link.epub

@rkwright
Copy link
Member Author

rkwright commented Aug 8, 2018

from @danielweck

by the way: once HTTPS works with enforcement / auto-redirect (recommended practice nowadays), it might be worth considering setting a canonical URL for the Jekyll website:

https://github.com/readium/readium.github.io/blob/master/_config.yml#L22
(e.g. https://readium.org)

https://github.com/readium/readium.github.io/blob/master/_includes/head.html#L6
(usually <link rel="canonical" href="{{ site.url }}{{ page.url }}" /> but this depends on your config ... right now the generated string is not an absolute URL, so there is a problem somewhere)

@rkwright
Copy link
Member Author

rkwright commented Aug 8, 2018

from @danielweck

I realize I am digressing a bit in this email thread, but in fairness this kind of erroneous HTTPS configuration does in fact impact Readium.org 's ability to host sites, serve files, etc. (especially when handshaking across domains / origins, such as HTTP CORS with the cloud reader).

... anyway, I will just mention these last few debunking things (you may copy/paste for future reference, and/or pass onto web-admin @ Readium Foundation):

HTTPS checks (readium.github.io):
https://mxtoolbox.com/SuperTool.aspx?action=https:readium.github.io&run=toolpage#

HTTPS checks (readium.org, same as above but mismatch name):
https://mxtoolbox.com/SuperTool.aspx?action=https:readium.org&run=toolpage#

WHOIS DNS lookup:
https://mxtoolbox.com/SuperTool.aspx?action=whois:readium.org&run=toolpage#

A DNS lookup:
https://mxtoolbox.com/SuperTool.aspx?action=a:readium.org&run=toolpage#

CNAME DNS lookup:
https://mxtoolbox.com/SuperTool.aspx?action=cname:readium.org&run=toolpage#

>>> dig www.readium.org +nostats +nocmd +nocomments

; <<>> DiG 9.8.3-P1 <<>> www.readium.org +nostats +nocmd +nocomments
;; global options: +cmd
;www.readium.org.        IN    A
www.readium.org.    359    IN    A    185.199.108.153

>>> dig readium.org +nostats +nocmd +nocomments

; <<>> DiG 9.8.3-P1 <<>> readium.org +nostats +nocmd +nocomments
;; global options: +cmd
;readium.org.            IN    A
readium.org.        478    IN    A    185.199.109.153
readium.org.        478    IN    A    185.199.110.153
readium.org.        478    IN    A    185.199.111.153
readium.org.        478    IN    A    185.199.108.153

>>> curl -I -X GET http://www.readium.org

HTTP/1.1 301 Moved Permanently
Location: http://readium.org/

>>> curl -I -X GET https://www.readium.org --insecure

HTTP/1.1 301 Moved Permanently
Location: https://readium.org/

>>> curl -I -X GET http://readium.org

HTTP/1.1 200 OK
Server: GitHub.com

>>> curl -I -X GET https://readium.org --insecure

HTTP/1.1 200 OK
Server: GitHub.com

>>> curl -I -X GET http://readium.github.io

HTTP/1.1 301 Moved Permanently
Location: http://readium.org/

>>> curl -I -X GET https://readium.github.io

HTTP/1.1 301 Moved Permanently
Location: http://readium.org/

(note the non-secure HTTP redirect with this last one ... I suspect "HTTP enforcement" has not been turned on in GitHub?)

@rkwright rkwright changed the title Need to resolve issues with HTTP settings Need to resolve issues with HTTP(S) settings Aug 9, 2018
@danielweck
Copy link
Member

Above excerpts from this email discussion thread (readium-dev Google Group):
https://groups.google.com/forum/#!topic/readium-dev/rAzbKu2Jmtk

@danielweck
Copy link
Member

@rkwright
Copy link
Member Author

@danielweck
Given that readium.org now has HTTPS enabled by default, is this issue now moot - other than the fact it is a great piece of documentation on CORS and Readium?

@danielweck
Copy link
Member

The issue can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants