Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Hapijs directory route not being scrapped by facebook behind cloudfront #60
I initially filed this here: hapijs/hapi#3132 Was directed here.
I wonder if anyone has hit this?
I have this https://donate.mozilla.org/en-US/
Which is a hapi server. In this case it's serving a static html file: https://github.com/mozilla/donate.mozilla.org/blob/master/server.js#L352-L358
Seems to work fine as a file server. However, when it interacts with cloudfront and facebooks scrapper, something breaks. Not fully understanding what's happening, but what I can piece together is:
The hapi server sends the file contents as
Cloudfront then has "If the viewer makes a Range GET request and the origin returns Transfer-Encoding: chunked, CloudFront returns the entire object to the viewer instead of the requested range." from http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RangeGETs.html
Facebook's scrapper then chokes on the size of the range not being expected.
You can test that here: https://developers.facebook.com/tools/debug/og/object/
Then click "fetch new scrape information"
Facebook has provided me with a curl command that simulates what their scrapper does:
It responds with
It also doesn't respond with
If I curl directly to the server without cloudfront:
I get back
Thoughts? Can I just turn off
Thanks for the detailed report. I have investigated further, and it appears that you have encountered a bug in how inert handles range requests for compressed responses.
The response to the request should be a plain