Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

HTTP parse error with amazon.com HPE_INVALID_CONSTANT with test case. #5479

Closed
rustyconover opened this issue May 15, 2013 · 8 comments
Closed

Comments

@rustyconover
Copy link

This test script errors out with a simple request to amazon.com. The number of bytes parsed varies. It seems to be a bug in HTTPParser.

var HTTP = require('http'),
    util = require('util'),
    domain = require('domain');

var d = domain.create();
d.on('error', function(e) {
        console.log("Got error: " + e.code + "\nBytes parsed: " + e.bytesParsed);
    });

d.run(function() {
        var req = HTTP.get('http://www.amazon.com/Dan-Brown/e/B000AP9DSU/ref=s9_pop_gw_al1/186-5993084-4043935?_encoding=UTF8&refinementId=618073011&pf_rd_m=ATVPDKIKX0DER&pf_rd_s=center-2&pf_rd_r=0SHYY5BZXN3KR20BNFAY&pf_rd_t=101&pf_rd_p=1263340922&pf_rd_i=507846',
                           function(resp) {
                               resp.on('data', function() {});
                               resp.on('end', function() {
                                       console.log("Finished");
                                   });
                           });
    });
@mscdex
Copy link

mscdex commented May 15, 2013

The reason the parse error occurs is because the "Connection" header is spelled "Cneonction" in the response.

You can see this with: # curl -i http://www.amazon.com/Dan-Brown/e/B000AP9DSU/ref=s9_pop_gw_al1/186-5993084-4043935?_encoding=UTF8\&refinementId=618073011\&pf_rd_m=ATVPDKIKX0DER\&pf_rd_s=center-2\&pf_rd_r=0SHYY5BZXN3KR20BNFAY\&pf_rd_t=101\&pf_rd_p=1263340922\&pf_rd_i=507846

@redchair123
Copy link

@mscdex that's a pretty common practice: http://www.nextthing.org/archives/2005/08/07/fun-with-http-headers

@rustyconover
Copy link
Author

No, thats not the reason. Headers can be anything you want then to be.

Amazon is sensitive to user agents. So its hard to reproduce in curl, its a problem with the chunked encoding it appears from my tracing with gdb through http_parser_execute. I'd suggest wireshark to see whats being returned over the socket.

@rustyconover
Copy link
Author

It appears amazon is returning two zero length chunks at the end of the request. The parser is setting the error on line 717 of http_parser.c in 0.10.5. ch is set to '0'.

@bnoordhuis
Copy link
Member

What @rustyconover said, the error is on Amazon's side. Closing, not our bug.

@karli2000
Copy link

Sorry to come back with this issue but:
It is strange that every browser is able to get this page and it is not working in Node when using any of the known User-Agent strings. I know it is a Bug on Amazons side, but should Node not be able to get this pages with a User-Agent string from a known browser? Who know what other servers have the same problem?

Thank you,
Max

@bitliner
Copy link

I support the point from @karli2000 .
On a side is an amazon's fault, but at the same time nodejs should handle such situations, as other techonologies do.
Furthermore, the error is so low level that a developer cannot handle it. The unique reasonable choice seems to change language...

@rustyconover
Copy link
Author

There was a fix produced:

#5493

It was marked as not needed, feel free to fork node and apply as necessary for your work.

Rusty

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants