I was doing some low-level scraping on a few of my devices (Cable Modem, X10 system, ADT alarm system) and noticed that all of these embedded devices return no headers in their http responses. It returns a status code, but no headers. With no headers being sent, the Node HTTP parser fails (silently in 0.4.11 and with a parse error in 0.5.x)
Here is a repro case:
I couldn't find an entry in the spec that "required" headers to be sent from the server, but I feel that the parser shouldn't choke if no headers are returned, since curl/wget/links all work as expected.
Responding from a server with
HTTP/1.1 200 OK\r\n\r\n<html><head><title></title></head><body></body></html>
Is invalid HTTP.
When the parser reaches the end of the CRLFCRLF it has two ways of interpreting this message: first that it has no content and the next character begins the next message or that the user agent should consider the rest of the connection to be the body terminated by EOF.
HTTP/1.1 servers default to Connection: keep-alive and therefore the user agent considers this to be a zero length message to be soon followed by another HTTP response.
The relevant branch in http-parser is https://github.com/ry/http-parser/blob/c0ecab0516147401b5fd02a2272ebfb5dce8deb4/http_parser.c#L1564-1571 and here https://gist.github.com/8d854170ac16c39fdc74 is a test I was using to determine this.
@mnot am I correct or should this be interpreted as a zero-length message?
@ry I Thought that too, but reading this:
"Both types of message consist of a start-line, zero or more header fields (also known as "headers"), an empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields, and possibly a message-body"
That made me think that headers are optional since it's worded as "zero or more".
@davglass headers are indeed optional but you will need either a Transfer-Encoding or a Content-Length or a Connection: close in order to have a response with a body using HTTP/1.1. The response
HTTP/1.1 200 OK\r\n\r\n
The relevant rule is the catch-all;
I.e., it's all body.
See also: http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4 point 5.
That's what curl actually does: https://github.com/bagder/curl/blob/master/lib/http.c#L2831
camilo@imacsita:~> curl -vvvv http://localhost:8500
I've got the same problem. Receiving HTTP/0.9 200 OK without headers. curl output:
HTTP/0.9 200 OK
curl -vvvv http://domain.tld:5222
* About to connect() to domain.tld port 5222 (#0)
* Trying x.x.x.x... connected
* Connected to domain.tld (x.x.x.x) port 5222 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.21.4 (universal-apple-darwin11.0) libcurl/7.21.4 OpenSSL/0.9.8r zlib/1.2.5
> Host: domain.tld:5222
> Accept: */*
* Connection #0 to host domain.tld left intact
* Closing connection #0
<?xml version='1.0'?><stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='0123456789' from='domain.tld' version='1.0'><stream:error><xml-not-well-formed xmlns='urn:ietf:params:xml:ns:xmpp-streams'/></stream:error></stream:stream>
Error: Parse Error
at Socket.ondata (http.js:1231:22)
at Socket._onReadable (net.js:677:27)
at IOWatcher.onReadable [as callback] (net.js:177:10)
response "HTTP/1.0 200 Connection established\r\n\r\n" could not be parsed either...
other issue like this:
is this still happening with nodejs 0.6.1, I haven't try it out.
Confirmed. A HTTP/1.0 response body without headers works, a HTTP/0.9 response does not.
test: add 'response body with no headers' http test
HTTP/0.9 - fails with a parse error
HTTP/1.0 - works
HTTP/1.1 - fails with an empty response body
(Failing) test added in 4f38c5e.
Accept HTTP/0.9 responses
Get test-http-response-no-headers.js to pass
Main fix was in 3abebf which added HTTP/0.9 support to http parser.
Changed test because HTTP 1.1 mandates keep-alive when no headers are
kindly guide me how can i parse header response using c++