net/http: Clarify Content-Type behavior of ResponseWriter.Write #41259
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone
Preface
The current documentation for
http.ResponseWriter.Write
states:What I think this means is that up to 512 bytes of the first call to write will be used to perform Content-Type detection, there is no buffering involved. I think the comment might be better worded to highlight this part as "the initial 512 bytes of written data" could be misread to mean "across multiple calls to write".
Issue with the documentation
I, for one, misread it when I tried to implement this interface and I added a buffer, but @rsc suggested this could be done without buffering, which I think makes sense by reading the docs or skimming over some of the implementations. It was also not immediately clear to @katiehockman and @FiloSottile, who were involved in discussing a fix for #40928, so I am CCing them here to voice their opinion.
Issue with the API
Arguably this is not even a very user-friendly interface as short writes can cause Content-Type to be wrongly set even when the same response is being generated, and this may only happen in very peculiar situations, making it hard to debug (e.g. some short buffers passed to
io.CopyBuffer
or similar things).Issues with the implementation
Current state (summarized in a table below)
We currently have several implementations of this, and it looks like they all behave in slightly different ways (the following list refers to Go 1.15.1):
The cgi implementation only uses the first byte slice passed to
Write
to sniff, and only if Content-Type is not set, but does not care about Transfer-Encoding. (Before 1.15.1 it used to just set "text/html" without sniffing, which caused a vulnerability).fcgi is like cgi, but in addition it removes whatever Content-Type and Transfer-Encoding was set if the status is "204 No Content", even if there is a body (which is invalid HTTP, but we don't check for it)
httptest.ResponseRecorder
will only sniff if both Content-Type and Transfer-Encoding are missing and will use the first slice or string written to sniff Content-Type.The standard, main implementation is a bit more complex as it defaults to writing chunked responses, but the summary is that
a) A non-nil and non-zero-length write will cause it to call the underlying writer implementation, which is a
*bufio.Writer
that wraps achunkWriter
(note that this introduces some buffering that depends onbufferBeforeChunkingSize
, which currently is 2048)b) This will in turn (after some buffering that depends on
bufio
internal implementation) callchunkWriter.Write
, which will do quite a lot, but the CT-relevant part is that if there is no Content-Encoding, no Content-Type and no Transfer-Encoding and something was eventually written to the response body then Content-Type will be sniffed with some data that is the minimum value between these:Note: the content-type that is sniffed will not be visible by accessing
w.Header()
, unlike all the other implementations.To summarize:
Consequences
Inconsistent behavior between tests and cgi
http.Handler
implementation that sets "Transfer-Encoding" and then writes<!DOCTYPE HTML><html><body>Hello world</body></html>
to the given http.ResponseWriter will:Inconsistent behavior between cgi and main
An
http.Handler
implementation that sets no "Content-Type" and then writes<!DOCTYPE HTML><html><body>Hello world</body></html>
to the given http.ResponseWriter but in chunks shorter than 14 will:Inconsistent behavior between cgi, tests and main
An
http.Handler
implementation that sets "Content-Encoding" and then writes<!DOCTYPE HTML><html><body>Hello world</body></html>
to the given http.ResponseWriter but in chunks shorter than 14 will:If chunks are longer tests and CGI will have a content-type of "text/html".
Security implications
Server-Side Content-Type sniffing is generally a bad idea: any server that accepts any user data might fall victim to XSS unless proper sanitization and proper checks are performed during the upload phase, and those checks would be much easier to write if there was a consistent behavior wrt C-T sniffing.
Scenarios
A server allow users to upload a profile picture, and checks with http.DetectContentType if the uploaded file is an image before accepting it, then serves the file with http.ServeFile. If the image was called *.html it will render as html.
A server allow users to upload a profile picture, and checks that the uploaded file mime.TypeByExtension returns "image/*", then serves the file with io.Copy. If the image content was html, it will render as html.
A server allows users to upload videos and forces uploaded files to be (among other types) valid mp4. The implementation relies on C-T sniffing when serving the file. Valid mp4 files can start with
<a
, which we currently sniff as html when serving.To be extra-sure, a server implementation has tests everywhere that all responses are served with "text/plain" content-type. The actual server might still sniff "text/html" just because it buffers more data before making a decision.
Conclusion
This has already caused some issues in the past (#31753 and #40928 come to mind) but there is probably more there to justify this behavior (it feels like most differences are due to incremental patches that only addressed the issues on the affected implementations that were reported).
I would kindly propose to sync the behaviors to all be like the main one (which seems the more coherent) and clarify in the documentation that this is the intended one.
We could even have an "internal" folder in http that will implement C-T sniffing once for all implementations, so we can be sure that they will not go out of sync again.
The main issue I can see is that this might break existing CGI and FCGI applications, and might make implementations of http.ResponseWriter that we don't know about not correct (the interface is fully exported).
/cc @bradfitz @FiloSottile @katiehockman @rsc @kele
Edit
@rsc also pointed out that (in addition to what is listed above) the sniffing will always set a charset of
utf-8
for all html and xml responses, even if served from disk, even if the actual encoding differs. (Note that not setting charset on modern browsers would also default to utf-8 if XCTO nosniff is specified).The text was updated successfully, but these errors were encountered: