Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: wrong detected MIME type with UTF-8 BOM #20912

Closed
unixpickle opened this issue Jul 5, 2017 · 1 comment

Comments

Projects
None yet
3 participants
@unixpickle
Copy link

commented Jul 5, 2017

What version of Go are you using (go version)?

go version go1.8.3 darwin/amd64

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="darwin"
GOOS="darwin"
GOPATH="/Users/alex/Documents/Code/Go"
GORACE=""
GOROOT="/usr/local/Cellar/go/1.8.3/libexec"
GOTOOLDIR="/usr/local/Cellar/go/1.8.3/libexec/pkg/tool/darwin_amd64"
GCCGO="gccgo"
CC="clang"
GOGCCFLAGS="-fPIC -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/23/qy5hclf52mdfnx7xgn1ddk6r0000gn/T/go-build891556908=/tmp/go-build -gno-record-gcc-switches -fno-common"
CXX="clang++"
CGO_ENABLED="1"
PKG_CONFIG="pkg-config"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"

What did you do?

Run this code, which is on the playground:

package main

import (
	"fmt"
	"net/http"
)

func main() {
	document := "<!DOCTYPE html><html></html>"
	bom := "\xef\xbb\xbf"
	fmt.Println(http.DetectContentType([]byte(document)))
	fmt.Println(http.DetectContentType([]byte(bom+document)))
}

What did you expect to see?

text/html; charset=utf-8
text/html; charset=utf-8

What did you see instead?

text/html; charset=utf-8
text/plain; charset=utf-8

It seems from sniff.go that a BOM automatically triggers a text/plain MIME type. Ideally, htmlSig would detect UTF-8 BOMs and skip past them.

@bradfitz

This comment has been minimized.

Copy link
Member

commented Jul 5, 2017

  1. UTF-8 BOMs are unnecessary and often cause pain for little to no benefit. You should avoid them.

  2. http.DetectContentType implements https://mimesniff.spec.whatwg.org/ which does not seem to suggest that any textual content type can have a UTF-8 BOM in front of it.

So it looks like this is working as intended.

Let me know if I misread the mimesniff spec, though.

@bradfitz bradfitz closed this Jul 5, 2017

@golang golang locked and limited conversation to collaborators Jul 5, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.