Description
My main concern is using ioutil.ReadAll
everytime the io.Reader
is read in parsePostForm
since it will then trigger bytes.makeSlice
and allocate new space each and every time the function is called. Very similar to my open PR to @bradfitz's memcache library (bradfitz/gomemcache#45)
Instead I think using a sync.Pool
of *bytes.Buffer
would be a fairly better solution.
I've implemented the solution and added a benchmark to show you the performance improvements.
First of, here's the benchmark:
func BenchmarkParsePostForm(b *testing.B) {
b.ReportAllocs()
// Create bodies
bodies := [][]byte{
make([]byte, 100),
make([]byte, 1000),
make([]byte, 10000),
make([]byte, 100000),
make([]byte, 1000000),
make([]byte, 10000000),
}
// Run
b.RunParallel(func(pb *testing.PB) {
for pb.Next() {
// Create request
r, e := NewRequest("POST", "/test", bytes.NewReader(bodies[rand.Intn(len(bodies))]))
if e != nil {
b.Error(e.Error())
}
// Add content type
r.Header.Add("Content-Type", "application/x-www-form-urlencoded")
// Parse post form
e = r.ParseForm()
if e != nil {
b.Error(e.Error())
}
}
})
}
Now here's the first run of the benchmark before the improvement:
$ go test ./src/net/http -v -run=^$ -bench=BenchmarkParsePostForm -benchtime=10s -parallel=4 -memprofile=prof.mem
BenchmarkParsePostForm-4 2000 11205322 ns/op 7821503 B/op 25 allocs/op
PASS
ok net/http 32.253s
$ go tool pprof --alloc_space http.test prof.mem Entering interactive mode (type "help" for commands)
(pprof) top
23140.16MB of 23187.37MB total (99.80%)
Dropped 11 nodes (cum <= 115.94MB)
flat flat% sum% cum cum%
17720.08MB 76.42% 76.42% 17720.08MB 76.42% bytes.makeSlice
5418.08MB 23.37% 99.79% 23141.66MB 99.80% net/http.parsePostForm
2MB 0.0086% 99.80% 17722.08MB 76.43% io/ioutil.readAll
0 0% 99.80% 17720.08MB 76.42% bytes.(*Buffer).ReadFrom
0 0% 99.80% 17722.08MB 76.43% io/ioutil.ReadAll
0 0% 99.80% 23141.66MB 99.80% net/http.(*Request).ParseForm
0 0% 99.80% 23144.16MB 99.81% net/http_test.BenchmarkParsePostForm.func1
0 0% 99.80% 23187.37MB 100% runtime.goexit
0 0% 99.80% 23144.16MB 99.81% testing.(*B).RunParallel.func1
(pprof) list parsePostForm
Total: 22.64GB
ROUTINE ======================== net/http.parsePostForm in /home/asticode/projects/go/go/src/net/http/request.go
5.29GB 22.60GB (flat, cum) 99.80% of Total
. . 884: // RFC 2616, section 7.2.1 - empty type
. . 885: // SHOULD be treated as application/octet-stream
. . 886: if ct == "" {
. . 887: ct = "application/octet-stream"
. . 888: }
. 512.02kB 889: ct, _, err = mime.ParseMediaType(ct)
. . 890: switch {
. . 891: case ct == "application/x-www-form-urlencoded":
. . 892: var reader io.Reader = r.Body
. . 893: maxFormSize := int64(1<<63 - 1)
. . 894: if _, ok := r.Body.(*maxBytesReader); !ok {
. . 895: maxFormSize = int64(10 << 20) // 10 MB is a lot of text.
. . 896: reader = io.LimitReader(r.Body, maxFormSize+1)
. . 897: }
. 17.31GB 898: b, e := ioutil.ReadAll(reader)
. . 899: if e != nil {
. . 900: if err == nil {
. . 901: err = e
. . 902: }
. . 903: break
. . 904: }
. . 905: if int64(len(b)) > maxFormSize {
. . 906: err = errors.New("http: POST too large")
. . 907: return
. . 908: }
5.29GB 5.29GB 909: vs, e = url.ParseQuery(string(b))
. . 910: if err == nil {
. . 911: err = e
. . 912: }
. . 913: case ct == "multipart/form-data":
. . 914: // handled by ParseMultipartForm (which is calling us, or should be)
You can see that ioutil.ReadAll(reader)
at line 898 is eating 17.31GB
.
Now here's the result of the benchmark after the improvement I suggest:
$ go test ./src/net/http -v -run=^$ -bench=BenchmarkParsePostForm -benchtime=10s -parallel=4 -memprofile=prof.mem
BenchmarkParsePostForm-4 2000 7830166 ns/op 1874354 B/op 18 allocs/op
PASS
ok net/http 17.011s
$ go tool pprof --alloc_space http.test prof.mem Entering interactive mode (type "help" for commands)
(pprof) top
4030.01MB of 4033.51MB total (99.91%)
Dropped 7 nodes (cum <= 20.17MB)
flat flat% sum% cum cum%
3638.12MB 90.20% 90.20% 3998.51MB 99.13% net/http.parsePostForm
359.40MB 8.91% 99.11% 359.40MB 8.91% bytes.makeSlice
31.99MB 0.79% 99.90% 31.99MB 0.79% net/http_test.BenchmarkParsePostForm
0.50MB 0.012% 99.91% 4001.52MB 99.21% net/http_test.BenchmarkParsePostForm.func1
0 0% 99.91% 359.40MB 8.91% bytes.(*Buffer).ReadFrom
0 0% 99.91% 3999.01MB 99.14% net/http.(*Request).ParseForm
0 0% 99.91% 4033.51MB 100% runtime.goexit
0 0% 99.91% 4001.52MB 99.21% testing.(*B).RunParallel.func1
0 0% 99.91% 31.99MB 0.79% testing.(*B).launch
0 0% 99.91% 31.99MB 0.79% testing.(*B).runN
(pprof) list parsePostForm
Total: 3.94GB
ROUTINE ======================== net/http.parsePostForm in /home/asticode/projects/go/go/src/net/http/request.go
3.55GB 3.90GB (flat, cum) 99.13% of Total
. . 895: // RFC 2616, section 7.2.1 - empty type
. . 896: // SHOULD be treated as application/octet-stream
. . 897: if ct == "" {
. . 898: ct = "application/octet-stream"
. . 899: }
. 512.05kB 900: ct, _, err = mime.ParseMediaType(ct)
. . 901: switch {
. . 902: case ct == "application/x-www-form-urlencoded":
. . 903: var reader io.Reader = r.Body
. . 904: maxFormSize := int64(1<<63 - 1)
. . 905: if _, ok := r.Body.(*maxBytesReader); !ok {
. . 906: maxFormSize = int64(10 << 20) // 10 MB is a lot of text.
. . 907: reader = io.LimitReader(r.Body, maxFormSize+1)
. . 908: }
. . 909: buf := bufferPool.Get().(*bytes.Buffer)
. . 910: defer freeBuffer(buf)
. 359.40MB 911: _, e := buf.ReadFrom(reader)
. . 912: if e != nil {
. . 913: if err == nil {
. . 914: err = e
. . 915: }
. . 916: break
. . 917: }
. . 918: if int64(buf.Len()) > maxFormSize {
. . 919: err = errors.New("http: POST too large")
. . 920: return
. . 921: }
3.55GB 3.55GB 922: vs, e = url.ParseQuery(buf.String())
. . 923: if err == nil {
. . 924: err = e
. . 925: }
. . 926: case ct == "multipart/form-data":
. . 927: // handled by ParseMultipartForm (which is calling us, or should be)
The very same command, this time at line 911, now eats only 359.40MB
for the same number of requests.
If you need any extra information please let me know.
If you think this performance proposal is sound I'll mail the change for review using codereview
.
Cheers
Quentin