Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http: ServeFile 2X slower in Go 1.21 #61530

Closed
r-hang opened this issue Jul 22, 2023 · 30 comments
Closed

net/http: ServeFile 2X slower in Go 1.21 #61530

r-hang opened this issue Jul 22, 2023 · 30 comments
Assignees
Labels
NeedsFix The path to resolution is known, but the work has not been done. Performance release-blocker
Milestone

Comments

@r-hang
Copy link

r-hang commented Jul 22, 2023

What version of Go are you using (go version)?

$ go version
go version go1.21rc3 linux/amd64

Does this issue reproduce with the latest release?

Yes. Slower in go1.21.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GOARCH='amd64'
GOBIN='/home/user/gocode/bin'
GOCACHE='/home/user/.cache/go-build'
GOENV='/home/user/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/user/gocode/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/user/gocode'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/user/gocode/src/github.com/golang/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/home/user/gocode/src/github.com/golang/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.21rc3'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD=''
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build3856449379=/tmp/go-build -gno-record-gcc-switches'

What did you do?

While testing Uber’s go monorepo against go1.21rc3, I noticed a large performance regression in logic that depends on http.ServeFile. git bisect points to this commit (#56480) as the cause.

While we should expect to performance benefits from the use of io.Copy due to the ability to upgrade to io.WriterTo or io.ReaderFrom, in the common use case of http.ServeFile, the ReadFrom method that is now used instead of io.CopyN is noticeably slower for the common case of serving relatively small files. Our profiling reveals larger GC costs as a result of the new copying implementation.

Because the user doesn’t really have means of tinkering with the underlying writer or reader used by http.ServeFile they can’t reasonably work around this performance issue.

A reproducible benchmark test is attached below, we see roughly a 2x performance regression on http.ServeFile.

package main

import (
	"bufio"
	"io"
	"log"
	"net/http"
	"net/http/httptest"
	"testing"
)

func BenchmarkFileServe(b *testing.B) {
	b.StopTimer()
	ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		b.StartTimer()
		http.ServeFile(w, r, "/tmp/example") // the contents of example are attached in the issue (336 lines json)
		b.StopTimer()
	}))
	defer ts.Close()

	for i := 0; i < b.N; i++ {
		res, err := http.Get(ts.URL)
		if err != nil {
			log.Fatal(err)
		}
		// note: we see a noticeable performance regression with both allRead
		// and scanRead.
		err = allRead(res.Body)
		// err = scanRead(res.Body)
		res.Body.Close()
		if err != nil {
			log.Fatal(err)
		}
	}
}

func scanRead(r io.Reader) error {
	scanner := bufio.NewScanner(r)
	for scanner.Scan() { /* scan */ }
	return scanner.Err()
}


func allRead(r io.Reader) error {
	_, err := io.ReadAll(r)
	return err
}

Contents of /tmp/example we used in the benchmark 336 line example JSON
[
  {
    "latlng": {
      "lat": 0,
      "lng": 0
    },
    "region": {
      "name": "gopherland",
      "id": 1
    },
    "golang": {
      "namespace": "gopherland",
      "ids": [
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers"
      ]
    }
  },
  {
    "latlng": {
      "lat": 0,
      "lng": 0
    },
    "region": {
      "name": "gopherland",
      "id": 1
    },
    "golang": {
      "namespace": "gopherland",
      "ids": [
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers",
        "i-like-gophers"
      ]
    }
  }
]

in go1.20.5

$ go test -bench=.
goos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
BenchmarkFileServe-96              17074             69007 ns/op
PASS
ok      exp/fileserver       10.056s

in go1.21rc2 (go1.21rc3 test times out for me after 660s so I can't get a clean result).

$ go test -bench=.
oos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
BenchmarkFileServe-96              10000            126070 ns/op
PASS
ok      exp/fileserver       416.290s

benchstat

goos: linux
goarch: amd64
pkg: exp/fileserver
cpu: AMD EPYC 7B13
             │   /tmp/rs0   │             /tmp/rs1             │
             │    sec/op    │    sec/op      vs base           │
FileServe-96   69.01µ ± ∞ ¹   126.07µ ± ∞ ¹  ~ (p=1.000 n=1) ²
@mauri870
Copy link
Member

mauri870 commented Jul 23, 2023

I can reproduce it, it is extremely slow for me after this change from io.CopyN to Copy.

Edit:

Reproducible benchmark that runs to completion:

package main

import (
	"net/http"
	"net/http/httptest"
	"testing"
)

func BenchmarkFileServe(b *testing.B) {
	handler := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		http.ServeFile(w, r, "./example.json") // the contents of example are attached in the issue (336 lines json)
	})

	r, _ := http.NewRequest("GET", "/file", nil)
	w := httptest.NewRecorder()

	b.ReportAllocs()
	b.ResetTimer()

	for i := 0; i < b.N; i++ {
		handler.ServeHTTP(w, r)
	}
}

@qiulaidongfeng
Copy link
Contributor

qiulaidongfeng commented Jul 23, 2023

I ran the above two benchmarks.Using go1.20.5 and go1.21rc3.
Taking the benchmark of @mauri870 as an example.

            │     120     │                121                 │
            │   sec/op    │   sec/op     vs base               │
FileServe-4   115.2µ ± 3%   122.5µ ± 3%  +6.31% (p=0.001 n=10)

            │     120      │                 121                  │
            │     B/op     │     B/op      vs base                │
FileServe-4   37.54Ki ± 4%   60.68Ki ± 3%  +61.64% (p=0.000 n=10)

            │    120     │                121                │
            │ allocs/op  │ allocs/op   vs base               │
FileServe-4   17.00 ± 0%   16.00 ± 0%  -5.88% (p=0.000 n=10)

@mauri870
Copy link
Member

mauri870 commented Jul 23, 2023

Seems like this path to io.Copy is allocating too much memory in this benchmark compared to 1.20:

image

image

image

@seankhliao seankhliao added Performance NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Jul 23, 2023
mauri870 added a commit to mauri870/go that referenced this issue Jul 23, 2023
The change to io.Copy in servefile caused a exponential increase in
memory allocations if the reader is a os.File.

             │   sec/op    │   sec/op     vs base                │
ServeFile-16   9.165µ ± 7%   3.759µ ± 3%  -58.99% (p=0.000 n=10)

             │     B/op     │    B/op     vs base                │
ServeFile-16   33358.0 ± 0%   620.0 ± 0%  -98.14% (p=0.000 n=10)

             │ allocs/op  │ allocs/op     vs base               │
ServeFile-16   14.00 ± 0%   15.00 ± 0%    +7.14% (p=0.000 n=10)

Fixes: golang#61530
mauri870 added a commit to mauri870/go that referenced this issue Jul 23, 2023
The change to io.Copy in servefile caused a exponential increase in
memory allocations if the reader is a os.File.

             │   sec/op    │   sec/op     vs base                │
ServeFile-16   9.165µ ± 7%   3.759µ ± 3%  -58.99% (p=0.000 n=10)

             │     B/op     │    B/op     vs base                │
ServeFile-16   33358.0 ± 0%   620.0 ± 0%  -98.14% (p=0.000 n=10)

             │ allocs/op  │ allocs/op     vs base               │
ServeFile-16   14.00 ± 0%   15.00 ± 0%    +7.14% (p=0.000 n=10)

Fixes: golang#61530
mauri870 added a commit to mauri870/go that referenced this issue Jul 23, 2023
The change to io.Copy in servefile caused a exponential increase in
memory allocations if the reader is a os.File.

```
             │   sec/op    │   sec/op     vs base                │
ServeFile-16   9.165µ ± 7%   3.759µ ± 3%  -58.99% (p=0.000 n=10)

             │     B/op     │    B/op     vs base                │
ServeFile-16   33358.0 ± 0%   620.0 ± 0%  -98.14% (p=0.000 n=10)

             │ allocs/op  │ allocs/op     vs base               │
ServeFile-16   14.00 ± 0%   15.00 ± 0%    +7.14% (p=0.000 n=10)
```

Fixes: golang#61530
@mauri870
Copy link
Member

mauri870 commented Jul 23, 2023

@qiulaidongfeng @r-hang a fix is on its way, feel free to test it to see if it fixes the issue for you 🙏

https://go-review.googlesource.com/c/go/+/512235

@mauri870
Copy link
Member

mauri870 commented Jul 23, 2023

@seankhliao I think we should raise the priority on this one, see if it can be included in the 1.21 release before it gets out of rc✌️

@seankhliao
Copy link
Member

cc @neild

@seankhliao seankhliao added this to the Go1.21 milestone Jul 23, 2023
@qiulaidongfeng
Copy link
Contributor

I tested CL 512235 and used it here #61530 (comment) benchmark

Before CL

BenchmarkFileServe-4 8014 126864 ns/op 54669 B/op 16 allocs/op

After CL

BenchmarkFileServe-4 9562 109095 ns/op 28008 B/op 17 allocs/op

@AlexanderYastrebov
Copy link
Contributor

Related #58452

@AlexanderYastrebov
Copy link
Contributor

The problem is not io.Copy but, surprisingly, sendFile.

io.CopyN uses io.Copy with size hint (io.LimitReader)

go/src/io/io.go

Line 365 in 2eca0b1

written, err = Copy(dst, LimitReader(src, n))

io.Copy uses internal copyBuffer

go/src/io/io.go

Lines 388 to 390 in 2eca0b1

func Copy(dst Writer, src Reader) (written int64, err error) {
return copyBuffer(dst, src, nil)
}

that checks whether response writer (dst) implements io.ReaderFrom

go/src/io/io.go

Lines 414 to 417 in 2eca0b1

// Similarly, if the writer has a ReadFrom method, use it to do the copy.
if rt, ok := dst.(ReaderFrom); ok {
return rt.ReadFrom(src)
}

which it does

go/src/net/http/server.go

Lines 575 to 578 in 2eca0b1

// ReadFrom is here to optimize copying from an *os.File regular file
// to a *net.TCPConn with sendfile, or from a supported src type such
// as a *net.TCPConn on Linux with splice.
func (w *response) ReadFrom(src io.Reader) (n int64, err error) {

It sniffs first 512 bytes (note for later)

go/src/net/http/server.go

Lines 591 to 601 in 2eca0b1

// Copy the first sniffLen bytes before switching to ReadFrom.
// This ensures we don't start writing the response before the
// source is available (see golang.org/issue/5660) and provides
// enough bytes to perform Content-Type sniffing when required.
if !w.cw.wroteHeader {
n0, err := io.CopyBuffer(writerOnly{w}, io.LimitReader(src, sniffLen), buf)
n += n0
if err != nil || n0 < sniffLen {
return n, err
}
}

and then (if content length is set to disable chunked encoding) uses underlying *TCPConn.ReadFrom

go/src/net/http/server.go

Lines 606 to 612 in 2eca0b1

// Now that cw has been flushed, its chunking field is guaranteed initialized.
if !w.cw.chunking && w.bodyAllowed() {
n0, err := rf.ReadFrom(src)
n += n0
w.written += n0
return n, err
}

TCPConn.ReadFrom then (in my case) uses sendFile

func (c *TCPConn) readFrom(r io.Reader) (int64, error) {
if n, err, handled := splice(c.fd, r); handled {
return n, err
}
if n, err, handled := sendFile(c.fd, r); handled {
return n, err
}
return genericReadFrom(c, r)
}

sendFile uses size hint (io.LimitReader) passed by io.CopyN to adjust remain from huge number to (file size - 512 due to sniffing)

func sendFile(c *netFD, r io.Reader) (written int64, err error, handled bool) {
var remain int64 = 1<<63 - 1 // by default, copy until EOF
lr, ok := r.(*io.LimitedReader)
if ok {
remain, r = lr.N, lr.R
if remain <= 0 {
return 0, nil, true
}
}
f, ok := r.(*os.File)
if !ok {
return 0, nil, false
}

and this huge remain in case of io.Copy is what causes sendFile slowdown

var werr error
err = sc.Read(func(fd uintptr) bool {
written, werr, handled = poll.SendFile(&c.pfd, int(fd), remain)
return true
})

The problem could be reproduced by the following benchmark:

package go61530

import (
	"bytes"
	"fmt"
	"io"
	"log"
	"net/http"
	"net/http/httptest"
	"os"
	"os/exec"
	"strconv"
	"testing"
)

// A benchmark for profiling the server without the HTTP client code.
// The client code runs in a subprocess.
//
// For use like:
//
//	$ go test ./*.go -c && ./go61530.test -test.run=NONE -test.bench=BenchmarkServerIOCopy$ -test.count=10
func BenchmarkServerIOCopy(b *testing.B) {
	b.ReportAllocs()
	b.StopTimer()

	const size = 1024

	// Child process mode;
	if url := os.Getenv("TEST_BENCH_SERVER_URL"); url != "" {
		n, err := strconv.Atoi(os.Getenv("TEST_BENCH_CLIENT_N"))
		if err != nil {
			panic(err)
		}
		for i := 0; i < n; i++ {
			res, err := http.Get(url)
			if err != nil {
				log.Panicf("Get: %v", err)
			}
			if res.StatusCode != 200 {
				log.Panicf("Status: %d", res.StatusCode)
			}
			n, err := io.Copy(io.Discard, res.Body)
			if err != nil {
				log.Panicf("Copy: %v", err)
			}
			if n != size {
				log.Panicf("Wrong size: %d", n)
			}
			res.Body.Close()
		}
		os.Exit(0)
		return
	}

	file := b.TempDir() + "/foo"
	err := os.WriteFile(file, bytes.Repeat([]byte{0}, size), 0666)
	if err != nil {
		b.Fatal(err)
	}

	ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		// Set Content-Length to disable chunked encoding and allow w.ReadFrom
		w.Header().Set("Content-Length", strconv.FormatInt(size, 10))

		f, err := os.Open(file)
		if err != nil {
			b.Fatal(err)
		}
		defer f.Close()

		//io.Copy(w, f) // slow
		io.CopyN(w, f, size) // fast
	}))
	defer ts.Close()

	cmd := exec.Command(os.Args[0], "-test.run=XXXX", "-test.bench=BenchmarkServerIOCopy$")
	cmd.Env = append([]string{
		fmt.Sprintf("TEST_BENCH_CLIENT_N=%d", b.N),
		fmt.Sprintf("TEST_BENCH_SERVER_URL=%s", ts.URL),
	}, os.Environ()...)

	b.StartTimer()
	out, err := cmd.CombinedOutput()
	if err != nil {
		b.Errorf("Test failure: %v, with output: %s", err, out)
	}
}

With debug print

diff --git a/src/net/sendfile_linux.go b/src/net/sendfile_linux.go
index 9a7d005803..4934621fee 100644
--- a/src/net/sendfile_linux.go
+++ b/src/net/sendfile_linux.go
@@ -39,6 +39,7 @@ func sendFile(c *netFD, r io.Reader) (written int64, err error, handled bool) {
 
        var werr error
        err = sc.Read(func(fd uintptr) bool {
+               println(remain)
                written, werr, handled = poll.SendFile(&c.pfd, int(fd), remain)
                return true
        })

io.Copy version prints 9223372036854775807 and io.CopyN prints 512

@AlexanderYastrebov
Copy link
Contributor

AlexanderYastrebov commented Jul 24, 2023

Now knowing what to search for I've found #45256 that apparently describes the same problem.
In particular it states that problem is only observed on local connections #45256 (comment) #45256 (comment)

@AlexanderYastrebov
Copy link
Contributor

And #41513

@rsc rsc changed the title net/http: go1.21 http.ServeFile performance regression net/http: ServeFile 2X performance regression in Go 1.21 Jul 24, 2023
@rsc rsc changed the title net/http: ServeFile 2X performance regression in Go 1.21 net/http: ServeFile 2X slower in Go 1.21 Jul 24, 2023
@mauri870
Copy link
Member

Interesting, weird that this just resurfaced now but it is a first reported in 2020. There is something going on around sendfile/io.Copy for quite a while, seems to be a more complex underlying issue.

I've also found that the bench for me gets stuck on epoll wait and prints a abnormally long remaining size.

Since this is blocking the release of 1.21, we could just revert the change back to Io.CopyN so it fixes the immediate issue so we have proper time to investigate the real issue.

@neild
Copy link
Contributor

neild commented Jul 24, 2023

Thanks for the really nice analysis @AlexanderYastrebov.

I think reverting https://go.dev/cl/446276 is the right thing for 1.21. Whatever the root problem with sendfile is, it's subtle and I don't want to be trying to figure out the right subtle use of it this close to release.

@gopherbot
Copy link

Change https://go.dev/cl/512615 mentions this issue: Revert "net/http: use Copy in ServeContent if CopyN not needed"

@AlexanderYastrebov
Copy link
Contributor

@costela FYI

@costela
Copy link
Contributor

costela commented Jul 24, 2023

@AlexanderYastrebov thanks for the great investigative work! I totally overlooked this case in my original PR's benchmarks 😓

@mauri870
Copy link
Member

Seems to be the right thing to do for this release ✌️

BTW we should close this one: https://go-review.googlesource.com/c/go/+/512235

@AlexanderYastrebov
Copy link
Contributor

@mauri870 You can close the PR and it will abandon CL

@mauri870
Copy link
Member

mauri870 commented Jul 24, 2023

Just did it, thanks!

Some notes from my investigation:

There seems to be a >200ms delay if io.Copy is used. This delay only happens if you have server and client running at the same time. If you try to debug the issue and step by line using delve for example it does not happen anymore.

If this bit from copyBuffer gets removed it becomes fast again:

// Similarly, if the writer has a ReadFrom method, use it to do the copy.
if rt, ok := dst.(ReaderFrom); ok {
	start := time.Now()
	n, err := rt.ReadFrom(src)
	println("readFrom copyBuffer", time.Since(start).Milliseconds())
	return n, err
}

Edit:

First sendfile runs fast, the second takes >200ms, and subsequent ones take ~50ms each.

Applying the patch describe here fixes the issue: https://go-review.googlesource.com/c/go/+/305229

@AlexanderYastrebov
Copy link
Contributor

I was also wondering why it uses sendFile instead of splice:

func (c *TCPConn) readFrom(r io.Reader) (int64, error) {
if n, err, handled := splice(c.fd, r); handled {
return n, err
}
if n, err, handled := sendFile(c.fd, r); handled {
return n, err
}
return genericReadFrom(c, r)
}

and it turns out splice is not implemented for file-to-socket case:

var s *netFD
if tc, ok := r.(*TCPConn); ok {
s = tc.fd
} else if uc, ok := r.(*UnixConn); ok {
if uc.fd.net != "unix" {
return 0, nil, false
}
s = uc.fd
} else {
return 0, nil, false
}

I dont know much about this low-level stuff, likely there are reasons why it is not implemented yet.

Nevertheless I tried a trivial change:

diff --git a/src/net/splice_linux.go b/src/net/splice_linux.go
index ab2ab70b28..819932374b 100644
--- a/src/net/splice_linux.go
+++ b/src/net/splice_linux.go
@@ -7,6 +7,7 @@ package net
 import (
        "internal/poll"
        "io"
+       "os"
 )
 
 // splice transfers data from r to c using the splice system call to minimize
@@ -24,19 +25,21 @@ func splice(c *netFD, r io.Reader) (written int64, err error, handled bool) {
                }
        }
 
-       var s *netFD
+       var spfd *poll.FD
        if tc, ok := r.(*TCPConn); ok {
-               s = tc.fd
+               spfd = &tc.fd.pfd
        } else if uc, ok := r.(*UnixConn); ok {
                if uc.fd.net != "unix" {
                        return 0, nil, false
                }
-               s = uc.fd
+               spfd = &uc.fd.pfd
+       } else if f, ok := r.(*os.File); ok {
+               spfd = f.PollFD()
        } else {
                return 0, nil, false
        }
 
-       written, handled, sc, err := poll.Splice(&c.pfd, &s.pfd, remain)
+       written, handled, sc, err := poll.Splice(&c.pfd, spfd, remain)
        if lr != nil {
                lr.N -= written
        }
diff --git a/src/os/file_unix.go b/src/os/file_unix.go
index 533a48404b..9e7440f218 100644
--- a/src/os/file_unix.go
+++ b/src/os/file_unix.go
@@ -93,6 +93,10 @@ func (f *File) Fd() uintptr {
        return uintptr(f.pfd.Sysfd)
 }
 
+func (f *File) PollFD() *poll.FD {
+       return &f.pfd
+}
+

and reproducer bechmark with io.Copy (#61530 (comment)) does not seem to suffer from the problem anymore and shows numbers on par with io.CopyN+sendFile

Maybe @ianlancetaylor could share some insight here.

@jmhodges
Copy link
Contributor

jmhodges commented Jul 24, 2023

Is there a new ticket to follow the underlying work on io.Copy since this one has been closed?

(The comments here imply there's more work to be done, but gopherbot closed this ticket)

@ianlancetaylor
Copy link
Contributor

Thanks for digging into this further. I don't understand what is going on at the kernel level, as discussed at #41513 and #45256. Reverting CL 446276 seems fine for 1.21, but I don't understand what the right long-term fix is. It seems that there is some odd behavior going on with sendfile, but I don't know why. I'm a little reluctant to try to jump in and fix it before we know what the problem is.

@AlexanderYastrebov If we start to use splice for the file-to-socket case then I don't think we need the sendfile case at all. And maybe that is the right move here. But it's a bit weird that splice would work better, since it requires an extra pipe.

The remain value in net/sendfile_linux,go, which is 1<<63 - 1, shouldn't be too relevant, as internal/poll/sendfile_linux.go cuts it to 4 << 20 anyhow.

I wonder whether it makes a difference if net/sendfile_linux.go calls os.File.Stat to get the file size to pass to internal/poll.Sendfile.

@AlexanderYastrebov
Copy link
Contributor

The remain value in net/sendfile_linux,go, which is 1<<63 - 1, shouldn't be too relevant, as internal/poll/sendfile_linux.go cuts it to 4 << 20 anyhow.

yes, in fact reproducer (#61530 (comment)) showed undesired behavior even with io.CopyN(w, f, size+1)

I wonder whether it makes a difference if net/sendfile_linux.go calls os.File.Stat to get the file size to pass to internal/poll.Sendfile.

This was declined in https://go-review.googlesource.com/c/go/+/305229

@ianlancetaylor
Copy link
Contributor

Fair point about CL 305229. I declined it because it seemed to me that we didn't have the data to support doing the extra system call. I guess today I would still like to understand what is going on in the kernel to cause the slowdowns.

@r-hang
Copy link
Author

r-hang commented Aug 2, 2023

Hey @neild it looks like the revert (https://go.dev/cl/512615) didn't make it into go1.21rc4 nor release-branch.go1.21. Is there a process or Issue I should file to move it in?

@sywhang
Copy link
Contributor

sywhang commented Aug 3, 2023

@ianlancetaylor @bcmills just a friendly ping on ^ :-). Can we re-open this until https://go.dev/cl/512615 gets backported? Thank you!

@cespare
Copy link
Contributor

cespare commented Aug 3, 2023

I'll reopen this for @neild to close after cherry-picking CL 512615 to the release branch. (Per this thread we don't need a backport issue.)

@cespare cespare reopened this Aug 3, 2023
@gopherbot
Copy link

Change https://go.dev/cl/515795 mentions this issue: [release-branch.go1.21] Revert "net/http: use Copy in ServeContent if CopyN not needed"

gopherbot pushed a commit that referenced this issue Aug 4, 2023
… CopyN not needed"

This reverts CL 446276.

Reason for revert: Causing surprising performance regression.

Fixes #61530

Change-Id: Ic970f2e05d875b606ce274ea621f7e4c8c337481
Reviewed-on: https://go-review.googlesource.com/c/go/+/512615
Run-TryBot: Damien Neil <dneil@google.com>
Reviewed-by: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
(cherry picked from commit df0a129)
Reviewed-on: https://go-review.googlesource.com/c/go/+/515795
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: David Chase <drchase@google.com>
Reviewed-by: Damien Neil <dneil@google.com>
@dr2chase
Copy link
Contributor

dr2chase commented Aug 4, 2023

Robots seem to be having a slow day, the CL mentioned immediately above "Fixes #61530", and that was its intent, so I am closing.

@dr2chase dr2chase closed this as completed Aug 4, 2023
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 4, 2023
@gopherbot
Copy link

Change https://go.dev/cl/555855 mentions this issue: net/http: use Copy in ServeContent if CopyN not needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done. Performance release-blocker
Projects
None yet
Development

Successfully merging a pull request may close this issue.