Skip to content

x/net/http2: stream prioritization does not work for small payloads #75936

@nicholashusin

Description

@nicholashusin

While working on #75500, I've found that HTTP/2 stream prioritization does not work properly for small payloads.

Some part of the code used to reproduce this issue is unfortunately still unmerged as #75500 is still in discussion, however, the following test snippet should illustrate the core of the issue:

func testServer_RFC9218_Priority(t testing.TB) {
	st := newServerTester(t, func(w http.ResponseWriter, r *http.Request) {
		for {
			w.Write([]byte("a"))
			if f, ok := w.(http.Flusher); ok {
				f.Flush()
			}
		}
	}, func(s *Server) {
		s.NewWriteScheduler = newPriorityWriteSchedulerRFC9128
	})
	if syncConn, ok := st.cc.(*synctestNetConn); ok {
		syncConn.SetReadBufferSize(1)
	} else {
		t.Fatal("Server connection is not synctestNetConn")
	}
	defer st.Close()
	st.greet()

	st.writeHeaders(HeadersFrameParam{
		StreamID:      1,
		BlockFragment: st.encodeHeader("priority", "u=3"),
		EndStream:     false,
		EndHeaders:    true,
	})
	synctest.Wait()
	st.writeHeaders(HeadersFrameParam{
		StreamID:      3,
		BlockFragment: st.encodeHeader("priority", "u=0"),
		EndStream:     false,
		EndHeaders:    true,
	})
	synctest.Wait()

	for {
		f := st.readFrame()
		if f == nil {
			break
		}
		fmt.Println(f)
	}
}

Given two streams with differing RFC 9218 priorities, @neild and I have found that the frames received via st.readFrame() actually alternate between stream 1 and 3. Had prioritization worked properly, the frames we received should have been for stream 3 as it is more urgent.

The cause of the issue seems to be the following:

  • The request handlers for both stream 1 and 3 call the WriteScheduler's Push() to enqueue their writes.
  • The request handlers would block until said write is complete.
  • The WriteScheduler's Pop() is called by the server, and it tries to write the FrameWriteRequest.
  • Once the write succeeds, the request handlers are no longer blocked, and the server will try to Pop() another FrameWriteRequest. However, this time, Pop() will get a FrameWriteRequest for the lower-priority stream, as the handler for the higher-priority stream hasn't had a chance to enqueue a new frame yet.

Note that the prioritization behavior still works when the write size is larger than the max frame size, as that causes multiple FrameWriteRequest to be pushed to the WriteScheduler. This means that stream prioritization currently does not behave well for payloads less than 16 KB (the default max frame size), although this size can be larger or smaller, as dictated by SETTING frames.

One potential fix here is to let the handlers run ahead of the written frames. Rather than blocking after requesting that a DATA frame be written, we can let handlers start constructing the next frame, up to a certain limit—for example, we can let it enqueue two frames instead of one.

However, this does conflict with existing optimization, where we currently allow ResponseWriter.Write() with a large []byte to send the entire []byte to the write queue (rather than being copied into an internal buffer). In this case, we must block until the write is complete, since the []byte is free to be modified by the caller as soon as ResponseWriter.Write() returns. As such, to be able to let the handlers run ahead of the written frames, we would need to disable this optimization.

There are real possibilities that letting the handlers run ahead of the written frames could get us a significant performance increase: connections would no longer need to wait for the handler to provide another frame after writing the last one. However, a more concrete benchmark would be needed to determine if this is true (and if the []byte copying optimization in ResponseWriter.Write() is worth it).

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReportIssues describing a possible bug in the Go implementation.NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions