Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/net/http2: window updates on randomWriteScheduler after stream closed cause memory leaks #33812

Open
jared2501 opened this issue Aug 24, 2019 · 2 comments

Comments

@jared2501
Copy link

commented Aug 24, 2019

What version of Go are you using (go version)?

go version go1.12.4 darwin/amd64

Does this issue reproduce with the latest release?

Yes

What did you do? What did you expect to see? What did you see instead?

I've been tracing down some issues in a creeping response times in a gRPC server that uses HTTP/2.

image

The big jump downwards is when I restart the server (purple=median, blue=p95, yellow=max response times).

After much poking and logging, I've traced it down to the entries in the randomWriteScheduler.sq growing, and not being garbage collected. What I believe is happening is that RST_STREAM frames are being sent on closed streams, which causes the randomWriteScheduler to allocate new entries in the randomWriteScheduler.sq map (see code here). The RST_STREAM frames will get queued and emitted by WriteScheduler.Pop() eventually, however since the stream has already been closed, they will never get deleted from the randomWriteScheduler.sq map since randomWriteScheduler.CloseStream will never be called, which causes this map to grow unbounded.

The priorityWriteScheduler handles this case by not allocating new nodes, but instead attaching RST_STREAM frames for closed streams to it's root node (in fact, it has a nice comment explaining this logic here).

I can look at providing a minimum repro of the issue if required, but it's a bit tricky to isolate since it requires delay in the stream to reproduce. Hopefully, the difference in behaviour between the randomWriteScheduler and the priorityWriteScheduler, and reading the code, is enough to identify & fix the issue here.

@odeke-em odeke-em changed the title http2: window updates on randomWriteScheduler after stream closed cause memory leaks x/net/http2: window updates on randomWriteScheduler after stream closed cause memory leaks Aug 24, 2019

@jared2501

This comment has been minimized.

Copy link
Author

commented Aug 26, 2019

I've run the gRPC server through a full traffic cycle with the priorityWriteScheduler enabled and it seems to have fixed the issue!

image

Here's a suggested fix to the leak:

diff --git a/http2/writesched_random.go b/http2/writesched_random.go
index 36d7919..4c34683 100644
--- a/http2/writesched_random.go
+++ b/http2/writesched_random.go
@@ -51,8 +51,14 @@ func (ws *randomWriteScheduler) Push(wr FrameWriteRequest) {
        }
        q, ok := ws.sq[id]
        if !ok {
-               q = ws.queuePool.get()
-               ws.sq[id] = q
+               // id is an idle or closed stream. wr should not be a HEADERS or
+               // DATA frame. However, wr can be a RST_STREAM. In this case, we
+               // push wr onto the zero write queue, rather than creating a new
+               // write queue, since RST_STREAM is tiny.
+               if wr.DataSize() > 0 {
+                       panic("add DATA on non-open stream")
+               }
+               q = &ws.zero
        }
        q.push(wr)
 }
@jared2501

This comment has been minimized.

Copy link
Author

commented Sep 6, 2019

Quick ping @bradfitz @odeke-em - any chance of landing this in 1.14?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.