Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: maybe allgs should shrink after peak load #34457

Open
cch123 opened this issue Sep 22, 2019 · 6 comments
Open

runtime: maybe allgs should shrink after peak load #34457

cch123 opened this issue Sep 22, 2019 · 6 comments

Comments

@cch123
Copy link
Contributor

@cch123 cch123 commented Sep 22, 2019

What version of Go are you using (go version)?

$ go version
go version go1.12.4 linux/amd64

Does this issue reproduce with the latest release?

Y

What operating system and processor architecture are you using (go env)?

any

What did you do?

When serving a peak load, the system creates a lot of goroutines, and after that the goroutine garbages cause more CPU consuming.

This can be reproduced by:

package main

import (
	"log"
	"net/http"
	_ "net/http/pprof"
	"time"
)

func sayhello(wr http.ResponseWriter, r *http.Request) {}

func main() {
	for i := 0; i < 1000000; i++ {
		go func() {
			time.Sleep(time.Second * 10)
		}()
	}
	http.HandleFunc("/", sayhello)
	err := http.ListenAndServe(":9090", nil)
	if err != nil {
		log.Fatal("ListenAndServe:", err)
	}
}

after 10 seconds, the inuse objects still remain the same.
flame3

What did you expect to see?

The global goroutines shrink to a proper size.

What did you see instead?

Many inuse objects created by malg

@cch123 cch123 changed the title maybe allgs should shrink after peak load runtime : maybe allgs should shrink after peak load Sep 22, 2019
@cch123 cch123 changed the title runtime : maybe allgs should shrink after peak load runtime: maybe allgs should shrink after peak load Sep 22, 2019
@zboya
Copy link

@zboya zboya commented Sep 23, 2019

In fact, allgs never been reduced, it is not conducive to stability, should provide a strategy to reduce, such as sysmon monitoring found that more than half of g are dead, then release it.

@agnivade
Copy link
Contributor

@agnivade agnivade commented Sep 23, 2019

@changkun

This comment has been hidden.

@aclements
Copy link
Member

@aclements aclements commented Sep 25, 2019

Your observation is correct. Currently the runtime never frees the g objects created for goroutines, though it does reuse them. The main reason for this is that the scheduler often manipulates g pointers without write barriers (a lot of scheduler code runs without a P, and hence cannot have write barriers), and this makes it very hard to determine when a g can be garbage collected.

One possible solution is to use an RCU-like reclamation scheme over the Ms that understands when each M's scheduler passes through a quiescent state. Then we could schedule unused gs to be reclaimed after a grace period, when all of the Ms have been in a quiescent state. Unfortunately, we can't simply use STWs to detect this grace period because those stop all Ps, so, just like the write barriers, those won't protect against scheduler instances manipulating gs without a P.

@changkun, I'm not sure what your benchmark is measureing. Calling runtime.GC from within a RunParallel doesn't make sense. The garbage collector is already concurrent, and calling runtime.GC doesn't start another garbage collection until the first one is done. Furthermore, if there are several pending runtime.GC calls, they'll all be coalesced into a single GC. If the intent is to just measure how long a GC takes, just call runtime.GC without the RunParallel.

@changkun

This comment has been hidden.

@aclements
Copy link
Member

@aclements aclements commented Sep 26, 2019

Calling runtime.GC within a RunParallel does not measure contention on allglock. The GCs are serialized by runtime.GC itself, so they're not fighting over allglock, and they're coalesced by runtime.GC, so calling runtime.GC N times concurrently can result in anywhere from 1 to N GCs depending on vagaries of scheduling.

Benchmark aside, though, I think we're all clear on the issue that allgs is never collected and that impacts GC time and heap size.

Since gs are just heap allocated, it would make the most sense to collect them during GC like other heap allocations. The question is when it's safe to unlink them from allgs and allow them to be collected, given that the normal GC reachability invariants don't apply to gs. (At the same time, we don't want to be over-aggressive about unlinking them from allgs either, since we want the allocation pooling behavior to reduce the cost of starting a goroutine.) This is certainly doable, though it would require a fair amount of care.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.