Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: trim unnecessary fields from scase #40410

Open
mdempsky opened this issue Jul 26, 2020 · 10 comments
Open

runtime: trim unnecessary fields from scase #40410

mdempsky opened this issue Jul 26, 2020 · 10 comments
Labels
Milestone

Comments

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Jul 26, 2020

Currently the runtime.scase structure is 5 words (6 words on 32-bit machines):

type scase struct {
	c           *hchan         // chan
	elem        unsafe.Pointer // data element
	kind        uint16
	pc          uintptr // race pc (for race detector / msan)
	releasetime int64
}

I think this can be trimmed down to just 2: c and elem. Here's how:

  1. The pc field is only needed for race-instrumented builds. Instead of embedding it directly into the scase array (and using stack space even for non-instrumented builds), we can split it out into a separate array that's prepended to the pollorder/lockorder arrays for race-instrumented builds, and omitted entirely for race-instrumented builds. (We'd prepend it because uintptr has stricter alignment than uint16.)

  2. The releasetime field is only needed for the case that actually succeeds. Rather than adding it to each scase, we can just have an extra casreleasetime local variable that parallels cas and casi. (I think: I'm least confident about this as I'm not familiar with the runtime's blocking profiling.)

  3. The kind field currently disambiguates four cases: send, recv, default, and nil. We can eliminate this field with a sequence of three optimizations:

    1. There can be only one default case. So instead of emitting a dummy scase to represent the default case, selectgo can just take an extra boolean parameter indicating whether the call should block or not. When in non-blocking mode, selectgo should just return -1 as the case index, and callers can handle this accordingly. This would simplify selectgo, because it wouldn't need to search for and remember where the caseDefault case is.

    2. caseNil is used internally within selectgo to simplify skipping over send/recv cases with c is nil. But instead of doing it this way, selectgo could simply omit nil-c cases from the pollorder and lockorder arrays when they're constructed. Then the rest of code will naturally skip over them.

    3. That leaves just send and recv cases. Because the runtime is going to randomize the order that cases will be processed anyway, the compiler can simply arrange that send cases always precede receive cases (or vice versa), and split ncases into nsends and nrecvs. Then selectgo can discern send and recv cases by just comparing the case index (e.g., if sends are ordered before receives, then casi < nsends would indicate cas is a send operation).

      Requiring sends before recvs will be slightly more work for reflect.Select, but it shouldn't be too bad. It already has to translate from []reflect.SelectCase to []runtime.scase (and currently it goes through an extra step and extra heap allocations with runtime.runtimeSelect). It should be easy to construct the latter by placing sends at the front of the slice and receives at the rear (remember: order in []runtime.scase doesn't matter), and just tracking a correspondence of array indices to translate selectgo's return value back.

/cc @aclements

@josharian
Copy link
Contributor

@josharian josharian commented Jul 26, 2020

As it stands, eliminating just one of these will likely help considerably, as it will make this struct SSA-able.

@mdempsky
Copy link
Member Author

@mdempsky mdempsky commented Jul 26, 2020

It looks like per-scase releasetime is intended to help when the select is awoken because of a channel closing. However, it turns out this code has corner cases that would be addressed by replacing it with a single casreleasetime variable like I suggest.

When selectgo is awoken by a channel closing, it doesn't know which sudog triggered the wakeup, because closechan sets gp.param = nil. To handle this, selectgo polls all of the select cases again, knowing it will succeed this time. And to account for this, selectgo also writes each sudog's releasetime back into the corresponding scase.

However, there's no guarantee that the re-poll will actually pick the same case that triggered the wakeup. It's possible that between the channel close wakeup and the goroutine getting to run, other cases might become ready too. If one of these other cases ends up getting selected on the re-poll, we'll never end up recording a block event.

As an extreme but somewhat contrived example (involving selecting on the same channel twice), this program spends a total of 1 second blocked waiting for the select statement, but this doesn't show up in the pprof output at all: https://play.golang.org/p/HoK4YDLwEvb

The more realistic example, this program also spends 1 second blocked waiting for the select statement, but pprof only reports about 1/4 of the time spent blocked: https://play.golang.org/p/ldCvE6fYYZQ

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 27, 2020

Change https://golang.org/cl/245019 mentions this issue: runtime: add "success" field to sudog

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 28, 2020

Change https://golang.org/cl/245124 mentions this issue: runtime: split PCs out of scase

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 28, 2020

Change https://golang.org/cl/245126 mentions this issue: cmd/compile/internal/gc: cleanup walkselectcases slightly

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 28, 2020

Change https://golang.org/cl/245123 mentions this issue: runtime: omit nil-channel cases from selectgo's orders

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 28, 2020

Change https://golang.org/cl/245125 mentions this issue: runtime: eliminate scase.kind field

@gopherbot
Copy link

@gopherbot gopherbot commented Jul 28, 2020

Change https://golang.org/cl/245122 mentions this issue: runtime: remove scase.releasetime field

@mdempsky
Copy link
Member Author

@mdempsky mdempsky commented Jul 28, 2020

Uploaded a WIP patch series that implements this.

The change to separate the pc field into a separate array is failing on race detector builds because of a use of select within the runtime. Because we turn off instrumentation when building the runtime, it wasn't actually setting pc before. (So it was a good thing I decided in CL 245124 to be conservative about not trying to optimize argument passing just yet.)

Easy fix is probably to create and populate the pcs array when -race is set, even for packages that normally suppress instrumentation.

@mdempsky
Copy link
Member Author

@mdempsky mdempsky commented Jul 28, 2020

Some notes to self on more cleanup still to do:

  1. The nsends and nrecvs parameters can be uint16 instead. Since nsends/nrecvs/block are all constant values at call sites, this would probably save a couple bytes of instructions at call sites.

  2. runtime.reflect_rselect needs to be tweaked to correctly handle when there's only a default case. (There should probably be a test for this too.)

  3. reflect.runtimeSelect should be updated to match runtime.scase so that reflect.Value.Select can directly call selectgo rather that another translation function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.