Go compiler and runtime meeting notes #43930

jeremyfaller · 2021-01-26T20:31:19Z

Google's Go compiler and runtime team meets periodically (roughly weekly) to discuss ongoing development of the compiler and runtime. While not open to the public, there's been desire by the community to learn what the compiler and runtime team is working on. While we learn what is a good format, and what works and doesn't, we will start publishing meeting minutes here.

These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

This meta-issue records minutes of Google's Go compiler and runtime meetings as issue comments, so that they can be cross-linked easily with the relevant issues. This meta-issue is for minutes only; comments that are not meeting minutes will be deleted.

jeremyfaller · 2021-01-26T20:34:14Z

2021-01-19

These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

We continued a discussion about increasing the bootstrap version of Go from 1.4 to 1.16 after the 1.17 release. We wanted to make sure it wasn't hard for people to build from scratch, but we would like to use features that have been added over time. Consensus was that this would be okay to submit as a formal proposal.
We discussed Austin's benchmark unit proposal. Consensus was hopeful that it would be accepted.
We discussed the weak maps proposal. A number of corner cases, gotchas, and subtleties were discussed. Consensus was that the discussion on the issue was up to date.
After falling off the prior week's agenda, we began a discussion about releasing the compiler and runtime meeting notes. We all felt communication with the community was important, and we would start with an internal publication of meeting notes. No dates were set for a final decision.

jeremyfaller · 2021-01-26T20:37:23Z

2021-01-26

These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

We continued discussing whether or not there were any concerns with publishing compiler and runtime meeting notes. No issues were raised, so it's likely I will go ahead and start publishing the notes.
We discussed the redirecting crash output proposal. No major concerns were raised, but consensus was that a call would be best approach (against environment variables, etc).
We discussed the pain merging the development branches (master -> dev.regabi -> dev.typeparams). The consensus was that without acceptance of the generics proposal, we need to keep living with the pain.
We briefly discussed using the type checker in dev.typeparams as the default type checker. Using that type checker is blocked on either acceptance of generics proposal or a separate proposal to move to a new type checker.

jeremyfaller · 2021-02-02T19:43:15Z

2021-02-02
These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

We continued our discussion of moving to the new default typechecker on dev.typeparams. Engineering effort continues at a break-neck pace, and we're hopeful we'll be able to have this early.
We discussed the issue with spinning M's, and have seen large improvements for unique workloads within Google. Consensus was that this was an excellent find.
We had a brief discussion of integer constant resolution.
We briefly discussed merging the dev.regabi branch back to master when the tree opens. Consensus was that it would be safe to merge when the tree opened.
We discussed some recent additions to dev.typeparams, that added support for generic functions and types. It's possible the backend work will happen on the branch's compiler soon.

jeremyfaller · 2021-02-09T19:43:18Z

2021-02-08

These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

We briefly discussed the status of 1.16.
We discussed the tree reopening, and landing of the dev.regabi branch, and the dev.typeparms branches on master. Current plans are to merge dev.regabi during the first week. If both proposals (1, 2) are accepted, dev.typeparams will also be merged during that week.
We discussed the new pacer proposal, and its benefits.
We also discussed the status of the register abi. While functionality is gated behind a flag, we don't want to go into code-freeze, enabling the ABI that late. We discussed the amount of soak time we think we will need.

jeremyfaller · 2021-02-16T21:21:35Z

2021-02-16

These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

We had a small celebration about the acceptance of the Generics Proposal.
We discussed merging the dev.typeparams branch to master. There are public API typechecking changes on that branch that aren't stable. As a result, we will likely hide the APIs using buildtags.
We continued a long-running discussion on making types2 the default typechecker. We still don't need to make a decision here, so back into the oven it goes.
We discussed revising the syntax for typelists in generics. An internally floated proposal is similar to public proposals. More to come.
We discussed adding attendance to entries in this document. As no one objected to having their name added, I will start doing that. I won't start by CC-ing people, just linking to their github profile for now.

Attendees:
Carlos Amedee
David Chase
Austin Clements
Matthew Dempsky
Jeremy Faller
Robert Griesemer
Than McIntosh
Martin Möhrmann
Patrik Nyblom
Michael Pratt
Keith Randall
Dan Scales
Ian Lance Taylor
Cherry Zhang

jeremyfaller · 2021-02-23T22:19:44Z

2021-02-23

These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

We celebrated the reopening of tree for 1.17 development.
We discussed some of the features we want to land for 1.17, including the register ABI, foundational work for generics, the new gc pacer, improvements to the scheduler, escape analysis improvements, and possibly some phase changes to the compiler. We also had a more in-depth discussion of subtle points in types2.
We discussed the work on register ABI. A hello world is working (in review, not fully committed). More work for return values has also started, and we're having some success there.
A new ABI with G in R14 and a zeroed X15 has shown reasonable promise to offer good runtime performance improvements. Work is ongoing to confirm results, and clean up some poorly understood overhead in wrappers. Also, not all of the places where we use G in assembly have been converted over. More data to come.
We discussed an issue that turning on stack poisoning seems to bork the openbsd-amd64-68 trybot.
We discussed a proposal to add zeroing for cryptographic functions. Consensus was that this is non-trivial.

Attendees:
Carlos Amedee
David Chase
Austin Clements
Matthew Dempsky
Jeremy Faller
Robert Griesemer
Than McIntosh
Martin Möhrmann
Patrik Nyblom
Michael Pratt
Keith Randall
Dan Scales
Ian Lance Taylor
Cherry Zhang

jeremyfaller · 2021-03-03T16:52:14Z

2021-03-02

These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

We discussed the current status of our 1.17 and 1.18 goals, including the register ABI, and generics.
We discussed a small issue on sum types. Likely more external communication to come.
We discussed how best to handle the meeting notes (this issue), and questions and feedback. We decided it would be best to move queries to the mailing list(s) for now.
We discussed moving GOEXPERIMENT, and how important it would be to help the registerABI work to land.
We discussed a subtle issue in the register ABI to simplify the internal representation and requirements for go & defer. Specifically, having the compiler rewrite defer f(x) ⇒ x' := x; defer func() { f(x') }().

Attendees:
David Chase
Austin Clements
Matthew Dempsky
Jeremy Faller
Robert Griesemer
Than McIntosh
Patrik Nyblom
Michael Pratt
Keith Randall
Dan Scales
Cherry Zhang

jeremyfaller · 2021-03-09T21:12:07Z

2021-03-09
These compiler and runtime meeting minutes are under development. We welcome feedback on content, format, level of detail, timeliness, and so on. If the minutes are helpful, please let us know. If they are less than helpful, we welcome constructive comments on how to improve them.

Notes on agenda items that address Google specific needs are elided.

Again, we discussed the status of our 1.17 and 1.18 goals.
We hope that move GOEXPERIMENT knob is accepted, as the current plan with the register ABI is to use it for turnup.
We discussed the progress on type lists. An internal proposal likely addresses all concerns, and an external document will be available.

Attendees:
David Chase
Austin Clements
Matthew Dempsky
Jeremy Faller
Robert Griesemer
Than McIntosh
Martin Möhrmann
Patrik Nyblom
Michael Pratt
Keith Randall
Dan Scales
Ian Lance Taylor
Cherry Zhang

jeremyfaller · 2021-03-17T16:43:58Z

2021-03-16

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

We discussed typelists again.
We discussed possibly updating MIPS to MIPS32r2. The thinking is that R2's been around for a long time (10years?), and there is significant performance to be gained in the crypto libraries if we adopt it. Previous attempts have failed because of Loongson users. Discussion will likely continue.
We discussed how to address amd64 per-architecture optimizations. Haswell specific optimizations have shown large improvements in certain workloads, and it might be worth providing for these wins. Discussion will likely continue.

Attendees:
The notetaker would like to apologize for leaving off Michael Knyszek from previous attendance. I don't have a good system yet, and it's quite manual.
David Chase
Austin Clements
Matthew Dempsky
Jeremy Faller
Robert Griesemer
Michael Knyszek
Than McIntosh
Martin Möhrmann
Patrik Nyblom
Michael Pratt
Keith Randall
Dan Scales
Ian Lance Taylor
Cherry Zhang

jeremyfaller · 2021-03-24T16:43:12Z

2021-03-23

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Most of the meeting was a discussion about 1.18.
We discussed changes to the standard library for generics in 1.18
We discussed followon work for the register ABI (beyond amd64).
We discussed measuring PGO and other GC (Immix) ideas.

Attendees:
I was absent from the meeting, and unable to grab the screenshot.

jeremyfaller · 2021-04-01T14:19:21Z

2021-03-30

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

We discussed the remaining outstanding work, now that we're 1 month from the code freeze.
Windows/ARM: completed, status of cgo is unknown
RegisterABI: most of the rest of the changes are in review, will be able to start debugging in earnest.
GC Pacer: Not complicated, but currently unscheduled. Register ABI is taking precedence, and if we get a chance, it'll get in.
Scheduler Improvements: Most CLs are ready for review, need to do more testing for metrics.
Improving Escape Analysis: Most CLs are ready for review
Haswell Improvements: Needs a proposal first.
Compiler phase changes: (interleaving front-end and SSA) mostly delayed.
We discussed using GOEXPERIMENT as a one true feature gating system. Register ABI work has found this to be helpful

Attendees:
Lost to the sands of time. I forgot to take a screenshot.

jeremyfaller · 2021-04-12T17:11:21Z

2021-04-06

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

We discussed turning up the register abi. Our current plan is to turn the five flags that control it one-by-one, and watching the builders/trybots. The date for this turn up is TBD.
We discussed branching again (during the freeze) for register abi and generics. Our current plan is to use a single branch for both.
We discussed how debug information for generics was going to work. We talked about how other compilers (gcc, swift) do it. Some issues might be brought up in a future meeting with the delve folks.

Attendees:
David Chase
Austin Clements
Matthew Dempsky
Jeremy Faller
Robert Griesemer
Michael Knyszek
Patrik Nyblom
Michael Pratt
Keith Randall
Dmitri Shuralyov
Ian Lance Taylor
Cherry Zhang

jeremyfaller · 2021-04-21T17:51:04Z

2021-04-20

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

We discussed regabi which has been completely enabled in master. Please let us know if there are any instabilities. Performance (while still changing) is showing ≥ 5% wins, and smaller binaries by 2%, with 10% smaller .TEXT sections.
We discussed the status of types2. It's not in yet, but we intend to get it in before release.

Attendees:
David Chase
Matthew Dempsky
Jeremy Faller
Robert Griesemer
Michael Knyszek
Than McIntosh
Martin Möhrmann
Patrik Nyblom
Michael Pratt
Keith Randall
Dan Scales
Ian Lance Taylor
Cherry Zhang

jeremyfaller · 2021-05-17T15:38:08Z

2021-05-11
These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

I'm a little behind updating these notes. I'll consolidate the last couple of meetings into this one.

The freeze is on. Generics and the register ABI work will continue on a the dev.typeparams branch.
The register ABI is holding true on the ~5% performance improvement, and smaller binaries. There are some regressions in some benchmarks, likely related to live variables over loops with register pressure, but we might not feel comfortable fixing these for 1.17.
On May 11, we discussed the Uber blog post. We think it's being adequately discussed on the proposal.
On May 4, we discussed Ian's slices proposal.
We continue to track the 1.17 release milestone. So far, it looks like we're on track for the 1.17 beta.

changkun · 2021-10-28T11:56:41Z

Ping. Is the meeting stopped since May? :)

thanm · 2023-02-21T19:52:34Z

2023-02-21

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Discussion relating to shallow/decoupled export data to speed up tools (including google-internal tools)
- Go has ~8x more export data than other languages, relatively speaking. Makes analysis tools much heavier-weight.
- Gopls is switching to an independent tree of shallow export files
- [Matthew]: Still working on a doc. Tools probably want something like what gopls is switching to. x/tools packages to write out export data, which is what gopls is using (via internal/ APIs).
- [Austin]: Distributed builds with shallow export data?
- [Matthew]: bazel has some support for input pruning. Rule lists all possible inputs and can say after the fact what was actually used and that goes into rebuilds.
- [Austin] I think this can only avoid unnecessary rebuilds. I don’t think it can avoid shipping all of the inputs if it does have to rebuild?
- [Than]: Do we know which parts of the export data are "too big"? Or is the reader just taking too long to read them? What fraction is inline bodies vs types?
- [Matthew]: I think inline bodies are a big part of it, and they pull in more transitive type information. Also, the go/types APIs don’t support lazy loading. In the compiler, only the types that are needed get loaded into the type checker.
Discussion of revisiting default link mode for cgo std packages.
- related bugs: cmd/link: linker fails on linux/amd64 when gcc's lto options are used #58619 and cmd/link: linker fails on linux/amd64 when gcc's annobin plugin is used #58620
- Cherry: “... now if user doesn't have a C compiler, they will get non-cgo std packages; if they have a C compiler, they can get cgo-enabled std packages but they can use external linking anyway. So maybe we should default to external linking with cgo even for the std packages?”
- [Ian] making this switch could impact build speed. Right now if you import “net” on Linux, you’ll normally being using a cgo-compiled object, and the Go linker knows enough to link that without the host linker.
- [Austin] Russ thinks we should try doing external linking more of the time.
- [Ian]: It would be nice if we could combine that with the same fix for Linux that Russ did for Darwin where we don’t need to invoke the C compiler at all for the net package.
- [Cherry]: On Linux if you use C libraries at all, you have to work with pthread and that was complicated. On Darwin you just say “I need pthread_create from libSystem or whatever”. On Linux it’s something more complicated.
- [Cherry]: For build speed to be impacted, the user needs A) a reasonable amount of code and B) the only C code in the whole program is the net package, which you can argue doesn't happen that often?
- [Cherry]: I think there is some room to improve the linking speed for external linking, but it will always be slower because you have to shell out to another program.
- [Cherry]: MacOS -race isn’t a problem. It doesn’t import C at all, it’s all just references.
- [Than]: Did we consider using *.syso’s for net?
- [Ian]: We never had to because we always just built it with cgo.
- [Ian]: If you build a binary on Linux, it needs to do the library versioning thing.

Attendees:

Austin Clements
Cherry Mui
David Chase
Ian Lance Taylor
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-03-01T18:55:59Z

2023-02-28

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Using internal linking less
- discussion of whether/how to reduce use of internal linking for CGO
- internal linking is always a net win from link time perspective, and users want to continue to have short link time, but since moving away from shipping precompiled *.a files, we're running into more situations where user programs that import "net" or "os/user" trigger linker problems due to C compiler flags/features/constructs that the linker doesn't grok (examples: cmd/link: linker fails on linux/amd64 when gcc's lto options are used #58619, cmd/link: linker fails on linux/amd64 when gcc's annobin plugin is used #58620)
- [Cherry]: We still need to use internal linking for syso files, which are not especially simpler. The problem we want to solve is that the linker only knows how to handle certain things, and people may pass weird C compiler flags or their compiler may do weird things. There are other ways to deal with that.
- [Than] Supporting internal linking leads to a steady drip of things we need to change in the linker as host C compilers evolve. It would be great to reduce that as much as we can. But it would be wonderful if we could do flag filtering or something to reduce that stream.
- [Keith]: Can’t our linker report that it sees something it can’t handle and then the Go command fires up the external linker?
- [Cherry]: We could do that. The linker itself can exec the external linker with cmd/go. The downside would be reading the external .o files twice, once to see if we can handle them, and then again to actually process them. Once we load them into the symbol table, it’s complicated to “unload” them.
- [Than]: May also be hard sometimes to tell the difference between "internal linking failed due to weird host object" and "internal linking failed due to Go linker bug"
- [Austin]: How reliably can the Go linker report that it can’t handle something?
- [Cherry]: I think it’s pretty good.
- [Cherry]: Do we want to make the linker do a pre-scan, or just fall back to external linking if we see flags we don’t understand?
- [Austin]: Do we look at flags today?
- [Cherry]: We don’t. There are so many ways to set flags. Maybe the cgo command can tell us.
- [Ian]: The go command does look at the compiler flags to only permit flags it thinks are safe (specifically, not compiler plugins).
- summary: internal linking is going to stay around, but hopefully we can come up with some way to handle weird host objects, e.g. make the linker better at dynamically detecting when it needs to externally link, either through flag scrubbing or pre-scanning
team planning discussion for 2023 objectives/goals
discussion of how much performance improvement we expect to deliver from upcoming projects such as PGO, overhaul from inliner, etc
- for PGO, we'd been planning that the immediate priorities are -pgo=auto by default, build scalability impact, and diagnostics, over more PGO-based optimizations, but maybe it's not enticing enough without more performance improvement. -pgo=auto by default is a hard requirement, but maybe the next priority should be getting PGO over 5%.
- [Keith]: I agree build speed isn’t that important if people are building with PGO.
- [Austin]: more worried about scalability issues
- [MichaelP]: in some corner cases it can triple or quadruple your build time; we don't have a lot of detailed data on the slowdowns yet
- [Austin]: The danger is people will try it in 1.21 and find the build speed is untenable and back off. But we can recover from that through clear communication.
- [David] Build speed is not what people care most about according to the recent surveys.
- [MichaelP]: If we GA -pgo=auto by default, more people will be exposed to PGO. Though it’s only for binaries, not libraries.
- [Austin]: I’m not too worried about that because you can just delete the profile if the impact is too high.
- [MichaelK]: I'm not sure users really care much about build scalability until something falls over, so I suspect it's harder to ask about.
- [David]: How much does PGO help the compiler itself?
- [Cherry]: about 4.5% at last measurement.

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
David Chase
Dmitri Shuraylov
Eli Bendersky
Ian Lance Taylor
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-03-07T20:20:10Z

2023-03-07

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Go performance on arm64
- Context: google-internal experimental study running workloads on arm64, points out some places where the Go compiler could do better
- We’re really bad about writing things to the stack and then immediately reading them back (and then sometimes writing back to somewhere else on the stack) and it seems ARM machines are less forgiving of this.
  - [MichaelP]: I’ve been experimenting with marking regalloc “overhead” instructions. That might be useful here, too.
- arm64 branch predictor apparently not as strong as amd64, so it’s more sensitive to block layout and we’re not great at putting hot blocks in fall-through order.
  - Maybe CL 418555 (improve basic block layout) would help for arm64? It didn’t help much on amd64, and cost ~2% in binary size.
- [MichaelP]: Do they have specific benchmark targets?
- [MichaelK]: would it help to spin up an ARM64 performance builder? In theory all of the benchmarks should work. The dashboard would need to learn architectures.
- Also a non-ARM-specific inlining bug related to our heuristics for re-exporting inline bodies (which shallow export data would fix).
Status
- PGO GA: -pgo=auto by default
  - Multiple main packages work now. We just need to flip the default
- PGO GA: 5% performance gain
  - [MichaelP]: Prototyping in regalloc. Uber is making progress on devirt, but it’s not clear how close they are.
- LUCI POC
  - We keep adding functionality. So far things are looking “pretty good”. Pushing some functionality to after the go/no-go decision.
- Iterable proposals
  - [Ian]: It doesn’t look like “for range” will settle until 1.22.
- Improved type inference
  - [Ian]: “It’s much better.” We’ve agreed on our approach.
  - [Robert]: Code is much simpler and clearer and I fixed several bugs.
  - [Ian]: We don’t have a spec yet.
- Structured all.bash
  - test/run.go is now a normal Go test
- ARM64 assembler
  - [Cherry]: The CL is fine. The next step is for CL authors to write the code generator that turns their XML into the instruction table. Not blocked on us right now.
- Traceback iterator
  - [Austin]: Rewrite is complete. Test coverage is actually pretty good (yay runtime coverage!). Dealing with performance regression that appears to be from bad code gen (#58905 and an issue with returning medium-sized structs)
  - [Cherry]: I tried an optimization that will remove some of the copies.
- David's aligned/atomic slices/string/interface work:
  - Len/cap swap and 16-byte aligned slices, strings, and interfaces: space (RSS +2%, VM +1%, tile38p50 +2%, tile38p90 -1.7%, tile38p99 -4.9%) and time (geomean +1.5%)

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
Eli Bendersky
Ian Lance Taylor
Keith Randall
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-03-14T19:37:51Z

2023-03-14

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

First PGO result from a google-internal app yields -2.75% CPU 🎉
- used manually collected LBR profile converted to pprof
PII in PGO profiles
- enterprise users possible concern about PII in profiles. Google doesn’t seem to think it’s a problem.
- [MichaelP]: Most of these concerns seemed to be coming from folks who hadn’t thought about it at all. Umbrella concern about bringing any data back from prod.
- [Austin]: It seems like the only solution to that is each company has to do their own privacy review.
- [MichaelP]: We could provide a tool to strip PII...? Maybe aggregation is enough?
- [Keith]: Would it help to tell people that Google does this all the time?
- [Austin]: considering reaching out to other google-internal folks on this, to find out how we arrived at our privacy position on continuous profiling
- [Cherry]: concerns seemed to be that they didn’t know what was in the profile. Maybe we just need to explain exactly what’s in there. And if google-internal uses of profiling for PGO went through privacy review, maybe we could use similar language.
- [Austin]: Such a doc would be useful if companies do need to do their own privacy review.
- [MichaelK]: just don’t see how you get PII into a profile. Everything going into it is just coming from the binary.
- [Austin]: You certainly can exfiltrate data in a profile if you try.
- [MichaelP]: Canonical example might be the presence (or relative hotness) of functions that give hints about what data is being processed. Example: “Render English/German/Spanish” functions. Now you can get frequency breakdown. I think it’s still a stretch.
- [MichaelP]: Several people on the EAB felt better when we communicated there’s no argument data in the profile.
- [Than]: Definitely emphasize to customers that we’re not putting anything special in the profile. And also that you can just look at the profile and see what’s in there.
- [Cherry]: Their concern isn’t just checking files into the build system, it’s extracting any data from prod. If we document what’s in the profile, and later want to extend what’s in the profile, that’s a hurdle.
- [MichaelK]: If they can’t get things back from prod, how do they debug things?
- [MichaelP]: At least one user told us they don’t profile in production just because they don’t let people connect to production and they don’t have continuous profiling.
PGO ease of use
- [Austin]: We could make continuous profiling easier.
- [MichaelP]: +1. My impression is that the biggest challenge for continuous profiling is the remote collection of profiles, which feels like something we can’t do a lot about.
Cgo, external linking, and weird flags
- rash of issues filed recently with common theme: building a Go program that depends on cgo-using stdlib packages (e.g. net, os/user), but with CGO_CFLAGS set to some unusual flag, results in link failure (internal linker doesn't understand resulting host objects due to flags)
- Where we are: do internal linking of std unless there are C flags we don’t support
- Are we happy with where we landed on inspecting the C compiler flags? CL 475375
  - [Than]: so far so good, ask again in a week
  - [Than]: There’s already been a request to back-port this [just switch to external linking if there are flags we don’t support] to 1.20. It’s a change to both the linker and cmd/go, so it’s not totally risk-free.
Status
- PGO GA: -pgo=auto by default
- waiting on code review from tools folks
- [Cherry]: CL ran into some issues I don’t quite understand. Asking Bryan to help with debugging. Otherwise in good shape.
PGO GA: 5% performance gain
- [MichaelP]: Prototyping regalloc. Nothing too notable yet. Uber will in theory be sending their prototype for PGO devirt to us this week.
Loop variable scoping GOEXP
- CL 472355 needs review
Structured all.bash
- [Austin]: Going to return to this now that traceback iterator is in
RAM efficiency prototyping
- Interest from David, Michael K (after wrapping up diagnostics)
Traceback iterator
- [Austin]: Done! Fun follow-up: print bottom 50 frames of deep stacks
Slice reorg/alignment work
- Status of 16-byte load/store?
- [Keith]: I have a CL for strings. It would be straightforward to also do for slices and interfaces.
No meeting next week (Go quiet week)

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
Eli Bendersky
Ian Lance Taylor
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-03-28T18:48:48Z

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

2023-03-28

FYI
- proposal: all: add GOOS=wasip1 GOARCH=wasm port all: add GOOS=wasip1 GOARCH=wasm port #58141 accepted
- proposal: use a zero for third digit for major release, such as 'go1.23.0' build: use a zero for third digit for major release, such as 'go1.21.0' #57631 accepted
- Russ is starting to ask publicly if we should drop Plan 9
  - Making Plan 9 “out of tree” would be ideal, except we don’t really have a way to do that.
  - There are two amd64 versions of Plan 9, and the one we have a builder for is the less-maintained one.
- Optimization guide
  - Austin’s thoughts
    - Give a high level mental model of how Go code executes. Try to avoid details that are likely to be fragile, and be clear that this area is always evolving.
    - How data structures work. E.g., slices and maps are reference data structures, so passing them is O(1) and you don’t have to pass a pointer to a slice or a map.
    - How to use performance tools. Flow chart of performance evaluation (writing this might help us improve the tools, too 🙂). Don’t prematurely optimize, but keep the basic execution model in mind so don’t have an even layer of performance problems spread across your code.
    - Model of escape analysis to avoid allocations.
    - Export some of our philosophy, e.g., fast/slow path patterns for inlining
  - [MichaelK]: The GC guide takes a stab at this.
  - [MichaelK]: Should we merge the GC optimization guide into whatever new thing we make?
  - [MichaelK]: Should we use this opportunity to merge other scattered sources, like the Wiki pages?
    - https://github.com/golang/go/wiki/Performance (Go developer performance guide by Dmitry); out of date, but it’s a starting point.
    - https://github.com/golang/go/wiki/CompilerOptimizations
    - https://github.com/golang/go/wiki/SliceTricks (maybe?)
    - https://github.com/golang/go/wiki/BoundingResourceUse
    - https://go.dev/doc/diagnostics
  - [Ian]: Should we also be thinking about a debugging guide? We have a bunch of different docs, but nothing unified.
    - [Keith]: That depends a lot on your IDE
  - [MichaelK]: As we work on tracing/etc, I plan to have a guide on how to use those tools.
  - [MichaelK]: I got feedback on the GC guide that people really just want “performance tips”, not a “mental model.”
  - [MichaelP]: “My program is slow and I need to fix it before my deadline.” Talk about the tools a lot and how to use them. That gives people actionable things.
  - [Austin]: Flow chart.
  - [Keith]: Another thing that would be nice about having a guide is that there’s all kinds of misinformation/anecdotes out there. It’s not necessarily something people would read, but something we could point them to.
Status
- PGO GA: -pgo=auto by default
  - Done!
- PGO GA: 5% performance gain
  - [MichaelP]: Working on some experiments. Nothing to send out yet. I found lots of interesting problems with regalloc, but fixing individual ones hasn’t made a big difference.
- Improved type inference
  - [Ian]: We have a proposal (incoming to committee). Robert has implemented it. It’s basically the same as the old type inference, but a new perspective on it. We’re also working on new forms of type inference, which will need a proposal: able to infer type arguments based on assignment.
- Inlining overhaul: Design
  - [Matthew]: Than and I had a meeting to discuss and start on a design doc.
- ARM64 assembler
  - [Cherry]: CL hasn’t been updated since last time.
- init ordering fix
  - issues with flags.Set and code that relies on specific order
  - [Keith]: work needed to clean up google-internal codebase here
  - [Austin]: Can a vet check or API change do anything about this?
    - [Keith]: Existing code will still break with an API change.

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
Eli Bendersky
Ian Lance Taylor
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Than McIntosh

thanm · 2023-04-04T19:49:28Z

2023-04-04

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

FYI
- Friction fixit announced for April 17–21.
- Reminder: H1 team summit scheduled for April 25–27.
- Minor releases went out today!
discussion of google-internal bugs relating to health-check failures for some large Go applications
Status
- PGO GA: 5% performance gain
  - [MichaelP]: Still working on various prototypes of PGO stuff and just general performance. Have a CL to send out for PGO. Right now we increase the budget for inlining, but we don't limit the budget if the function is big. Oversight, we didn't intend this. Adding that limit back in gives us back 0.5%.
- Loop variable scoping GOEXP
  - [David]: GOEXPERIMENT is all in. Cleaned up a bunch of things, brought back an optimization. Improved the automated loop finder but we need to decide what to do with it for release. Currently just sitting there.
  - [Michael]: what needs to happen next?
  - [David]: The question is what the interface should be. Down to the CLI. How chatty should it be?
- Improved type inference
  - [Ian]: Proposal's in the review committee. Robert sent out another proposal for type inference. Probably for 1.21. Infers assignment of a generic function to a variable whose type is known. Useful for function calls. https://go.dev/issue/59338
- ARM64 assembler
  - [Cherry]: more CLs have been sent in, unclear what the plan is with the new stuff or how it relates to the refactoring. Will ask the ARM folks what they plan to do.
- watchflakes
  - [Cherry]: Made some improvements, works relatively well now. Not productionized yet.
- Init ordering fix
  - [Than]: blocked on some google-internal changes. Waiting on for google-internal team to give the all clear.

Attendees:

Carlos Amedee
Cherry Mui
David Chase
Dmitri Shuraylov
Ian Lance Taylor
Matthew Dempsky
Michael Knyszek
Michael Pratt
Than McIntosh

thanm · 2023-04-18T19:15:08Z

2023-04-18

FYI
- Accepted: all: add opt-in transparent telemetry to Go toolchain all: add opt-in transparent telemetry to Go toolchain #58894
- Accepted: spec: a general approach to type inference spec: a general approach to type inference #58650
  - [Ian]: It’s exactly the same as what we had before. Just a better way of thinking about it.
  - [Robert]: And a much cleaner implementation. Implementation complete and landed.
  - [Robert]: Another active proposal: inference of type arguments for functions you’re passing to another function, or assigning, or returning. Any time you’re assigning, you get “reverse” inference. Likely accept, and close to a complete implementation.
- Accepted: runtime: use WER for GOTRACEBACK=wer on Windows runtime: use WER for GOTRACEBACK=wer on Windows #57441
- all: add GOOS=wasip1 GOARCH=wasm port all: add GOOS=wasip1 GOARCH=wasm port #58141 is kind of working already
- Newly active proposal: unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory #58625
- discussion of some performance profiles of the Go linker when linking google-internal programs (may be some speedup opportunities)
Discussion thread
- ~1 month 'til the freeze.
- Active issues
  - problems with additional closure inlining
    - [Than] all.bash has been pretty inadequate for detecting problems in this; pattern lately has been builders+trybots to pass, but things still fail for google internal apps
  - [MichaelP] working on a google-internal problem that seems to be scheduler related. The task is getting stuck for a long time.
  - [MichaelP] also working another issue related to long GC pauses in another google-internal app. Maybe CPU throttling?
    - [MichaelK] The current theory is that a thread holding a sync.Mutex is getting throttled, causing contention metrics in lock profiles to rise.
  - [MichaelP] Keeping M with cgo threads is on its second revert now. The new problem is TSAN thinks there’s a race in runtime C code because it doesn’t see synchronization that happens in Go code. (I’m 90% sure there isn’t a real race.)
- Loop variable scoping GOEXPERIMENT cmd/compile: add GOEXPERIMENT=loopvar #57969
  - Compiler status? Tooling/ecosystem status?
    - [David]: Compiler is done. But we’re not banging hard on the experiment. For the tooling (which can be done partly during the freeze), we have all the parts, but the packaging is up to snuff. We have hash searching. Some people worry this change will be bad for them, either in performance or semantics. We have a thing that matches compiler optimizer logging with performance data (we might need to tweak optimizer logging a bit).
    - [Austin]: Is this going to need IDE integration?
    - [David]: Not sure how bug finder would fit in
    - [Matthew]: Is the bug finder something IDEs could just integrate into however they normally run tests or user commands?
    - [David]: The hash search does work very much like a “go test” wrapper.
- PGO
  - Pprof proposal: add Discriminator field to Line message Proposal: add Discriminator field to Line message google/pprof#768
  - Uber CL http://go.dev/cl/484838 to do PGO-based indirect call specialization (thanks mdempsky@ for making a pass)
    - [Matthew]: One limitation was that it looks like they’re trying to rewrite whole statements rather than reaching into expressions. I think we could do it in a cleaner, more general way.
  - Not a lot of progress on reaching 5% 🙁

Attendees:

Carlos Amedee
Austin Clements
Cherry Mui
David Chase
Dmitri Shuraylov
Eli Bendersky
Ian Lance Taylor
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-05-02T18:51:27Z

2023-05-02

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

[Note: no meeting last week due to Go team internal summit]
Active issues
- Additional closure inlining (original CL https://go.dev/cl/482356 , since rolled back a couple of times)
  - Is this worth fixing at this point if we’re going to rewrite the inliner?
  - [Than]: It’s making the inliner behave in a more rational way and it’s better to make the inliner less weird, even if we’re rewriting the inliner.
  - [Than]: current blocker is definitely a problem with escape analysis
  - Matthew and/or Cherry can help look
- issue runtime: mayMoreStackPreempt failures #55160
  - mayMoreStack instrumentation is added in obj.
  - perhaps David can run a GOSSAHASH bisection.
Status
- PGO 5%
  - [MichaelP] working through through Uber’s specialization CL. I’ve rewritten most of it at this point. It should apply a lot more now.
- Loop variable scoping
  - [David] Compiler work is done.
  - David and Russ working on user bisection tooling. Might need some improvements to compiler diagnostic output.
  - For 1.22: need to key off -lang version
  - [David]: already works with cross-package inlining (you can turn it on with a -gcflag)
- LUCI POC
  - We made the “go” decision on LUCI at the summit.
  - [MichaelK]: We’re focusing on pre-submit first.
- Improved type inference algorithm
  - [Robert]: Spec work needs to be done (probably during the freeze). Ian and I have some idea how to describe it. Still some work on implementation: Today we can’t pass a partially instantiated generic function to another function f(g[T]) where g is g[T,U] and U can be inferred, but I don’t think that’s a show-stopper.
- Inlining overhaul: Design
  - Than still working on closures inlining bug; Matthew has been busy with PGO and helping with type inference.
- Structured all.bash
  - [MichaelK]: The sooner we get this, the easier LUCI test sharding/merging will be. We’re already using -json for the subrepos.
- ARM64 assembler
  - [Cherry]: No updates from contributor, things may be stalled
- watchflakes improvements
  - [Austin] Given LUCI, what’s the future of watchflakes?
  - [Michael]: We should look at the gaps between LUCI Analysis and watchflakes. LUCI Analysis doesn’t integrate at all with GitHub issues, but maybe that doesn’t matter because it has its own UI? We might be able to add GitHub integration.
  - [Cherry]: There are a few things I still want to improve. E.g., subrepos.
- Pinner
  - [MichaelK]: CL looks good and just needs a lot more commenting. There are a lot of almost races that are okay if you do them in the right way. Seems on track to land for 1.21.

Attendees:

Carlos Amedee
Austin Clements
Cherry Mui
David Chase
Ian Lance Taylor
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-05-30T18:57:37Z

2023-05-30

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

AIs from last week
- init ordering spec update
  - Robert to work on this
- compiler error URLs
  - Carlos + Robert: plan is to delay error URLs until 1.22
  - discussed with Russ, who feels that we should have one page per category of error (rather than for each really specific error)
  - code is in for 1.21 but just flagged off
- Reverse type inference
  - [Robert] spec change work for reverse type inference underway
  - [Robert] Also made another proposal for a separate inference enhancement (interface-common-methods). In proposal “active” now. Hopefully it will move to likely accept this week and then accept next week.
Please write release notes!
- PGO and PGO-devirt underway
- needm change underway
- GOEXPERIMENT=loopvar and bisect command
  - Russ to write?
- various type inference changes (Robert to write)
Discussion of SCCP, its utility and relation to the prove pass
- [Keith:] CL for it went in, there was a silly problem, new CL will go into 1.22.
- [Keith:] An example where SCCP helps where prove does not:
  - You allocate a constant-size, constant-cap slice of 0 bytes.
  - You append a bunch of constant-sized slices.
  - It eliminates the first growslice, then the phi on the cap,
    and then that lets it eliminate the next growslice, etc.
  - prove can do the first one, but not the rest because it doesn’t do dead code elimination while doing prove.
  - SCCP is somewhat redundant with prove, it has a niche where it’s useful.
  - also seems pretty cheap to run when it doesn’t find optimizations.
- [Cherry]: What does prove do that SCCP doesn’t?
  - [Keith]: Iteration variable in-range stuff. SCCP only does constants.
Should we move runtime to internal/runtime so things like reflect and sync can just call functions in it rather than linkname’ing? This would propagate escape information and allow inlining.
- [David]: I think it would improve the type refactoring. There are places where I stopped because I would have had to add more linknames.
- [Keith] The runtime package would have trampolines into the internal APIs. That would require more inlining, but that’s fine.
- [Austin]: A lot of the exported API isn’t called by runtime itself, so it could stay in runtime just like it is now.
- [MichaelP]: There are very few exported APIs in runtime that are performance-sensitive.
- [Matthew]: A lot of external users will complain about linknames breaking.
- [MichaelP]: git rebase handles renames pretty well.
- [Austin]: This is the sort of thing we’d do during early tree open.
- [Austin]: We’d have to leave some symbol names alone. E.g., enter/exitsyscall are linknamed out to x/sys.
- [Austin]: It would affect symbol names in tracebacks. That might increase binary size.
- [Austin]: It’ll cause churn. Is it worth it?
  - [David]: I’m in favor
  - [Cherry]: I’m in favor. I can remove all the weird escape tags from reflect.
PGO PPC compiler bug is still at large
- [David] this is not an easy bug to work on
Closure symbol naming
- [Matthew]: I’m pretty confident we can just reuse the underlying closure definition. I’ve thought about the ways we could optimize them differently.
- [Austin]: out of time, let's talk more about this next meeting

Carlos Amedee
Austin Clements
Cherry Mui
David Chase
Keith Randall
Matthew Dempsky
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-06-27T19:07:17Z

2023-06-27

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Nice improvement to link time and RAM from Cherry’s recently landed changes (approx May 2). Numbers: Frontend Kubelet, similar numbers seen for google internal applications.
Release items
- Review of release-blocking and non-release-blocking Go 1.21 issues.
- Review of pending 1.21 blog posts
  - Type inference improvements
  - Loop var semantics change experiment
  - slog
  - wasip1 target
  - PGO GA (can be a short announcement)
  - Backward/forward compatibility
- Discussion: should we submit CL 503795 for 1.21, the s==s optimization that helps generic string sort?
  - No. Target 1.22.
- [Robert] spec changes in the works.
LUCI
- [Eli]: Heschi feels this isn’t moving along quickly enough. If people have cycles to contribute before EOY, that would help.
ARM64 assembler
- [Cherry] They just sent a HUGE CL. So things are happening it looks like.
watchflakes improvements
- [Cherry]: Basically done. Just “bugs” at this point.
watchflakes: move to cloud
- [Cherry] future of this is unclear with LUCI
Explore MacService integration
- [MichaelP]: They’re started on this.
- [MichaelK]: We may end up using ChromeLabs Macs.
SwissTable
- [MichaelP]: Google3 may want to do some benchmarking on this. Maps are used heavily in google3. This may be of value to us.
- So this may be worth investing in.
Planning H2 scratch
- Go Core planning 2023
  - Aim to accept iterator proposal in time for 1.22
    - [Ian]: We want to get user for range loops in first. If that goes in, we’ll do an updated iterator proposal.
  - Consider landing x/exp/slices (and maybe maps too?) in stdlib
    - Done in 1.21.
PGO devirt follow-up
min/max codegen optimization
5% performance improvement
- [MichaelK]: Some things just aren’t going to benefit as much from PGO.
- [MichaelP]: About half of Sweet benchmarks basically don’t use indirect calls, so devirt doesn’t help those.

thanm · 2023-10-24T18:52:08Z

2023-10-24

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Go Team Summit (Google-internal) took place last week; lots of good information exchange
we are now <1 month before freeze (freeze begins Nov. 21)
Likely accepted proposal: spec: add range over int, range over func spec: add range over int, range over func #61405
- Specifically, range-over-int for 1.22, and range-over-func behind GOEXPERIMENT=rangefunc for 1.22.
proposal: testing: add Keep, to force evaluation in benchmarks proposal: testing: add Keep, to force evaluation in benchmarks #61179
- On hold pending a prototype implementation in the compiler.
- Keith: Anyone working on the prototype?
- MichaelK: it is on hold anyway so probably too late for 1.22. We have plenty of time for 1.23.
- Keith: it is possible to just add testing.Keep without any compiler changes. Downside is people may add testing.Keep in too many places.
- MichaelK: should it be two proposals? One for adding testing.Keep, one for auto-Keep. ?
- Keith(AI): post on the issue to clarify.
Discussion of google-internal applications
MichaelK: still working on tracer.
- Close to production ready, one issue with cgo.
- goal is to land in 1.22 behind a GOEXPERIMENT, maybe on by default.
- Already test against Sweet, works well.
Do we want to look over “os: make use of pidfd for linux os: make use of pidfd for linux #62654”
- Michael Pratt will review CLs.
- There is a new Linux API to query the process status without race. The CLs are going to use that.
Range-over-func tasks
- Change GOEXPERIMENT name to “rangefunc”
- Yield checking
- Named result details
  - Matthew: when a return statement updates named result values. There is a straightforward solution and we should do that.
- David: is Russ doing this, or is Russ busy?
- Cherry: how much of this needs to be in for 1.22?
- Matthew: I will handle the named result details.
Q: range-over-int fully implemented?
- Matthew: believe it is done, will double check. go/types support is also done. Maybe go/ssa needs support.
- Matthew: currently behind GOEXPERIMENT=range. Will make it on by default.
Keith: I have a CL to change GC work buffers from gray to white. Work buffer will mean "to be marked and scanned". The benefit will be save a lookup.
- David: will this cause more things in the work buffer?
- Keith: yes
- MichaelK: Austin tried something like this a while back and it is somewhat a wash?

Attendees:

Cherry Mui
David Chase
Eli Bendersky
Matthew Dempsky
Michael Knyszek
Michael Pratt
Than McIntosh

thanm · 2023-10-31T18:55:43Z

2023-10-31

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Happy Halloween
FYI
- Likely accepted proposal: unique: new package with unique.Handle unique: new package with unique.Handle #62483
- Likely accepted proposal: all: add GORISCV64 environment variable all: add GORISCV64 environment variable #61476
  - add GORISCV64=rva20u64,rva22u64,rva23u64, not emitting C instructions until dust settles
Range-over-int and range-over-func
- proposal accepted
- Change GOEXPERIMENT name to “rangefunc”, bring range-over-int out of experiment (Cherry to look into this)
- Yield checking for range-over-func (David to look into this)
- Spec change of range-over-int (Robert to look into this)
Status
- PGO
  - Michael Pratt: Prototype for function value devirt. Have it implemented, will send it out. Delayed by LUCI work but going fairly well.
  - Scalable builds: CL from Uber that adds a new command to do preprocessing for ingestion by the go command. Might not get around to landing integration of that for Go 1.22.
  - Method devirt CLs are already in, got 0.5% of performance gain.
  - No numbers yet for function value devirt. The prototype has a bunch of limitations.
- Loop scoping
  - David: Talked to Tim King, working on getting the SSA semantics right in the downstream ssa package. Should that happen before the freeze?
  - Cherry: Maybe doesn't need to be, but pending proposals for go/types and the new go/versions package.
  - Robert: go/types proposal expected to be accepted soon. Needs prose in the spec, I'll work on it.
- Inlining
  - Than McIntosh: Working on tuning the score adjustments to try and improve performance numbers.
  - Ran compilebench over the weekend with the new inliner. The compile time has gotten a good deal worse from new heuristics. Slowdown due to heavy use of ir.StaticValue in new heuristics. Looking at a more efficient replacement.
  - Cherry Mui: What does ir.StaticValue do?
  - Than McIntosh: detects cases where you assign a local variable "x := 1", then never re-assign "x". In such cases if the RHS is invariant, you can then substitute in the RHS where you see uses of the var.
  - Keith: There's probably a flag on variables on whether it has been reassigned. That might help to avoid recomputation.
  - Than McIntosh: concern for the flag is that it could get stale
  - Matthew: Main complexity we've had is that if we insert new IR concepts during inlining, we make changes that could affect StaticValue. So we do a failsafe that checks all the conditions again to make sure everything makes sense. It should be possible to track that more strictly to avoid the checks.
  - Any risk for finishing new inliner before Go 1.22? Are you considering downscoping?
  - Than McIntosh: Doesn't make sense to enable experiment in current state. Still needs work. Compile times are too high and the runtime improvement is too low. 1-2% improvement. Compile time is up 8-9%. For the compiler's SSA package it's 23% slower. The slowdown is mostly not due to additional inlines, but heuristic computation. Pessimistic that we'll ship anything for Go 1.22.
- Matthew Dempsky: If hypothetically StaticValue didn't have any overhead, would we be good to ship? We should be able to speed it up. If that's what's keeping us from landing a 1-2% improvement then maybe we can just fix it.
- Than McIntosh: We'll have to check.
- Matthew Dempsky: Working on inliner scheduling; should have CLs soon. Don't know what the performance impact of this will be. Filippo seems to believe it'll have a high impact.
Tracer work
- Michael Knyszek: in good shape, last corner case is cgo callbacks. Getting Michael Pratt's review. Blocked on cmd/trace update, Felix from DataDog is working on cmd/trace integration. Michael Knyszek as a backup. On track for on by default in 1.22.
Allocation headers CLs
- Michael Knyszek: support arenas now, behind a GOEXPERIMENT. Cherry Mui and Keith Randall are reviewing. A concern is that as the header is at the beginning of the allocation, shifting the allocated pointer by 8 bytes. So now allocation is not 16-byte aligned; unsure whether that is a concern or to what degree.

Attendees:

Carlos Amedee
Cherry Mui
David Chase
Eli Bendersky
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-11-07T19:47:39Z

2023-11-07

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Discussion of Keith's demo on faster type switches and asserts
- Keith considering about doing some version of the talk for Gophercon
- Than: did you use any other benchmarks?
- Keith: no. Other benchmarks don't use much type switches.
FYI
- Likely accepted proposal: cmd/compile: create GOARCH=wasm32 cmd/compile: create GOARCH=wasm32 #63131
  - this is adding a new GOARCH, not changing the current GOARCH=wasm
  - there is a 64-bit wasm VM, probably not widely used
  - this is proposed by the WASI folks
  - Matthew: sandboxing 32-bit programs may be easier
- Likely accepted proposal: sync/atomic: add Uint64Pair sync/atomic: add Uint64Pair #61236
  - MichaelK: is this 16-byte aligned? e.g.
    "the compiler and toolchain will provide the necessary alignment automatically"
  - Matthew: pad the type size to at least 32 bytes?
  - MichaelK: the allocation headers will break it (16-byte aligned for object >= 32 bytes), unless it is noscan
  - Matthew: need type checker support. As we do more and more aligned types, it may be a good idea to revisit how we do alignments
  - Cherry: SGTM. also as we may do SIMD in the future
  - Keith: the stack is not aligned, so need to escape to the heap
  - MichaelK: we should consider to do the AlignedTo types, soon. Or user would do "_ [0]atomic.Uint64Pair". If there is already a way, may give user a better way to specify it.
  - Cherry: for only atomics, it may not matter much for stack locals, as long as the instruction doesn't fault
1.22 stuff
- Robert: a fun CL 539299, math/big: implement Rat.FloatPrec
  not urgent but want to get in for 1.22
  - Cherry will review
- execution tracer work:
  - MichaelK: everything put up for review is pretty close to landing
  - new tracer will land pretty soon, but enabling by default is blocked by cmd/trace support. cmd/trace code has no abstraction. Working on it. Need review.
  - What about a release exception, just in case? The freeze exception would include enabling new tracer by default.
  - MichaelP: happy to review cmd/trace CLs. Release exception sounds fine, but should be done soon.
  - MichaelK: Felix has a working implementation but not all features. Will have a good chance done by freeze.
  - Matthew: freeze exception is up to the release team. My impression is that they've been increasingly strict about this.
  - Matthew: could the cmd/trace development be done in a branch or a separate repo, so power users could pull it in?
  - MichaelK: if new tracer is not on by default, it will be a hurdle for people to use it.
- Inliner work
  - Than: CL reviews are a little stalled but will get them reviewed in the next two weeks
  - Matthew will review

Attendees:

Carlos Amedee
Cherry Mui
David Chase
Eli Bendersky
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-11-14T19:58:27Z

2023-11-14

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

One week before the release freeze(!)
FYI
- Likely accepted proposal: go/types: API changes to support explicit Alias nodes go/types: API changes to support explicit Alias nodes #63223
  - go/types Alias API → alias with type parameter → iterators
  - [Robert]: code is already checked in
- Likely accepted proposal: runtime: software floating point for GOARM=6, 7 (not only GOARM=5) runtime: software floating point for GOARM=6, 7 (not only GOARM=5) #61588
- Accepted proposal: cmd/go: add support for dealing with flaky tests cmd/go: add support for dealing with flaky tests #62244
1.22 status
- new inliner: should we enable GOEXPERIMENT=newinliner by default to turn on the new inlining heuristics in 1.22?
  - [Than] on positive side, provides some modest performance improvements, with acceptable compiler speed overhead, and google-internal testing runs have been clean, so risk appears low.
  - [Than]: inlining diagnostics messages change, needs to be communicated
  - [Than]: leaning towards enabling by default, with a more "toned down" announcement to moderate user expectations
  - [Matthew]: go ahead and ship it if there is an overall performance improvement. turn it on in RC. if there are many bugs we can turn it back off
  - [David]: need to decide how to handle any changes for the LSP JSON logging users. The plan is to keep the old words in the old version, enable the new words in new version
  - [Matthew]: what sort of stability guarantees are we promising for LSP users?
  - [David]: we are not supposed to surprise users. If we change something, we change the version number
  - [Matthew]: what exactly does "surprise" mean here
  - [David]: we intend to not change the words
  - [Cherry]: version number is specified by the tool?
  - [David]: yes. if/when we let them know and they can change version
  - [MichaelP]: how we keep the old words? "can inline" means inline everywhere, but that will not be true
  - [David/Than]: this is happening already for large functions (ex: for callee of size 79 we'll advertise is as inlinable but then not inline it in a big func) * [David] sounds like concensus is to leave JSON as is
  - [Eli]: TGP status for Matthew's CLs?
  - [Matthew]: not yet
- range over function check
  - [David]: once it gets reviewed it can go in. overhead is high in some cases, very small with very cheap iterators. Plan to have a thorough check, no optimization, but have a compiler debug flag to turn the checks off. Performance is generally not bad.
  - CLs are under review
  - semantics for named results
    - [Matthew]: fine for now for experiment, will clarify when turning it on by default
    - [Matthew]: will find test cases
    - [Cherry]: documenting the caveat
- allocation headers
  - [MichaelK]: allocation header is in and enabled. google-internal testing is clean. issue tracker is clean. not yet released into google-internal prod
  - [MichaelK]: tracer: a lot is in, may not need freeze exception

Attendees:

Carlos Amedee
Cherry Mui
David Chase
Eli Bendersky
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-11-21T19:54:53Z

2023-11-21

Go 1.22 freeze today!
performance comparison Go 1.20 vs tip+PGO
- -4.59% geomean on Sweet
- BiogoKrishna +20% regression
- [Matthew]: does Sweet run old/new configs interleaved
- [Michael]: yes, interleaved (but not randomized)
- [Michael]: BiogoKrishna is not a representative benchmarks. built with some nonpreemptible loops, a lot of bit arithmetics and table lookups. The reason we keep it is it is a good preemption benchmarks. It can be sensitive to code layout.
- [Michael]: cool debug trick on the dashboard
Please write release notes doc: write release notes for Go 1.22 #61422
Go 1.22 issues
2024 topics
- KSLO cost reduction
- Go on future platforms (RAM efficiency. NUMA?)
- (maybe) Go-Python interop for AI-powered applications
- [David]: is it a good idea to use cgo for Go-Python interop?
- [Michael]: no. better with pipe or RPC (EDIT: I'm wrong about this. A conversation after the meeting clarified a misunderstanding I had about Go->Python calls specifically. Both Go->Python and Python->Go with cgo may very well be preferable to pipe/RPC in many circumstances. :))
- [Michael]: excited with RAM efficiency work. Not sure if NUMA is the right target. Austin's analysis shows memory bandwidth is going to be an issue.
  User survey: impact vs effort mapping graph of the potential optimizations?
- [Michael]: intended to do another session with Alice with user survey for 2024 priority. Austin has a good start for RAM efficiency things
- [discussions about project prioritization]
- [Michael]: go.dev/fast is under RAM efficiency?
- [Eli] documentation work is its own thing. Sounds important, but don't spend too much time. New tracing work is also in this direction.

thanm · 2023-11-28T19:49:50Z

2023-11-28

Go 1.22 RC1 is scheduled on Dec 12 (2 weeks from now)
Please write release notes doc: write release notes for Go 1.22 #61422
- [Keith] there was some discussion previously about revising and/or improving the process for putting together release notes -- has anything happened on this front?
- sounds like not for this cycle on the new process front
- Loop scoping (David or Russ to write relnotes)
- Range-over-int (David to write relnotes)
- Range-over-func under GOEXPERIMENT (David and/or Russ to write relnotes)
- go/types new APIs (Alias, FileVersion) (Robert Griesemer to write relnotes)
- New tracer (Michael Knyszek to write relnotes)
- Bump bootstrap to 1.20 (MichaelK)
- PGO devirtualization changes (Michael Pratt to write relnotes)
- Allocation headers (Michael Knyszek to write notes)
- CL 544195 runtime: profile contended lock calls (Michael Pratt or Rhys)
- CL 534161 runtime/metrics: add STW stopping and total time metrics (Michael Pratt)
- CL 537515 runtime/pprof: include labels for caller of goroutine profile (Michael Knyszek or Michael Pratt)
- CL 511475 cmd/link: allow deriving GNU build ID from Go build ID ID (Cherry Mui)
- CL 493136 cmd/link: rationalize -s and -w flags (Cherry Mui)
Port specific changes that might require release notes
- CL 461697 default to PIE on darwin/amd64 (Cherry Mui)
- CL 514907 GOARM=softfloat/hardfloat (Keith Randall)
- CL 521790 enable register ABI on Loong64 (David Chase or Loong64 owners)
- CL 521778 enable plugin on Loong64
- [Than] Windows SEH stack unwind runtime: support SEH stack unwinding on Windows #57302? need to ping CL/issue
Robert Griesemer: writing release notes is time consuming because the format is tedious
- Cherry Mui: there was some discussion about moving to markdown
- Michael Knyszek: this is actively being worked on
- Carlos Amedee: good to give feedback to Jonathan
- Matthew Dempsky: similar to api/next, we can do relnotes/next
- Carlos Amedee: this is the direction
Go 1.22 issues
Status
- discussion of range-over-func efficiency
- David Chase: Russ's CL is related
- Michael Knyszek: do need freeze exception for anything here?
- David Chase: iter package needs a separate GOEXPERIMENT
- Keith Randall: iter package should land in x/exp?
- Matthew Dempsky: need some coroutine stuff from runtime

Attendees:

Carlos Amedee
Cherry Mui
David Chase
Eli Bendersky
Joedian Reid
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-12-12T19:47:56Z

2023-12-12

Schedule
- go team fixit week (this week)
- go team "quiet" week (next week)
Go 1.22 RC1 is rescheduled to next Tuesday, December 19
- Thank you for writing the release notes!
topic: //go:linkname
- Than: do we need to start making a more concrete plan for disallowing “pull” style go:linknames of runtime/stdlib symbols?
- Than: there are a lot linknames in opensource code. If we plan to change this, maybe we start communicating.
- MichaelP: it is still unclear if we can and should change it. If we don't provide an alternative, I don't think we can just turn it off. If we delete entirely, users may resort to assembly.
- Matthew: assembly code is a good point, user can still call it from assembly code. Balancing that we want to stop user poke into runtime, while provide things they want to do
- Cherry: we can stop assembly code call them in the linker
- Keith: we can find the top users of it and try to fix it somehow. The problem is if we change the name of popular things, we can help them
- Than: if we leave things entirely unchanged, it would block future projects (e.g. renaming runtime to internal/runtime)
- Cherry: for the rename, we could provide forwarding linknames
- MichaelK: we can treat the list of existing linknames as a priority list and provide workarounds
  - rand (handled in 1.22)
  - nanotime
  - memhash
- MichaelK: fine to break programs that linkname to stoptheworld
- Matthew: maybe a good starting point is to start implement the logic in the linker and a vulncheck-style tool for why they might fail the build, to communicate with users, and get feedback from users which linknames are needed
- MichaelP: discourage linknames by providing a compiler flag to turn it back on. That will strongly discourage uses in libraries
- Robert: we should have a better idea for why users use them
- survey from Matthew, MichaelP shows 700 linknames for packages with 10+ imports
Status
- compiler error URLs
  - Robert: Russ thinks we should be doing this with mnemonics, not numbers. Robert Griesemer will look into the mnemonics. Also need to write the documents. Russ vetoed autogenerated docs. The docs could be more explanatory. Carried over to 2024.
- Heap analysis improvements -- viewcore
  - MichaelK: want to work on it this week for fixit, for viewcode.
- PGO devirt
  - MichaelP: done, with minor issues, not clear we want to address them soon
- Traceback iterator: simplify defer
  - Matthew Dempsky: done. will check with Austin.

Attendees:

Carlos Amedee
Cherry Mui
David Chase
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2023-12-19T19:15:32Z

2023-12-19

No meeting today (Go team "quiet" week)

thanm · 2024-01-02T20:14:47Z

2024-01-02

Allocation headers results
- still evaluating impact of GOEXPERIMENT=allocheaders change on google-internal applications. It is generally positive, but not clear to what degree.
Go developer survey
- “The top requests for improving toolchain warnings and errors were to make the messages more comprehensible and actionable”
- Some of this is also about runtime errors
- mild surprises (at least to attendees):
  - 5% targeting wasm
  - 56% “use” ARM64 (!)
    - “However, Apple hardware isn’t the only factor driving ARM64 adoption: among respondents who don’t develop on macOS at all, 29% still say they develop for ARM64.”
    - rumor has it that ARM just laid off their whole Go team in China
- “improved expressivity [“creature comforts”] (12%), improved error handling (12%), and improved type safety or reliability (9%). Respondents had a variety of ideas for improving expressivity, with the general trend of this feedback being “Here’s a specific thing I write frequently, and I wish it were easier to express this in Go”. The issues with error handling continue to be complaints about the verbosity of this code today, while feedback about type safety most commonly touched on sum types.”
- [MichaelP]: 50% said they use protobufs (mostly via gRPC)
2024 plans
- Finishing in flight (done by end of Q1)
  - LUCI migration
  - Inlining: enable new heuristics
  - Inlining: scheduler
  - PGO: scalable builds
  - Runtime trace overhaul: cleanup (delete old tracer, old cmd/trace, moving old trace parser behind new API)
  - Runtime trace overhaul: reader API
  - go build -json
1.23
- Range-over-func GA
- /e/ URLs?
- Type parameters for type aliases?
  - [Eli]: I think this is very far along. Ask Robert next week.
- sync/atomic And/Or
- sync/atomic Uint64Pair?
- Cost reduction (RAM efficiency)
- Performance analysis
- [MichaelK]: CockroachDB is excited about contributing benchmarks to our suite.
administrivia
- more of the work formerly done by the release team is being transferred over to the core (C+R) team

Attendees:

Austin Clements
David Chase
Dmitri Shuraylov
Eli Bendersky
Michael Knyszek
Michael Pratt
Than McIntosh

randall77 · 2024-01-23T23:19:40Z

2024-01-23

Tree open! 🎉
Upcoming 2024 plans
- MichaelK: Is the iter package going to land in 1.23?
- Cherry Mui: My tooling for cold path escapes could probably also be used to estimate up-stack escape analysis (by excluding returned values)
- Single bit reference counting
  - Matthew Dempsky: https://mdempsky.notion.site/Dynamic-escape-analysis-76bbeecd3ac4440c88d0cb2f722aaf75
  - Michael Knyszek: Unique pointer optimization and freegc
  - We should brainstorm about this.
People are grumpy we dropped Windows 7 support (also without a good error message)
- Dmitri Shuralyov: We dropped support in a major release and updated our installer check
- The minor release change may have been that we now get a dynamic linking error instead of something nice.
- Michael Pratt: This seems to have come from a security fix.
- Michael Pratt: Considering a proposal for Windows 7 as a secondary port. Seems like a reasonable path.
- Dmitri Shuralyov: We’ve never done that before. This is WAI and went through the proposal process. Adding back support would have to be a new proposal.

Attendees:

Austin Clements
Cherry Mui
David Chase
Dmitri Shuraylov
Joedian Reid
Keith Randall
Michael Knyszek
Michael Pratt
Robert Griesemer

thanm · 2024-02-20T19:41:31Z

No C&R meeting today (this is a Google Core "no meetings" quiet week).

thanm · 2024-02-27T20:06:50Z

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

2024-02-27

Amazing out-of-tree SwissTable results from Peter Mattis (CockroachDB’s CTO)
- TL;DR: Iteration is ~60% faster, gets are ~20% faster (though there are some odd slowdowns with gets of non-existent elements on large maps), puts are ~30–45% faster (~20% faster if they grow the map), deletes are a mixed bag
- Austin’s benchmark analysis, benchstat, benchplots
- Uses extendible hashing to avoid resize latency of earlier implementation
- Missing some optimizations that would be fairly easy to do in a runtime implementation. Mostly because it’s using Go generics so can’t do the type specialization we do.
- Doesn’t currently even use SIMD, though again in the runtime it wouldn’t be hard to add that.
- The implementation is complex, but looks really clean.
- Peter is supportive of getting it into the runtime; it seems clear that we (Go C&R team) should work to make this happen.
- Keith Randall and Michael Pratt to work on bringing this in.
- Michael Pratt: Expecting a long tail of weird performance regressions for google-internal tests and such. Probably needs to be behind a GOEXPERIMENT.
misc wasm discussions
misc meeting-scheduling discussions

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
David Chase
Dmitri Shuralyov
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2024-03-20T15:15:05Z

2024-03-12

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Lengthy discussion on iter.Pull and LockOSThread semantics
- Case: Both goroutines call LockOSThread
  - 1. G1 is on a locked OS thread
  - 1. G1 calls next, which switches to G2
  - 1. G2 locks its OS thread
  - 1. G2 calls yield
  - Option 1: Step 2 does a thread switch
  - Option 2: Step 2 “donates” the OS thread. Step 3 moves G2 to a new thread
  - Option 3: Step 3 panics
- Case: yield crosses goroutines
  - 1. G1 is on a locked OS thread
  - 1. G1 calls next, which switches to G2
  - 1. G2 passes yield to another goroutine, G3, and continues executing
  - 1. G3 calls yield, which switches to G1
  - If step 2 does a coroutine switch, then it has to “donate” the locked thread to G2. But then in step 4, you have both G1 and G2 that have to run on the same locked thread.
  - Option 1: Step 2 does a thread switch
  - Option 2: Step 4 interrupts G2 (which may take arbitrarily long) in order to run G1
  - Option 3: Step 4 panics because yield was called on another goroutine. (just iter.Pull or iter.Pull and range-over-func).
  - David Chase: G3 can do a lock OS thread, too.
    - Austin Clements: I’m not sure that matters for this case.
    - Michael Pratt: That could be a problem with the first case, where G2 calls lock OS thread, but G1 does not.
- Case: next crosses goroutines
- Michael Knyszek: Cgo callbacks are a pain because that thread is always locked.
- Matthew Dempsky: Deprecate LockOSThread and add RunOnLockedOSThread(f func())
  - Michael Knyszek: Maybe that’s what we should have done in the first place. A lot of UI libraries have a “take this function and run it on the main thread”.
  - Cherry Mui: go.dev/issue/64777
  - Ian Lance Taylor: go.dev/issue/64755
  - Austin Clements: Even if we deprecate LockOSThread, we have to deal with it forever (though the performance may be less of a concern).
  - Michael Pratt: Passing yield to a RunOnLockedOSThread goroutine, same trouble
- Ian Lance Taylor: What’s the downside of making the switch do a full goroutine switch if the thread is locked?
  - Austin Clements: Semantically that works out. Performance is an issue. You don’t necessarily control whether you’re locked, like in the cgo callback case.
  - Ian Lance Taylor: iter.Pull is not going to be that common
  - David Chase: I agree. We have a mechanism that works: goroutines. It’s unfortunate that many of these iterators could easily give you a pull, but we’re not exposing that.
  - Keith Randall: The issue I see is that you may use some code that uses iter.Pull internally. It feels like an abstraction break between the OS thread locking (maybe from a cgo callback) and the iter.Pull.
Struct packing/reordering
- David Chase: Struct packing/reordering is a mechanical thing humans shouldn’t be doing. If we had been doing this with protobufs, we’d have huge savings. 3% RAM footprint, ~20% GC cache traffic. We may end up doing GopherPen and being able to reorder struct fields eliminates a lot of the RAM footprint expense.
- David Chase: I was figuring this would work for all structs, but Russ Cox said it should depend on the module language version. But you can do var x *pkgA.T; (*pkgB.U)(x), if T and U have the same underlying type, and if we laid out T and U differently, it’s not clear how we could implement this cast. If this is rare, maybe we can just break it.
  - Matthew Dempsky: A “fragile conversion” is a conversion form *T to *U where T and U’s underlying type literals appeared in source files from different Go modules.
- David Chase: Ran fragilecast ecosystem analysis, only 27 instances in the 21305 10-or-more imports projects. Classification of conversions.
  - David Chase: I’ve been trying to make it module-aware to reduce false positives, but having trouble. Presumably in this case there’s shared ownership, and a single go.mod language version.
- Ian Lance Taylor: What about the case of passing the address of a struct to C code?
  - David Chase: I don’t pick that up. I’m working on a proposal for a signal type that says “follow the platform rules.” Cgo would automatically add this.
- Robert Griesemer: What about per-file versions?
  - David Chase: I think this would ignore those and just use the module version.
- What does “work” mean? Is there a language change here?
  - Austin Clements: Old code keeps working, even if it has unsafe assumptions about struct layout.
  - David Chase: There is a small language change here if we disallow “fragile” casts.
- Matthew Dempsky: The language spec only has “conversion”, not “cast”.
- Matthew Dempsky: Type aliases have a disadvantage for documentation. E.g., if you alias an internal type, the exported type won’t necessarily have documentation.
  - Austin Clements: Russ and I have talked about fixing that in go doc.
- Robert Griesemer: Is reordering based on the structure of the type? Or profile-based? Could it be done in a way that results in the same reordering for the same underlying types?
  - David Chase: I can imagine someone would want to try profiling, but my assumption is that we would use a stable algorithm that worked the same way for structurally identical types. Partly because there’s unsafe code out there.
- Michael Knyszek: How are type aliases represented in documentation today? Should it copy not just the documentation but the definition?
  - Austin Clements: I think you would want to copy the definition (though don’t hide that it’s an alias).
- Matthew Dempsky: So two struct types that are identical would be reordered in the same way
- Keith Randall: Fighting concerns. We don’t want to break everyone’s unsafe code all at once, but we don’t want to disallow cross-module conversions.
- Austin Clements: That’s why David did the ecosystem analysis. These conversions are very rare.
- Keith Randall: What about plain struct conversions? Memcpy now, but across language versions, a reorder.
  - David Chase: My analysis doesn’t look for these right now.
  - Cherry Mui: There’s at least a way to compile this. The pointer case just can’t be fixed.

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
David Chase
Ian Lance Taylor
Joedian Reid
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2024-03-20T15:19:32Z

2024-03-19

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

FYI
- cmd/compile: add go:wasmexport directive cmd/compile: add go:wasmexport directive #65199 accepted
- proposal: runtime/debug: add SetCrashHeader function proposal: runtime/debug: add SetCrashHeader function #64590 declined
- proposal: spec: support int(bool) conversions proposal: spec: support int(bool) conversions #64825 declined
- runtime: CL 561635 changes relative PCs printed in tracebacks runtime: CL 561635 changes relative PCs printed in tracebacks #65761
  - Still an open question. If we’re going to do structured tracebacks, it should probably be tied to the new debug.SetCrashOutput API, so we need to at least decide on the API this release.
discussion about ways to automate Go best practices
- Matthew Dempsky: Tangential to that, Austin and I were discussing doing team screencasts. I can feel stuck with my development workflow from 20 years ago and being familiar with workflows other people are using could make us more productive and realize where things could still be improved.
brainstorming ideas for AI-powered apps useful for Go team itself
- Solving our own problems through AI, to get us more familiar with AI applications.
- OSS project use cases
- Build-time use cases
Austin is working on a Go GC efficiency downselection doc

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
David Chase
Dmitri Shuralyov
Ian Lance Taylor
Joedian Reid
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2024-04-05T16:55:27Z

2024-03-26

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

upcoming inliner heuristics review during this Thursday’s Go breakout
Project status updates
- Range-over-func iter.Pull and LockOSThread
  - David: We could always use goroutines.
  - Austin: Russ did have an inbetween version that used goroutines and channels, but with fused channel operations.
  - Michael Knyszek: That might be an easier fallback path and general channels.
  - Cherry: Did Russ’ fused channel operations work with locked OS threads?
  - Austin: I’m not sure.
  - Cherry: It might still be easier to implement the fallback for that with locked OS threads.
- Joedian: Are there things we’re worried about for 1.23?
  - Range-over-func has a lot of open tasks
    - David: Checking semantics will be fine. Worried about defer with named results. Performance probably okay. Worried about weird things for debugger integration because it’s not clear we can do exactly what they want.
  - Cleaning up old tracer
    - Austin: Does this have to happen in 1.23?
    - Michael Knyszek: It will break with things like iter.Pull, which aren’t supported in the old tracer.
  - Structured go build -json doesn’t have to happen for 1.23
  - SwissTables doesn’t have to happen for 1.23, but it would be good for hitting our 5% target.
    - Michael Pratt: I’m hoping to start this after returning from vacation.

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
David Chase
Dmitri Shuralyov
Ian Lance Taylor
Joedian Reid
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Robert Griesemer
Than McIntosh

thanm · 2024-04-08T19:54:12Z

2024-03-28

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

On Thursday 3/28 we held a design review looking at the new inlining heuristics framework being developed as part of the inlining overhaul effort (this was a separate session from the usual C&R meeting)
- Slides from the talk: https://docs.google.com/presentation/d/1Lf3WoRyCNicS1K3NCuVl_VnJFhvew_6nAQF_Wx--F54/edit#slide=id.p
Notes from questions/discussion during the talk
- Litmus tests
  - Inline sort.Search with a (small) function literal
  - Fast path/slow path inlines the fast path
  - Singleton calls to non-exported functions
  - Or specifically closures. Comes up as a way to do “function literals”, and also to give more control over defers (though that will prevent inlining).
  - Russ gave some examples on the bug
  - A call site weight of 57 makes no sense. We should be able to significantly lower that.
- Parameter heuristics: constant feeds if/switch
  - Michael Pratt: If you could analyze the effect of propagating a constant, you could potentially significantly reduce the size. It would be nice to capture that in the score.
  - Than: We don’t currently look at the value of these constants. Tracking that could significantly bloat the export data.
- Sweet benchmark results
  - The GoBuild (non-Link) benchmarks are measuring the impact of doing more work on compile time, so it’s not directly comparable with other benchmarks. That’s fine, but means the overall geomean isn’t meaningful.
- Difficulties/obstacles
  - Michael Pratt: With PGO, we’ve noticed that about half of the performance effect comes from optimizing the runtime, in particular the garbage collector. We’ve never checked closely if that’s from making the GC faster or the application doing less allocation.
- Robert: Why don’t we let users say “inline this”?
  - Ian: GCC shows that people often get this wrong.
  - Russ: Overfitting to “right now”
  - Ian: OTOH, GCC experience shows that people usually use “don’t inline” correctly.
- Keith: To some extent we train our users to write code for the inliner. Like manual hot/cold splits today. It’s better if users don’t have to jump through hoops to get “the right thing.”
- Robert: If people write lots of small functions, you might get the large function in the end. If people write large functions, we can’t do anything. Maybe we should tell people to write smaller functions.
- Tim: Do we know how well these heuristics correlate with places of user angst?
  - Russ: Perhaps replace angst with “places where inlining actually helps”
  - Than: I think that happens with passing concrete values to interface arguments. People get grumpy when they have to contort their code around that.
- Ian: Heuristic that specifically targets removing escapes?
  - Than: Passing a concrete value to an interface helps, but that’s just one example.
  - Ian: Also feeds into type switch or type assertion. Maybe “likely” interface type at call site? Also unused parameters, though that helps more with interface method calls (e.g., caller does some work to compute a parameter that isn’t used).
Notes from questions/discussion after the talk
- Austin: I would expect the common case to be a constructor function that returns a pointer, but the result doesn’t escape past the call site.
- Than: That’s hard to capture in a heuristic that’s not doing full escape analysis.
- Austin: Long ago, we talked about doing the data flow before inlining, then using that information to drive inlining, then producing final escape summaries.
- Than: That would definitely require an escape analysis expert.
- Russ: Seems like the strategy is to think about heuristics and try them and see what happens, which doesn’t feel very directed. Maybe, turn the inliner up really high, see what that gives you, and then figure out why. Hash bisection may be able to tell you this: write a function that says “is etcd still > 10% faster?” and run hash bisection to get the set of places where it really matters. Then ask what patterns get you those. Then worry about false positives that come with those.
- Than and others: idea from Russ sounds good
  - probably need to finish stack slot merging before we do this, or the costs of inlining may be too dominant to seek out the wins.
  - Than: bisection could be made more difficut due to noisy benchmarks
  - Than: For escape effects, we could measure the allocation count, which should be pretty stable.
- Tim: Range-over-funcs don’t exist in any code, but do you have a sense of how that will interact with these when that does start to exist?
  - David: It resolves things down to flat code pretty often right now. Though if something prevents inlining, the whole house of cards falls apart.
- Austin: “Calls cost 57” is a clear red flag. A call isn’t expensive. Right now this is our backstop against making lots of bad decisions, but it also prevents us from making a lot of good decisions.
- Austin: “Don’t inline here” heuristics, if they’re working right, should have almost no effect on performance, just reduce binary size. Should check binary size effects of those.

thanm · 2024-04-08T19:56:36Z

2024-04-02

These compiler and runtime meeting minutes are a snapshot of what was discussed in an internal Google meeting. Notes on agenda items that address Google specific needs are elided. Any comments on this issue will be removed, and discussion about topics raised should be taken to the appropriate mailing list. Additionally, any feedback or suggestions for additions to the notes should be handled there as well.

Discussion about how to help automate / and ease the introduction/rollout for Go "best" practices" that we use in the Go core team
- Ensure users are getting the benefit of the core team’s work. (For example, IDE integration is great for build-time things, but we don’t have great answers for deploy/run-time.)
  - PGO example: How do developers consume profiles today? Can we integrate with those? (Vendors like DataDog?)
  - How do people coordinate production builds today?
    - “GitOps”: CICD & triggers, Github Actions
  - should we partner more closely with other teams? If we did, what could we do better?
  - If we did partner, then what happens? A single integration only does so much, while building a partner relationship keeps on giving.
- Keith: It would be great if this problem didn’t exist. A lot of our performance optimizations just happen when you upgrade. Hopefully the places where we couldn’t just make it automatic are a minority of places.
- Matthew: Go could be missing more of an “application level framework.” Right now if you want to build applications with Go, you have to use a lot of external packages. Unless you’re writing a 1970’s CLI. (Ian Lance Taylor: Gophers on Rails.) Opinionated Go way.
  - Michael Knyszek: Service weaver?
  - Carlos: go-kit
  - Austin: I don’t know what’s stopping us.
  - Matthew: What’s stopping me is that I’ve never built an application in Go and it seems scary and keeps changing. Go simplicity for building applications that work nicely over time.
- Michael Knyszek: I think there’s room for improving, e.g., net/http/pprof. Gophers Slack was discussing how everyone exposes basic debugging endpoints. Everyone does it differently because net/http/pprof is a security hazard. Where we did try to provide convenience, we’re now, today, doing a bad job. Pprof should just be there, and secure, and working by default. That impedes PGO integration. Could we integrate into the new “hot thing”, OpenTelemetry?
strings.Compare optimization
- Keith: Used for sorting. We have an old “Less” implementation, and strings.Compare builds a 3-way compare on two “Less”es. We’ve pushed back on optimizing strings.Compare, but now it’s the thing to use for sorting because “you should be using the built-in comparisons.”
  - Ian: Seen “if strings.Compare(a, b) < 0” many times because people don’t realize you can just directly compare strings.
- Compare the strings once instead of twice, using internal/bytealg
- CL 532195
- Austin: Can we do the same for cmp.Compare?
  - Ian: No because we can’t specialize it for string. (We could type switch on string, but not ~string. And it’s not clear if the compiler would specialize in that case anyway)
  - Austin: We could teach the compiler about this special case.
- Keith: The other option was to look for the three-way comparison code pattern and compile that specially. That would catch both strings.Compare and cmp.Compare. But it’s a tricky optimization.
  - Austin: It could be pretty specific to the pattern in strings.Compare and cmp.Compare.
- Way forward: just land this CL. We can undo it if we ever do the general optimization.
Austin: LLVM MLIR uses “basic block arguments” instead of “phi functions” and it sounds like that makes it easier to do block graph manipulations. Have we considered doing that sort of thing?
- Matthew: That’s what I was suggesting with reorganizing the compiler around basic blocks that do tail calls instead of functions. This seems to be the popular SSA-like representation these days.
- Austin: It seems like we could do an 80/20 thing where we don’t have to restructure the whole compiler toward a sea of basic blocks. We could still have regular functions, and just make it easier to manipulate the BB graph.
- Austin: What’s the implication for, say, BB1 dominates BB2 but isn’t a direct predecessor, and BB2 uses a variable introduced in BB1. Does that mean that variable has to be threaded through the arguments to the intermediate BBs?
- Matthew: Example: Thorin IR (see Figure 4 on page 4 for syntax/semantics/typing). BB arguments directly correspond to phi functions and you’re allowed to do “implicit capture”. Results in a very concise semantics.

Attendees:

Austin Clements
Carlos Amedee
Cherry Mui
David Chase
Dmitri Shuralyov
Ian Lance Taylor
Joedian Reid
Keith Randall
Matthew Dempsky
Michael Knyszek
Michael Pratt
Than McIntosh

jeremyfaller added Proposal umbrella labels Jan 26, 2021

gopherbot added this to the Proposal milestone Jan 26, 2021

ianlancetaylor changed the title ~~proposal: Go compiler and runtime meeting notes~~ Go compiler and runtime meeting notes Jan 26, 2021

ianlancetaylor removed the Proposal label Jan 26, 2021

ianlancetaylor modified the milestones: Proposal, Unplanned Jan 26, 2021

ianlancetaylor added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jan 26, 2021

golang deleted a comment from mvdan Jan 26, 2021

golang deleted a comment from ianlancetaylor Jan 26, 2021

randall77 mentioned this issue Feb 22, 2021

cmd/go: Unable to build Go 1.16 on i386 with GO386=387 or GO386=softfloat #44500

Open

golang deleted a comment from josharian Mar 3, 2021

golang deleted a comment from changkun Apr 12, 2021

thepudds mentioned this issue Oct 18, 2021

cmd/compile: feedback-guided optimization #28262

Closed

golang unlocked this conversation Jul 19, 2023

golang deleted a comment from ccahoon Oct 19, 2023

golang locked and limited conversation to collaborators Oct 19, 2023

Go compiler and runtime meeting notes #43930

Go compiler and runtime meeting notes #43930

Comments

jeremyfaller commented Jan 26, 2021

jeremyfaller commented Jan 26, 2021

jeremyfaller commented Jan 26, 2021 • edited Loading

jeremyfaller commented Feb 2, 2021

jeremyfaller commented Feb 9, 2021

jeremyfaller commented Feb 16, 2021

jeremyfaller commented Feb 23, 2021

jeremyfaller commented Mar 3, 2021

jeremyfaller commented Mar 9, 2021

jeremyfaller commented Mar 17, 2021

jeremyfaller commented Mar 24, 2021 • edited Loading

jeremyfaller commented Apr 1, 2021

jeremyfaller commented Apr 12, 2021

jeremyfaller commented Apr 21, 2021 • edited by dsnet Loading

jeremyfaller commented May 17, 2021

changkun commented Oct 28, 2021

thanm commented Feb 21, 2023

thanm commented Mar 1, 2023

thanm commented Mar 7, 2023

thanm commented Mar 14, 2023 • edited Loading

thanm commented Mar 28, 2023

thanm commented Apr 4, 2023

thanm commented Apr 18, 2023

thanm commented May 2, 2023

thanm commented May 30, 2023

thanm commented Jun 27, 2023

thanm commented Oct 24, 2023

thanm commented Oct 31, 2023

thanm commented Nov 7, 2023

thanm commented Nov 14, 2023

thanm commented Nov 21, 2023 • edited by mknyszek Loading

thanm commented Nov 28, 2023

thanm commented Dec 12, 2023

thanm commented Dec 19, 2023 • edited Loading

thanm commented Jan 2, 2024

randall77 commented Jan 23, 2024

thanm commented Feb 20, 2024

thanm commented Feb 27, 2024 • edited Loading

thanm commented Mar 20, 2024

thanm commented Mar 20, 2024

thanm commented Apr 5, 2024

thanm commented Apr 8, 2024

thanm commented Apr 8, 2024

jeremyfaller commented Jan 26, 2021 •

edited

Loading

jeremyfaller commented Mar 24, 2021 •

edited

Loading

jeremyfaller commented Apr 21, 2021 •

edited by dsnet

Loading

thanm commented Mar 14, 2023 •

edited

Loading

thanm commented Nov 21, 2023 •

edited by mknyszek

Loading

thanm commented Dec 19, 2023 •

edited

Loading

thanm commented Feb 27, 2024 •

edited

Loading