New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: use strings.Builder #23828

Open
josharian opened this Issue Feb 14, 2018 · 27 comments

Comments

Projects
None yet
8 participants
@josharian
Contributor

josharian commented Feb 14, 2018

The main challenge here is ensuring everything still works when built with 1.4.

Probably build tag protected files, with something like

type buffer struct {
bytes.Buffer
}

for pre-1.10 and

type buffer = strings.Builder

for 1.10+.

@mdempsky

This comment has been minimized.

Member

mdempsky commented Feb 14, 2018

I've been thinking we should add a "cmd/internal/polyfill" (or however we want to spell the bikeshed) package that provides interfaces to sort.Slice, strings.Builder, and other recent stdlib additions that we want to use in the toolchain, while maintaining Go 1.4 compatibility.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Feb 14, 2018

@mdempsky, you could call the package just "util". ducks

@davecheney

This comment has been minimized.

Contributor

davecheney commented Feb 14, 2018

@martisch

This comment has been minimized.

Member

martisch commented Feb 14, 2018

cmd/internal/compat?

@pciet

This comment has been minimized.

Contributor

pciet commented Feb 14, 2018

In cmd/compile/internal/gc there's these cases:

  • fmt.Fprintln/fmt.Fprintf writing to a bytes.Buffer
  • bytes.Buffer plugged into the output of exec.Command
  • buf.WriteString()
  • buf.WriteString(fmt.Sprintf())
  • bufio.NewWriter(&a_bytes.Buffer)

I assume there might be cases of “str” + “str2” or fmt.Sprintf that may be more efficient as a strings.Builder.

#18990 indicates there are no benchmarks for Builder. I see go tool compile -bench that would allow a comparison of compiler things in ns/op with these changes. Without a clear performance win I don't see a reason to do this.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Feb 14, 2018

In any case, the point behind my snark was to make sure that the package does NOT become a "util" package that itself depends on many things the caller of your package may not want.

It might be valid for the first few things, but it'll inevitably grow, and the util-ness bloat will sneak up on us.

It might be best to have N such packages under some "compat" or "future" directory.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Feb 14, 2018

@pciet, there are tests for Builder's allocations. Not allocating as much helps CPU, running the GC less. It's hard to capture that in benchmarks sometimes. (or they end up noisy)

@pciet

This comment has been minimized.

Contributor

pciet commented Feb 14, 2018

cmd/internal/compat/strbuf? Isn't compile the only thing that needs to be built with 1.4 though?

@mdempsky

This comment has been minimized.

Member

mdempsky commented Feb 14, 2018

@bradfitz

In any case, the point behind my snark was to make sure that the package does NOT become a "util" package that itself depends on many things the caller of your package may not want.

I was imagining a package that's limited to just providing current standard library functionality, with backwards compatible implementations for bootstrapping under older Go releases. That is, for non-bootstrap builds, the packages don't do anything but pass through calls to the standard library.

It seems like you're suggesting that cmd shouldn't treat std as a single monolithic dependency, but instead should view each standard library package dependency individually. I understand why this is important within std, but it's not clear to me that it extends to cmd too.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Feb 14, 2018

I understand why this is important within std, but it's not clear to me that it extends to cmd too.

I personally don't care too much. I'm mostly echoing the objections I've heard about "util" packages in the past.

Practically, it matters less here because this is an internal package and we can therefore audit the fixed number of users of the "util" package and verify they're not pulling in anything unnecessary via the util package they wouldn't depend on otherwise via other paths.

@dgryski

This comment has been minimized.

Contributor

dgryski commented Feb 15, 2018

At what point to we bite the bullet and say that you need something newer than 1.4 to build the latest Go release and add an extra step to bootstrapping from C?

@josharian

This comment has been minimized.

Contributor

josharian commented Feb 15, 2018

At what point to we bite the bullet

As late as humanly possible. Go 2, perhaps.

@pciet

This comment has been minimized.

Contributor

pciet commented Feb 15, 2018

I can work on this.

I don’t like “util” packages but importing a few extra compatibility types isn’t going to be a big deal here, and we have to have a package since compile is split into horizontally dependent subpackages. I’d just add cmd/internal/compat and have a compat.StringBuffer.

My previous comment is coming from ignorance, sorry about that. Obviously at least the linker is also required, and adjusting the bootstrap is outside the scope of this issue. I’ll assume other commands need the compatibility types too.

Again from ignorance, I’m not seeing package testing style benchmarking (with the stability feature) for the compiler. I was thinking I’ll write cmd/compile/bench_test.go that sets up by compiling the gc dependencies then benchmark loops on main() targeting package gc.

I only have an amd64/linux computer to benchmark on though, are there resources for trying things on other platforms? I’ll just email golang-dev asking for benchmarks with the change if not.

@josharian

This comment has been minimized.

Contributor

josharian commented Feb 15, 2018

@josharian

This comment has been minimized.

Contributor

josharian commented Feb 15, 2018

Oh, and I assume that the vast majority of compilation occurs on amd64, so I wouldn’t sweat that.

@pciet

This comment has been minimized.

Contributor

pciet commented Feb 15, 2018

I’m not seeing a significant performance difference with compilebench+benchstat. I’ll try to modify the cases where a Reader or .Bytes() call is used next, so far I've only done the direct replacement cases.

@pciet

This comment has been minimized.

Contributor

pciet commented Feb 16, 2018

Based on what I'm reading via Google Search I'll leave these small return concat cases alone:

// fmt.go
return "[]" + tmodeString(t.Elem(), mode, depth)

return t.Etype.String() + "-" + typefmt(t, flag, FErr, depth)

return "[" + strconv.FormatInt(t.NumElem(), 10) + "]" + tmodeString(t.Elem(), mode, depth)

[actually these are where there may be wins with strings.Builder since a new allocation wouldn't have to be made to combine the strings]

@gopherbot

This comment has been minimized.

gopherbot commented Feb 17, 2018

Change https://golang.org/cl/94897 mentions this issue: cmd/compile: use strings.Builder

@pciet

This comment has been minimized.

Contributor

pciet commented Feb 17, 2018

time+alloc benchmarks: #16897

@pciet

This comment has been minimized.

Contributor

pciet commented Feb 19, 2018

I’ve submitted two patch sets that are both off what this change should be but I’ve shown that func tconv in cmd/compile/internal/gc/fmt.go is the only potentially worthwhile place to add in strings.Builder. I’m working on a fresh patch set now [patch set three had better performance but not conclusively positive, and @davecheney says duplicating tconv with a builder implementation isn't maintainable enough which I agree with].

Specifically tconv is the only obvious place in cmd/compile/ (in tconv there's thousands of <1KB width, mostly <250 byte, string allocations made in most compile calls), but there may be worthwhile changes in dependencies outside this directory or in rearchitecting. If I submit another patch it will have higher code quality and a definite performance improvement.

And thanks for letting me work on this even though I've had to learn a lot and required feedback on suboptimal changes. I'll be out most of today through the weekend, I should be able to look at this again next week.

@pciet

This comment has been minimized.

Contributor

pciet commented Mar 1, 2018

Here’s the time spent when compile (devel +1b1c8b3 Feb 17) is called on cmd/compile/internal/gc:

screen shot 2018-03-01 at 11 02 45 am

screen shot 2018-03-01 at 11 03 12 am

screen shot 2018-03-01 at 11 08 27 am

The tconv things (dumpobj1) account for only 6.6% of the compile time in this case, but garbage collection (runtime.gcDrain) accounts for 18.7%. It’s possible that dumpobj recursive string concatenation generates a significant part of the garbage, so next I’ll try to show where garbage is left. gc.bottomUpVisitor.visitcode, gc.compile, and ssa.Compile are other places to look.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Mar 1, 2018

You want to look at memory (allocation) profiles, not CPU profiles, if your goal is to eliminate the creation of garbage. Reducing garbage reduces CPU.

@pciet

This comment has been minimized.

Contributor

pciet commented Mar 1, 2018

@bradfitz for some reason the pprof heap profile doesn’t show any detail into gc.Main which is why I went to CPU, but I’ll keep trying.

@pciet

This comment has been minimized.

Contributor

pciet commented Mar 1, 2018

With debug.SetGCPercent(-1) and runtime.MemProfileRate = 1 compiling cmd/compile/internal/gc:

screen shot 2018-03-01 at 3 40 45 pm

screen shot 2018-03-01 at 3 46 22 pm

I'm not sure why the dumpobj string concats don't end up here.

@bradfitz

This comment has been minimized.

Member

bradfitz commented Mar 1, 2018

I primarily look at -alloc_space. It's space that ultimately triggers collections.

@pciet

This comment has been minimized.

Contributor

pciet commented Mar 9, 2018

Well I’m not convinced making small changes like replacing some bytes.Buffer use is worth it, for larger changes I think I’ll need a lot more time to understand the compiler and in-flight efforts, and I’m in the middle of some life changes.

I’m unable to work on this more for now.

@josharian

This comment has been minimized.

Contributor

josharian commented Mar 11, 2018

Thanks for looking into it.

@bradfitz bradfitz modified the milestones: Go1.11, Unplanned May 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment