Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: stack copying performance regressions #28678

Closed
josharian opened this issue Nov 8, 2018 · 13 comments

Comments

Projects
None yet
5 participants
@josharian
Copy link
Contributor

commented Nov 8, 2018

The runtime stack copying benchmarks have regressed significantly since Go 1.11.

A very quick benchmark run shows:

name                old time/op  new time/op   delta
StackCopyPtr-8      88.9ms ± 1%  117.7ms ± 1%  +32.34%  (p=0.000 n=5+17)
StackCopy-8         65.2ms ± 1%   95.8ms ± 1%  +46.99%  (p=0.000 n=5+19)
StackCopyNoCache-8   107ms ± 1%    135ms ± 2%  +26.23%  (p=0.000 n=5+19)

cc @aclements

@josharian josharian added this to the Go1.12 milestone Nov 8, 2018

@aclements

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

That's surprising. You don't happen to have a bisect handy, do you? (If not, I can run one.)

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Nov 8, 2018

I was planning to bisect tomorrow; laptop otherwise occupied today. Go for it.

@aclements

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

Bisect started

cd runtime && mkdir -p issue-28678 && pl benchmany -d issue-28678 -order metric -benchflags '-test.run NONE -test.bench StackCopy$' go1.11..master
@aclements

This comment has been minimized.

Copy link
Member

commented Nov 8, 2018

Unsurprisingly, this was cbafcc5 "cmd/compile,runtime: implement stack objects":

$ benchstat2 bench.log 433496615f cbafcc55e8
name          old time/op  new time/op  delta
StackCopy-12  86.6ms ± 0%  94.4ms ± 1%  +8.99%  (p=0.008 n=5+5)

/cc @randall77

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Nov 8, 2018

I'm seeing a much larger regression here. But on a hunch, I checked, and the second half of the regression starts at c803ffc, making it a dup of #28595. cc @mknyszek as FYI.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Nov 26, 2018

I'm not seeing the difference reported here. Tip seems faster than 1.11.2.

name              old time/op  new time/op  delta
StackCopyPtr      93.3ms ± 2%  90.8ms ± 2%  -2.77%  (p=0.003 n=8+8)
StackCopy         68.5ms ± 1%  67.4ms ± 2%  -1.54%  (p=0.002 n=8+8)
StackCopyNoCache   118ms ± 5%   120ms ± 2%    ~     (p=0.083 n=8+8)

It would be strange for these benchmarks to be affected by the extra stack object code. None of them has a stack object, so the additional cost is just another funcdata call and an empty for loop per frame.

GC stack scanning might be more expensive for StackCopyPtr as there are a lot of intra-stack pointers, but stack copying won't see that cost.

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Nov 28, 2018

Tip is faster than 1.11 because https://go-review.googlesource.com/c/go/+/110564 went in, which offset the other slowdown. Maybe that’s fine? Might still be worth a quick look before/after stack objects to make sure there’s no low hanging fruit.

@andybons

This comment has been minimized.

Copy link
Member

commented Nov 28, 2018

Given the offset, removing release-blocker. This needs more investigation.

@aclements

This comment has been minimized.

Copy link
Member

commented Jan 8, 2019

Are we still concerned about this regression?

@odeke-em

This comment has been minimized.

Copy link
Member

commented Feb 4, 2019

Tip is faster than 1.11 because https://go-review.googlesource.com/c/go/+/110564 went in, which offset the other slowdown. Maybe that’s fine? Might still be worth a quick look before/after stack objects to make sure there’s no low hanging fruit.

Hello @josharian, I am doing a bit of issue gardening and wondering if we can punt this to Go1.13 and we can do investigations there. What do you think?

@josharian

This comment has been minimized.

Copy link
Contributor Author

commented Feb 4, 2019

Punt.

@odeke-em

This comment has been minimized.

Copy link
Member

commented Feb 4, 2019

Roger that, thanks @josharian!

@odeke-em odeke-em modified the milestones: Go1.12, Go1.13 Feb 4, 2019

@aclements

This comment has been minimized.

Copy link
Member

commented Jun 25, 2019

I think given the performance offset and that there haven't been any clear problems with this during 1.12, I'm no longer worried about the underlying regression. I'm going to go ahead and close, but feel free to re-open.

@aclements aclements closed this Jun 25, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.