New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regexp: revert per-Regexp use of sync.Pool #26219

Closed
gottwald opened this Issue Jul 4, 2018 · 14 comments

Comments

Projects
None yet
8 participants
@gottwald

gottwald commented Jul 4, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go1.11beta1

Does this issue reproduce with the latest release?

Yes, with the latest beta.

What operating system and processor architecture are you using (go env)?

GOOS="linux"
GOARCH="amd64"

What did you do?

Ran all tests from a closed source monorepo with go1.11beta1 and one test fails with the new beta.
The test is somewhat bigger but the core error is that a config struct that holds compiled regexps among other things gets marshaled to json and unmarshaled back.
reflect.DeepEqual is used to test the equality of the result.

Easiest way I could find to reproduce: https://play.golang.org/p/5WB8WJbQG15

go1.10 returns true, while go1.11beta1 returns false

What did you expect to see?

result true for go1.11beta1 as it is in go1.10

What did you see instead?

result false

I don't think reflect.DeepEqual is even a good idea here but it's still a behavior change.

@zegl

This comment has been minimized.

Show comment
Hide comment
@zegl

zegl Jul 4, 2018

Contributor

This is most likely caused by 7dbf9d4.

Contributor

zegl commented Jul 4, 2018

This is most likely caused by 7dbf9d4.

@ALTree ALTree changed the title from reflect.DeepEqual behavior change in Go 1.11beta1 to regexp: reflect.DeepEqual behavior on regexp changed in Go 1.11beta1 Jul 4, 2018

@ALTree ALTree added this to the Go1.11 milestone Jul 4, 2018

@ianlancetaylor

This comment has been minimized.

Show comment
Hide comment
@ianlancetaylor

ianlancetaylor Jul 4, 2018

Contributor

CC @jkohen @bradfitz @dsnet

I think this is a change that should be documented in the release notes. I think it was an unfortunate accident that reflect.DeepEqual ever worked correctly on compiled regexps. I don't see why that is a behavior that we should preserve.

I'm open to counter-arguments.

Contributor

ianlancetaylor commented Jul 4, 2018

CC @jkohen @bradfitz @dsnet

I think this is a change that should be documented in the release notes. I think it was an unfortunate accident that reflect.DeepEqual ever worked correctly on compiled regexps. I don't see why that is a behavior that we should preserve.

I'm open to counter-arguments.

@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz

bradfitz Jul 4, 2018

Member

I'd just document it.

Member

bradfitz commented Jul 4, 2018

I'd just document it.

@gottwald

This comment has been minimized.

Show comment
Hide comment
@gottwald

gottwald Jul 4, 2018

Would be ok for me as well, just wanted to raise awareness since reflect.DeepEqual gets used / abused a lot in tests in the wild.

gottwald commented Jul 4, 2018

Would be ok for me as well, just wanted to raise awareness since reflect.DeepEqual gets used / abused a lot in tests in the wild.

@dsnet

This comment has been minimized.

Show comment
Hide comment
@dsnet

dsnet Jul 5, 2018

Member

I support documenting it. The regexp.Regexp change also broke a number of tests inside Google. We switched such cases over to the cmp package with a custom comparer to only compare the regexp string:

cmp.Comparer(func(x, y *regexp.Regexp) bool {
    if x == nil || y == nil {
        return x == nil && y == nil
    }
    return x.String() == y.String()
})
Member

dsnet commented Jul 5, 2018

I support documenting it. The regexp.Regexp change also broke a number of tests inside Google. We switched such cases over to the cmp package with a custom comparer to only compare the regexp string:

cmp.Comparer(func(x, y *regexp.Regexp) bool {
    if x == nil || y == nil {
        return x == nil && y == nil
    }
    return x.String() == y.String()
})
@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Jul 6, 2018

Contributor

I am concerned about the disregard shown for practical compatibility here.

I support documenting it. The regexp.Regexp change also broke a number of tests inside Google.

So both a bunch of Google tests and some outside-Google tests broke. That makes it an incompatible change. To force an incompatible change through into a release, there has to be a compelling reason, usually a correctness reason. Being a little faster in a niche use case is not sufficient.

Compatibility is more than just type system compatibility. Type system compatibility is just the bare minimum that we can automate. When we accidentally introduce other incompatibility, and we find out via bug reports and the like, then we need to step back and think carefully about whether inflicting the cost of this particular incompatibility on users is worth the cost of the corresponding benefit.

The benefit here is very minimal - again, faster in a niche use case. Performance almost never justifies breaking actual uses.

This change should be rolled back. I will look into whether there is a different way to save some of the performance win without breaking the uses. Either way, I will send a CL restoring the old behavior.

/cc @dsnet @bradfitz @andybons

Contributor

rsc commented Jul 6, 2018

I am concerned about the disregard shown for practical compatibility here.

I support documenting it. The regexp.Regexp change also broke a number of tests inside Google.

So both a bunch of Google tests and some outside-Google tests broke. That makes it an incompatible change. To force an incompatible change through into a release, there has to be a compelling reason, usually a correctness reason. Being a little faster in a niche use case is not sufficient.

Compatibility is more than just type system compatibility. Type system compatibility is just the bare minimum that we can automate. When we accidentally introduce other incompatibility, and we find out via bug reports and the like, then we need to step back and think carefully about whether inflicting the cost of this particular incompatibility on users is worth the cost of the corresponding benefit.

The benefit here is very minimal - again, faster in a niche use case. Performance almost never justifies breaking actual uses.

This change should be rolled back. I will look into whether there is a different way to save some of the performance win without breaking the uses. Either way, I will send a CL restoring the old behavior.

/cc @dsnet @bradfitz @andybons

@rsc rsc changed the title from regexp: reflect.DeepEqual behavior on regexp changed in Go 1.11beta1 to regexp: revert per-Regexp use of sync.Pool Jul 6, 2018

@rsc rsc added release-blocker and removed Documentation labels Jul 6, 2018

@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz

bradfitz Jul 6, 2018

Member

I don't think it's a niche performance improvement.

If we had a fast regexp implementation we wouldn't need extra API like https://golang.org/pkg/regexp/#Regexp.Copy

Instead we need to push such worries to users.

If I had to pick between people needing to think about when to use Regexp.Copy vs a few tests breaking, I'd still pick a few tests breaking.

Member

bradfitz commented Jul 6, 2018

I don't think it's a niche performance improvement.

If we had a fast regexp implementation we wouldn't need extra API like https://golang.org/pkg/regexp/#Regexp.Copy

Instead we need to push such worries to users.

If I had to pick between people needing to think about when to use Regexp.Copy vs a few tests breaking, I'd still pick a few tests breaking.

@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz

bradfitz Jul 6, 2018

Member

But given that we're stuck with Regexp.Copy, what if we overload Copy to also mean that it returns a *Regexp that uses the machines *sync.Pool, but by default that field is nil and uses the old slow way?

Then people defining global variables can say:

    var fooPattern = regexp.MustCompile(`....`).Copy()
Member

bradfitz commented Jul 6, 2018

But given that we're stuck with Regexp.Copy, what if we overload Copy to also mean that it returns a *Regexp that uses the machines *sync.Pool, but by default that field is nil and uses the old slow way?

Then people defining global variables can say:

    var fooPattern = regexp.MustCompile(`....`).Copy()
@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Jul 6, 2018

Change https://golang.org/cl/122495 mentions this issue: regexp: preserve property that a Regexp can be compared with reflect.DeepEqual

gopherbot commented Jul 6, 2018

Change https://golang.org/cl/122495 mentions this issue: regexp: preserve property that a Regexp can be compared with reflect.DeepEqual

@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz

bradfitz Jul 6, 2018

Member

Alternatively, https://go-review.googlesource.com/c/go/+/122495 is even simpler: it just lazily initializes the machines field, so DeepEqual still works before the regexp has been used at least, which should fix most tests.

Member

bradfitz commented Jul 6, 2018

Alternatively, https://go-review.googlesource.com/c/go/+/122495 is even simpler: it just lazily initializes the machines field, so DeepEqual still works before the regexp has been used at least, which should fix most tests.

@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Jul 6, 2018

Contributor

I still fundamentally object to a per-Regexp sync.Pool. They really don't belong in individual objects. That's a huge amount of allocation to stick in one regexp.

Contributor

rsc commented Jul 6, 2018

I still fundamentally object to a per-Regexp sync.Pool. They really don't belong in individual objects. That's a huge amount of allocation to stick in one regexp.

@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Jul 6, 2018

Contributor

If I had to pick between people needing to think about when to use Regexp.Copy vs a few tests breaking, I'd still pick a few tests breaking.

This is a false dichotomy. The choice is almost never (1) faster and break users vs (2) slow and no broken users. It's almost always possible - and always preferable - to find a way to do (3) faster with no broken users, even if that's a little bit more work.

My point of raising the general concern was just to remind everyone to think about "how can we keep the improvement and not break users?" instead of "what words should we write to tell users how we broke their code?" And if there's no time to do the work to not break users, better to roll back the change and try to get the performance win again next cycle.

(But I have a CL that's faster and makes regexp.Regexp completely DeepEqual-safe, more than it was before. Running benchmarks and then will send it.)

Contributor

rsc commented Jul 6, 2018

If I had to pick between people needing to think about when to use Regexp.Copy vs a few tests breaking, I'd still pick a few tests breaking.

This is a false dichotomy. The choice is almost never (1) faster and break users vs (2) slow and no broken users. It's almost always possible - and always preferable - to find a way to do (3) faster with no broken users, even if that's a little bit more work.

My point of raising the general concern was just to remind everyone to think about "how can we keep the improvement and not break users?" instead of "what words should we write to tell users how we broke their code?" And if there's no time to do the work to not break users, better to roll back the change and try to get the performance win again next cycle.

(But I have a CL that's faster and makes regexp.Regexp completely DeepEqual-safe, more than it was before. Running benchmarks and then will send it.)

@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Jul 9, 2018

Change https://golang.org/cl/122596 mentions this issue: regexp: revert "use sync.Pool to cache regexp.machine objects"

gopherbot commented Jul 9, 2018

Change https://golang.org/cl/122596 mentions this issue: regexp: revert "use sync.Pool to cache regexp.machine objects"

@gopherbot gopherbot closed this in a41d216 Jul 9, 2018

@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot Oct 4, 2018

Change https://golang.org/cl/139783 mentions this issue: regexp: add DeepEqual test

gopherbot commented Oct 4, 2018

Change https://golang.org/cl/139783 mentions this issue: regexp: add DeepEqual test

gopherbot pushed a commit that referenced this issue Oct 12, 2018

regexp: add DeepEqual test
This locks in behavior we accidentally broke
and then restored during the Go 1.11 cycle.
See #26219.

It also locks in new behavior that DeepEqual
always works, instead of only usually working.

This CL is the final piece of a series of CLs to make
DeepEqual always work, by eliminating the machine
cache and making other related optimizations.
Overall, this whole sequence of CLs achieves:

name                             old time/op    new time/op    delta
Find-12                             264ns ± 3%     260ns ± 0%   -1.59%  (p=0.000 n=10+9)
FindAllNoMatches-12                 140ns ± 2%     133ns ± 0%   -5.34%  (p=0.000 n=10+7)
FindString-12                       256ns ± 0%     249ns ± 0%   -2.73%  (p=0.000 n=8+8)
FindSubmatch-12                     339ns ± 1%     333ns ± 1%   -1.73%  (p=0.000 n=9+10)
FindStringSubmatch-12               322ns ± 0%     322ns ± 1%     ~     (p=0.450 n=8+10)
Literal-12                          100ns ± 2%      92ns ± 0%   -8.13%  (p=0.000 n=10+10)
NotLiteral-12                      1.50µs ± 0%    1.47µs ± 0%   -1.65%  (p=0.000 n=8+8)
MatchClass-12                      2.18µs ± 0%    2.15µs ± 0%   -1.05%  (p=0.000 n=10+9)
MatchClass_InRange-12              2.12µs ± 0%    2.11µs ± 0%   -0.65%  (p=0.000 n=10+9)
ReplaceAll-12                      1.41µs ± 0%    1.41µs ± 0%     ~     (p=0.254 n=7+10)
AnchoredLiteralShortNonMatch-12    89.8ns ± 0%    81.5ns ± 0%   -9.22%  (p=0.000 n=8+9)
AnchoredLiteralLongNonMatch-12      105ns ± 3%      97ns ± 0%   -7.21%  (p=0.000 n=10+10)
AnchoredShortMatch-12               141ns ± 0%     128ns ± 0%   -9.22%  (p=0.000 n=9+9)
AnchoredLongMatch-12                276ns ± 4%     253ns ± 2%   -8.23%  (p=0.000 n=10+10)
OnePassShortA-12                    620ns ± 0%     587ns ± 0%   -5.26%  (p=0.000 n=10+6)
NotOnePassShortA-12                 575ns ± 3%     547ns ± 1%   -4.77%  (p=0.000 n=10+10)
OnePassShortB-12                    493ns ± 0%     455ns ± 0%   -7.62%  (p=0.000 n=8+9)
NotOnePassShortB-12                 423ns ± 0%     406ns ± 1%   -3.95%  (p=0.000 n=8+10)
OnePassLongPrefix-12                112ns ± 0%     109ns ± 1%   -2.77%  (p=0.000 n=9+10)
OnePassLongNotPrefix-12             405ns ± 0%     349ns ± 0%  -13.74%  (p=0.000 n=8+9)
MatchParallelShared-12              501ns ± 1%      38ns ± 2%  -92.42%  (p=0.000 n=10+10)
MatchParallelCopied-12             39.1ns ± 0%    38.6ns ± 1%   -1.38%  (p=0.002 n=6+10)
QuoteMetaAll-12                    94.6ns ± 0%    94.8ns ± 0%   +0.26%  (p=0.001 n=10+9)
QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
Match/Easy0/32-12                  79.1ns ± 0%    72.0ns ± 0%   -8.95%  (p=0.000 n=9+9)
Match/Easy0/1K-12                   307ns ± 1%     297ns ± 0%   -3.32%  (p=0.000 n=10+7)
Match/Easy0/32K-12                 4.65µs ± 2%    4.67µs ± 1%     ~     (p=0.633 n=10+8)
Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.684 n=10+10)
Match/Easy0/32M-12                 7.98ms ± 1%    7.96ms ± 0%   -0.31%  (p=0.014 n=9+9)
Match/Easy0i/32-12                 1.13µs ± 1%    1.10µs ± 0%   -3.18%  (p=0.000 n=9+10)
Match/Easy0i/1K-12                 32.5µs ± 0%    31.7µs ± 0%   -2.61%  (p=0.000 n=9+9)
Match/Easy0i/32K-12                1.59ms ± 0%    1.26ms ± 0%  -20.71%  (p=0.000 n=9+7)
Match/Easy0i/1M-12                 51.0ms ± 0%    40.4ms ± 0%  -20.68%  (p=0.000 n=10+7)
Match/Easy0i/32M-12                 1.63s ± 0%     1.30s ± 0%  -20.62%  (p=0.001 n=7+7)
Match/Easy1/32-12                  75.1ns ± 1%    67.4ns ± 0%  -10.24%  (p=0.000 n=8+10)
Match/Easy1/1K-12                   861ns ± 0%     879ns ± 0%   +2.18%  (p=0.000 n=8+8)
Match/Easy1/32K-12                 39.2µs ± 1%    34.1µs ± 0%  -13.01%  (p=0.000 n=10+8)
Match/Easy1/1M-12                  1.38ms ± 0%    1.17ms ± 0%  -15.06%  (p=0.000 n=10+8)
Match/Easy1/32M-12                 44.2ms ± 1%    37.5ms ± 0%  -15.15%  (p=0.000 n=10+9)
Match/Medium/32-12                 1.04µs ± 1%    1.03µs ± 0%   -0.64%  (p=0.002 n=9+8)
Match/Medium/1K-12                 31.3µs ± 0%    31.2µs ± 0%   -0.36%  (p=0.000 n=9+9)
Match/Medium/32K-12                1.44ms ± 0%    1.20ms ± 0%  -17.02%  (p=0.000 n=8+7)
Match/Medium/1M-12                 46.1ms ± 0%    38.2ms ± 0%  -17.14%  (p=0.001 n=6+8)
Match/Medium/32M-12                 1.48s ± 0%     1.23s ± 0%  -17.10%  (p=0.000 n=9+7)
Match/Hard/32-12                   1.54µs ± 1%    1.47µs ± 0%   -4.64%  (p=0.000 n=9+10)
Match/Hard/1K-12                   46.4µs ± 1%    44.4µs ± 0%   -4.35%  (p=0.000 n=9+8)
Match/Hard/32K-12                  2.19ms ± 0%    1.78ms ± 7%  -18.74%  (p=0.000 n=8+10)
Match/Hard/1M-12                   70.1ms ± 0%    57.7ms ± 7%  -17.62%  (p=0.000 n=8+10)
Match/Hard/32M-12                   2.24s ± 0%     1.84s ± 8%  -17.92%  (p=0.000 n=8+10)
Match/Hard1/32-12                  8.17µs ± 1%    7.95µs ± 0%   -2.72%  (p=0.000 n=8+10)
Match/Hard1/1K-12                   254µs ± 2%     245µs ± 0%   -3.62%  (p=0.000 n=9+10)
Match/Hard1/32K-12                 9.58ms ± 1%    8.54ms ± 7%  -10.87%  (p=0.000 n=10+10)
Match/Hard1/1M-12                   306ms ± 1%     271ms ± 8%  -11.42%  (p=0.000 n=9+10)
Match/Hard1/32M-12                  9.79s ± 1%     8.58s ± 9%  -12.37%  (p=0.000 n=9+10)
Match_onepass_regex/32-12           808ns ± 0%     716ns ± 1%  -11.39%  (p=0.000 n=8+9)
Match_onepass_regex/1K-12          27.8µs ± 0%    19.9µs ± 2%  -28.51%  (p=0.000 n=8+9)
Match_onepass_regex/32K-12          925µs ± 0%     631µs ± 2%  -31.71%  (p=0.000 n=9+9)
Match_onepass_regex/1M-12          29.5ms ± 0%    20.2ms ± 2%  -31.53%  (p=0.000 n=10+9)
Match_onepass_regex/32M-12          945ms ± 0%     648ms ± 2%  -31.39%  (p=0.000 n=9+9)
CompileOnepass-12                  4.67µs ± 0%    4.60µs ± 0%   -1.48%  (p=0.000 n=10+10)
[Geo mean]                         24.5µs         21.4µs       -12.94%

https://perf.golang.org/search?q=upload:20181004.5

Change-Id: Icb17b306830dc5489efbb55900937b94ce0eb047
Reviewed-on: https://go-review.googlesource.com/c/139783
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment