Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve how reference regexps are built #3566

Conversation

paulcacheux
Copy link
Contributor

This PR reduces the amount of string -> *regexp.Regexp -> string in reference Regexps. This conversion (especially the compilation is quite expensive both in memory and CPU usage.

This PR tries to be as minimal as possible by ensuring that the same regexps are exported (even those that are not really exported but just used in the same package) but I would be happy to change it to anything else if bigger changes are desired.

Signed-off-by: Paul Cacheux paul.cacheux@datadoghq.com

Previous implementation was doing a lot of string -> regexp -> string
conversions

Signed-off-by: Paul Cacheux <paul.cacheux@datadoghq.com>
@paulcacheux paulcacheux marked this pull request as ready for review January 13, 2022 16:06
@codecov-commenter

This comment has been minimized.

@deleteriousEffect
Copy link
Member

Looking at the code, it seems like it would be more efficient, but would you also mind providing a couple of benchmarks, so we can quantify the improvement?

reference/regexp.go Outdated Show resolved Hide resolved
Signed-off-by: Paul Cacheux <paul.cacheux@datadoghq.com>
@paulcacheux
Copy link
Contributor Author

Here is a repo with benchmarks around this PR https://github.com/paulcacheux/distrib-re-benchmark. Since it's benchmarking an init function it's a bit hacky but it works.

Copy link
Member

@milosgajdos milosgajdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! LGTM.

@deleteriousEffect
Copy link
Member

Here is a repo with benchmarks around this PR https://github.com/paulcacheux/distrib-re-benchmark. Since it's benchmarking an init function it's a bit hacky but it works.

I really appreciate this, thank you!

For posterity, this is the output from my machine:

Previous
10:22 $ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: example.com/previous
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkPrevious-16                1269           1586451 ns/op         1339721 B/op      11453 allocs/op
PASS
ok      example.com/previous    2.114s
This PR
10:21 $ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: example.com/pr
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
BenchmarkPR-16              3660            822432 ns/op          485248 B/op       4501 allocs/op
PASS
ok      example.com/pr  3.047s

@deleteriousEffect
Copy link
Member

LGTM 🚀

@deleteriousEffect deleteriousEffect merged commit 91f33cb into distribution:main Jan 18, 2022
vrothberg added a commit to vrothberg/image that referenced this pull request Nov 29, 2022
Improve the code such that a given raw regex string is compiled only
once.

Before:
init 1341880 bytes, 11453 allocs

After:
init 485600 bytes, 4501 allocs

Copied-from: distribution/distribution/pull/3566
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
vrothberg added a commit to vrothberg/image that referenced this pull request Nov 30, 2022
Improve the code such that a given raw regex string is compiled only
once.

Before:
init 1341880 bytes, 11453 allocs

After:
init 485600 bytes, 4501 allocs

Copied-from: distribution/distribution/pull/3566
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants