Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groupcache: enable H2C + add benchmarks #5068

Merged
merged 2 commits into from Jan 18, 2022
Merged

Conversation

GiedriusS
Copy link
Member

@GiedriusS GiedriusS commented Jan 17, 2022

Add h2c benchmarks. Somehow h2c is slower than regular HTTP locally for
me. Probably because these aren't real life benchmarks i.e. establishing
TCP connections locally have no cost. And if there is only one operation
happening at any time then h2 has to do more work hence more CPU usage.

With 500 parallel requests going on at the same time, h2 becomes faster
than (or is at the same level as) h1 while using a minimal number of TCP
connections. Thus, it will work much faster in real life situations. So,
let's enable it.

name                                                               time/op
GroupcacheRetrieval/h2c/seq-16                                      227µs ± 1%
GroupcacheRetrieval/h2c/parallel=500-16                            54.8µs ±20%
GroupcacheRetrieval/h1,_max_one_TCP_connection/seq-16               150µs ± 2%
GroupcacheRetrieval/h1,_max_one_TCP_connection/parallel=500-16      146µs ± 1%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/seq-16            145µs ± 3%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/parallel=500-16  52.8µs ± 9%

name                                                               alloc/op
GroupcacheRetrieval/h2c/seq-16                                      183kB ± 0%
GroupcacheRetrieval/h2c/parallel=500-16                             143kB ± 1%
GroupcacheRetrieval/h1,_max_one_TCP_connection/seq-16               161kB ± 0%
GroupcacheRetrieval/h1,_max_one_TCP_connection/parallel=500-16      162kB ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/seq-16            161kB ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/parallel=500-16   116kB ± 2%

name                                                               allocs/op
GroupcacheRetrieval/h2c/seq-16                                        283 ± 0%
GroupcacheRetrieval/h2c/parallel=500-16                               256 ± 1%
GroupcacheRetrieval/h1,_max_one_TCP_connection/seq-16                 260 ± 0%
GroupcacheRetrieval/h1,_max_one_TCP_connection/parallel=500-16        262 ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/seq-16              260 ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/parallel=500-16     279 ± 1%

Signed-off-by: Giedrius Statkevičius giedrius.statkevicius@vinted.com

@GiedriusS
Copy link
Member Author

cc @akanshat @onprem

Add h2c benchmarks. Somehow h2c is slowed than regular HTTP locally for
me. Probably because these aren't real life benchmarks i.e. establishing
TCP connections locally has no cost. And if there is only one operation
happening at any time then h2 has to do more work hence more CPU usage.

With 500 parallel requests going on at the same time, h2 becomes faster
than (or is at the same level as) h1 while using a minimal number of TCP
connections. Thus, it will work much faster in real life situations. So,
let's enable it.

```
name                                                               time/op
GroupcacheRetrieval/h2c/seq-16                                      227µs ± 1%
GroupcacheRetrieval/h2c/parallel=500-16                            54.8µs ±20%
GroupcacheRetrieval/h1,_max_one_TCP_connection/seq-16               150µs ± 2%
GroupcacheRetrieval/h1,_max_one_TCP_connection/parallel=500-16      146µs ± 1%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/seq-16            145µs ± 3%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/parallel=500-16  52.8µs ± 9%

name                                                               alloc/op
GroupcacheRetrieval/h2c/seq-16                                      183kB ± 0%
GroupcacheRetrieval/h2c/parallel=500-16                             143kB ± 1%
GroupcacheRetrieval/h1,_max_one_TCP_connection/seq-16               161kB ± 0%
GroupcacheRetrieval/h1,_max_one_TCP_connection/parallel=500-16      162kB ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/seq-16            161kB ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/parallel=500-16   116kB ± 2%

name                                                               allocs/op
GroupcacheRetrieval/h2c/seq-16                                        283 ± 0%
GroupcacheRetrieval/h2c/parallel=500-16                               256 ± 1%
GroupcacheRetrieval/h1,_max_one_TCP_connection/seq-16                 260 ± 0%
GroupcacheRetrieval/h1,_max_one_TCP_connection/parallel=500-16        262 ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/seq-16              260 ± 0%
GroupcacheRetrieval/h1,_unlimited_TCP_connections/parallel=500-16     279 ± 1%
```

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
@@ -93,6 +96,12 @@ func NewGroupcacheWithConfig(logger log.Logger, reg prometheus.Registerer, conf
cfg *CachingBucketConfig) (*Groupcache, error) {
httpProto := galaxyhttp.NewHTTPFetchProtocol(&galaxyhttp.HTTPOptions{
BasePath: basepath,
Transport: &http2.Transport{
AllowHTTP: true,
DialTLS: func(network, addr string, cfg *tls.Config) (net.Conn, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, let's disable TLS but in the future, we can come back to this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we open an issue for it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add it to #5037

kakkoyun
kakkoyun previously approved these changes Jan 18, 2022
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM.

Just a couple of testing nits.

@@ -93,6 +96,12 @@ func NewGroupcacheWithConfig(logger log.Logger, reg prometheus.Registerer, conf
cfg *CachingBucketConfig) (*Groupcache, error) {
httpProto := galaxyhttp.NewHTTPFetchProtocol(&galaxyhttp.HTTPOptions{
BasePath: basepath,
Transport: &http2.Transport{
AllowHTTP: true,
DialTLS: func(network, addr string, cfg *tls.Config) (net.Conn, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we open an issue for it?

pkg/cache/groupcache_test.go Show resolved Hide resolved
pkg/cache/groupcache_test.go Show resolved Hide resolved
pkg/cache/groupcache_test.go Show resolved Hide resolved
pkg/cache/groupcache_test.go Outdated Show resolved Hide resolved
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
@GiedriusS GiedriusS merged commit ba9b36d into thanos-io:main Jan 18, 2022
@GiedriusS GiedriusS deleted the enable_h2c branch January 18, 2022 15:46
Copy link
Member

@onprem onprem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good. I have a minor suggestion which we can take care of in a follow-up.

Sorry for the late review.


go func() {
if err = httpServer.ListenAndServe(); err != nil {
fmt.Printf("failed to listen: %s\n", err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do an os.Exit(1) here as well. If we fail to start the http server, we should fail the test here instead of continuing with it.

}()
go func() {
if err = httpServerH2C.ListenAndServe(); err != nil {
fmt.Printf("failed to listen: %s\n", err.Error())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, os.Exit(1) after this.

GiedriusS added a commit to GiedriusS/thanos that referenced this pull request Jan 20, 2022
Add fixes according to the suggestions here
thanos-io#5068 (comment).

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
GiedriusS added a commit to GiedriusS/thanos that referenced this pull request Jan 20, 2022
Add fixes according to the suggestions here
thanos-io#5068 (comment).

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
GiedriusS added a commit that referenced this pull request Jan 26, 2022
Add fixes according to the suggestions here
#5068 (comment).

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Nicholaswang pushed a commit to Nicholaswang/thanos that referenced this pull request Mar 6, 2022
Add fixes according to the suggestions here
thanos-io#5068 (comment).

Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@vinted.com>
Signed-off-by: Nicholaswang <wzhever@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants