Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Handle too big CH payloads for caching #191

Merged
merged 2 commits into from Sep 9, 2022

Conversation

sigua-cs
Copy link
Collaborator

@sigua-cs sigua-cs commented Jul 19, 2022

Implements #143

@render
Copy link

render bot commented Jul 19, 2022

@sigua-cs sigua-cs force-pushed the chore/cache-max-payload-size branch from d4fb3c0 to 75de0c6 Compare July 19, 2022 12:34
Copy link
Collaborator

@mga-chka mga-chka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Guram,
I haven't reviewed your PR yet but you should add a description of this feature in the documentation otherwise it won't be used by chproxy users

@sigua-cs sigua-cs closed this Jul 19, 2022
@sigua-cs sigua-cs reopened this Jul 19, 2022
@render
Copy link

render bot commented Jul 19, 2022

@sigua-cs sigua-cs marked this pull request as draft July 19, 2022 12:57
@sigua-cs sigua-cs requested review from gontarzpawel and removed request for gontarzpawel July 19, 2022 12:57
@sigua-cs sigua-cs force-pushed the chore/cache-max-payload-size branch 2 times, most recently from c6ec7c6 to 98be969 Compare July 19, 2022 13:31
config/config.go Outdated
@@ -32,6 +32,8 @@ var (
}

defaultExecutionTime = Duration(30 * time.Second)

defaultMaxPayloadSize = ByteSize(100000000)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget in the doc to talk about the default max value

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nb: given what @Garnek20 saw in production (for the chatamart cluster), this default size is still to small. IHMO, if we want to have a transparent release (meaning a user that was not using this feature don't understand why some queries are not cached anymore), we should put a very high and unrealistic value like 1<<50.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sigua-cs Can you add a comment explaining why this value is set so high?

@@ -20,6 +20,8 @@ type AsyncCache struct {
TransactionRegistry

graceTime time.Duration

MaxPayloadSize int64
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the size must be positive, why not using a uint64 instead of an int64. I'm not sure but I think the ByteSize type is an alias of the uint64 type

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ByteSize is a custom type based on uint64. In the Cache struct maxPayloadSize is a ByteSize, also in the config defaultMaxPayloadSize is a ByteSize. The reason why in the AsyncCache struct it's int64 because we need compare it with the contentLength/bufferedRespWriter which is int64. There is a function which checks if the contentLength not greater than maxPayloadSize, and if the MaxPayloadSize is ByteSize we will need to convert it to int64 every time we check the contentLength

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still, I agree with @mga-chka we should not allow building a cache with negative value

utils.go Outdated

func isToCache(length int64, s *scope) bool {
maxPayloadSize := s.user.cache.MaxPayloadSize
return length <= maxPayloadSize
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

open question: should we had some metrics in order to know the number of queries not cached because they're too big

@sigua-cs
Copy link
Collaborator Author

I need to update the proxy.go according the latest fix version

@sigua-cs sigua-cs force-pushed the chore/cache-max-payload-size branch 3 times, most recently from 279785d to 509048e Compare July 24, 2022 08:26
proxy.go Outdated Show resolved Hide resolved
@sigua-cs sigua-cs force-pushed the chore/cache-max-payload-size branch 5 times, most recently from baaeb05 to 867d697 Compare August 8, 2022 06:10
}
} else {
// Do not cache responses greater than max payload size.
if contentLength > int64(s.user.cache.MaxPayloadSize) {
cacheSkipped.With(labels).Inc()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test that will check that the maxpayloadSize works for a payload bigger and doesn't work for a payload smaller. One way to check it in the test would be to look at the cacheSkipped counter

@pull-request-size pull-request-size bot added size/L and removed size/M labels Sep 4, 2022
main_test.go Outdated
@@ -339,6 +374,39 @@ func TestServe(t *testing.T) {
},
startHTTP,
},
{
"http cache max payload size",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a good test but you should also add the test that will asses if when the maxPayloadSize is activated if a payload a smaller than the limit will be cached

@sigua-cs sigua-cs force-pushed the chore/cache-max-payload-size branch 3 times, most recently from 3b196e8 to c2a7319 Compare September 7, 2022 15:37
@sigua-cs sigua-cs force-pushed the chore/cache-max-payload-size branch 3 times, most recently from 1829ae8 to 3afaba7 Compare September 7, 2022 19:54
main_test.go Outdated
cachedData, err := cc.Get(key)

if cachedData != nil || err == nil {
t.Fatal("skipped response from cache is expected")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.Fatal("skipped response from cache is expected")
t.Fatal("response bigger than maxPayloadSize should not be cached")

main_test.go Outdated
Comment on lines 179 to 182
path := fmt.Sprintf("%s/cache/%s", testDir, key.String())
if _, err := os.Stat(path); err != nil {
t.Fatalf("err while getting file %q info: %s", path, err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why should we do this check here? IMO it's irrelevant given we're retrieving from cache just after

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

proxy.go Outdated
// Do not cache responses greater than max payload size.
if contentLength > int64(s.user.cache.MaxPayloadSize) {
cacheSkipped.With(labels).Inc()
log.Debugf("%s: Request will not be cached. Content length (%d) is greater than max payload size (%d)", s, contentLength, s.user.cache.MaxPayloadSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth to place it as info log

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
} else {
// Do not cache responses greater than max payload size.
if contentLength > int64(s.user.cache.MaxPayloadSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should complete ongoing transaction even if cache is being skipped

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@gontarzpawel
Copy link
Contributor

I also don't think distinguishing http vs https is necessary in tests. It's enough to test one given we're interested in checking for if max payload size is exceeded or not @sigua-cs

@sigua-cs
Copy link
Collaborator Author

sigua-cs commented Sep 9, 2022

I also don't think distinguishing http vs https is necessary in tests. It's enough to test one given we're interested in checking for if max payload size is exceeded or not @sigua-cs

Removed HTTP tests. Only HTTPS will be used

Copy link
Contributor

@gontarzpawel gontarzpawel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @sigua-cs 🎉

@mga-chka mga-chka merged commit d257a95 into master Sep 9, 2022
@mga-chka mga-chka deleted the chore/cache-max-payload-size branch September 9, 2022 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

3 participants