Skip to content

Improve handling of concurrent identical queries#146

Merged
gontarzpawel merged 8 commits intomasterfrom
fix/concurrent-identical-queries
Apr 21, 2022
Merged

Improve handling of concurrent identical queries#146
gontarzpawel merged 8 commits intomasterfrom
fix/concurrent-identical-queries

Conversation

@gontarzpawel
Copy link
Contributor

Description

In this PR I patch a fix for #51 which comes with a slight modification to proposed solution.

Details

Instead of simply refusing request after grace_time period, we refuse if elapsed time achieved half of max execution time of the request.

The rational behind is that:

  • the identical query failed to register its completion (TransactionRegistry unavailable)
  • configured grace time was too short

Further improvements

For the case of successful transaction completion:

/ There are following options:
// - the identical query failed
// - operation to save the result in cache failed
// To fix that behaviour we can add information in transaction registry. Operation should be expressed as:
// - created
// - failed
// - completed
// and we should store it for the TTL of at most 2 * max_execution_time

which I propose to handle in another PR as it's less likely and requires more additional work.

@render
Copy link

render bot commented Feb 21, 2022

proxy.go Outdated
//
// Because of those 2 options, we will fail the query if elapsed time is bigger than half of the max request timeout.
if transactionResult.ElapsedTime > s.user.maxExecutionTime / 2 {
respondWith(srw, err, http.StatusRequestTimeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you think it may be weird for the end user to receive timeouts after 15 seconds ou 30 seconds ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Perhaps it should be treated more as an internal server error. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes and a message like "Won't be able to server your query within ..." ?

Copy link
Collaborator

@mga-chka mga-chka Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

becareful because on new clickhouse versions (including the one we use at contentsquare), clickhouse will timeout after a few seconds if the forecasted execution time of the query is above the max_execution_time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. I'd propose that we rely on the idea of state machine in transactionRegistry. We would check the state for the query (e.g. Created, Failed, Completed) and take a reliable decision based on that. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, as long as the state contains a Timeout state (because the Failed state could be due to other factors)

@pull-request-size pull-request-size bot added size/L and removed size/M labels Mar 25, 2022
@gontarzpawel gontarzpawel force-pushed the fix/concurrent-identical-queries branch from a4b58e4 to eca3260 Compare April 3, 2022 08:27
@pull-request-size pull-request-size bot added size/XL and removed size/L labels Apr 3, 2022
@gontarzpawel gontarzpawel force-pushed the fix/concurrent-identical-queries branch 5 times, most recently from 879fa3a to fa9522a Compare April 3, 2022 19:59
@gontarzpawel gontarzpawel requested a review from mga-chka April 4, 2022 07:20
}

state, err := c.TransactionRegistry.Status(key)
if errors.Is(err, ErrMissingTransaction) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[opinionated comment]
I don't see the relevance of the error ErrMissingTransaction because:

  • you're dealing with it the same as the state transactionCompleted (and when you return the error ErrMissingTransaction you always return an transactionCompleted state)
  • it doesn't look like an error since it's the nominal use case

If you want to dissociate the ErrMissingTransaction with the state transactionCompleted you could create a state liketransactionNotPresent. Doing so would allow you to remove this part since it would be handled by the logic of the part if !state.IsPending() {...}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with you 👍

// mark transaction as failed
// todo: discuss if we should mark it as failed upon timeout. The rational against it would be to hope that
// partial results of the query are cached and therefore subsequent execution can succeed
if err = userCache.Fail(key); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO you should dissociate timeouted queries from other error reasons. And CHProxy should only deal with timeout erros because let's say there is a transient issue on the cluster (like an AWS network down for 3 sec), all the queries done during this downtime will be unavailable as long as they're in the transaction registry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'll be hard to tell the difference between those different types of errors because Clickhouse does not seem to handle it well. I've checked status codes it propagates in case of socket timeout and it's http service unavailable 503

https://github.com/ClickHouse/ClickHouse/blob/f0774ca9ee1c6de7049fa10be0c67eeb822797a6/src/Server/HTTPHandler.cpp#L225

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, I did a test using the http client to get 2 differents behaviour:

  • a timeout
  • an predicted timeout (so the query is cancel by CH)
    I got on both case an HTTP 500 (which is weird because according to the code you showed me, I was exepcted a 503)
    but on the first one I got the answer:
    Code: 159. DB::Exception: Timeout exceeded: elapsed 1.00059767 seconds, maximum: 1. (TIMEOUT_EXCEEDED) (version 22.2.2.1)
    and on the second one I got a
    Code: 160. DB::Exception: Estimated query execution time (220.8800604841333 seconds) is too long. Maximum: 30. Estimated rows to process: 18629040: While executing MergeTreeInOrder. (TOO_SLOW) (version 22.2.2.1)

So maybe we could try to fetch the error code in case of a 500 from CH to check if it's 159 or a 160. WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,
since this point is not critical we can leave it the way it is now.

@gontarzpawel gontarzpawel force-pushed the fix/concurrent-identical-queries branch from fa9522a to e4938c9 Compare April 8, 2022 08:04

func (c *AsyncCache) AwaitForConcurrentTransaction(key *Key) (TransactionResult, error) {
startTime := time.Now()
var seenState TransactionState
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more readable to initialize seenState at transactionCreated because without knowing TransactionState you can't understand this part

proxy.go Outdated
}
// We will assess if elapsed time is bigger than half of the max request timeout.
// This precondition halts execution
if transaction.ElapsedTime > s.user.maxExecutionTime/2 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after 2nd thought, this code seems risky because on cluster with long queries we could have that:

  • the concurrent queries was not working (because of the equality on pointers you spot)
  • people update their CHProxy version and starts to have "weird" errors
    IMHO we should at least have a new conf to activate this behavior so that people know what they're doing when they activate it.
    Another thing is that maybe maxExecutionTime/2 is to short because:
  • if people use the default CH 30 sec.
  • Let's say you have an APP (or function module) with slow queries (like 25 sec)
  • the end user refreshes it's UI and fire a new query after 5 sec
  • and it will get an error even if the first query was able to finish.
    Maybe the best would be something like s.user.maxExecutionTime - 2 sec if (maxExecutionTime>= 20 sec) or maxExecutionTime -1 sec if ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the first point - I don't understand how the pointer or long query issue can be impacted by this predicate. Keep in mind that it's a branch in which result of concurrent query was either not found or exceeded grace_time. Now, it depends on the user how grace_time is defined.

We can indeed be conservative in choice and set the predicate to:
transaction.ElapsedTime > s.user.maxExecutionTime -1.

Copy link
Contributor Author

@gontarzpawel gontarzpawel Apr 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I have another proposition which would remove this condition. What if we removed grace_time configuration and use the value in code equal to max_exec_time in place of grace_time, or at least deprecate it in the first step?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding your first comment, my issue is that the grace_time is at 5 sec by default so someone not aware about this new feature could upgrade CHPRoxy and get weird KO if his workload is designed for slow (+ 20 sec) queries (it's maybe an edge since I have not understanding of how people are using CH).

Regarding your second comment I agree with you that grace_time should be equal to max_exec_time and it would avoid this extra logic. Since the max_exec_time is optionnal, we should use a default value at 30 sec (same as CH), wdyt?

@gontarzpawel gontarzpawel requested a review from mga-chka April 20, 2022 07:08
// Deprecated: GraceTime duration before the expired entry is deleted from the cache.
// It's deprecated and in future versions it'll be replaced by user's MaxExecutionTime.
// It's already the case today if value of GraceTime is omitted.
GraceTime Duration `yaml:"grace_time,omitempty"`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't forget to deprecate grace_time in the https://github.com/ContentSquare/chproxy/blob/master/config/README.md

func (c *AsyncCache) AwaitForConcurrentTransaction(key *Key) bool {
startTime := time.Now()
type TransactionResult struct {
ElapsedTime time.Duration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit picky but since the duration is not used (anymore) outside the function AwaitForConcurrentTransaction, there is no need to keep this field.
In fact, we could remove the TransactionResult structure and the AwaitForConcurrentTransaction could return just a TransactionState.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right!

// in order to let the cleaner swipe the transaction
time.Sleep(100 * time.Millisecond)
if !transactionResult.State.IsPending() {
time.Sleep(150 * time.Millisecond)
Copy link
Collaborator

@mga-chka mga-chka Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!
[it's not related to this PR but it's a bit of a shame that the 100 msec frequency of the cleaner is hardcoded. Because it makes the 150 msec choice in this test a bit magical for a newcomer. Moreover, if it was in a variable it would allow us to modify it for the test and therefore speedup the tests]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a todo to refactor at some point.

Copy link
Collaborator

@mga-chka mga-chka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall it looks good.
I left a few comments if you have time but they're not blocking issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

3 participants