-
Notifications
You must be signed in to change notification settings - Fork 13.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Synchronously return cached charts #15157
Conversation
Codecov Report
@@ Coverage Diff @@
## master #15157 +/- ##
==========================================
+ Coverage 77.09% 77.24% +0.14%
==========================================
Files 971 971
Lines 50236 50343 +107
Branches 6494 6148 -346
==========================================
+ Hits 38729 38887 +158
+ Misses 11302 11251 -51
Partials 205 205
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
||
# If the chart query has already been cached, return it immediately. | ||
if already_cached_result: | ||
return self.send_chart_response(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test for this code path?
@benjreinhart I believe the alternative to return already cached results sync was discussed earlier in the GAQ development process, but it was decided to always use the async route. What is the motivation for returning the already cached results sync? (I don't oppose the decision, just curious to understand if there are perf or other reasons to do this). |
a9b253d
to
ffca544
Compare
@villebro yeah I think from an interface design PoV, it's definitely cleaner to have consistent behavior / response structure. So I think you're right to question this change. The reasoning thus far has been performance related. We've noticed in our testing that there are times when the celery queue gets congested quickly and latency increases. A number of those tasks are to run queries in the background. A subset of those tasks will end up returning the cached result. For the subset that is cached, this change will help take some load off the celery queues and also reduce the chances that users are stuck waiting for celery queues to process a task before they can see their chart render. @robdiciuccio may have some additional reasoning. That being said, after typing that out, it does feel like the best solution is to get celery into a more predictable state with separate queues and resource allocation. I know @robdiciuccio has been working hard on that, so maybe there's an update there? |
@@ -395,7 +395,16 @@ export function exploreJSON( | |||
if (isFeatureEnabled(FeatureFlag.GLOBAL_ASYNC_QUERIES)) { | |||
// deal with getChartDataRequest transforming the response data | |||
const result = 'result' in response ? response.result[0] : response; | |||
return waitForAsyncData(result); | |||
const awaitingChartData = | |||
'job_id' in result && 'channel_id' in result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we switch on HTTP status code instead here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, good suggestion!
We had this debate early on, whether pre-cached data should be returned synchronously. There are pros and cons here, as @benjreinhart pointed out. We're realizing potential performance gains while making the API slightly more complex. That said, the responses from the API endpoints use different HTTP status codes for different use cases (200 for cached data, 202 for async job metadata). Personally, I feel it's acceptable for an API endpoint to have different status code responses for different use cases, but this quickly gets pretty subjective and philosophical. The problem we're trying to solve for with this change is avoiding chart/dashboard rendering delays due to (unnecessary) async job workflows for pre-cached data. Running Celery at scale without delays is *cough* difficult, and we're optimizing for user experience. Happy to continue the discussion and get additional viewpoints. I believe @ktmud and @etr2460 had comments on this in the initial implementation. |
@benjreinhart @robdiciuccio thanks for the context. Like I said, I don't have a strong opinion here, and I agree that deferring an already cached result to the async flow seems like an unnecessary hoop to jump through. Having said that, it could be nice to have the option to configure this behavior with a new config key |
These are the existing states we have:
There's also the cache retrieval endpoints, and a configurable transport that implies either more endpoints (polling) or a new service. In other words, there's already quite a bit going on here. It may be pretty small to make the change you're describing, @villebro, but my preference would be to keep things simpler unless it's truly needed, and I don't know that making this behavior configurable adds much value. Happy to consider making the change if you feel strongly or others feel differently. |
@benjreinhart no, I don't feel strongly about this, so no problem with restricting the options and keeping configuration simple (it's easy to add this later if the need ever comes up). Btw I will try to actually look at the code tomorrow unless it gets merged by then 😄 |
b5654e5
to
83742cd
Compare
/testenv up |
@robdiciuccio Ephemeral environment spinning up at http://54.212.107.240:8080. Credentials are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code LGTM. I'm going to do some more testing locally with different configurations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested sync and async query operation. LGTM with latest fix for forced cache refresh.
Ephemeral environment shutdown and build artifacts deleted. |
* feat: Synchronously return cached charts * Fix lint issue * Fix python lint error * Change getChartDataRequest to return response * Fix lint errors * Add test * explore_json: skip cached data check for forced refresh Co-authored-by: Rob DiCiuccio <rob.diciuccio@gmail.com>
* feat: Synchronously return cached charts * Fix lint issue * Fix python lint error * Change getChartDataRequest to return response * Fix lint errors * Add test * explore_json: skip cached data check for forced refresh Co-authored-by: Rob DiCiuccio <rob.diciuccio@gmail.com>
* feat: Synchronously return cached charts * Fix lint issue * Fix python lint error * Change getChartDataRequest to return response * Fix lint errors * Add test * explore_json: skip cached data check for forced refresh Co-authored-by: Rob DiCiuccio <rob.diciuccio@gmail.com>
SUMMARY
This is for the global async queries feature. We'd like to return the query results immediately (synchronously) if they are cached, otherwise we'll kick off the background job to run the query and later notify clients.
TESTING INSTRUCTIONS
Manual
ADDITIONAL INFORMATION
cc @robdiciuccio