-
Notifications
You must be signed in to change notification settings - Fork 11.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CloudWatch Logs: Queries in an expression should run synchronously #64443
Conversation
Hello @iwysiu!
Please, if the current pull request addresses a bug fix, label it with the |
Backend code coverage report for PR #64443
|
Frontend code coverage report for PR #64443 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool catch and makes sense to me! Might be good to get someone from plugins platform to weigh-in maybe? It feels like this an interesting edge case that could be at least better documented as part of our api or something? Otherwise I'm not sure how any other plugin dev would know to do this sort of thing you know?
Also lets add tests!
@@ -18,7 +18,7 @@ const ( | |||
alertPollPeriod = time.Second | |||
) | |||
|
|||
func (e *cloudWatchExecutor) executeLogAlertQuery(ctx context.Context, req *backend.QueryDataRequest) (*backend.QueryDataResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that this is used for things beyond alerting should it go in a separate file, seems a bit weird to have it in a file called log_alert probably right?
Also sort of basic on my part but how would one know if they should use a Sync vs Async Query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed! We have the async queries so that the panels can get and display partial results, which the cloudwatch logs frontend handles and just keeps pulling results until it gets a done message, but the problem is that alerting and expressions want to consume the data for other things and expect the first dataQuery call to return a frame with all the data. We want to return a sync query whenever the first call needs to get the whole result.
@@ -227,6 +229,7 @@ func (dn *DSNode) Execute(ctx context.Context, now time.Time, _ mathexp.Vars, s | |||
}, | |||
Headers: dn.request.Headers, | |||
} | |||
req.Headers[FromExpressionHeaderName] = "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do other datasources deal with this problem? It feels worth asking grafana plugins platform folks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked! But honestly I think that Cloudwatch logs, redshift and athena are the only async datasources.
@@ -21,6 +21,8 @@ var ( | |||
logger = log.New("expr") | |||
) | |||
|
|||
const FromExpressionHeaderName = "FromExpression" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a header, we generally do X-Grafana
, perhaps X-Grafana-From-SSE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I'm not sure these become HTTP headers, and I guess it matches FromAlert, so okay to leave it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool catch and makes sense to me! Might be good to get someone from plugins platform to weigh-in maybe? It feels like this an interesting edge case that could be at least better documented as part of our api or something? Otherwise I'm not sure how any other plugin dev would know to do this sort of thing you know?
It would probably make sense to document this somewhere. I actually don't think that other people are doing async datasources, at least right now, but other people might. Also, tests added! I needed to make one of the functions mock-able to do it.
@@ -18,7 +18,7 @@ const ( | |||
alertPollPeriod = time.Second | |||
) | |||
|
|||
func (e *cloudWatchExecutor) executeLogAlertQuery(ctx context.Context, req *backend.QueryDataRequest) (*backend.QueryDataResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed! We have the async queries so that the panels can get and display partial results, which the cloudwatch logs frontend handles and just keeps pulling results until it gets a done message, but the problem is that alerting and expressions want to consume the data for other things and expect the first dataQuery call to return a frame with all the data. We want to return a sync query whenever the first call needs to get the whole result.
@@ -227,6 +229,7 @@ func (dn *DSNode) Execute(ctx context.Context, now time.Time, _ mathexp.Vars, s | |||
}, | |||
Headers: dn.request.Headers, | |||
} | |||
req.Headers[FromExpressionHeaderName] = "true" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Asked! But honestly I think that Cloudwatch logs, redshift and athena are the only async datasources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
This pull request was removed from the 9.4.5 milestone because 9.4.5 is currently being released. |
… synchronously (#64443)" (#65067) CloudWatch Logs: Revert "Queries in an expression should run synchronously (#64443)" (#65036) Revert "CloudWatch Logs: Queries in an expression should run synchronously (#64443)" This reverts commit 74436d3. (cherry picked from commit 972e611) Co-authored-by: Shirley <4163034+fridgepoet@users.noreply.github.com>
Cloudwatch Logs queries are generally run asynchronously in parts that are managed by the frontend (the frontend sends separate requests to start the query and then to poll for results). However, when a expression is used, it expects the query results to be returned from the first request. This pr adds a flag to the expression query that signals to the cloudwatch datasource that it should run the entire query synchronously in the backend and return the results (which we already handle for alerting).
The fixed bug can be reproed in the external dev as below:
Which issue(s) does this PR fix?:
Fixes #63735
Special notes for your reviewer:
Note that this doesn't solve the separate "wide format" multiple time series issue #63388, that will still require changing how we create data frames.
Also I suspect that this may be an issue in async redshift/athena.