Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle unique query id on server #9648

Merged
merged 1 commit into from Oct 26, 2022

Conversation

Jackie-Jiang
Copy link
Contributor

This PR reverts the special broker negative request id handling for hybrid table introduced in #9443, and move the handling to the server side to avoid all the logging change related to the request id.
On the server side, the unique query id is generated as <brokerId>_<requestId>_(O|R) for OFFLINE and REALTIME request correspondingly. For backward compatibility, when trying to cancel a query without the type suffix, server will look up both OFFLINE and REALTIME query and cancel both of them.
One behavior change is that when looking up the running queries on the server side, the returned query ids will have type suffix. This should be desired to reflect the actual queries running.

@codecov-commenter
Copy link

codecov-commenter commented Oct 25, 2022

Codecov Report

Merging #9648 (d2a481e) into master (b2da310) will decrease coverage by 34.78%.
The diff coverage is 35.48%.

@@              Coverage Diff              @@
##             master    #9648       +/-   ##
=============================================
- Coverage     62.83%   28.05%   -34.79%     
+ Complexity     5162       53     -5109     
=============================================
  Files          1935     1936        +1     
  Lines        103815   103816        +1     
  Branches      15758    15754        -4     
=============================================
- Hits          65236    29123    -36113     
- Misses        33708    71817    +38109     
+ Partials       4871     2876     -1995     
Flag Coverage Δ
integration1 25.88% <25.80%> (+0.14%) ⬆️
integration2 24.43% <35.48%> (-0.18%) ⬇️
unittests1 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...roker/requesthandler/BaseBrokerRequestHandler.java 60.11% <0.00%> (+0.80%) ⬆️
...ache/pinot/server/api/resources/QueryResource.java 0.00% <0.00%> (ø)
...rg/apache/pinot/core/query/utils/QueryIdUtils.java 25.00% <25.00%> (ø)
...e/pinot/core/query/request/ServerQueryRequest.java 86.66% <80.00%> (-1.14%) ⬇️
...roker/requesthandler/GrpcBrokerRequestHandler.java 86.95% <100.00%> (+1.85%) ⬆️
...core/query/executor/ServerQueryExecutorV1Impl.java 76.34% <100.00%> (-6.52%) ⬇️
...a/org/apache/pinot/core/transport/QueryRouter.java 75.55% <100.00%> (-8.15%) ⬇️
...in/java/org/apache/pinot/spi/utils/BytesUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/spi/trace/BaseRecording.java 0.00% <0.00%> (-100.00%) ⬇️
...java/org/apache/pinot/spi/trace/NoOpRecording.java 0.00% <0.00%> (-100.00%) ⬇️
... and 1099 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@@ -201,23 +191,12 @@ public boolean cancelQuery(long requestId, int timeoutMs, Executor executor, Htt
if (queryServers == null) {
return false;
}
// TODO: Use different global query id for OFFLINE and REALTIME table after releasing 0.12.0. See QueryIdUtils for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to log different global query ids(or traceId as mentioned in this PR) for the same query in the broker log as well? That can help triage the queries in different components in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, we will always log the same request id on both broker and servers. Query id and trace id are only available within the server, so you won't find them on the broker.

Copy link
Contributor

@jackjlli jackjlli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jadami10 jadami10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for reverting the breaking stuff here. I'm going to think about this some more and see if there's a more generic solution here. Not a big fan of <broker>_<server>_<O|R> as some magical string, but doing some interface RequestID seems like overkill at the moment

@@ -65,10 +67,20 @@ public class QueryResource {
@ApiResponse(code = 404, message = "Query not found running on the server")
})
public String cancelQuery(
@ApiParam(value = "QueryId as in the format of <brokerId>_<requestId>", required = true) @PathParam("queryId")
String queryId) {
@ApiParam(value = "QueryId as in the format of <brokerId>_<requestId> or <brokerId>_<requestId>_(O|R)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it might make more sense for consistency with other APIs to make the offline/realtime part another optional API param. The default behavior has always been to cancel both, so I don't think there's backwards compatibility to worry about

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This API is on the server side, which is meant to be called by the broker. We need the type suffix to separate the OFFLINE and REALTIME query on the server side. Since they are tracked as 2 separate queries, it is better to give finer control.
On the broker side, the query cancellation will take the request id and always cancel both queries. When the server is upgraded, broker can cancel OFFLINE queries and REALTIME queries individually because they might fanout to different servers.

@Jackie-Jiang Jackie-Jiang merged commit 103e6cb into apache:master Oct 26, 2022
@Jackie-Jiang Jackie-Jiang deleted the unique_query_id branch October 26, 2022 22:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants