Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove query status report from BE when query is cancelled normally #1489

Merged

Conversation

morningman
Copy link
Contributor

When query result reach limit, the Coordinator in FE will send a cancel
request to BE to cancel the query. And when being cancelled, BE will report
query status to FE for debug purpose. But actually it is not necessary
and will generate too many logs.

So I add a CancelReason to distinguish the difference between 'normally'
cancellation and 'internal error' cancellation. if 'normally' cancelled,
no status will be reported to FE.

When query reach limit, or user cancel it actively, it is being cancelled 'normally'.
Otherwise, the query is cancelled due to internal error, which will need
a report from BE.

When query result reach limit, the Coordinator in FE will send a cancel
request to BE to cancel the query. And when being cancelled, BE will report
query status to FE for debug purpose. But actually it is not necessary
and will generate too many logs.

So I add a CancelReason to distinguish the difference between 'normally'
cancellation and 'internal error' cancellation. if 'normally' cancelled,
no status will be reported from BE.

When query reach limit, or user cancel it actively, it is being cancelled 'normally'.
Otherwise, the query is cancelled due to internal error, which will need
a report from BE.
@@ -116,6 +116,12 @@ message PExecPlanFragmentResult {

message PCancelPlanFragmentRequest {
required PUniqueId finst_id = 1;
enum CancelReason {
LIMIT_REACH = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reserve 0, start from 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -116,6 +116,12 @@ message PExecPlanFragmentResult {

message PCancelPlanFragmentRequest {
required PUniqueId finst_id = 1;
enum CancelReason {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to move CancelReason out of PCancelPlanFragmentRequest. Because fragment can be cancelled in other format. For example when Fragment is timed out, cancel thread will cancel this instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


// The reason of cancellation of this query.
// This is used for telling BE whether it need to report query status when being cancelled.
public enum CancelReason {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you have defined in protobuf, we should use it to avoid define one thing in different places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

std::lock_guard<std::mutex> l(_status_lock);
RETURN_IF_ERROR(_exec_status);
if (reason != PCancelPlanFragmentRequest::INTERNAL_ERROR) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to use equal other than not equal here. Because when we add more types of error later, we should let it report in default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -111,7 +112,7 @@ private synchronized PBackendService getProxy(TNetworkAddress address) {
}

public Future<PCancelPlanFragmentResult> cancelPlanFragmentAsync(
TNetworkAddress address, TUniqueId finstId) throws RpcException {
TNetworkAddress address, TUniqueId finstId, CancelReason cancelReason) throws RpcException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cancelReason is not used?

for (BackendExecState backendExecState : backendExecStates) {
TNetworkAddress address = backendExecState.getBackendAddress();
LOG.info("cancelRemoteFragments initiated={} done={} hasCanceled={} ip={} port={} fragment instance id={}",
LOG.info("cancelRemoteFragments initiated={} done={} hasCanceled={} ip={} port={} fragment instance id={}, reason: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this log can be debug level

@@ -114,8 +114,16 @@ message PExecPlanFragmentResult {
required PStatus status = 1;
};

enum PCancelReason {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PFragmentCancelReason?

Copy link
Contributor

@imay imay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman closed this Jul 19, 2019
@morningman morningman reopened this Jul 19, 2019
@morningman morningman merged commit 556299a into apache:master Jul 19, 2019
morningman added a commit to morningman/doris that referenced this pull request Jul 22, 2019
…pache#1489)

When query result reach limit, the Coordinator in FE will send a cancel
request to BE to cancel the query. And when being cancelled, BE will report
query status to FE for debug purpose. But actually it is not necessary
and will generate too many logs.

So I add a CancelReason to distinguish the difference between 'normally'
cancellation and 'internal error' cancellation. if 'normally' cancelled,
no status will be reported from BE.

When query reach limit, or user cancel it actively, it is being cancelled 'normally'.
Otherwise, the query is cancelled due to internal error, which will need
a report from BE.
@imay imay mentioned this pull request Sep 26, 2019
luwei16 added a commit to luwei16/incubator-doris that referenced this pull request Apr 7, 2023
…ster (apache#1489)

When there are pure compute plan nodes, such as join nodes, in a plan, the original impl. may assign a BE to the node which does not belong to the current cloud cluster. It leads to incomplete isolation of clusters.
SWJTU-ZhangLei pushed a commit to SWJTU-ZhangLei/incubator-doris that referenced this pull request Jul 25, 2023
…ster (apache#1489)

When there are pure compute plan nodes, such as join nodes, in a plan, the original impl. may assign a BE to the node which does not belong to the current cloud cluster. It leads to incomplete isolation of clusters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants