Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix DISTINCT + LIMIT queries #1390

Closed

Conversation

mbasmanova
Copy link
Contributor

@mbasmanova mbasmanova commented Apr 9, 2022

Limit operator finishes early if the limit is reached before the operator
received all the input. In this case, the Driver closes all upstream operators
and finished early. If this is the only Driver in the Task, the Task finishes early
as well. However, when there are multiple pipelines, upstream pipelines need
to be explicitly closed.

This change uses Task::terminate to finish upstream pipelines early when
output pipeline is finished and output has been consumed. This change
applies only to un-grouped execution. With this change
SELECT DISTINCT c FROM t LIMIT 10 Presto queries finish quickly even
when 't' is a very large table. Before this change such queries would finish
only after all splits for 't' are enumerated and sent to workers, which could
take minutes.

A more generic change would be to introduce a mechanism similar to
Task::terminate which can finish early only a subset of pipelines in a Task,
not all pipelines. This mechanism then could be used with grouped execution
to finish early pipelines only for a given split group. It could also be used to
finish early upstream pipelines when a pipeline other than output pipeline
finishes early.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 9, 2022
@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mbasmanova mbasmanova force-pushed the fix-limit-over-exchange branch 2 times, most recently from 8a371c0 to 45f5602 Compare April 11, 2022 13:48
@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mbasmanova mbasmanova force-pushed the fix-limit-over-exchange branch 2 times, most recently from c85d9b7 to fbd2e8f Compare April 11, 2022 16:15
@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@mbasmanova mbasmanova marked this pull request as ready for review April 11, 2022 18:54
@facebook-github-bot
Copy link
Contributor

@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

auto planNodeIdGenerator = std::make_shared<PlanNodeIdGenerator>();
core::PlanNodeId scanNodeId;

CursorParameters params;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set params.maxDrivers = 4; and it doesn't seem to terminate. Is that expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you set params.maxDrivers, then you need to set params.numResultDrivers = 1. Would you try that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting params.numResultDrivers = 1; works.

numRead += vector->size();
}

// Do not send no-more-splits message. Expect the task to finish without
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should have a timeout here that throws an error. The current behavior does not terminate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice for tests to have a generic timeout functionality, but looks like GoggleTest doesn't offer that. Some folks suggested using timeout support in ctest.

https://stackoverflow.com/questions/16942154/how-to-add-timeout-to-test-when-using-google-testing-framework

https://cmake.org/cmake/help/v3.8/prop_test/TIMEOUT.html

Should we look into these options?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried setting set_tests_properties(velox_exec_test PROPERTIES TIMEOUT 1) and without params.numResultDrivers = 1;. Strangely it did not timeout.
Probably adding a timeout can be a separate PR.

Summary:
Limit operator finishes early if the limit is reached before the operator
received all the input. In this case, the Driver closes all upstream operators
and finished early. If this is the only Driver in the Task, the Task finishes early
as well. However, when there are multiple pipelines, upstream pipelines need
to be explicitly closed.

This change uses Task::terminate to finish upstream pipelines early when
output pipeline is finished and output has been consumed. This change
applies only to un-grouped execution. With this change
`SELECT DISTINCT c FROM t LIMIT 10` Presto queries finish quickly even
when 't' is a very large table. Before this change such queries would finish
only after all splits for 't' are enumerated and sent to workers, which could
take minutes.

A more generic change would be to introduce a mechanism similar to
Task::terminate which can finish early only a subset of pipelines in a Task,
not all pipelines. This mechanism then could be used with grouped execution
to finish early pipelines only for a given split group. It could also be used to
finish early upstream pipelines when a pipeline other than output pipeline
finishes early.

Pull Request resolved: facebookincubator#1390

Reviewed By: oerling

Differential Revision: D35518213

Pulled By: mbasmanova

fbshipit-source-id: 5b3bc695a1ce6abb6dfd41a665bc77f5ee8d4591
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D35518213

shiyu-bytedance pushed a commit to shiyu-bytedance/velox-1 that referenced this pull request Aug 18, 2022
Summary:
Limit operator finishes early if the limit is reached before the operator
received all the input. In this case, the Driver closes all upstream operators
and finished early. If this is the only Driver in the Task, the Task finishes early
as well. However, when there are multiple pipelines, upstream pipelines need
to be explicitly closed.

This change uses Task::terminate to finish upstream pipelines early when
output pipeline is finished and output has been consumed. This change
applies only to un-grouped execution. With this change
`SELECT DISTINCT c FROM t LIMIT 10` Presto queries finish quickly even
when 't' is a very large table. Before this change such queries would finish
only after all splits for 't' are enumerated and sent to workers, which could
take minutes.

A more generic change would be to introduce a mechanism similar to
Task::terminate which can finish early only a subset of pipelines in a Task,
not all pipelines. This mechanism then could be used with grouped execution
to finish early pipelines only for a given split group. It could also be used to
finish early upstream pipelines when a pipeline other than output pipeline
finishes early.

Pull Request resolved: facebookincubator#1390

Reviewed By: oerling

Differential Revision: D35518213

Pulled By: mbasmanova

fbshipit-source-id: 1cd754b315605d0b959440dff5d34508d0d6668b
marin-ma pushed a commit to marin-ma/velox-oap that referenced this pull request Dec 15, 2023
…ng issue (facebookincubator#1390)

What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)

(Fixes: facebookincubator#1389)

How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants