Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test NestedLoopJoin without filter in join fuzzer #9923

Closed
wants to merge 1 commit into from

Conversation

kagamiori
Copy link
Contributor

Summary:
A correctness bug was found recently in NestedLoopJoin without filter (#9892).
So this diff extends join fuzzer to cover this case with 10% chance.

Differential Revision: D57761514

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 24, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

Copy link

netlify bot commented May 24, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 9a527c9
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/665f59772074300008410b48

kagamiori added a commit to kagamiori/velox that referenced this pull request May 24, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request May 24, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request May 24, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request May 24, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request May 29, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request May 29, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kagamiori LGTM % comments. Thanks!

for (auto i = 0; i < numRows; ++i) {
rowNumbers[i] = randInt(0, probe->size() - 1);
if (vectorFuzzer_.coinToss(0.8) && probe->size() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need the additional randomness processing here? Does it help catch any new bug? thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @xiaoxmeng, yes, when generating a build input row, the original code randomly chooses an existing probe key value as the build key value. Then when the fuzzer tests right join and full join, there is always no unmatched rows from the build side. This PR makes the build input contain existing probe key values for 80% chance, and random build key values for the rest 20% so that there are some unmatched rows in the build input.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe comment this in the code. Thanks!

const std::vector<std::string>& buildKeys,
const std::vector<RowVectorPtr>& probeInput,
const std::vector<RowVectorPtr>& buildInput) {
VELOX_CHECK(probeInput.size() > 0 && buildInput.size() > 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VELOX_CHECK_GE(probeInput.size(), 0);
VELOX_CHECK_GE(buildInput.size(), 0);

VELOX_CHECK(probeInput.size() > 0 && buildInput.size() > 0);
const auto probeType = asRowType(probeInput[0]->type());
const auto buildType = asRowType(buildInput[0]->type());
auto outputColumns =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we generate the output columns as the other join type does?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually the same as how outputColumns is generated at line 1244 that is used for other join types. The code here is simpler because we already know the join type is not LeftSemiProjectJoin, LeftSemiFilterJoin, or AntiJoin. We only call testCrossProduct() when the join type is LeftJoin, FullJoin, or InnerJoin.

buildKeys,
flatProbeInput,
flatBuildInput);
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we want to return here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testCrossProduct() executes a nested loop join with no join condition. Hash join and merge join cannot execute join with no join condition, so there is no need to continue executing alternative join plans for this iteration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then shall we remove core::isInnerJoin(joinType) || core::isLeftJoin(joinType) in condition above? Thanks!

kagamiori added a commit to kagamiori/velox that referenced this pull request May 31, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request May 31, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

Copy link
Contributor

@xiaoxmeng xiaoxmeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kagamiori LGTM % removing the return. Thanks!

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request Jun 1, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request Jun 4, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request Jun 4, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request Jun 4, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request Jun 4, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

kagamiori added a commit to kagamiori/velox that referenced this pull request Jun 4, 2024
…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

…9923)

Summary:

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892). 
So this diff extends join fuzzer to cover this case with 10% chance. This diff 
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D57761514

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in cda5d8a.

Copy link

Conbench analyzed the 1 benchmark run on commit cda5d8a2.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Jun 7, 2024
…9923)

Summary:
Pull Request resolved: facebookincubator#9923

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892).
So this diff extends join fuzzer to cover this case with 10% chance. This diff
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514

fbshipit-source-id: 4a134ac315ea033ec1b66b874533b79c60437e07
Joe-Abraham pushed a commit to Joe-Abraham/velox that referenced this pull request Jun 7, 2024
…9923)

Summary:
Pull Request resolved: facebookincubator#9923

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892).
So this diff extends join fuzzer to cover this case with 10% chance. This diff
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514

fbshipit-source-id: 4a134ac315ea033ec1b66b874533b79c60437e07
deepashreeraghu pushed a commit to deepashreeraghu/velox that referenced this pull request Jun 13, 2024
…9923)

Summary:
Pull Request resolved: facebookincubator#9923

A correctness bug was found recently in NestedLoopJoin without filter (facebookincubator#9892).
So this diff extends join fuzzer to cover this case with 10% chance. This diff
also makes the build side input to contain rows that do not match the probe side.

Reviewed By: xiaoxmeng

Differential Revision: D57761514

fbshipit-source-id: 4a134ac315ea033ec1b66b874533b79c60437e07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants