Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

physicalplan: add support for multi-stage execution of corr, covar_samp, sqrdiff, and regr_count aggregate functions. #76754

Merged
merged 1 commit into from Feb 25, 2022

Conversation

mneverov
Copy link
Contributor

@mneverov mneverov commented Feb 17, 2022

Fixes: #58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting

@blathers-crl
Copy link

blathers-crl bot commented Feb 17, 2022

Thank you for contributing to CockroachDB. Please ensure you have followed the guidelines for creating a PR.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

I have added a few people who may be able to assist in reviewing:

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Feb 17, 2022
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor Author

@mneverov mneverov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/sem/builtins/aggregate_builtins.go, line 3698 at r1 (raw file):

	dd := res.(*tree.DDecimal)
	_, err = tree.DecimalCtx.Round(&dd.Decimal, &a.sqrDiff)

We need to round here to decimal context to get rid of differences caused by reordering (discussed previously).


pkg/sql/sem/builtins/aggregate_builtins.go, line 4046 at r1 (raw file):

		return tree.DNull, nil
	}
	sqrDiff, err := a.agg.intermediateResult()

variance and var_pop use sqrdiff as intermediate calculations, so no rounding here.


pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):


query FFFFF
SELECT covar_pop(y, x), covar_pop(int_y, int_x), covar_pop(y, int_x), covar_pop(int_y, x), round(covar_pop(dy, dx), 7)

I split the covar_pop and covar_samp because logic test failed with fakedist-* configs. To reproduce the select should look like:

select covar_pop(y, x), covar_samp(y, x)
from statistics_agg_test

i.e. only decimal calculations were affected. I tried to combine different functions like regr_sxx and regr_sxy and tests also failed.
Should the tests run for each function separately?

@blathers-crl
Copy link

blathers-crl bot commented Feb 17, 2022

Thank you for updating your pull request.

My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I think you're almost ready to close that meta issue :)

Reviewed 7 of 9 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mneverov and @yuzefovich)


-- commits, line 5 at r2:
nit: I think these are the last remaining aggregates, so you could use Fixes here to automatically close the issue when the PR merges.


pkg/sql/sem/builtins/aggregate_builtins.go, line 3698 at r1 (raw file):

Previously, mneverov (Max Neverov) wrote…

We need to round here to decimal context to get rid of differences caused by reordering (discussed previously).

nit: consider adding this note as a comment in the code.


pkg/sql/sem/builtins/aggregate_builtins.go, line 256 at r2 (raw file):

	)),

	// The input signature is: SQDIFF, SUM, COUNT

nit: s/SQDIFF/SQRDIFF/.


pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):

Previously, mneverov (Max Neverov) wrote…

I split the covar_pop and covar_samp because logic test failed with fakedist-* configs. To reproduce the select should look like:

select covar_pop(y, x), covar_samp(y, x)
from statistics_agg_test

i.e. only decimal calculations were affected. I tried to combine different functions like regr_sxx and regr_sxy and tests also failed.
Should the tests run for each function separately?

Huh, this is surprising to me. Do you have an idea for the reasons of such behavior?

Copy link
Contributor Author

@mneverov mneverov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Huh, this is surprising to me. Do you have an idea for the reasons of such behavior?

I did some experiments with Reset function as I thought it was the cause, but results were the same. I also tried different combinations of the aggregate functions. I don't have a theory except that there is a bug in the aggregate functions implementation. I'll try to hunt it down.

@blathers-crl
Copy link

blathers-crl bot commented Feb 21, 2022

Thank you for updating your pull request.

Before a member of our team reviews your PR, I have some potential action items for you:

  • We notice you have more than one commit in your PR. We try break logical changes into separate commits, but commits such as "fix typo" or "address review commits" should be squashed into one commit and pushed with --force
  • Please ensure your git commit message contains a release note.
  • When CI has completed, please ensure no errors have appeared.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Copy link
Contributor Author

@mneverov mneverov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):

Previously, mneverov (Max Neverov) wrote…

I did some experiments with Reset function as I thought it was the cause, but results were the same. I also tried different combinations of the aggregate functions. I don't have a theory except that there is a bug in the aggregate functions implementation. I'll try to hunt it down.

I found that regressionAccumulatorDecimalBase sxx, syy, and sxy fields get corrupted. To reproduce:

make testlogic FILES=aggregate SUBTESTS='corrupt_combine'

combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
combine AFTER n1=3 sx1=7E+1 sxx1=316.66666666666666667 sy1=6 syy1=2 sxy1=25
combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0

combine BEFORE n1=2 sx1=35 sxx1=316.66666666666666667 sy1=3 syy1=0.2 sxy1=2.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
combine AFTER n1=3 sx1=7E+1 sxx1=520.83333333333333333 sy1=6 syy1=1.7 sxy1=2E+1
combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0

[20:13:15]      -- FAIL
    logic.go:2498:
        expected:
            8.33333333333333 12.5

        but found (query options: "") :
            6.66666666666667  10

Since combine function is executed for each covar_pop, covar_samp their arguments should not affect each other execution, i.e. for both executions it should be:

combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0

, but in the second case sxx=316.6 i.e. the result of the previous execution, and syy=0.2, sxy=2.5 i.e. results of the prev. execution divided by 10.
I couldn't find the reason for that yet. There is no math. operations in aggregate_builtins that suit this pattern as well as far as I can see no reference leakage

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mneverov and @yuzefovich)


pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):

Previously, mneverov (Max Neverov) wrote…

I found that regressionAccumulatorDecimalBase sxx, syy, and sxy fields get corrupted. To reproduce:

make testlogic FILES=aggregate SUBTESTS='corrupt_combine'

combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
combine AFTER n1=3 sx1=7E+1 sxx1=316.66666666666666667 sy1=6 syy1=2 sxy1=25
combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0

combine BEFORE n1=2 sx1=35 sxx1=316.66666666666666667 sy1=3 syy1=0.2 sxy1=2.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0
combine AFTER n1=3 sx1=7E+1 sxx1=520.83333333333333333 sy1=6 syy1=1.7 sxy1=2E+1
combine AFTER n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0

[20:13:15]      -- FAIL
    logic.go:2498:
        expected:
            8.33333333333333 12.5

        but found (query options: "") :
            6.66666666666667  10

Since combine function is executed for each covar_pop, covar_samp their arguments should not affect each other execution, i.e. for both executions it should be:

combine BEFORE n1=2 sx1=35 sxx1=112.5 sy1=3 syy1=0.5 sxy1=7.5
combine BEFORE n2=1 sx2=35 sxx2=0 sy2=3 syy2=0 sxy2=0

, but in the second case sxx=316.6 i.e. the result of the previous execution, and syy=0.2, sxy=2.5 i.e. results of the prev. execution divided by 10.
I couldn't find the reason for that yet. There is no math. operations in aggregate_builtins that suit this pattern as well as far as I can see no reference leakage

I pushed the commit that seems to fix things. TBH not 100% why it works - seems like we're mutating the decimals we don't want to.

Copy link
Contributor Author

@mneverov mneverov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I pushed the commit that seems to fix things. TBH not 100% why it works - seems like we're mutating the decimals we don't want to.

Thanks for your help! and sorry i introduced the bug 😬

@mneverov mneverov marked this pull request as ready for review February 24, 2022 11:23
@mneverov mneverov requested a review from a team as a code owner February 24, 2022 11:23
@rytaft rytaft requested a review from a team February 24, 2022 12:57
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Thanks for working on all of these aggregate functions!

Reviewed 2 of 2 files at r6, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mneverov)


pkg/sql/logictest/testdata/logic_test/aggregate, line 1520 at r1 (raw file):

Previously, mneverov (Max Neverov) wrote…

Thanks for your help! and sorry i introduced the bug 😬

Np!

@yuzefovich
Copy link
Member

bors r=yuzefovich

@yuzefovich
Copy link
Member

There is a merge conflict with #76963, I'll wait for that PR to merge, rebase this one and will bors it then.

sqrdiff, and regr_count aggregate functions.

Fixes cockroachdb#58347.

Release note (performance improvement): corr, covar_samp, sqrdiff, and
regr_count aggregate functions are now evaluated more efficiently in a
distributed setting
@yuzefovich
Copy link
Member

bors r=yuzefovich

@craig craig bot merged commit 2826900 into cockroachdb:master Feb 25, 2022
@craig
Copy link
Contributor

craig bot commented Feb 25, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

physicalplan: add support for multi-stage execution of aggregate functions
3 participants