-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/stats: evict stats cache entry if user-defined types have changed #124603
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job figuring this out! I only have minor nits.
For the third commit I wonder whether we should make it both stronger and backportable - namely, thoughts on making those two checks be CrdbTestBuild
-only and make the errors AssertionFailed
so that we could have a chance to catch bugs like this in randomized tests?
Reviewed 1 of 1 files at r1, 11 of 11 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)
pkg/sql/opt_catalog.go
line 1798 at r1 (raw file):
} } if stat.HistogramData != nil && len(stat.ColumnIDs) == 1 {
nit: the check for a single column is here because we don't generate histograms for multi-column stats? Should we reference #49698 around this defensive check to make sure the code is adjusted when we lift that restriction?
pkg/sql/stats/stats_cache.go
line 636 at r2 (raw file):
// parseStats converts the given datums to a TableStatistic object. It might // need to run a query to get user defined type metadata.
nit: mention in the comment that ColumnID is only specified if a single-column statistic is returned. Alternatively, it might be cleaner to not return the ColumnID and fetch it one layer above in getTableStatsFromDB
.
pkg/ccl/logictestccl/testdata/logic_test/multi_region_stats
line 73 at r2 (raw file):
statement ok ALTER DATABASE db124181 ADD REGION "ca-central-1"
nit: should we also check statistics after having changed the enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once Yahor's comments are addressed. I agree it makes sense to either not backport the last commit, or hide it behind CrdbTestBuild
.
Reviewed 1 of 1 files at r1, 11 of 11 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @michae2)
pkg/sql/opt_catalog.go
line 1806 at r2 (raw file):
// during an ALTER TABLE ALTER COLUMN TYPE statement, which we should // not. Still, never hurts to be defensive. return false, nil
[nit] Should we add a log message or something here, since this is unexpected behavior?
2f42da5
to
746092c
Compare
31abd7b
to
0bef5f4
Compare
In statistics builder we assume that the datums decoded from histogram upper bounds are comparable with datums in spans and constraints. If the histogram column type were ever different from the table column type, however, this assumption would not hold. This should never happen, because histograms are associated with a column ID, and ALTER TABLE should never re-use a column ID during ALTER TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be defensive, add a check to sql.(*optTableStat).init that skips over the TableStatistic if the histogram column type isn't equivalent to the table column type. Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard against injecting histograms that don't match the column type. As part of this fix I'm removing the testcase added in cockroachdb#46552 which deliberately injects statistics with a histogram of a different type. Informs: cockroachdb#124181 Release note: None
When adding table statistics to the stats cache, we decode histogram upper bounds into datums. If the histogram column uses a user-defined type, we hydrate the type and use this to decode. In statistics builder, these histogram upper bound datums are compared against datums in spans and constraints. The comparisons assume that the datums are of equivalent type, but if the user-defined type has changed sometime after loading the stats cache entry, this might not be true. If the user-defined type has changed, we need to evict and re-load the stats cache entry so that we decode histogram datums with a freshly- hydrated type. (We were already checking UDT versions when building the optTable in sql.(*optCatalog).dataSourceForTable, but the newly-built optTable used the existing table statistics instead of refreshing those, too.) Fixes: cockroachdb#124181 Release note (bug fix): Fix a bug where a change to a user-defined type could cause queries against tables using that type to fail with an error message like: "histogram.go:694: span must be fully contained in the bucket" The change to the user-defined type could come directly from an ALTER TYPE statement, or indirectly from an ALTER DATABASE ADD REGION or DROP REGION statement (which implicitly change the crdb_internal_region type). This bug has existed since UDTs were introduced in v20.2.
Make sure when we're comparing two enum datums that they are, in fact, the same enum type. Informs: cockroachdb#124181 Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews!
The first commit ran into a testcase added by #46552 which was purposefully injecting stats with a histogram of a different type. I added a similar type check to ALTER TABLE INJECT STATISTICS which prevents injection of stats with a histogram of a different type. The existence of #46552 made me nervous, so I'm not planning to backport the first commit.
For the third commit I wonder whether we should make it both stronger and backportable - namely, thoughts on making those two checks be
CrdbTestBuild
-only and make the errorsAssertionFailed
so that we could have a chance to catch bugs like this in randomized tests?
Thinking about this more, I changed the oid difference to a "normal" unsupported comparison, because that means we're comparing two different ENUMs, and the version difference to an assertion failure because it probably indicates failure to hydrate a type. I'm still leaning toward not backporting the third commit, either. So I'm only planning to backport the second commit (the bug fix).
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @yuzefovich)
pkg/sql/opt_catalog.go
line 1798 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: the check for a single column is here because we don't generate histograms for multi-column stats? Should we reference #49698 around this defensive check to make sure the code is adjusted when we lift that restriction?
Done.
pkg/sql/opt_catalog.go
line 1806 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] Should we add a log message or something here, since this is unexpected behavior?
Done.
pkg/sql/stats/stats_cache.go
line 636 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: mention in the comment that ColumnID is only specified if a single-column statistic is returned. Alternatively, it might be cleaner to not return the ColumnID and fetch it one layer above in
getTableStatsFromDB
.
Good call, I moved it up to getTableStatsFromDB
.
pkg/ccl/logictestccl/testdata/logic_test/multi_region_stats
line 73 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: should we also check statistics after having changed the enum?
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Backporting only the second commit SGTM.
Reviewed 5 of 17 files at r4, 11 of 11 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball)
Thanks! bors r=DrewKimball,yuzefovich |
Build failed: |
bors r=DrewKimball,yuzefovich |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from ef24bf7 to blathers/backport-release-23.1-124603: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.x failed. See errors above. error creating merge commit from ef24bf7 to blathers/backport-release-23.2-124603: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2.x failed. See errors above. error setting reviewers, but backport branch blathers/backport-release-24.1-124603 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/124810/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. [] Backport to branch 24.1.x failed. See errors above. error creating merge commit from ef24bf7 to blathers/backport-release-23.2.6-rc-124603: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.2.6-rc failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
sql: add defensive type check to sql.(*optTableStat).init
In statistics builder we assume that the datums decoded from histogram upper bounds are comparable with datums in spans and constraints. If the histogram column type were ever different from the table column type, however, this assumption would not hold.
This should never happen, because histograms are associated with a column ID, and ALTER TABLE should never re-use a column ID during ALTER TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be defensive, add a check to sql.(*optTableStat).init that skips over the TableStatistic if the histogram column type isn't equivalent to the table column type.
Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard against injecting histograms that don't match the column type. As part of this fix I'm removing the testcase added in #46552 which deliberately injects statistics with a histogram of a different type.
Informs: #124181
Release note: None
sql/stats: evict stats cache entry if user-defined types have changed
When adding table statistics to the stats cache, we decode histogram upper bounds into datums. If the histogram column uses a user-defined type, we hydrate the type and use this to decode.
In statistics builder, these histogram upper bound datums are compared against datums in spans and constraints. The comparisons assume that the datums are of equivalent type, but if the user-defined type has changed sometime after loading the stats cache entry, this might not be true.
If the user-defined type has changed, we need to evict and re-load the stats cache entry so that we decode histogram datums with a freshly-hydrated type.
(We were already checking UDT versions when building the optTable in sql.(*optCatalog).dataSourceForTable, but the newly-built optTable used the existing table statistics instead of refreshing those, too.)
Fixes: #124181
Release note (bug fix): Fix a bug where a change to a user-defined type could cause queries against tables using that type to fail with an error message like:
The change to the user-defined type could come directly from an ALTER TYPE statement, or indirectly from an ALTER DATABASE ADD REGION or DROP REGION statement (which implicitly change the crdb_internal_region type).
This bug has existed since UDTs were introduced in v20.2.
sem/tree: check oid and version in tree.(*DEnum).CompareError
Make sure when we're comparing two enum datums that they are, in fact, the same enum type.
Informs: #124181
Release note: None