-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: fix histogram type assertion error #46552
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fix!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)
pkg/sql/opt/props/histogram.go, line 467 at r1 (raw file):
// assertion errors below. mismatchedTypes := bucketLowerBound.ResolvedType().Family() != bucketUpperBound.ResolvedType().Family() ||
Should we use Equivalent
instead? It makes some additional checks (for localized strings, tuples, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/sql/opt/props/histogram.go, line 467 at r1 (raw file):
Previously, RaduBerinde wrote…
Should we use
Equivalent
instead? It makes some additional checks (for localized strings, tuples, etc).
Good idea - done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rytaft)
pkg/sql/opt/props/histogram.go, line 466 at r2 (raw file):
// sizes. This should almost never happen, but we want to avoid type // assertion errors below. mismatchedTypes :=
[supernit] it would be easier to read if we inverted the variable (if !typesMatch { .. }
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @rytaft)
pkg/sql/logictest/testdata/logic_test/prepare, line 1231 at r2 (raw file):
statement ok PREPARE q AS DELETE FROM ts WHERE ts.d <= $1
Do you have more insight about what is happening here exactly? The placeholder should have the timestamp type; if it gets a value with a different type during execution that seems like a bug
It is possible in certain edge cases that we need to filter a histogram using a constraint with a different data type. For example, we may need to filter a histogram of timestamps with a constraint of type date. Prior to this commit, filtering a histogram with mismatched types would cause an assertion error. This commit fixes the issue by avoiding the calculations requiring a type assertion if the types don't match. In order to easily perform this check, this commit includes a small refactoring of the `getFilteredBucket` function in `histogram.go`. Specifically, it pulls out the code for calculating bucket range sizes and for determining whether a data type is discrete into two separate functions. Fixes cockroachdb#46217 Release justification: This change is a low risk, high benefit change to existing functionality. Release note (bug fix): Fixed an internal error that could happen during planning when a column with a histogram was filtered with a predicate of a different data type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)
pkg/sql/logictest/testdata/logic_test/prepare, line 1231 at r2 (raw file):
Previously, RaduBerinde wrote…
Do you have more insight about what is happening here exactly? The placeholder should have the timestamp type; if it gets a value with a different type during execution that seems like a bug
The placeholder has type DATE since the column has type DATE. It's just the histogram that has a different type. I don't think this would likely happen in practice, but I could imagine some scenario where a customer changed the type of a column and then prepared and executed a query before the stats had time to be regenerated.
Not sure if this is exactly what caused the crash in sentry, but at least I was able to reproduce the crash this way.
pkg/sql/opt/props/histogram.go, line 466 at r2 (raw file):
Previously, RaduBerinde wrote…
[supernit] it would be easier to read if we inverted the variable (
if !typesMatch { .. }
)
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained
bors r+ |
Build failed (retrying...) |
Build succeeded |
In statistics builder we assume that the datums decoded from histogram upper bounds are comparable with datums in spans and constraints. If the histogram column type were ever different from the table column type, however, this assumption would not hold. This should never happen, because histograms are associated with a column ID, and ALTER TABLE should never re-use a column ID during ALTER TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be defensive, add a check to sql.(*optTableStat).init that skips over the TableStatistic if the histogram column type isn't equivalent to the table column type. Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard against injecting histograms that don't match the column type. As part of this fix I'm removing the testcase added in cockroachdb#46552 which deliberately injects statistics with a histogram of a different type. Informs: cockroachdb#124181 Release note: None
124573: release: automatically detect if version is latest r=celiala a=rail Previously, we relied on manually set variable to distinguish latest versions. This has been error-prone. This PR uses the releases DB to detect if the current version is latest. Epic: none Release note: None 124603: sql/stats: evict stats cache entry if user-defined types have changed r=DrewKimball,yuzefovich a=michae2 **sql: add defensive type check to sql.(\*optTableStat).init** In statistics builder we assume that the datums decoded from histogram upper bounds are comparable with datums in spans and constraints. If the histogram column type were ever different from the table column type, however, this assumption would not hold. This should never happen, because histograms are associated with a column ID, and ALTER TABLE should never re-use a column ID during ALTER TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be defensive, add a check to sql.(\*optTableStat).init that skips over the TableStatistic if the histogram column type isn't equivalent to the table column type. Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard against injecting histograms that don't match the column type. As part of this fix I'm removing the testcase added in #46552 which deliberately injects statistics with a histogram of a different type. Informs: #124181 Release note: None --- **sql/stats: evict stats cache entry if user-defined types have changed** When adding table statistics to the stats cache, we decode histogram upper bounds into datums. If the histogram column uses a user-defined type, we hydrate the type and use this to decode. In statistics builder, these histogram upper bound datums are compared against datums in spans and constraints. The comparisons assume that the datums are of equivalent type, but if the user-defined type has changed sometime after loading the stats cache entry, this might not be true. If the user-defined type has changed, we need to evict and re-load the stats cache entry so that we decode histogram datums with a freshly-hydrated type. (We were already checking UDT versions when building the optTable in sql.(\*optCatalog).dataSourceForTable, but the newly-built optTable used the existing table statistics instead of refreshing those, too.) Fixes: #124181 Release note (bug fix): Fix a bug where a change to a user-defined type could cause queries against tables using that type to fail with an error message like: ``` histogram.go:694: span must be fully contained in the bucket ``` The change to the user-defined type could come directly from an ALTER TYPE statement, or indirectly from an ALTER DATABASE ADD REGION or DROP REGION statement (which implicitly change the crdb_internal_region type). This bug has existed since UDTs were introduced in v20.2. --- **sem/tree: check oid and version in tree.(\*DEnum).CompareError** Make sure when we're comparing two enum datums that they are, in fact, the same enum type. Informs: #124181 Release note: None 124776: schemachanger_test: increase shard count r=rafiss a=rafiss fixes #124714 Release note: None Co-authored-by: Rail Aliiev <rail@iqchoice.com> Co-authored-by: Michael Erickson <michae2@cockroachlabs.com> Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
In statistics builder we assume that the datums decoded from histogram upper bounds are comparable with datums in spans and constraints. If the histogram column type were ever different from the table column type, however, this assumption would not hold. This should never happen, because histograms are associated with a column ID, and ALTER TABLE should never re-use a column ID during ALTER TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be defensive, add a check to sql.(*optTableStat).init that skips over the TableStatistic if the histogram column type isn't equivalent to the table column type. Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard against injecting histograms that don't match the column type. As part of this fix I'm removing the testcase added in #46552 which deliberately injects statistics with a histogram of a different type. Informs: #124181 Release note: None
In statistics builder we assume that the datums decoded from histogram upper bounds are comparable with datums in spans and constraints. If the histogram column type were ever different from the table column type, however, this assumption would not hold. This should never happen, because histograms are associated with a column ID, and ALTER TABLE should never re-use a column ID during ALTER TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be defensive, add a check to sql.(*optTableStat).init that skips over the TableStatistic if the histogram column type isn't equivalent to the table column type. Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard against injecting histograms that don't match the column type. As part of this fix I'm removing the testcase added in cockroachdb#46552 which deliberately injects statistics with a histogram of a different type. Informs: cockroachdb#124181 Release note: None
It is possible in certain edge cases that we need to filter a histogram
using a constraint with a different data type. For example, we may need
to filter a histogram of timestamps with a constraint of type date.
Prior to this commit, filtering a histogram with mismatched types
would cause an assertion error. This commit fixes the issue by avoiding
the calculations requiring a type assertion if the types don't match.
In order to easily perform this check, this commit includes a small
refactoring of the
getFilteredBucket
function inhistogram.go
.Specifically, it pulls out the code for calculating bucket range sizes
and for determining whether a data type is discrete into two separate
functions.
Fixes #46217
Release justification: This change is a low risk, high benefit change
to existing functionality.
Release note (bug fix): Fixed an internal error that could happen
during planning when a column with a histogram was filtered with a
predicate of a different data type.