Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: fix histogram type assertion error #46552

Merged
merged 1 commit into from
Mar 26, 2020
Merged

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Mar 25, 2020

It is possible in certain edge cases that we need to filter a histogram
using a constraint with a different data type. For example, we may need
to filter a histogram of timestamps with a constraint of type date.

Prior to this commit, filtering a histogram with mismatched types
would cause an assertion error. This commit fixes the issue by avoiding
the calculations requiring a type assertion if the types don't match.

In order to easily perform this check, this commit includes a small
refactoring of the getFilteredBucket function in histogram.go.
Specifically, it pulls out the code for calculating bucket range sizes
and for determining whether a data type is discrete into two separate
functions.

Fixes #46217

Release justification: This change is a low risk, high benefit change
to existing functionality.

Release note (bug fix): Fixed an internal error that could happen
during planning when a column with a histogram was filtered with a
predicate of a different data type.

@rytaft rytaft requested a review from RaduBerinde March 25, 2020 13:33
@rytaft rytaft requested a review from a team as a code owner March 25, 2020 13:33
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)


pkg/sql/opt/props/histogram.go, line 467 at r1 (raw file):

	// assertion errors below.
	mismatchedTypes :=
		bucketLowerBound.ResolvedType().Family() != bucketUpperBound.ResolvedType().Family() ||

Should we use Equivalent instead? It makes some additional checks (for localized strings, tuples, etc).

Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/sql/opt/props/histogram.go, line 467 at r1 (raw file):

Previously, RaduBerinde wrote…

Should we use Equivalent instead? It makes some additional checks (for localized strings, tuples, etc).

Good idea - done.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rytaft)


pkg/sql/opt/props/histogram.go, line 466 at r2 (raw file):

	// sizes. This should almost never happen, but we want to avoid type
	// assertion errors below.
	mismatchedTypes :=

[supernit] it would be easier to read if we inverted the variable (if !typesMatch { .. })

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rytaft)


pkg/sql/logictest/testdata/logic_test/prepare, line 1231 at r2 (raw file):


statement ok
PREPARE q AS DELETE FROM ts WHERE ts.d <= $1

Do you have more insight about what is happening here exactly? The placeholder should have the timestamp type; if it gets a value with a different type during execution that seems like a bug

It is possible in certain edge cases that we need to filter a histogram
using a constraint with a different data type. For example, we may need
to filter a histogram of timestamps with a constraint of type date.

Prior to this commit, filtering a histogram with mismatched types
would cause an assertion error. This commit fixes the issue by avoiding
the calculations requiring a type assertion if the types don't match.

In order to easily perform this check, this commit includes a small
refactoring of the `getFilteredBucket` function in `histogram.go`.
Specifically, it pulls out the code for calculating bucket range sizes
and for determining whether a data type is discrete into two separate
functions.

Fixes cockroachdb#46217

Release justification: This change is a low risk, high benefit change
to existing functionality.

Release note (bug fix): Fixed an internal error that could happen
during planning when a column with a histogram was filtered with a
predicate of a different data type.
Copy link
Collaborator Author

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)


pkg/sql/logictest/testdata/logic_test/prepare, line 1231 at r2 (raw file):

Previously, RaduBerinde wrote…

Do you have more insight about what is happening here exactly? The placeholder should have the timestamp type; if it gets a value with a different type during execution that seems like a bug

The placeholder has type DATE since the column has type DATE. It's just the histogram that has a different type. I don't think this would likely happen in practice, but I could imagine some scenario where a customer changed the type of a column and then prepared and executed a query before the stats had time to be regenerated.

Not sure if this is exactly what caused the crash in sentry, but at least I was able to reproduce the crash this way.


pkg/sql/opt/props/histogram.go, line 466 at r2 (raw file):

Previously, RaduBerinde wrote…

[supernit] it would be easier to read if we inverted the variable (if !typesMatch { .. })

Done.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@rytaft
Copy link
Collaborator Author

rytaft commented Mar 26, 2020

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 26, 2020

Build failed (retrying...)

@craig
Copy link
Contributor

craig bot commented Mar 26, 2020

Build succeeded

@craig craig bot merged commit fdc568c into cockroachdb:master Mar 26, 2020
michae2 added a commit to michae2/cockroach that referenced this pull request May 29, 2024
In statistics builder we assume that the datums decoded from histogram
upper bounds are comparable with datums in spans and constraints. If the
histogram column type were ever different from the table column type,
however, this assumption would not hold.

This should never happen, because histograms are associated with a
column ID, and ALTER TABLE should never re-use a column ID during ALTER
TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be
defensive, add a check to sql.(*optTableStat).init that skips over the
TableStatistic if the histogram column type isn't equivalent to the
table column type.

Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard
against injecting histograms that don't match the column type. As part
of this fix I'm removing the testcase added in cockroachdb#46552 which deliberately
injects statistics with a histogram of a different type.

Informs: cockroachdb#124181

Release note: None
craig bot pushed a commit that referenced this pull request May 29, 2024
124573: release: automatically detect if version is latest r=celiala a=rail

Previously, we relied on manually set variable to distinguish latest versions. This has been error-prone.

This PR uses the releases DB to detect if the current version is latest.

Epic: none
Release note: None

124603: sql/stats: evict stats cache entry if user-defined types have changed r=DrewKimball,yuzefovich a=michae2

**sql: add defensive type check to sql.(\*optTableStat).init**

In statistics builder we assume that the datums decoded from histogram upper bounds are comparable with datums in spans and constraints. If the histogram column type were ever different from the table column type, however, this assumption would not hold.

This should never happen, because histograms are associated with a column ID, and ALTER TABLE should never re-use a column ID during ALTER TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be defensive, add a check to sql.(\*optTableStat).init that skips over the TableStatistic if the histogram column type isn't equivalent to the table column type.

Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard against injecting histograms that don't match the column type. As part of this fix I'm removing the testcase added in #46552 which deliberately injects statistics with a histogram of a different type.

Informs: #124181

Release note: None

---

**sql/stats: evict stats cache entry if user-defined types have changed**

When adding table statistics to the stats cache, we decode histogram upper bounds into datums. If the histogram column uses a user-defined type, we hydrate the type and use this to decode.

In statistics builder, these histogram upper bound datums are compared against datums in spans and constraints. The comparisons assume that the datums are of equivalent type, but if the user-defined type has changed sometime after loading the stats cache entry, this might not be true.

If the user-defined type has changed, we need to evict and re-load the stats cache entry so that we decode histogram datums with a freshly-hydrated type.

(We were already checking UDT versions when building the optTable in sql.(\*optCatalog).dataSourceForTable, but the newly-built optTable used the existing table statistics instead of refreshing those, too.)

Fixes: #124181

Release note (bug fix): Fix a bug where a change to a user-defined type could cause queries against tables using that type to fail with an error message like:

```
histogram.go:694: span must be fully contained in the bucket
```

The change to the user-defined type could come directly from an ALTER TYPE statement, or indirectly from an ALTER DATABASE ADD REGION or DROP REGION statement (which implicitly change the crdb_internal_region type).

This bug has existed since UDTs were introduced in v20.2.

---

**sem/tree: check oid and version in tree.(\*DEnum).CompareError**

Make sure when we're comparing two enum datums that they are, in fact, the same enum type.

Informs: #124181

Release note: None

124776: schemachanger_test: increase shard count r=rafiss a=rafiss

fixes #124714
Release note: None

Co-authored-by: Rail Aliiev <rail@iqchoice.com>
Co-authored-by: Michael Erickson <michae2@cockroachlabs.com>
Co-authored-by: Rafi Shamim <rafi@cockroachlabs.com>
blathers-crl bot pushed a commit that referenced this pull request May 29, 2024
In statistics builder we assume that the datums decoded from histogram
upper bounds are comparable with datums in spans and constraints. If the
histogram column type were ever different from the table column type,
however, this assumption would not hold.

This should never happen, because histograms are associated with a
column ID, and ALTER TABLE should never re-use a column ID during ALTER
TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be
defensive, add a check to sql.(*optTableStat).init that skips over the
TableStatistic if the histogram column type isn't equivalent to the
table column type.

Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard
against injecting histograms that don't match the column type. As part
of this fix I'm removing the testcase added in #46552 which deliberately
injects statistics with a histogram of a different type.

Informs: #124181

Release note: None
nameisbhaskar pushed a commit to nameisbhaskar/cockroach that referenced this pull request May 30, 2024
In statistics builder we assume that the datums decoded from histogram
upper bounds are comparable with datums in spans and constraints. If the
histogram column type were ever different from the table column type,
however, this assumption would not hold.

This should never happen, because histograms are associated with a
column ID, and ALTER TABLE should never re-use a column ID during ALTER
TABLE ADD COLUMN or ALTER TABLE ALTER COLUMN TYPE. But just to be
defensive, add a check to sql.(*optTableStat).init that skips over the
TableStatistic if the histogram column type isn't equivalent to the
table column type.

Also add this same typecheck to ALTER TABLE INJECT STATISTICS to guard
against injecting histograms that don't match the column type. As part
of this fix I'm removing the testcase added in cockroachdb#46552 which deliberately
injects statistics with a histogram of a different type.

Informs: cockroachdb#124181

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: v19.2.4: type assertion error in histogram
3 participants