sql: support tableoid system column on virtual tables#165727
sql: support tableoid system column on virtual tables#165727trunk-io[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
|
😎 Merged successfully - details. |
|
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
pkg/sql/opt_catalog.go
Outdated
| // Add 1, since the 0th index will the primary that we added above. | ||
| // Secondary indexes do not include the tableoid system column in | ||
| // numCols because the Column() method for secondary indexes uses a | ||
| // different offset scheme and would panic on out-of-bounds access. | ||
| ot.indexes[idx.Ordinal()] = optVirtualIndex{ | ||
| tab: ot, | ||
| idx: idx, | ||
| indexOrdinal: idx.Ordinal(), | ||
| // The virtual indexes don't return the bogus PK key? | ||
| numCols: ot.ColumnCount(), | ||
| numCols: numNonSystemCols, |
There was a problem hiding this comment.
Potential bug: panic via IndexJoin on virtual tables.
Setting numCols: numNonSystemCols (excluding tableoid) on secondary indexes while the primary index includes tableoid creates a covering-check divergence. When a query like SELECT tableoid FROM pg_class WHERE oid = <value> is planned, the optimizer finds the secondary index on oid non-covering (tableoid is missing from IndexColumns()), generates an IndexJoinExpr, and execution reaches ConstructIndexJoin at pkg/sql/opt_exec_factory.go:681 which does table.(*optTable).desc — a hard type assertion that panics because virtual tables are *optVirtualTable, not *optTable.
Multiple pg_catalog tables (pg_class, pg_attribute, pg_constraint, pg_proc, etc.) have secondary indexes, making this user-triggerable.
Suggested fixes (pick one):
- (A) Include tableoid in secondary index
numColstoo (setnumCols: ot.ColumnCount()) and update theColumn()method ofoptVirtualIndexto handle tableoid for secondary indexes. - (B) Add a virtual table guard in
GenerateConstrainedScans(pkg/sql/opt/xform/select_funcs.go) to skip IndexJoin generation for virtual tables. - (C) Set
NoIndexJoin = trueonScanPrivate.Flagswhen building virtual table scans.
Option B or C is simplest and safest since virtual tables have never supported index joins.
There was a problem hiding this comment.
This seems like a valid concern - is it possible to have an index join into the primary index of the virtual table?
There was a problem hiding this comment.
Yes, this was a real bug. With numCols: numNonSystemCols on secondary indexes, the optimizer saw tableoid as not covered by secondary indexes and could attempt an index join into the primary index, which fails for virtual tables.
I've fixed this by going with option (A) from the AI review: secondary indexes now include tableoid in their numCols (set to ot.ColumnCount()), and the Column() method on optVirtualIndex is updated to handle the tableoid column after all stored columns for secondary indexes. This way the optimizer sees all indexes as covering for tableoid, so it never considers an index join necessary.
I also added a regression test that queries tableoid from pg_class with a filter on oid which uses a secondary index.
Option (C) to set NoIndexJoin = true on virtual table ScanPrivate.Flags also would fix this. IIUC, virtual tables have never supported index joins and the generator always produces all columns, so this would be safe. But I didn't know if there would be unintended consequences from disabling index joins. That is a one-line change though, so if it's safe to make that change, let me know if you'd prefer that instead.
AI Review: Potential Issue(s) DetectedAn inline comment has been added to the relevant line identifying a potential panic when Summary: This PR introduces a column count divergence between primary indexes (includes tableoid) and secondary indexes (excludes tableoid) on virtual tables. This can cause the optimizer to generate an If helpful: add |
yuzefovich
left a comment
There was a problem hiding this comment.
Nice! I only have the same question as the AI reviewer 😃
@yuzefovich reviewed 3 files and all commit messages, and made 3 comments.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on rafiss and ZhouXing19).
pkg/sql/opt_catalog.go line 2608 at r1 (raw file):
} // Add 1, since the 0th index will the primary that we added above.
nit: Add 1 ... sentence has been stale for a while now, we should remove it.
pkg/sql/opt_catalog.go
Outdated
| // Add 1, since the 0th index will the primary that we added above. | ||
| // Secondary indexes do not include the tableoid system column in | ||
| // numCols because the Column() method for secondary indexes uses a | ||
| // different offset scheme and would panic on out-of-bounds access. | ||
| ot.indexes[idx.Ordinal()] = optVirtualIndex{ | ||
| tab: ot, | ||
| idx: idx, | ||
| indexOrdinal: idx.Ordinal(), | ||
| // The virtual indexes don't return the bogus PK key? | ||
| numCols: ot.ColumnCount(), | ||
| numCols: numNonSystemCols, |
There was a problem hiding this comment.
This seems like a valid concern - is it possible to have an index join into the primary index of the virtual table?
Previously, the `tableoid` system column was only supported on regular tables but not on virtual tables (pg_catalog, information_schema, etc.). This caused `pg_dump` to fail when connecting to CockroachDB because pg_dump queries `SELECT x.tableoid, ... FROM pg_extension x ...` as one of its first introspection queries. This commit adds `tableoid` support for virtual tables by: 1. Adding the `tableoid` column to `optVirtualTable`'s column list in the optimizer catalog, registered as a `cat.System` column (hidden by default, like on regular tables). 2. Handling the `tableoid` column in `constructVirtualScan` by rendering it as a constant value (the table's descriptor ID) after the virtual table scan, since virtual table generators don't produce system column values. 3. Including the `tableoid` system column in all indexes' `numCols` (both primary and secondary) so the optimizer considers every index as covering for tableoid queries. The `Column()` method on `optVirtualIndex` is updated to return the tableoid column after all stored columns for secondary indexes. Release note (sql change): The `tableoid` system column is now supported on virtual tables such as those in `pg_catalog` and `information_schema`. This improves compatibility with PostgreSQL tools like `pg_dump` that reference `tableoid` in their introspection queries. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
rafiss
left a comment
There was a problem hiding this comment.
@rafiss resolved 1 discussion.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on yuzefovich and ZhouXing19).
yuzefovich
left a comment
There was a problem hiding this comment.
@yuzefovich reviewed 3 files and all commit messages, and made 2 comments.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on rafiss and ZhouXing19).
pkg/sql/opt_catalog.go
Outdated
| // Add 1, since the 0th index will the primary that we added above. | ||
| // Secondary indexes do not include the tableoid system column in | ||
| // numCols because the Column() method for secondary indexes uses a | ||
| // different offset scheme and would panic on out-of-bounds access. | ||
| ot.indexes[idx.Ordinal()] = optVirtualIndex{ | ||
| tab: ot, | ||
| idx: idx, | ||
| indexOrdinal: idx.Ordinal(), | ||
| // The virtual indexes don't return the bogus PK key? | ||
| numCols: ot.ColumnCount(), | ||
| numCols: numNonSystemCols, |
When pg_dump queries pg_attrdef with a join like:
SELECT a.tableoid, ... FROM unnest('{...}'::oid[]) AS src(tbloid)
JOIN pg_catalog.pg_attrdef a ON (src.tbloid = a.adrelid)
the virtual table lookup join panics with "index out of range [5]
with length 5". This happens because pushRow iterates lookupCols
which includes the tableoid system column ordinal, but the
looked-up row from the virtual table generator only contains
public columns.
Fix this by detecting the tableoid ordinal in pushRow and producing
its value as a constant from the table descriptor, matching how
constructVirtualScan handles tableoid for regular virtual table
scans.
Informs: cockroachdb#20296
Release note: None
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
|
TFTR! /trunk merge |
Previously, the
tableoidsystem column was only supported on regulartables but not on virtual tables (pg_catalog, information_schema, etc.).
This caused
pg_dumpto fail when connecting to CockroachDB becausepg_dump queries
SELECT x.tableoid, ... FROM pg_extension x ...asone of its first introspection queries.
This commit adds
tableoidsupport for virtual tables by:Adding the
tableoidcolumn tooptVirtualTable's column list inthe optimizer catalog, registered as a
cat.Systemcolumn (hiddenby default, like on regular tables).
Handling the
tableoidcolumn inconstructVirtualScanby renderingit as a constant value (the table's descriptor ID) after the virtual
table scan, since virtual table generators don't produce system
column values.
Including the
tableoidsystem column in all indexes'numCols(both primary and secondary) so the optimizer considers every index
as covering for tableoid queries. The
Column()method onoptVirtualIndexis updated to return the tableoid column after allstored columns for secondary indexes.
informs #20296
Release note (sql change): The
tableoidsystem column is nowsupported on virtual tables such as those in
pg_catalogandinformation_schema. This improves compatibility with PostgreSQLtools like
pg_dumpthat referencetableoidin their introspectionqueries.