sql: remove dead interleave code #70618

postamar · 2021-09-23T13:53:21Z

sql: remove dead interleave code

In the previous release, we disabled interleaved tables and indexes.
Therefore, at this point in time, all code pertaining to this feature is
effectively dead. This commit removes a lot of it by removing the
catalog methods and their callers.

The remaining dead code is now located in the SQL grammar definition and
the plan nodes, and is removed in the subsequent commit.

Release note: None


sql: remove INTERLEAVE IN PARENT from grammar

This commit follows up on the previous one by removing INTERLEAVE
clauses from the SQL grammar itself.

Release note (sql change): Removed INTERLEAVE IN PARENT.

cockroach-teamcity · 2021-09-23T13:53:29Z

This change is

postamar · 2021-09-23T21:09:15Z

This is not ready to merge, but it's ready to review, if anyone dares. One aspect of my change I'm not too sure of is if I was supposed to do something in particular to deprecate the cluster settings instead of outright removing those which I removed.

I appreciate that this diff is hard to review. I've broken it down into two commits but the substance of the change is in the first one. There's a lot of noise in there, but there are also test cases which I outright removed because I felt they no longer made sense, that kind of thing. What I'm trying to say is that this should probably be reviewed quite carefully despite the noise. I tried to break the change down further but everything is quite intertwined.

yuzefovich · 2021-09-23T21:11:59Z

I have #70239 that removes the interleaved logic from the fetchers - it'd be probably good to rebase this PR on top of that one since #70239 has already been approved (I was thinking of waiting till 21.2.0 is out, just in case we need to backport something).

postamar · 2021-09-24T10:54:42Z

Excellent! I'll do just that, once the release is done. I may even hold off until your PR is merged, depending on circumstances.

yuzefovich · 2021-10-27T15:36:15Z

One aspect of my change I'm not too sure of is if I was supposed to do something in particular to deprecate the cluster settings instead of outright removing those which I removed.

All cluster settings that are being removed should be added to retiredSettings in setting/registry.go.

postamar · 2021-10-27T17:49:23Z

Thanks @yuzefovich! Will do.

yuzefovich

Thanks for working on this! It's exciting to see all this code go away.

Reviewed 110 of 111 files at r1, 48 of 48 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @postamar, and @tbg)

docs/generated/sql/bnf/BUILD.bazel, line 1 at r2 (raw file):

# There should be an entry in this list for every .bnf in this directory (omit

Seems like this changed more than necessary, what was the reason?

pkg/ccl/backupccl/backupresolver/targets.go, line 635 at r1 (raw file):

	}

	sort.Slice(matched.Descs, func(i, j int) bool { return matched.Descs[i].GetID() < matched.Descs[j].GetID() })

Do we still need to sort here? The old comment implied the sort was only needed because of the interleaved tables.

pkg/clusterversion/cockroach_versions.go, line 394 at r1 (raw file):

	},
	{
		Key:     PreventNewInterleavedTables,

I don't know anything about removing versions, so I'll defer to someone else to look at this. Maybe @irfansharif or @ajwerner.

pkg/sql/backfill.go, line 859 at r1 (raw file):

	}
	for _, idx := range dropped {
		var resume roachpb.Span

IIUC correctly resume can be removed too - this was introduced only to support dropping of the interleaved indexes, and it'll be always unset. Additionally, the for loop also now seems unnecessary since we always will do just a single iteration.

pkg/sql/crdb_internal.go, line 2419 at r1 (raw file):

	})

func showAlterStatement(

nit: I think this method is called in a single place, should we inline it then?

pkg/sql/show_create.go, line 62 at r1 (raw file):

// TABLE statement used to create the given table.
//
// The names of the tables references by foreign keys are prefixed by their own

nit: s/references/referenced/.

pkg/sql/tablewriter_delete.go, line 67 at r1 (raw file):

// zero resume-span. After a chunk is deleted a new resume-span is returned.
//
// limit is a limit on either the number of keys deleted in the operation.

nit: s/on either the number/on the number/.

pkg/sql/catalog/descriptor.go, line 488 at r1 (raw file):

	GetNextFamilyID() descpb.FamilyID

	// HasColumnBackfillMutation returns whether the table has any queued column

nit: why remove the comment?

pkg/sql/catalog/descpb/structured.proto, line 516 at r1 (raw file):

  // Interleave, if it's not the zero value, describes how this index's data is
  // interleaved into another index's data.
  optional InterleaveDescriptor interleave = 11 [(gogoproto.nullable) = false, deprecated = true];

Can we just delete these two fields and mark the corresponding ordinals as reserved? I'm guessing not because of the backups.

pkg/sql/logictest/testdata/logic_test/pg_builtins, line 541 at r1 (raw file):

indexed_b_d_idx  {2,4}  4  2

# information_schema._pg_numeric_precision and information_schema._pg_numeric_precision_radix

Why is this chunk removed?

pkg/sql/logictest/testdata/logic_test/pg_catalog, line 3436 at r1 (raw file):

oid     typname        typnamespace  typowner  typlen  typbyval  typtype

query OTOOIBT colnames

Why is this chunk removed?

pkg/sql/logictest/testdata/logic_test/pg_catalog, line 5420 at r1 (raw file):

regression_70180  NULL

# Testing pg_statistic_ext

ditto

pkg/sql/opt/exec/execbuilder/mutation.go, line 603 at r1 (raw file):

		if maxRows, ok := b.indexConstraintMaxResults(&scan.ScanPrivate, scan.Relational()); ok {
			if maxKeys := maxRows * uint64(tab.FamilyCount()); maxKeys <= row.TableTruncateChunkSize {

nit: an extra new line.

pkg/sql/opt/testutils/opttester/testfixtures/tpcc_schema, line 62 at r1 (raw file):

    index customer_idx (c_w_id, c_d_id, c_last, c_first),
    foreign key (c_w_id, c_d_id) references district (d_w_id, d_id)
) interleave in parent district (c_w_id, c_d_id)

super nit: seems like these couple of changes belong better in the first commit.

pkg/sql/opt/xform/testdata/rules/join, line 255 at r1 (raw file):

  PRIMARY KEY(pid1, cid1)
)
INTERLEAVE IN PARENT parent1 (pid1)

I wonder whether these tests should be converted to use foreign keys instead of interleaving?

pkg/sql/randgen/schema.go, line 80 at r1 (raw file):

	tables := make([]tree.Statement, num)
	for i := 0; i < num; i++ {
		t := RandCreateTableWithColumnIndexNumberGenerator(rng, prefix, i+1, nil /* generateColumnIndexNumber */)

nit: should we use RandCreateTable here?

pkg/sql/row/fetcher.go, line 212 at r1 (raw file):

	// True if the index key must be decoded.
	// If there is more than one table, the index key must always be decoded.

nit: no need to change this comment since row.Fetcher now supports only a single table.

pkg/sql/rowenc/index_encoding.go, line 506 at r1 (raw file):

	// bookkeeping. We could improve it to n with a prefix tree of components.

	interleaves := append(make([]catalog.Index, 0, len(desc.ActiveIndexes())), desc.ActiveIndexes()...)

I think the logic here could be simplified further (maybe the for loop below can be removed entirely?).

pkg/sql/rowenc/index_encoding.go, line 606 at r1 (raw file):

	// interleavedSentinel is actually next, then this key is for a child
	// table.
	if _, ok := encoding.DecodeIfInterleavedSentinel(key); ok {

I think this can be removed. As a result we could remove DecodeIndexKeyWithoutTableIDIndexIDPrefix entirely and just inline DecodeKeyVals instead.

pkg/sql/rowenc/index_encoding.go, line 1527 at r1 (raw file):

	desc catalog.TableDescriptor, index catalog.Index,
) (signatures [][]byte, err error) {
	// signatures contains the slice reference to the signature of every

nit: the comment needs an update.

pkg/sql/rowenc/index_encoding.go, line 1533 at r1 (raw file):

	// fullSignature is the backing byte slice for each individual signature
	// as it buffers each block of table and index IDs.
	// We eagerly allocate 4 bytes for each of the two IDs per ancestor

ditto

pkg/sql/rowenc/index_encoding.go, line 1554 at r1 (raw file):

// if uncertain, pass in true to 'overestimate' the maxKeyTokens.
//
// In general, a key belonging to an interleaved index grandchild is encoded as:

ditto

pkg/sql/span/span_builder.go, line 87 at r1 (raw file):

	// Set up the interstices for encoding interleaved tables later.
	s.interstices[0] = s.KeyPrefix

I think we can simplify these interstices further. Feel free to leave TODO(yuzefovich) about it.

tbg

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @irfansharif, and @postamar)

pkg/clusterversion/cockroach_versions.go, line 394 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I don't know anything about removing versions, so I'll defer to someone else to look at this. Maybe @irfansharif or @ajwerner.

In serverless, tenant pods can trail the KV host cluster by one release. So when KV runs 21.2, tenants can still run 21.1. I am wagering a guess that the migrations attached to these versions are entirely on the tenant side, and so things should be fine (a 21.1 tenant can have interleaved tables, but the 21.2 host cluster couldn't care less).

As for removing settings, current master is going to be the 22.1 release and so will only ever be run in cluster versions 21.2 and up. The version here is pre-21.2, so this should be good to remove.

postamar

Thanks for taking time to review! I bet the excitement of seeing this go away is largely tempered by the drudgery of plowing through 10k line diff.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @irfansharif, and @yuzefovich)

docs/generated/sql/bnf/BUILD.bazel, line 1 at r2 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Seems like this changed more than necessary, what was the reason?

The reason was make bazel-generate followed by a careless git add. Fixed.

pkg/ccl/backupccl/backupresolver/targets.go, line 635 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Do we still need to sort here? The old comment implied the sort was only needed because of the interleaved tables.

Agreed. I had a look at the call sites just to be sure: I'm reasonably confident that it's of no use now. Removed.

pkg/clusterversion/cockroach_versions.go, line 394 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

In serverless, tenant pods can trail the KV host cluster by one release. So when KV runs 21.2, tenants can still run 21.1. I am wagering a guess that the migrations attached to these versions are entirely on the tenant side, and so things should be fine (a 21.1 tenant can have interleaved tables, but the 21.2 host cluster couldn't care less).

As for removing settings, current master is going to be the 22.1 release and so will only ever be run in cluster versions 21.2 and up. The version here is pre-21.2, so this should be good to remove.

Correct, these migrations are tenant migrations. This seems to be the case for all SQL-related migrations, from what I can tell.

pkg/sql/backfill.go, line 859 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

IIUC correctly resume can be removed too - this was introduced only to support dropping of the interleaved indexes, and it'll be always unset. Additionally, the for loop also now seems unnecessary since we always will do just a single iteration.

Nice catch! Done.

pkg/sql/crdb_internal.go, line 2419 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I think this method is called in a single place, should we inline it then?

The call site is already quite deeply nested, I'll leave it as it is, if you don't mind.

pkg/sql/show_create.go, line 62 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/references/referenced/.

Done.

pkg/sql/tablewriter_delete.go, line 67 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: s/on either the number/on the number/.

Done.

pkg/sql/catalog/descriptor.go, line 488 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: why remove the comment?

No good reason, this was a mistake. Good catch.

pkg/sql/catalog/descpb/structured.proto, line 516 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Can we just delete these two fields and mark the corresponding ordinals as reserved? I'm guessing not because of the backups.

Yes, it's because of the backups. More generally, we need to check dead fields such as this one during descriptor validation to make sure that it's empty. Consequently nothing in structured.proto can ever be deleted. It's not such a big deal as it sounds, this file doesn't change that much these days.

pkg/sql/logictest/testdata/logic_test/pg_builtins, line 541 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Why is this chunk removed?

No good reason. Good catch.

pkg/sql/logictest/testdata/logic_test/pg_catalog, line 3436 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Why is this chunk removed?

No good reason. Good catch.

pkg/sql/logictest/testdata/logic_test/pg_catalog, line 5420 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

ditto

Thanks.

pkg/sql/opt/testutils/opttester/testfixtures/tpcc_schema, line 62 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: seems like these couple of changes belong better in the first commit.

I'm not sure I agree, but don't feel too strongly about this. I'll leave it in the current commit at least out of laziness, unless someone has a strong opinion.

pkg/sql/opt/xform/testdata/rules/join, line 255 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I wonder whether these tests should be converted to use foreign keys instead of interleaving?

I'm surprised they weren't broken in the first place, DDLs with INTERLEAVE should trigger errors nowadays.

pkg/sql/randgen/schema.go, line 80 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: should we use RandCreateTable here?

You're right. Done.

pkg/sql/row/fetcher.go, line 212 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: no need to change this comment since row.Fetcher now supports only a single table.

Fixed.

pkg/sql/rowenc/index_encoding.go, line 506 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think the logic here could be simplified further (maybe the for loop below can be removed entirely?).

I think so too. I don't know why I didn't do it. Perhaps I was waiting for your changes? In any case, it's done.

pkg/sql/rowenc/index_encoding.go, line 606 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think this can be removed. As a result we could remove DecodeIndexKeyWithoutTableIDIndexIDPrefix entirely and just inline DecodeKeyVals instead.

Done.

pkg/sql/rowenc/index_encoding.go, line 1527 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: the comment needs an update.

As a matter of fact, this function isn't used anywhere except in tests, thanks to your recent changes. I removed the lot.

pkg/sql/rowenc/index_encoding.go, line 1533 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

ditto

See above.

pkg/sql/rowenc/index_encoding.go, line 1554 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

ditto

Done.

pkg/sql/span/span_builder.go, line 87 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

I think we can simplify these interstices further. Feel free to leave TODO(yuzefovich) about it.

Done. I don't know what these are, so I don't dare touch them.

pkg/sql/opt/exec/execbuilder/mutation.go, line 603 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: an extra new line.

Fixed.

yuzefovich

LOL

I think last thing is that we need to add the removed cluster setting into retiredSettings map. Otherwise,

Reviewed 1 of 111 files at r1, 76 of 76 files at r3, 49 of 49 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @irfansharif, and @postamar)

pkg/sql/opt/xform/testdata/rules/join, line 255 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

I'm surprised they weren't broken in the first place, DDLs with INTERLEAVE should trigger errors nowadays.

This test file probably runs through a special optimizer setup, without really executing the queries, so we weren't hitting those errors.

pkg/sql/rowenc/index_encoding.go, line 1527 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

As a matter of fact, this function isn't used anywhere except in tests, thanks to your recent changes. I removed the lot.

👍

pkg/sql/rowenc/index_encoding.go, line 509 at r3 (raw file):

	if tableID != desc.GetID() {
		return 0, nil, errors.Errorf(
			"Unexpected table ID %d, expected %d instead.", tableID, desc.GetID())

super nit: I think we use lower-case first word in the main error message.

postamar

Oh, I forgot all about retiredSettings, my bad. Done.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @adityamaru, @ajwerner, @irfansharif, and @yuzefovich)

pkg/sql/opt/xform/testdata/rules/join, line 255 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

This test file probably runs through a special optimizer setup, without really executing the queries, so we weren't hitting those errors.

That makes sense.

pkg/sql/rowenc/index_encoding.go, line 509 at r3 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

super nit: I think we use lower-case first word in the main error message.

I'm never sure about this. Done.

In fact in that case I should remove the period also. I'll push that change at the next opportunity.

postamar · 2021-10-29T14:06:06Z

I'll leave this PR up a few more days to allow for further comments. If there are none I'll go ahead and merge towards the end of next week.

Although I'm confident most of the interleave code will be removed, a quick search tells me that there are still plenty of comments mentioning interleaves. I suppose we'll all just have to edit these gradually over time.

In the previous release, we disabled interleaved tables and indexes. Therefore, at this point in time, all code pertaining to this feature is effectively dead. This commit removes a lot of it by removing the catalog methods and their callers. The remaining dead code is now located in the SQL grammar definition and the plan nodes, and is removed in the subsequent commit. Release note: None

This commit follows up on the previous one by removing INTERLEAVE clauses from the SQL grammar itself. Release note (sql change): Removed INTERLEAVE IN PARENT.

postamar · 2021-11-03T23:30:12Z

bors r+

craig · 2021-11-04T00:58:58Z

Build succeeded:

GitHub CI (Cockroach)

postamar · 2021-11-04T13:05:34Z

This is so cathartic. Thanks @yuzefovich !

yuzefovich · 2021-11-04T21:58:25Z

I wonder whether we should close #69150 given that this has been merged?

postamar · 2021-11-04T23:47:24Z

You're right, we should. At the time that I started working on this PR, I didn't intend to remove all remaining interleave code, only that which involved NumAncestors() and other interleave-related catalog descriptor methods, so I didn't issue-link. I since forgot to update.

postamar force-pushed the schema-remove-interleave branch 3 times, most recently from cac6d53 to 649dc18 Compare September 23, 2021 17:01

postamar changed the title ~~[WIP] sql: remove most dead interleave code~~ [DNM] sql: remove most dead interleave code Sep 23, 2021

postamar force-pushed the schema-remove-interleave branch 2 times, most recently from dd50ce7 to 54da9ee Compare September 23, 2021 21:04

postamar marked this pull request as ready for review September 23, 2021 21:09

postamar requested a review from a team September 23, 2021 21:09

postamar requested review from a team as code owners September 23, 2021 21:09

postamar requested a review from a team September 23, 2021 21:09

postamar requested a review from a team as a code owner September 23, 2021 21:09

postamar requested a review from a team September 23, 2021 21:09

postamar requested review from a team as code owners September 23, 2021 21:09

postamar requested review from adityamaru and tbg and removed request for a team September 23, 2021 21:09

postamar added the do-not-merge bors won't merge a PR with this label. label Oct 5, 2021

yuzefovich mentioned this pull request Oct 9, 2021

sql: remove the interleaved logic from fetchers #70239

Merged

postamar marked this pull request as draft October 26, 2021 18:14

postamar changed the title ~~[DNM] sql: remove most dead interleave code~~ sql: remove dead interleave code Oct 26, 2021

postamar force-pushed the schema-remove-interleave branch from 54da9ee to 8b900d8 Compare October 26, 2021 18:17

postamar removed the do-not-merge bors won't merge a PR with this label. label Oct 26, 2021

yuzefovich reviewed Oct 27, 2021

View reviewed changes

tbg requested review from ajwerner and irfansharif October 27, 2021 20:37

tbg reviewed Oct 27, 2021

View reviewed changes

postamar force-pushed the schema-remove-interleave branch from 8b900d8 to a3c15a6 Compare October 28, 2021 19:52

postamar marked this pull request as ready for review October 28, 2021 19:53

postamar force-pushed the schema-remove-interleave branch from a3c15a6 to 4180799 Compare October 28, 2021 20:17

postamar commented Oct 28, 2021

View reviewed changes

yuzefovich approved these changes Oct 28, 2021

View reviewed changes

postamar force-pushed the schema-remove-interleave branch from 4180799 to 95653c1 Compare October 29, 2021 13:53

postamar commented Oct 29, 2021

View reviewed changes

postamar force-pushed the schema-remove-interleave branch from 95653c1 to bb0da77 Compare November 3, 2021 21:02

postamar added 2 commits November 3, 2021 18:22

sql: remove INTERLEAVE IN PARENT from grammar

4c0a5c7

This commit follows up on the previous one by removing INTERLEAVE clauses from the SQL grammar itself. Release note (sql change): Removed INTERLEAVE IN PARENT.

postamar force-pushed the schema-remove-interleave branch from bb0da77 to 4c0a5c7 Compare November 3, 2021 22:24

craig bot merged commit 4493953 into cockroachdb:master Nov 4, 2021

cockroach-teamcity mentioned this pull request Nov 4, 2021

sql: remove dead interleave code cockroachdb/docs#12182

Closed

postamar deleted the schema-remove-interleave branch November 4, 2021 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: remove dead interleave code #70618

sql: remove dead interleave code #70618

postamar commented Sep 23, 2021 •

edited

cockroach-teamcity commented Sep 23, 2021

postamar commented Sep 23, 2021

yuzefovich commented Sep 23, 2021

postamar commented Sep 24, 2021

yuzefovich commented Oct 27, 2021

postamar commented Oct 27, 2021

yuzefovich left a comment

tbg left a comment

postamar left a comment

yuzefovich left a comment

postamar left a comment

postamar commented Oct 29, 2021

postamar commented Nov 3, 2021

craig bot commented Nov 4, 2021

postamar commented Nov 4, 2021

yuzefovich commented Nov 4, 2021

postamar commented Nov 4, 2021

sql: remove dead interleave code #70618

sql: remove dead interleave code #70618

Conversation

postamar commented Sep 23, 2021 • edited

cockroach-teamcity commented Sep 23, 2021

postamar commented Sep 23, 2021

yuzefovich commented Sep 23, 2021

postamar commented Sep 24, 2021

yuzefovich commented Oct 27, 2021

postamar commented Oct 27, 2021

yuzefovich left a comment

Choose a reason for hiding this comment

tbg left a comment

Choose a reason for hiding this comment

postamar left a comment

Choose a reason for hiding this comment

yuzefovich left a comment

Choose a reason for hiding this comment

postamar left a comment

Choose a reason for hiding this comment

postamar commented Oct 29, 2021

postamar commented Nov 3, 2021

craig bot commented Nov 4, 2021

postamar commented Nov 4, 2021

yuzefovich commented Nov 4, 2021

postamar commented Nov 4, 2021

postamar commented Sep 23, 2021 •

edited