Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

descidgen: use system.descriptor_id_seq for all tenants #91205

Merged
merged 1 commit into from Nov 4, 2022

Conversation

postamar
Copy link
Contributor

@postamar postamar commented Nov 3, 2022

Previously, the system tenant used a legacy non-SQL key in the meta range to generate descriptor IDs, while non-system tenants used the system.descriptor_id_seq sequence.

This commit introduces a cluster upgrade step which creates this sequence in the the system tenant's system database and migrates the legacy counter value to it.

While this migration is underway, the descriptor ID generator is not available. This upgrade consists of a relatively brief transaction and descriptor ID generation is a relatively infrequent event, as a result the disruption is deemed tolerable.

Fixes #80294.

Release note (sql change): upgrading a cluster to a major release containing this patch will endow the system tenant's system database with a descriptor_id_seq sequence which will henceforth be used to generate new descriptor IDs, as is already the case for non-system tenants.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 32 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @postamar)


pkg/sql/catalog/descidgen/generate_id.go line 63 at r1 (raw file):

		// which case descriptor ID generation is made unavailable.
		if cv.IsActive(ctx, clusterversion.V23_1DescIDSequenceForSystemTenant-1) {
			return catid.InvalidDescID, ErrDescIDSequenceMigrationInProgress

I sort of think that if you're going to hit this error, we might as well sit and retry until you won't hit it again and then return a serializable restart. That way the system will automatically retry and succeed and the client drivers will know how to handle it. WDYT? Otherwise, I think this is a nice approach.

@postamar
Copy link
Contributor Author

postamar commented Nov 3, 2022

Thanks! I wondered about retries, and indeed what you suggest would have been ideal as far as I'm concerned, I just don't know how to properly do it.

My initial approach was to do what @nvanbenschoten suggested in #80294 but I rejected it for two reasons:

  • the added complexity of tracking two counters felt more risky (too many opportunities for bugs with severe consequences, too hard to test) than just barfing an error in the user's face,
  • making the generator transactional in all places was going to be hard.

@ajwerner
Copy link
Contributor

ajwerner commented Nov 3, 2022

There's a few options. The most complex pattern to follow which is probably the most by-the-book would be to make your error type marked as retriable here:

func errIsRetriable(err error) bool {
return errors.HasType(err, (*roachpb.TransactionRetryWithProtoRefreshError)(nil)) ||
scerrors.ConcurrentSchemaChangeDescID(err) != descpb.InvalidID ||
errors.Is(err, retriableMinTimestampBoundUnsatisfiableError) ||
descs.IsTwoVersionInvariantViolationError(err)
}

Then, in the txnStateApplyWrapper, do the waiting. You can follow the pattern we did for concurrent schema changes here:

if p, ok := payload.(payloadWithError); ok {
if descID := scerrors.ConcurrentSchemaChangeDescID(p.errorCause()); descID != descpb.InvalidID {
if err := ex.handleWaitingForConcurrentSchemaChanges(ex.Ctx(), descID); err != nil {
return advanceInfo{}, err
}
}
}

Internally this thing aborts the transaction and then waits for there to be no concurrent changes on the descriptor. In your case we'll want to also abort the transaction but instead poll the currently active version.

@postamar
Copy link
Contributor Author

postamar commented Nov 3, 2022

Thanks, this seems to work.

@postamar postamar marked this pull request as ready for review November 3, 2022 16:59
@postamar postamar requested a review from a team November 3, 2022 16:59
@postamar postamar requested review from a team as code owners November 3, 2022 16:59
@postamar postamar requested review from a team and benbardin and removed request for a team and benbardin November 3, 2022 16:59
@postamar postamar marked this pull request as draft November 3, 2022 17:29
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first time Andrew and I talked about this, we had the follownig idea:

  • use 3 version markers, not just 1. Let's call them versions A, B, C.
  • after version A is reached, we still use the special ID gen but we start incrementing by 2 every time.
  • we make a migration from A to B, which creates a new sequence generator and initializes it with an offset of 1 from the current value of the special ID generator.
  • after version B is reached, the ID gen code uses the new sequence, and also increments it by 2 every time.

At this point, the following properties hold:

  • it's still possible that a txn started under version A is still using the previous gen, however
  • regardless of which gen is used, the IDs generated from them don't overlap

Then the final transition:

  • a migraiton from version B to C, which checks if the old ID gen has a value higher than the (new) sequence, and if so bumps the sequence forward.
  • once version C is reached, the new sequence is used and incremented by 1 every time.

This way, there is never an error because the seq is not available yet, nor a need to retry.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)

@postamar
Copy link
Contributor Author

postamar commented Nov 3, 2022

The first time Andrew and I talked about this, we had the follownig idea

It's neat. I think the reason why I can't convince myself that it works is that I don't know enough about how kv.DB.Inc() works or how it interacts with cluster.Settings.Version and how that's updated:

  • what if an Inc which started pre-A takes a really long time for whatever reason and only increments by 1 after version A is active?
  • how do we implement the B->C migration when the counters aren't transactional? Is a GetForUpdate enough? What if there's a concurrent kv.DB.Inc on the special gen which reads the old value before the for-update lock is acquired and actually increments it after the txn commits?

I found my either-or approach to be less sensitive to having overlooked something. Still, I'm genuinely curious about the questions above and just how the database works in general. I assume it has to do with raft logs and whatnot, things I'm painfully ignorant about.

@ajwerner
Copy link
Contributor

ajwerner commented Nov 3, 2022

I think I've come around to preferring this. Very few systems use this generator at high throughput. I doubt anybody will notice this migration, and if they do, it's transient and equivalent to contention we'd have seen from other migrations which rewrite descriptors. I like its simplicity.

@knz
Copy link
Contributor

knz commented Nov 3, 2022

No objection to the simplicity.

Marius, to answer your question, the increment is not in a txn but it is now MVCC aware so the last update has a HLC timestamp associated to it. So it's always possible to certainly order an increment with the txn that's bumping the cluster version.

Previously, the system tenant used a legacy non-SQL key in the meta
range to generate descriptor IDs, while non-system tenants used the
system.descriptor_id_seq sequence.

This commit introduces a cluster upgrade step which creates this
sequence in the the system tenant's system database and migrates the
legacy counter value to it.

While this migration is underway, the descriptor ID generator is not
available, which will trigger a retry of any transaction which attempts
to access it. We expect the disruption to be negligible: the migration
should be quick to perform and descriptor ID generation is a rare event,
relatively speaking.

Fixes cockroachdb#80294.

Release note (sql change): upgrading a cluster to a major release
containing this patch will endow the system tenant's system database
with a descriptor_id_seq sequence which will henceforth be used to
generate new descriptor IDs, as is already the case for non-system
tenants.
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@postamar
Copy link
Contributor Author

postamar commented Nov 4, 2022

Thanks for the reviews!

bors r+

@craig craig bot merged commit c57b340 into cockroachdb:master Nov 4, 2022
@craig
Copy link
Contributor

craig bot commented Nov 4, 2022

Build succeeded:

@knz
Copy link
Contributor

knz commented Nov 6, 2022

thank you 💖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: migrate system tenant to use SQL sequence as DescIDGenerator
4 participants