Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix data race in MemberOfWithAdminOption #98469

Closed

Conversation

andyyang890
Copy link
Collaborator

singleflight: mitigate potential data races from context cancellation

This patch ensures that if the leader of a singleflight flight has its
context cancelled while waiting for the result of the flight closure,
it will block until the closure returns to prevent any possible data
races on any data shared with the closure.

Release note: None


sql: change role members cache singleflight to inherit cancellation

This patch changes the role members cache singleflight to inherit
cancellation since the flight closure uses the caller's transaction.

Release note: None


Fixes #95642
Fixes #96539

This patch ensures that if the leader of a singleflight flight has its
context cancelled while waiting for the result of the flight closure,
it will block until the closure returns to prevent any possible data
races on any data shared with the closure.

Release note: None
This patch changes the role members cache singleflight to inherit
cancellation since the flight closure uses the caller's transaction.

Release note: None
@andyyang890 andyyang890 requested review from andreimatei, rafiss, ajwerner and a team March 13, 2023 05:38
@blathers-crl
Copy link

blathers-crl bot commented Mar 13, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@@ -251,7 +251,13 @@ func (c *call) result(ctx context.Context, leader bool) Result {
case <-c.c:
case <-ctx.Done():
op := fmt.Sprintf("%s:%s", c.opName, c.key)
if !leader {
if leader {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this that this should only happen if InheritCancellation is true. Another thing that might be good to do here is to log if we wait a long time here. Maybe:

var timer timeutil.Timer
defer timer.Stop()
const slowLogTimeout = time.Second
timer.Reset(slowLogTimeout)
start := timeutil.Now()
select {
case <-c.c:
    return
case <-timer.C:
    timer.Read = true
    log.Infof(ctx, "have been waiting for singleflight %s:%s for %v after cancelation", c.opName, c.key, timeutil.Since(start))
}
<-c.c

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I this that this should only happen if InheritCancellation is true.

There would still be a potential for a data race even if InheritCancelation is false, right? Unless we rely on the fact that all users of the singleflight package set InheritCancelation: true if they're sharing data between the caller and the closure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. I think I'd argue instead that the singleflights users which share state and want to exit early are abusing the API. Maybe we should just comment that and not build this into the singleflight package.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)


pkg/util/syncutil/singleflight/singleflight.go line 254 at r2 (raw file):

I this that this should only happen if InheritCancellation is true.
There would still be a potential for a data race even if InheritCancelation is false, right?

It seems to me that the current race is possible both with and without InheritCancelation, but the bad outcome of the race is much more likely without InheritCancelation; with InheritCancelation, both Do and WaitForResult are interrupted at the same time. Without InheritCancelation only WaitForResult is interrupted, thus allowing the closure to continue and past the caller unwinding.

I think I'd argue instead that the singleflights users which share state and want to exit early are abusing the API.

The problem, as I see it, is that with the current API it's kinda hard for a caller to opt out of exiting early in the WaitForResult() call. I think maybe it'd be reasonable to make it easy to opt-out, but it needs to be an option and not forced like in the current patch. But,

This patch changes the role members cache singleflight to inherit
cancellation since the flight closure uses the caller's transaction.

I want to understand if this makes good sense, and in turn if we should worry about "sharing data with the closure" at all.
Can you explain to me whether it's a good thing for the leader's txn to be used in the closure by the leader, seeing how the flight is possibly joined by others, with other transactions, and those guy are presumably just fine with the closure having used an unrelated txn? Can we make the closure use a new txn?
Are there other uses of singleflight, perhaps in descriptor lease acquisitions, where there's a similar pattern?

@ajwerner
Copy link
Contributor

Are there other uses of singleflight, perhaps in descriptor lease acquisitions, where there's a similar pattern?

In descriptor leasing we create a new transaction. I argue that we shouldn't be using a shared transaction in the singleflight. It was done here only because it was expedient, IIRC. I argued against it at the time.

@rafiss
Copy link
Collaborator

rafiss commented Mar 13, 2023

Doesn't it need to use the same transaction so that you can be sure that the singleflight lookup is using the same system.role_members table version as the transaction that is trying to check role membership?

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the leader needs the same txn, then how come the followers don't?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)

@ajwerner
Copy link
Contributor

I don't think that's the only way. Consider:

  • One option would be to force the caller to do the read themselves if we don't have the older version they need (the singleflight won't get an older version than a waiter already read).
  • another option is to accept a newer version of the cache and say that that's fine. That could lead to weird edge cases I suppose.
  • yet another option would be to make the cache data structure versioned so we don't thrash while changes happen
  • probably the best option would be to make the singleflight operate per version and use the modification time of the descriptor as the transaction timestamp for reading the set of members. This is going to be the exact snapshot of the transaction which published that version. This would avoid any questions about reading the right value.

@rafiss
Copy link
Collaborator

rafiss commented Mar 13, 2023

If the leader needs the same txn, then how come the followers don't?

Because part of the singleflight request key is the system.role_members table version:

fmt.Sprintf("%s-%d", member.Normalized(), tableVersion),

So even though the followers use a different transaction, they know that they will read from the same table version as if they used their own transaction.

@rafiss
Copy link
Collaborator

rafiss commented Mar 13, 2023

probably the best option would be to make the singleflight operate per version and use the modification time of the descriptor as the transaction timestamp for reading the set of members. This is going to be the exact snapshot of the transaction which published that version. This would avoid any questions about reading the right value.

ah ok, so we already do most of this, but you're saying we could also start a different transaction inside of the singleflight and manually set it's transaction timestamp. what's the interface for manually changing the timestamp?

@ajwerner
Copy link
Contributor

cockroach/pkg/kv/txn.go

Lines 1393 to 1401 in 736a67e

// SetFixedTimestamp makes the transaction run in an unusual way, at a "fixed
// timestamp": Timestamp and RefreshedTimestamp are set to ts, there's no clock
// uncertainty, and the txn's deadline is set to ts such that the transaction
// can't be pushed to a different timestamp.
//
// This is used to support historical queries (AS OF SYSTEM TIME queries and
// backups). This method must be called on every transaction retry (but note
// that retries should be rare for read-only queries with no clock uncertainty).
func (txn *Txn) SetFixedTimestamp(ctx context.Context, ts hlc.Timestamp) error {

@andyyang890
Copy link
Collaborator Author

So even though the followers use a different transaction, they know that they will read from the same table version as if they used their own transaction.

I actually had a question about how this works. If the table version is only bumped when the schema changes, wouldn't the table version actually remain the same if new GRANT/REVOKE ROLEs happen? In which case, isn't it possible that the data in the system.role_members table has changed but the key for the singleflight will still be the same?

@rafiss
Copy link
Collaborator

rafiss commented Mar 13, 2023

I actually had a question about how this works. If the table version is only bumped when the schema changes, wouldn't the table version actually remain the same if new GRANT/REVOKE ROLEs happen? In which case, isn't it possible that the data in the system.role_members table has changed but the key for the singleflight will still be the same?

We increment the table version here in the GRANT code:

if err := params.p.BumpRoleMembershipTableVersion(params.ctx); err != nil {

And similar for REVOKE.


So I wonder if something like this would work:

	future, _ := roleMembersCache.populateCacheGroup.DoChan(ctx,
		fmt.Sprintf("%s-%d", member.Normalized(), tableVersion),
		singleflight.DoOpts{
			Stop:               roleMembersCache.stopper,
			InheritCancelation: false,
		},
		func(ctx context.Context) (interface{}, error) {
			ts := txn.KV().ReadTimestamp()
			var m map[username.SQLUsername]bool
			err = execCfg.InternalDB.DescsTxn(ctx, func(ctx context.Context, newTxn descs.Txn) error {
				err := newTxn.KV().SetFixedTimestamp(ctx, ts)
				if err != nil {
					return err
				}
				m, err = resolveMemberOfWithAdminOption(
					ctx, member, newTxn,
					useSingleQueryForRoleMembershipCache.Get(execCfg.SV()),
				)
				return err
			})
			return m, err
		})

@ajwerner
Copy link
Contributor

You don't want ReadTimestamp() I don't think. I think you want to take that descriptor you read for the role_memberships table and take its ModificationTime(). Otherwise, that looks right.

@andreimatei
Copy link
Contributor

If the leader needs the same txn, then how come the followers don't?

Because part of the singleflight request key is the system.role_members table version:

I'll throw more of my 2c to doubt that using something about the query to check permissions doesn't makes any sense; it seems like an obvious security hole (permissions are not supposed to be immutable). I remember pestering Marc about it many years ago, and I kinda assumed that we changed it. @ajwerner pointed to #51861 tracking it.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)


pkg/util/syncutil/singleflight/singleflight.go line 254 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I this that this should only happen if InheritCancellation is true.
There would still be a potential for a data race even if InheritCancelation is false, right?

It seems to me that the current race is possible both with and without InheritCancelation, but the bad outcome of the race is much more likely without InheritCancelation; with InheritCancelation, both Do and WaitForResult are interrupted at the same time. Without InheritCancelation only WaitForResult is interrupted, thus allowing the closure to continue and past the caller unwinding.

I think I'd argue instead that the singleflights users which share state and want to exit early are abusing the API.

The problem, as I see it, is that with the current API it's kinda hard for a caller to opt out of exiting early in the WaitForResult() call. I think maybe it'd be reasonable to make it easy to opt-out, but it needs to be an option and not forced like in the current patch. But,

This patch changes the role members cache singleflight to inherit
cancellation since the flight closure uses the caller's transaction.

I want to understand if this makes good sense, and in turn if we should worry about "sharing data with the closure" at all.
Can you explain to me whether it's a good thing for the leader's txn to be used in the closure by the leader, seeing how the flight is possibly joined by others, with other transactions, and those guy are presumably just fine with the closure having used an unrelated txn? Can we make the closure use a new txn?
Are there other uses of singleflight, perhaps in descriptor lease acquisitions, where there's a similar pattern?

Just in case this was not considered as an option - the caller of WaitForResult can pass in a context without cancelation (e.g. context.TODO()). This will be trivial in the next Go version; for now you gotta build a context by hand and you lose the tracing, etc.

Copy link
Collaborator Author

@andyyang890 andyyang890 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)


pkg/util/syncutil/singleflight/singleflight.go line 254 at r2 (raw file):

Just in case this was not considered as an option - the caller of WaitForResult can pass in a context without cancelation (e.g. context.TODO()). This will be trivial in the next Go version; for now you gotta build a context by hand and you lose the tracing, etc.

Yeah, we did consider that and #98376 has an implementation for that solution. Just curious, do you mind elaborating on why this will be trivial in the next Go version?

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)


pkg/util/syncutil/singleflight/singleflight.go line 254 at r2 (raw file):

do you mind elaborating on why this will be trivial in the next Go version?

https://go-review.googlesource.com/c/go/+/459016

@andyyang890
Copy link
Collaborator Author

andyyang890 commented Mar 15, 2023

I've created #98617 with an implementation of the idea discussed above to use a fresh transaction within the singleflight. There are more details on that PR of an issue I've observed with this approach.

@andyyang890
Copy link
Collaborator Author

Closed in favor of #98617

@andyyang890 andyyang890 deleted the fix_singleflight_race branch March 17, 2023 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants