Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Race During --race Test #94

Closed
steve-gray opened this issue Jan 21, 2022 · 8 comments
Closed

Data Race During --race Test #94

steve-gray opened this issue Jan 21, 2022 · 8 comments

Comments

@steve-gray
Copy link

steve-gray commented Jan 21, 2022

Probably needs proper mutexing - this one periodically causes flaking in CI for me. The items hitting it are:
github.com/gocql/gocql.(*tokenAwareHostPolicy).Init() vs github.com/gocql/gocql. (*tokenAwareHostPolicy).updateReplicas()

image

@martin-sucha
Copy link

It seems this is something that should be fixed in upstream gocql/gocql as it seems that upstream is affected too. I'll be happy to merge the fix upstream.

@martin-sucha
Copy link

I've checked the code, but it seems gocql does not spawn goroutines using the policy before calling Init, so there shouldn't be any race. Do you by any chance reuse the policy instance between multiple sessions? Unfortunately I don't see from the screenshot where goroutine 61 was spawned, but it seems that it was spawned somewhere in Session.init, which is called after returning from tokenAwareHostPolicy.Init.

@steve-gray
Copy link
Author

The issue is the assignment of the logger on line 401 during Init() can race with a use of the logger in the updateReplicas where it passes to getStrategy. The updateReplicas is getting called concurrently with the Init() because the schema event listener debouncer starts further up in the sequence, and if you do per-test schema operations, its possible to get into a state where one arrives during the race, because it's seeing stuff like the previous tests keyspace getting torn down.

	s.schemaEvents = newEventDebouncer("SchemaEvents", s.handleSchemaEvent, s.logger)

In short, if a schema update arrives during startup, you can get this. I believe without the race detector, it'll probably nil-pointer as the logger wasn't assigned.

@martin-sucha
Copy link

The issue is the assignment of the logger on line 401 during Init() can race with a use of the logger in the updateReplicas where it passes to getStrategy.

But what leads to that? It seems that Init is called and returns before any connections are created, so how do you reach the Session.handleEvent that ultimately leads to updateReplicas?

Init is called in NewSession:

s.policy.Init(s)

There are no goroutines that gocql spawns until this point.

First connection is made in Session.init:

if err = s.init(); err != nil {

The only explanation I have right now is that the policy instance is used with multiple sessions. Is that the case? Or why do we reach handleEvent? Could you please provide example code that reproduces the issue?

@steve-gray
Copy link
Author

Multiple sessions, but each only created after the previous has been .Closed()'d. However, I can't share the entire body of the application - but it seems like the thrashing during my unit tests (creating and deleting lots of keyspaces, as each test uses its own) makes this exponentially more likely.

@martin-sucha
Copy link

Reusing host selection policies between multiple sessions (even after closing the old session) is not supported as the old session might still be using the policy. Adding a lock to Init won't fix the fact that the old session is still using the policy. I suggest you use a new instance of the policy every time you create a session. I guess we need to document that reusing policies is not supported and panic in tokenAwareHostPolicy.Init if the policy was already initialized.

@steve-gray
Copy link
Author

steve-gray commented Jan 26, 2022 via email

@martin-sucha
Copy link

Sounds like it’s a more along the lines of a defective design pattern - and the strategies should be passed as a factory so that each session the driver summons can get its own.

However that boat has probably sailed in terms keeping things compatible with todays world. It’d be a fairly breaking change to fix.

Yes, I agree. Unfortunately the policy interface is like this for a long time and there could be external implementations, so we can't easily change it.

martin-sucha added a commit to kiwicom/gocql that referenced this issue Jan 27, 2022
Sharing a policy between sessions is not supported because the policy
receives state updates from the session.

Let's update the documentation and add a panic in TokenAwareHostPolicy
constructor. It is better to panic early than to have races and
undefined behavior later.

See also discussion in scylladb#94
martin-sucha added a commit to kiwicom/gocql that referenced this issue Apr 13, 2022
Sharing a policy between sessions is not supported because the policy
receives state updates from the session.

Let's update the documentation and add a panic in TokenAwareHostPolicy
constructor. It is better to panic early than to have races and
undefined behavior later.

See also discussion in scylladb#94
alanmmock pushed a commit to mailgun/gocql that referenced this issue Jul 12, 2022
Sharing a policy between sessions is not supported because the policy
receives state updates from the session.

Let's update the documentation and add a panic in TokenAwareHostPolicy
constructor. It is better to panic early than to have races and
undefined behavior later.

See also discussion in scylladb#94
medasx pushed a commit to kiwicom/gocql that referenced this issue Aug 16, 2022
Sharing a policy between sessions is not supported because the policy
receives state updates from the session.

Let's update the documentation and add a panic in TokenAwareHostPolicy
constructor. It is better to panic early than to have races and
undefined behavior later.

See also discussion in scylladb#94
@sylwiaszunejko sylwiaszunejko closed this as not planned Won't fix, can't repro, duplicate, stale Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants