Skip to content

Fix race in getEndpointsFromStore#750

Merged
mrjana merged 1 commit intomoby:masterfrom
LK4D4:endpoints_race
Nov 12, 2015
Merged

Fix race in getEndpointsFromStore#750
mrjana merged 1 commit intomoby:masterfrom
LK4D4:endpoints_race

Conversation

@LK4D4
Copy link
Copy Markdown
Contributor

@LK4D4 LK4D4 commented Nov 12, 2015

Race can occur between two getEndpointsFromStore or between it and
getNetwork.

@LK4D4
Copy link
Copy Markdown
Contributor Author

LK4D4 commented Nov 12, 2015

tests passed for me locally :/ but probably there is some deadlock
I think that race is pretty bad

@LK4D4
Copy link
Copy Markdown
Contributor Author

LK4D4 commented Nov 12, 2015

However I stress-tested docker daemon with this patch and didn't found any deadlocks.

@mrjana
Copy link
Copy Markdown
Contributor

mrjana commented Nov 12, 2015

It's not your diff. CircleCI some times locks up and times out. Restarted the test. Never seen that local runs.

@mavenugo
Copy link
Copy Markdown
Contributor

LGTM

But, am not sure I understand the race condition here. We have locking in most (must be all) of the
situations where a content can change (such as endpoint list in network or network list in controller,....). But for those objects where the data is guaranteed to be immutable after the object creation, we don't lock that particular operation. But having a lock around that is certainly not a bad idea. But am just curious how ep.network assignment can result in a race.
@mrjana ?

@LK4D4
Copy link
Copy Markdown
Contributor Author

LK4D4 commented Nov 12, 2015

@mavenugo it can be two concurrent assignments to ep.network or also read from getNetwork. Both can produce corrupted ep.network.

@mrjana
Copy link
Copy Markdown
Contributor

mrjana commented Nov 12, 2015

@LK4D4 It is a valid race when we get it from cache i.e not from first time population in which case two go routines can indeed get hold of the same pointer. But this is a case of harmless race since go memory model ensures that the readers either see the old value of ep.network or the new value. Not some corrupted value in between and given that the same value is going to be set there is no issue of reading a stale value either.

Which brings us to the question of why we even need to update it with network value which should already be set for every endpoint. So that assignment is redundant. May be we should just remove that assignment altogther?

BTW, out of curiosity, did this show up in go race detector?

@LK4D4
Copy link
Copy Markdown
Contributor Author

LK4D4 commented Nov 12, 2015

@mrjana Yes, it is from go race detector. I'm not sure about what part of memory model you talking, but it was not once said by Go developers, that races can produce corrupted values if type more than one-word size. Maybe it's changed some time ago?

@LK4D4
Copy link
Copy Markdown
Contributor Author

LK4D4 commented Nov 12, 2015

@mrjana this assignment create multiple races with other functions too. Removing would be perfect.

@mrjana
Copy link
Copy Markdown
Contributor

mrjana commented Nov 12, 2015

@LK4D4 I am referring to https://golang.org/ref/mem

Reads and writes of values larger than a single machine word behave as multiple machine-word-sized operations in an unspecified order.

And this assignment will happen in one machine word whether it is 32-bit or 64-bit. But having said that it's worth it to fix it by removing it altogether. Do you want to do this as part of this PR?

Race detector was angry about that assignment

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
@LK4D4
Copy link
Copy Markdown
Contributor Author

LK4D4 commented Nov 12, 2015

@mrjana yeah, I pushed change.

@aboch
Copy link
Copy Markdown
Contributor

aboch commented Nov 12, 2015

LGTM

@mrjana
Copy link
Copy Markdown
Contributor

mrjana commented Nov 12, 2015

Thanks @LK4D4. LGTM

mrjana added a commit that referenced this pull request Nov 12, 2015
Fix race in getEndpointsFromStore
@mrjana mrjana merged commit 29c413a into moby:master Nov 12, 2015
@LK4D4 LK4D4 deleted the endpoints_race branch November 12, 2015 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants