-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Get rid of lock during list containers #2024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/ok-to-test |
|
The following files are not properly formatted: |
a1c0974 to
1fdff28
Compare
|
@dashpole |
dashpole
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Looks good with one nit.
manager/manager.go
Outdated
| } | ||
| allContainers, err := cont.handler.ListContainers(container.ListRecursive) | ||
| contHandler := cont.handler | ||
| m.containersLock.RUnlock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: rather than locking for all but the ListContainers call, can we instead lock around just the critical sections? I.E. lock and unlock around the := m.containers call above, and then Lock, and defer Unlock before the Determine which were added... section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it's looks much better
manager/manager.go
Outdated
| return nil, nil, fmt.Errorf("failed to find container %q while checking for new containers", containerName) | ||
| } | ||
| allContainers, err := cont.handler.ListContainers(container.ListRecursive) | ||
| contHandler := cont.handler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need to assign cont.handler to a new variable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, it's better to leave assignation to a new var, but move it under the lock, to avoid possible data race.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With assignation placed under the lock, code looks ugly without defer, but with defer we returning to the first variant. Maybe, it's better to use initial variant?
m.containersLock.RLock()
cont, ok := m.containers[namespacedContainerName{
Name: containerName,
}]
if !ok {
m.containersLock.RUnlock()
return nil, nil, fmt.Errorf("failed to find container %q while checking for new containers", containerName)
}
contHandler := cont.handler
m.containersLock.RUnlock()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not convinced there is a data race in reading the handler attribute of cont. It is not modified once it is initialized. The containersLock is meant to prevent concurrent access to the containers map, which the current design does. I don't think you need to add a new contHandler variable here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there are no data races in the current codebase, but i have one concern:
When m.containersLock locked, we expect exclusive access to the m.containers and it's members' fields. It's unexpected, that cont.handler could be read by anyone else during lock, so it could lead to tricky data race in future.
What do you think about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need to reassess that when the time comes. Other functions, including getContainer and getSubcontainers, getAllDockerContainers, etc. return references to containerData structs, and could allow concurrent modification of containerData objects, if used in the way you are suggesting it could be. We would need much more than a change here to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Fixed.
dashpole
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Sometimes, when reading of cgroups tree is too slow, manager.getContainersDiff during global housekeeping could block any other actions due to acquiring mmanager.containersLock for a long time.
To fix it, we could release lock while list cgroups tree and then acquire it again to get diff.