Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for race condition caused by concurrent fsnotify (CREATE and REMOVE) in kubelet/plugin_watcher.go #71440

Merged
merged 1 commit into from
Nov 28, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion pkg/kubelet/util/pluginwatcher/plugin_watcher.go
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ func (w *Watcher) Start() error {
//TODO: Handle errors by taking corrective measures

w.wg.Add(1)
go func() {
func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the goroutine inside traversePluginDir also makes order of events non-deterministic, especially if changes are occurring at the same time the initial scan is being done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the handleCreateEvent function verify the created path still exists?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goroutinein traversePluginDir is used to place items on a non-buffered channel, ws.fsWatcher.Events. If must be present to avoid deadlocks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleCreateEvent does not explicitly re-check existence of the dir right before the Driver is delegated to handle the registration:

With the changes in this PR, the fsnotify CREATE/DELETE operations should not occur out of sync. If a dir existed right before Registration, it should not go away until a DELETE event comes right after it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fsnotify CREATE/DELETE operations should not occur out of sync

That is the expectation from the underlying library. Just to be safe, it will be worth catching any potential issue around delete after create operations and logging it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the changes in this PR, the fsnotify CREATE/DELETE operations should not occur out of sync. If a dir existed right before Registration, it should not go away until a DELETE event comes right after it.

nothing guarantees that, correct?

  1. traversePluginDir is called

  2. traversePluginDir adds a filesystem watch to a particular directory

    case mode.IsDir():
    if w.containsBlacklistedDir(path) {
    return filepath.SkipDir
    }
    if err := w.fsWatcher.Add(path); err != nil {
    return fmt.Errorf("failed to watch %s, err: %v", path, err)
    }

  3. traversePluginDir descends into the directory and adds synthetic Create events for the files found in the dir via a goroutine

    go func() {
    defer w.wg.Done()
    w.fsWatcher.Events <- fsnotify.Event{
    Name: path,
    Op: fsnotify.Create,
    }
    }()

because there is an active watcher registered in step 2 that can immediately start delivering events, and the synthetic create events in step 3 are delivered via a goroutine, they can interleave with actual observed filesystem events in non-deterministic ways. For example, if a driver is being deleted while this runs:

  1. filepath.Walk lists dir, sees driver socket file
  2. driver socket file is deleted
  3. filesystem delete event is observed and queued
  4. filepath.Walk queues synthetic create event

because the delete is handled, then the synthetic create event, could we end up with a registered driver that doesn't actually exist any more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To guarantee consistency, shouldn't you be enqueuing existing sockets first prior to accepting fsnotify events from the kernel? Imagine a situation where a socket was identified by path traversal, but before traversePluginDir can enqueue a create event, the socket get's deleted and that event gets processed first before the creation event?

Copy link
Member

@liggitt liggitt Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To guarantee consistency, shouldn't you be enqueuing existing sockets first prior to accepting fsnotify events from the kernel?

that's what I expected as well. registering watches on the dirs, processing the contents and enqueuing synthetic create events, then starting processing of the events from the registered watchers

Imagine a situation where a socket was identified by path traversal, but before traversePluginDir can enqueue a create event, the socket get's deleted and that event gets processed first before the creation event?

yes, that's exactly the scenario described above

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt @vishh I think the code already has HappensBefore and HappensAfter serial properties that you are alluding to. The code seems to have 1to1 parity between observed filesystem event and synthetic queued events. To explain, let's further unpack the scenario that Jordan presented earlier:

  1. filepath.Walk lists dir, sees driver socket file

So let's look at some scenarios

  1. filepath.Walk hits dir, adds watcher for it, continue
  2. filesystem creates socket file (from driver)
    a. But, socket file is immediately deleted from filesystem
    b. According to fswatcher, if the file is removed before it is observed, the Walk will generate an error
  3. filepath.Walk receives error because watcher is missing, returns

Scenario 2

  1. filepath.Walk hits dir, adds watcher for it, continue
  2. filesystem creates socket file (from driver)
  3. filepath.Walk receives socket file info (prior to deletion)
    a. queues synthetic create event
  4. Socket file is from filesystem
  5. filepath.Walk receives deleted file info (after deletion)
    a. enqueues the observed delete event

Because there is a sequentiality between the creation and immediate deletion of the socket files, the observed events will have before/after relationships. Therefore, the synthetic events are generated and placed on the internal event queue (fsWatcher.Events) should also inherit that sequentiality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the scenario described in #71440 (comment) is still racy

The synthetic create events traversePluginDir sends to the channel (for socket files encountered by filepath.Walk) are independent of (and can race with) create/delete events sent to the channel by the registered filesystem watchers.

That said, if a synthetic create event was processed after an actual observed delete event, handleCreateEvent does verify the created path still exists:

fi, err := os.Stat(event.Name)
if err != nil {
return fmt.Errorf("stat file %s failed: %v", event.Name, err)
}

I still think the raciness should be fixed in a follow up because it makes the event flow hard to understand and relies on compensation in the event handler, but in the context of this PR, it is not unsafe.

defer w.wg.Done()

if event.Op&fsnotify.Create == fsnotify.Create {
Expand Down