-
-
Notifications
You must be signed in to change notification settings - Fork 887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inotify: Watcher.Remove may deadlock when multiple fs events occur #195
Comments
Have the same issue here. Adding a goroutine really isn't a feasible alternative here since you need to check for errors and the goroutine that owns the watcher/list of watched directories (because we have no recursive API) needs to know when this goroutine succeeds or fails in order to go ahead with deletion from the watched dirs. This really complicates implementing the recursive watcher :( |
Having this problem as well, please see above reference. What can we do about this? |
Several people have reported this issue where if you are using a single goroutine to watch for fs events and you call Remove in that goroutine it can deadlock. The cause for this is that the Remove was made synchronous by PR fsnotify#73. The reason for this was to try and ensure that maps were no longer leaking. In this PR: IN_IGNORE was used as the event to ensure map cleanup. This worked fine when Remove() was called and the next event was IN_IGNORE, but when a different event was received the main goroutine that's supposed to be reading from the Events channel would be stuck waiting for the sync.Cond, which would never be hit because the select would then block waiting for someone to receive the non-IN_IGNORE event from the channel so it could proceed to process the IN_IGNORE event that was waiting in the queue. Deadlock :) Removing the synchronization then created two nasty races where Remove followed by Remove would error unnecessarily, and one where Remove followed by an Add could result in the maps being cleaned up AFTER the Add call which means the inotify watch is active, but our maps don't have the values anymore. It then becomes impossible to delete the watches via the fsnotify code since it checks it's local data before calling InotifyRemove. This code attempts to use IN_DELETE_SELF as a means to know when a watch was deleted as part of an unlink(). That means that we didn't delete the watch via the fsnotify lib and we should clean up our maps since that watch no longer exists. This allows us to clean up the maps immediately when calling Remove since we no longer try to synchronize cleanup using IN_IGNORE as the sync point. - Fix fsnotify#195 - Fix fsnotify#123 - Fix fsnotify#115
Several people have reported this issue where if you are using a single goroutine to watch for fs events and you call Remove in that goroutine it can deadlock. The cause for this is that the Remove was made synchronous by PR #73. The reason for this was to try and ensure that maps were no longer leaking. In this PR: IN_IGNORE was used as the event to ensure map cleanup. This worked fine when Remove() was called and the next event was IN_IGNORE, but when a different event was received the main goroutine that's supposed to be reading from the Events channel would be stuck waiting for the sync.Cond, which would never be hit because the select would then block waiting for someone to receive the non-IN_IGNORE event from the channel so it could proceed to process the IN_IGNORE event that was waiting in the queue. Deadlock :) Removing the synchronization then created two nasty races where Remove followed by Remove would error unnecessarily, and one where Remove followed by an Add could result in the maps being cleaned up AFTER the Add call which means the inotify watch is active, but our maps don't have the values anymore. It then becomes impossible to delete the watches via the fsnotify code since it checks it's local data before calling InotifyRemove. This code attempts to use IN_DELETE_SELF as a means to know when a watch was deleted as part of an unlink(). That means that we didn't delete the watch via the fsnotify lib and we should clean up our maps since that watch no longer exists. This allows us to clean up the maps immediately when calling Remove since we no longer try to synchronize cleanup using IN_IGNORE as the sync point. - Fix #195 - Fix #123 - Fix #115
Before reporting an issue, please ensure you are using the latest release of fsnotify.
Which operating system (GOOS) and version are you using?
Distributor ID: Ubuntu
Description: Ubuntu 14.04.1 LTS
Release: 14.04
Codename: trusty
go version: go1.7.3 linux/amd64
Please describe the issue that occurred.
I run into this issue, and from issue list I think people seeming to have similar problem, so I spend some time digging the cause.
The code I am using is latest commit fd9ec7d
Assuming select loop for events is sequential, see code example below
Inotify may generate multiple events at same time. When this happens, if one happens to call Watcher.Remove on any valid watch path (e.g. for error handling), then it will block on the Watcher's condition variable (inotify.go:158), waiting for a broadcast on sender side (inotify.go:299). However, the channel size is 1, so it will also block waiting for the select loop to consume events, which is already in waiting state.
I tried to increase the event channel size, which allows sender side to proceed, or alternatively one could use goroutine for Remove in order to avoid the deadlock. But I am not sure if this is ideal.
Are you able to reproduce the issue? Please provide steps to reproduce and a code sample if possible.
Code below is to reproduce the issue: a test folder "tests" has a file "file1" inside, and the code tries to rename "tests/file1" to "file2" (which generates two events). If Remove is called inside select loop, it will cause deadlock alarm. Changing
watcher.Remove
togo watcher.Remove
or modify watcher's Events channel would fix it.The text was updated successfully, but these errors were encountered: