New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watching Directories in Inotify - you don't catch all created files in new subdirs. #11

Closed
ufobat opened this Issue Nov 28, 2017 · 6 comments

Comments

Projects
None yet
2 participants
@ufobat

ufobat commented Nov 28, 2017

Hi,

my Bug report applys only when you parse_events => 1 on INotify on linux.

Here https://github.com/mvgrimes/AnyEvent-Filesys-Notify/blob/master/lib/AnyEvent/Filesys/Notify/Role/Inotify2.pm#L83-L94 you're going to create Events for files that might have been created in a new created directory. Unfortunatelly you won't catch them all. As you mentioned in the comment above. This is due the the fact that you add the directory later and new files could have been created there.

  1. I didnt find this in the documentation. :-( Unless i was to stupid to find it, users probably dont know about this. and expect all eventys to be caught.

  2. You should probably add the directory to the watch list, before you scan through it for files. I would also write a PR but I don't know how to change the code. _parse_events isn't allways called and I dont want to break your design.

  3. In order to catch all events the user must scan the file for missed files as well, because when the user gets the "created directory" event it is already on the watch list. So what's happening now is that you scan the direcotry twice. This makes me feel that you're workaround is
    a) just slowing things down, any maybe
    b) leading more into troubles because now you probably notice this behavour "pretty late".

ufobat added a commit to ufobat/AnyEvent-Filesys-Notify that referenced this issue Nov 28, 2017

@mvgrimes

This comment has been minimized.

Show comment
Hide comment
@mvgrimes

mvgrimes Dec 4, 2017

Owner

Thanks for the report and PR. The patch looks promising. The one blocker is that (I think) it fails to apply the filter to the files/subdirs in the newly created directory. Aside from that, it seems to solve the race condition. The trade-off is the increased likelihood of duplicates events being fired. Have you actually experienced the race condition?

I'm not sure it is relevant, but I didn't fully follow you your third point. The current implementation does scan newly created directory (that's why there is no mention in the docs). If anything is missed, it is due to the race condition.

Owner

mvgrimes commented Dec 4, 2017

Thanks for the report and PR. The patch looks promising. The one blocker is that (I think) it fails to apply the filter to the files/subdirs in the newly created directory. Aside from that, it seems to solve the race condition. The trade-off is the increased likelihood of duplicates events being fired. Have you actually experienced the race condition?

I'm not sure it is relevant, but I didn't fully follow you your third point. The current implementation does scan newly created directory (that's why there is no mention in the docs). If anything is missed, it is due to the race condition.

mvgrimes added a commit that referenced this issue Dec 5, 2017

Adds new directories to watch *before* searching them for new entitie…
…s (GH#11)

When using `parse_events => 1` with inotify, there is a race condition
where entities in a new directory could be missed. We need to add the
new directory to the watch list before searching it for new files or
sub-directories.

Reported by @ufobat in GitHub Issue #11. Thanks!
@mvgrimes

This comment has been minimized.

Show comment
Hide comment
@mvgrimes

mvgrimes Dec 5, 2017

Owner

I just pushed a new commit (8324180) that to address the race condition. Let me know if you agree, and I'll release a new version.

I'm not sure, but this might change the behavior when ignoring directories in some rare situations. If someone is ignoring qr{/ignoreme$}, I could envision a situation where ignoreme/sub2/ ends up being watched. This is consistent with parse_events => 0, but it might not have been watched with parse_events => 1. I need to think it through it a bit more.

Owner

mvgrimes commented Dec 5, 2017

I just pushed a new commit (8324180) that to address the race condition. Let me know if you agree, and I'll release a new version.

I'm not sure, but this might change the behavior when ignoring directories in some rare situations. If someone is ignoring qr{/ignoreme$}, I could envision a situation where ignoreme/sub2/ ends up being watched. This is consistent with parse_events => 0, but it might not have been watched with parse_events => 1. I need to think it through it a bit more.

@ufobat

This comment has been minimized.

Show comment
Hide comment
@ufobat

ufobat Dec 5, 2017

Thanks for the report and PR. The patch looks promising. The one blocker is that (I think) it fails to apply the filter to the files/subdirs in the newly created directory.

Really? I tried to read the code and I dont see why. it still calles _parse_event() which applies the filter. What did I miss?
https://github.com/ufobat/AnyEvent-Filesys-Notify/blob/c335290c51eacac61138bda7b41faf5cf626423b/lib/AnyEvent/Filesys/Notify/Role/Inotify2.pm#L141

The trade-off is the increased likelihood of duplicates events being fired. Have you actually experienced the race condition?

Aye this is increased. That should probably be mentioned in the readme/documentation :-)

Have you actually experienced the race condition?

Aye. It happend to me in my daylight job :-) And that's what i tried to explain in my 3rd point. because my workaround in my code is to rescan new created dirs. This fix will make this workaround in my code obsolete.

ufobat commented Dec 5, 2017

Thanks for the report and PR. The patch looks promising. The one blocker is that (I think) it fails to apply the filter to the files/subdirs in the newly created directory.

Really? I tried to read the code and I dont see why. it still calles _parse_event() which applies the filter. What did I miss?
https://github.com/ufobat/AnyEvent-Filesys-Notify/blob/c335290c51eacac61138bda7b41faf5cf626423b/lib/AnyEvent/Filesys/Notify/Role/Inotify2.pm#L141

The trade-off is the increased likelihood of duplicates events being fired. Have you actually experienced the race condition?

Aye this is increased. That should probably be mentioned in the readme/documentation :-)

Have you actually experienced the race condition?

Aye. It happend to me in my daylight job :-) And that's what i tried to explain in my 3rd point. because my workaround in my code is to rescan new created dirs. This fix will make this workaround in my code obsolete.

@ufobat

This comment has been minimized.

Show comment
Hide comment
@ufobat

ufobat Dec 5, 2017

Let me know if you agree, and I'll release a new version.

I don't quite understand why you had to change the order of the calles from _parse_events and _filter_events. is it because of the difference in the behavour depending on the parse_events setting?

I think I found a bug in both of our commits. What if you create a directory parent and within this directory you create subdir. If you just find subdir when you scan the directory parent we dont add a watcher for those dirs, we just create "dummy" events.

There is a debug line: 8324180#diff-95cdfb59f7cb21b7bb990a096301664bR131

ufobat commented Dec 5, 2017

Let me know if you agree, and I'll release a new version.

I don't quite understand why you had to change the order of the calles from _parse_events and _filter_events. is it because of the difference in the behavour depending on the parse_events setting?

I think I found a bug in both of our commits. What if you create a directory parent and within this directory you create subdir. If you just find subdir when you scan the directory parent we dont add a watcher for those dirs, we just create "dummy" events.

There is a debug line: 8324180#diff-95cdfb59f7cb21b7bb990a096301664bR131

@mvgrimes

This comment has been minimized.

Show comment
Hide comment
@mvgrimes

mvgrimes Dec 5, 2017

Owner

Really? I tried to read the code and I dont see why. it still calles _parse_event() which applies the filter. What did I miss?

That line is the callback when a new event is fired. None of the events created is your _parse_events_postprocess are filtered.

I think I found a bug in both of our commits. What if you create a directory parent and within this directory you create subdir. If you just find subdir when you scan the directory parent we dont add a watcher for those dirs, we just create "dummy" events.

I think you're correct. We need a call to _add_events_to_watch in _add_entities_in_subdir.

There is a debug line: 8324180#diff-95cdfb59f7cb21b7bb990a096301664bR131

Good catch.

Owner

mvgrimes commented Dec 5, 2017

Really? I tried to read the code and I dont see why. it still calles _parse_event() which applies the filter. What did I miss?

That line is the callback when a new event is fired. None of the events created is your _parse_events_postprocess are filtered.

I think I found a bug in both of our commits. What if you create a directory parent and within this directory you create subdir. If you just find subdir when you scan the directory parent we dont add a watcher for those dirs, we just create "dummy" events.

I think you're correct. We need a call to _add_events_to_watch in _add_entities_in_subdir.

There is a debug line: 8324180#diff-95cdfb59f7cb21b7bb990a096301664bR131

Good catch.

@ufobat

This comment has been minimized.

Show comment
Hide comment
@ufobat

ufobat Dec 5, 2017

you're right!

ufobat commented Dec 5, 2017

you're right!

@mvgrimes mvgrimes closed this Mar 22, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment