Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which filesystem events correspond to which event::effect_type values? #39

Closed
saukijan opened this issue Oct 6, 2023 · 18 comments
Closed

Comments

@saukijan
Copy link

saukijan commented Oct 6, 2023

Hi @e-dant, I am a bit unsure as to what event::effect_type values are set for which filesystem events. This might be due to my lack of experience with filesystem events, or due to my debian bookworm setup (I am using Debian Bookworm with KDE Plasma 5.27.5). Could you clarify a few points, please?

  1. When I copy a file into the watched directory with cp new_file watched_dir/ command, 2 events are triggered event::effect_type::create first and then event::effect_type::modify. Is this intended or should only 1 event be set?
  2. When I rename a file in the watched directory with mv old_filename new_filename command, only the event::effect_type::rename with the old path is set. How should I get the new filename? Shouldn't a event::effect_type::create be set?
  3. Using mv new_file watched_dir/ command to move a file into the watched directory does not trigger any events at all. Shouldn't this set event::effect_type::create event?
  4. Using mv filename .. in the watched directory triggers a event::effect_type::rename and not event::effect_type::destroy. Is this intended?
  5. When I modify the watched file content event::effect_type::modify is set twice after file modifications are saved. Is this intended?
  6. When modifying a file with nano, vim, or kate editors a temporary hidden file (.filename.txt.swp or .filename.txt.kate-swp for kate) is created. This file also triggers watch events, it's not hard to filter them out by checking the event::path_name filename content, but shouldn't these files not trigger any watch events?
@e-dant
Copy link
Owner

e-dant commented Oct 6, 2023

I want to look more into 1 and 3 before I respond fully. However, here are my thoughts so far:

For 1, my sense is that the events reported are accurate -- At least from the kernel's perspective. However, there are some platform-specific oddities which we smooth over when we can. Maybe this is one which we can add logic to handle. (For example, on Darwin, we keep a map of seen created paths, and only report new create events after the path has been removed, because that system tends to over-report create events for directories.)

For 3, my sense is that we aren't watching the file, so the event isn't being reported to us. I will need to confirm that there is no information available to us that we are missing.

@e-dant
Copy link
Owner

e-dant commented Oct 6, 2023

2, 4 and 6 are accurate and intentional as-is.

5 is typical, but not necessarily correct. I will look into it. I have also seen plenty of "duplicate" modification events from editors. I should look into whether or not they are valid. One way to test this is to match the events to a system-call trace of the editor, or just tee. However, I have thought of those events as accurate, a reflection of extra work done by the editors.

@e-dant
Copy link
Owner

e-dant commented Oct 6, 2023

The interesting one -- Properly handling rename events -- Has been on my mind for many months now. I would appreciate your thoughts on how to handle them.

One of the design goals of this project is to be simple (for users to use). Representing rename events (same as move events) is, in my mind, tricky to do ergonomically for the user.

Some options:

  • Pass out a cookie to correlate rename events, and let the user match them when they get them. A benefit to this is simplicity in implementation and the easy json representation -- it's just another field. A downside is that anything interesting the user would want to do with the cookie would require them to keep a map of cookies-to-events (or a similar structure), update it on each event (which, if not done efficiently, can have downstream effects on the watcher) -- just to find correlations between events. Sounds like something the library should do. Maybe users are comfortable with that, maybe not.
  • Make the event's path_name field a union (or variant) over {path_name,{from_path_name,to_path_name}} which the user can handle in the case of rename events. Whether to represent this as a union in json (is that typical?), or just render a set of optional fields (is that more common?), for our "default" CLI, is a related but much less pressing issue. The biggest downside to this is that it requires every user to handle the variant, even if they don't care about correlated rename from/rename to events.

@e-dant
Copy link
Owner

e-dant commented Oct 7, 2023

Update on 1.

Impression is that there's either (unlikely?) a bug in how the kernel reports modify events, or (more likely?) one of the syscalls used for opening the file for writing is causing a modify event. The watcher is getting a "duplicate" event here:

$ strace -- wtr.watcher build/out/tmp
...
[pid 52043] epoll_wait(4, [], 1, 16)    = 0
[pid 52043] epoll_wait(4, [{events=EPOLLIN, data={u32=3, u64=3}}], 1, 16) = 1
[pid 52043] read(3, "\1\0\0\0\2\0\0\0\0\0\0\0 \0\0\0a-file-in-watche"..., 4096) = 48
[pid 52043] write(1, "\"1696704698022528947\":{\"effect_t"..., 145"1696704698022528947":{"effect_type":"modify","path_name":"/home/edant/dev/watcher/build/out/tmp/a-file-in-watched-dir.txt","path_type":"file"},
) = 145
[pid 52043] epoll_wait(4, [{events=EPOLLIN, data={u32=3, u64=3}}], 1, 16) = 1
[pid 52043] read(3, "\1\0\0\0\2\0\0\0\0\0\0\0 \0\0\0a-file-in-watche"..., 4096) = 48
[pid 52043] write(1, "\"1696704698022727173\":{\"effect_t"..., 145"1696704698022727173":{"effect_type":"modify","path_name":"/home/edant/dev/watcher/build/out/tmp/a-file-in-watched-dir.txt","path_type":"file"},
) = 145
[pid 52043] epoll_wait(4, [], 1, 16)    = 0

While, from echo+redirection's perspective, it only wrote once:

$ strace sh -c 'echo word > tmp/a-file-in-watched-dir.txt'
execve("/bin/sh", ["sh", "-c", "echo word > tmp/a-file-in-watche"...], 0x7ffe10664a50 /* 34 vars */) = 0
brk(NULL)                               = 0x556bbce00000
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc234e56000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=80610, ...}, AT_EMPTY_PATH) = 0
mmap(NULL, 80610, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc234e42000
close(3)                                = 0
openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220s\2\0\0\0\0\0"..., 832) = 832
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
newfstatat(3, "", {st_mode=S_IFREG|0755, st_size=1922136, ...}, AT_EMPTY_PATH) = 0
pread64(3, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784, 64) = 784
mmap(NULL, 1970000, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc234c61000
mmap(0x7fc234c87000, 1396736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x26000) = 0x7fc234c87000
mmap(0x7fc234ddc000, 339968, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x17b000) = 0x7fc234ddc000
mmap(0x7fc234e2f000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1ce000) = 0x7fc234e2f000
mmap(0x7fc234e35000, 53072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc234e35000
close(3)                                = 0
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fc234c5e000
arch_prctl(ARCH_SET_FS, 0x7fc234c5e740) = 0
set_tid_address(0x7fc234c5ea10)         = 52010
set_robust_list(0x7fc234c5ea20, 24)     = 0
rseq(0x7fc234c5f060, 0x20, 0, 0x53053053) = 0
mprotect(0x7fc234e2f000, 16384, PROT_READ) = 0
mprotect(0x556bbcd94000, 8192, PROT_READ) = 0
mprotect(0x7fc234e88000, 8192, PROT_READ) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
munmap(0x7fc234e42000, 80610)           = 0
getuid()                                = 1000
getgid()                                = 1000
getpid()                                = 52010
rt_sigaction(SIGCHLD, {sa_handler=0x556bbcd89dc0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fc234c9cfd0}, NULL, 8) = 0
geteuid()                               = 1000
getrandom("\x66\x49\x19\x0d\x5a\x95\xad\xdf", 8, GRND_NONBLOCK) = 8
brk(NULL)                               = 0x556bbce00000
brk(0x556bbce21000)                     = 0x556bbce21000
getppid()                               = 52007
newfstatat(AT_FDCWD, "/home/edant/dev/watcher/build/out", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
geteuid()                               = 1000
getegid()                               = 1000
rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x556bbcd89dc0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fc234c9cfd0}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGQUIT, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fc234c9cfd0}, NULL, 8) = 0
rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fc234c9cfd0}, NULL, 8) = 0
openat(AT_FDCWD, "tmp/a-file-in-watched-dir.txt", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 10
close(1)                                = 0
fcntl(10, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
write(1, "word\n", 5)                   = 5
dup2(10, 1)                             = 1
close(10)                               = 0
exit_group(0)                           = ?
+++ exited with 0 +++

In particular, I'm wondering if either the openat and fcntl syscalls (from the echo+redirection process) are producing a modify event (in addition to the write syscall).

@e-dant
Copy link
Owner

e-dant commented Oct 8, 2023

Update on 3.

For 3, my sense is that we aren't watching the file, so the event isn't being reported to us. I will need to confirm that there is no information available to us that we are missing.

It seems that when using the inotify adapter on linux, we don't receive an event for files which are moved-from an unwatched directory into a watched directory. This behavior is different when using fanotify -- We do get the rename event when a file is moved-from an unwatched directory into a watched directory.

To be clear, the library doesn't even seem to be notified that there is an event in this case when we're using inotify.

If you need a more precise watcher on linux, fanotify is generally the better choice. That adapter has unlimited file descriptors and fewer quirks. The downside is that you need to have root privileges. The fanotify adapter will always be used if you have a kernel version greater than or equal to 5.9.0 and you have root privileges (i.e. run with sudo).

I still want to look more into any quirks with inotify that might let us have that event. I'm not sure of any off the top of my head. Maybe there's a way...

@saukijan
Copy link
Author

saukijan commented Oct 9, 2023

Hi @e-dant, thanks for your thorough analysis!

Here are my thoughts on your findings.

For 1, my sense is that the events reported are accurate -- At least from the kernel's perspective. However, there are some platform-specific oddities which we smooth over when we can. Maybe this is one which we can add logic to handle.

That would be appreciated, though it is quite easy to solve from the user side too.

If the logic gets too complicated, it might be best just to leave this for the user to deal with. In that case, I would suggest stating in the documentation that copy commands cause 2 events and not 1.

5 is typical, but not necessarily correct. I will look into it. I have also seen plenty of "duplicate" modification events from editors. I should look into whether or not they are valid. One way to test this is to match the events to a system-call trace of the editor, or just tee. However, I have thought of those events as accurate, a reflection of extra work done by the editors.

Thanks for the info, it might be best just to leave this to the user.

In my case, I will simply keep track of file checksums and compare them on the modify event to double-check if there are any changes that need to be addressed.

Users could also use event timestamps and filter out any duplicate modification events that occur within 10ms of the last modify event.

It might be useful to add to the documentation, that modify events caused by editors will result in duplicates. That way, users know to watch out for such behavior.

The interesting one -- Properly handling rename events -- Has been on my mind for many months now. I would appreciate your thoughts on how to handle them.

I don't think cookies would be very easy to use for the user. As you state yourself, users would have to keep track of them and that's extra work for the user,

Refactoring path_name into a variant is not a bad option. It does clearly tell the user that different events contain different information. However, I think it is a bit of an overkill since at least 1 std::filesystem::path instance will always be in the variant.

I think adding an extra std::optional<std::filesystem::path> field might make more sense. This way the original event structure remains the same, so existing users, that do not need this feature, will not need to modify their code bases when updating. It would also be very simple to represent in json format. If the field exists, the optional type will have a value, otherwise, it will be set to std::nullopt. It shouldn't be too difficult on the watcher implementation either, since this field would only be set when it is needed.

Unless, of course, more information needs to be presented to the user. Does event::effect_type::owner need to provide owner information? If so an extra variant field containing all of the possible extra information would make more sense. And if there is no need for extra information, a std::monostotate could be used to show that. This approach also carries all of the benefits of the previous suggestion.

The fanotify adapter will always be used if you have a kernel version greater than or equal to 5.9.0 and you have root privileges (i.e. run with sudo).

Good to know that. It would be very useful to document when each adapter is used for Linux systems and what the difference between them is.

@e-dant
Copy link
Owner

e-dant commented Oct 21, 2023

I'll update the documentation as you suggest.

I began work on rename-from/rename-to events.

Currently, the implementation is storing a pointer to an associated event (within the existing event object).

That might change. I want to leave room for other kinds of associated events -- Not just rename events.

@e-dant
Copy link
Owner

e-dant commented Oct 29, 2023

I mentioned here: #40 (comment)

That the linux adapters (on the next branch) should be mostly ready for use. They pass all of our existing tests, and I've been running them through some shell-script tests to see how they handle rename events. Things look OK.

It will probably be a few more weeks before this is released and up on Conan.

If it is more convenient for you, using the header off the next branch is an option always available to you.

@e-dant
Copy link
Owner

e-dant commented Oct 29, 2023

Your feedback is welcome about the current implementation, especially from an API perspective. Do you have more thoughts? Does the pointer to an "associated" event seem ergonomic from a user's perspective? Are there any notes or pain-points you might come across?

@saukijan
Copy link
Author

Hi @e-dant, thanks for your work!

I created a super simple executable to play around with your new changes to see what works and what doesn't. You can find it here. I hope it's okay that I copy-pasted it, if you want, I can use git modules to link to your sources, it's just copy-paste was easier.

Anyways, here are my findings:

  1. std::unique_ptr<event> const associated{nullptr}; is a bit clunky to use. You can access it from the base event type directly, but when you need to pass it over to another function (as in my example), you have to use the raw pointer to pass it over. It works, but it would be a lot neater if associated field type was std::shared_ptr<event> instead of std::unique_ptr<event> or if it was not const and allowed the user to use the std::move semantics. That way users can stick with smart pointers.
  2. Shouldn't long long const effect_time{...} be size_t const effect_time{...} ? Can duration::count() return a negative integer?
  3. It would be nice if long long const effect_time{...} was TimePoint const effect_time instead. That way user could decide, if nanoseconds precission is desired, or maybe microseconds is enough. Timestamp formatting would also be a lot easier. A helper method to_nanoseconds() could return the old value, if it is needed.
  4. It would be helpful to have a utility function that prints which underlying API is being used(inotify, fanotify, etc...) at runtime. That way users can know exactly what API is used and how it will work.
  5. When I call mv new_file old_file I get 3 Rename events in total. Rename new_file with associated Rename event old_file and then another separate Rename event with old_file at a later time. Shouldn't this be just 2 (event, with another associated event within it)?
  6. Calling chown new_user new_file does not trigger any event at all. What triggers wtr::event::effect_type::owner?
  7. It would be neater if enum class effect_type and enum class path_type were named enum class EffectType and enum class PathType. That way, you could just use EffectType const effect_type{} and PathType const path_type{} and don't have to redefine the C++ enums as C enums for type recognition to work. It would also simplify type aliasing, changing using PathType = enum wtr::event::path_type; into using PathType = wtr::event::PathType;. If Camel case is not wanted, you could also change the Enum names into enum class effect_type_enum and enum class path_type_enum, it's a bit more verbose, but achieves the same result.

@e-dant
Copy link
Owner

e-dant commented Oct 31, 2023

  1. std::unique_ptr const associated{nullptr}; is a bit clunky to use.

I think a shared pointer sounds good here. The non-const unique pointer would be nice, maybe ideal, except that I don't want users to get tripped us with this:

void f(event&&);

void cb(event ev) {
  f(move(ev.associated));

  // Subtle bug. This will still try to use
  // ev.associated if the underlying
  // pointer is not null. I think it's
  // "unspecified" if the underlying
  // pointer is null here, but I also
  // think most (all?) unique pointer
  // implementations will make it null
  // during a move. Either way --
  // I think we can avoid any oddities here. 
  cout << ev;
}

So, I think a shared pointer seems good.

  1. Shouldn't long long const effect_time{...} be size_t const effect_time{...} ? Can duration::count() return a negative integer?

I thought about that a while ago, but I was unsure if it was possible for a (particularly odd) system to go back in time since before epoch. I'm not sure. The time types are usually signed (a-la time_t, and I think most of the time representations chrono uses as well).

I have no idea when, in a running system without any other issues, it would be possible to go back in time before epoch. Maybe in a container or VM migration the system clocks would be all out of whack, but I'm not even sure about that one.

If it's possible to go back before epoch, It's definitely not common. But, lots of those time types are signed, so I can only assume there's some reason for it.

It would probably be fine to change the type we use to a size_t, or better:

It would be nice if long long const effect_time{...} was TimePoint const effect_time instead. That way user could decide, if nanoseconds precission is desired, or maybe microseconds is enough. Timestamp formatting would also be a lot easier. A helper method to_nanoseconds() could return the old value, if it is needed.

I think we can and should do that.

  1. It would be helpful to have a utility function that prints which underlying API is being used(inotify, fanotify, etc...) at runtime. That way users can know exactly what API is used and how it will work.

I agree, and I think this would be useful in some cases.

Maybe there's room for an alternative API that does give the user more control and inspectability. I don't want that to affect the "fast path" to get a user up and running. Maybe this is an exception, I'm not sure.

In either case, I think we can push that a release or two down the road.

  1. When I call mv new_file old_file I get 3 Rename events in total. Rename new_file with associated Rename event old_file and then another separate Rename event with old_file at a later time. Shouldn't this be just 2 (event, with another associated event within it)?

These are (in all the cases I've seen them) "destructive" rename events; A rename-and-overwrite. (Did old_file exist at the time you called mv new_file old_file? If not, I'll need to do some debugging...)

A very similar pattern happens on Darwin. You can see my full notes about them here. Most of those notes apply to the Linux adapters.

I'm not sure if we should filter out and ignore the special case of a destructive rename, indicate a destroy event immediately before the destructive rename event, or just filter it out.

It might not always possible to get the full picture of a destructive rename event; We can peek ahead for all the events we know about, and we can almost always catch that pattern when we see it -- but it's not guaranteed that very closely timed events end up reported in the same call to read(), or the equivalent mechanisms on Windows and Darwin.

For now, we're just reporting them as-is. What do you think we should do there?

  1. Calling chown new_user new_file does not trigger any event at all. What triggers wtr::event::effect_type::owner?

Sounds like a bug. I think I see it, a missing flag -- TYVM

  1. It would be neater if enum class effect_type and enum class path_type were named enum class EffectType and enum class PathType.

I completely agree. It's a pain point, and I'm not sure how best to handle it. I personally think that CamelCase types make more sense, especially in this context, and I have no problem changing the type names in the future. I am also perfectly happy to rename the fields and values so that they are not shadowing one another. I would prefer to lump that along into a release or two in the future, though.

@e-dant
Copy link
Owner

e-dant commented Oct 31, 2023

Hi @e-dant, thanks for your work!

Thank you very much for your feedback

@saukijan
Copy link
Author

Hi @e-dant, great summary!

Regarding rename events, I try to ensure that it is a pure rename and not rename-and-overwrite. I do that as follows:

  1. Ensure the observed directory is empty.
  2. I first run touch new_file and expect to get Create File event (which I get)
  3. I then run mv new_file old_file and expect to get a a single Rename File event with another associated Rename File event (which I get, but just after that I get another Rename File event that points to old_file)

Here is a bit more detailed explanation:

Click me
  1. cd to test_watcher_changes project
  2. run rm -rf observed to ensure that we have a fresh dir
  3. run mkdir observed && cd observed
  4. run ../build/watcher_test_runner
    get:
Observing /test_watcher_changes/observed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Event Type: Create
Path Type: Watcher
Path Name: s/self/live@/test_watcher_changes/observed
Effect Time: 1698747739951805717
Associated event: 
 None
================================================================================
  1. create a new console
  2. cd to test_watcher_changes/observed
  3. run touch new_file
    get
Event Type: Create
Path Type: File
Path Name: /test_watcher_changes/observed/new_file
Effect Time: 1698747785194756460
Associated event: 
 None
================================================================================
  1. run mv new_file old_file
    get
Event Type: Rename
Path Type: File
Path Name: /test_watcher_changes/observed/new_file
Effect Time: 1698747811846163594
Associated event: 
 Event Type: Rename
 Path Type: File
 Path Name: /test_watcher_changes/observed/old_file
 Effect Time: 1698747811846152650
 Associated event: 
  None
================================================================================
Event Type: Rename
Path Type: File
Path Name: /test_watcher_changes/observed/old_file
Effect Time: 1698747811846289196
Associated event: 
 None
================================================================================

I ran ./watcher_test_runner thrice, once as a non-sudo, once as a sudo user, and finally as a root user and I always got the same results. I ran these tests on a Debian Bookworm VM, hosted on Windows 10, so that might be the cause. Also, the reason why I suggested a utility function to print/return the API name as a string was because I can not tell for sure, which API inotify, fanotify, epoll, or warthog is being used when I run these tests.

Regarding

For now, we're just reporting them as-is. What do you think we should do there?

I think that is the correct approach. At least from my point of view, this library just reports various filesystem events as they are reported by the system. If the user needs to ensure that the file/directory really did change, then the user should keep a map of monitored files and their hashsums to double-check against.

It's just a bit tricky to know which filesystem events trigger which wtr::event::effect_type event, hence the original request for clarification. It might be good to have a rough explanation in the documentation, what triggers what and when. For example, creating a new file/directory always triggers one wtr::event::effect_type::create event for said file/directory, rename events come with an associated event, and so on.

@e-dant
Copy link
Owner

e-dant commented Dec 21, 2023

Update on this.

The issues arounds rename events should be mostly resolved. I plan to release soon.

Two of the issues I want to look into a bit more and grab your feedback on.

  1. Determining which watcher adapter is running

I went ahead and started work on this along with the groundwork for a user-selectable adapter. The implementation reorganizes the directory tree a bit (to elide some namespaces), changes the watcher's return type to a result-style type, and places all the watch adapter symbols in the final binary, but stubbed out to return an "unsupported" result.

I think that only being able to observe the currently-running would leave some people wanting to configure it.

This adds about 17k to the final binary and adds a tiny bit of overhead scrolling through function pointers to a supported watcher, if one isn't given. Both are probably negligible, but it does feel a bit like feature creep. Not sure about this.

  1. The owner event

This one would require keeping a map of stat structs (or a subset of them) associated with file descriptors to see what changed when we are notified about an attribute change. Kernels don't tell us much about attribute changes, like what exactly changed; just that they changed.

On linux and, perhaps, windows, this is a feasible thing to do. We already have data structures for keeping bits of information on paths. On some linux adapters, for example, we keep file descriptor to path name maps.

I'm worried about the darwin implementation of this. Not because of its feasibility, but safety. I'm convinced there's a bug somewhere in darwin's dispatch implementation, or just fsevents not behaving properly under extreme load, that leads to calling into a queue after it has been released. That's bad, and leads to a segfault. It's an extraordinarily rare issue only observable on some of the very high-throughput performance tests we run, but it exists. The current workaround is a 1-millisecond sleep after we've asked fsevents to stop, which is ugly.

I am in the process of removing the context from the darwin adapter. If it's stateless, that issue from dispatch doesn't exist. Would be hard to get those owner events without a context.

@saukijan
Copy link
Author

saukijan commented Jan 8, 2024

Hi @e-dant! Thanks for the info and sorry that I did not reply for a long while, I did not have my work setup to test your changes during the holidays.

I am glad to confirm that the latest commit on the next branch, 3a45a7f does indeed fix the renaming issue. Now I get one main 'rename' event for the old element name and one associated 'rename' event with the new name in it. However, I noticed that you still use std::unique_ptr<event> type for the associated event. Is there a problem in reworking this into a shared pointer?

Regarding the user selectable adapter. That does sound like feature creep. I would first implement the reporting functionality and only then gauge if users want/need to choose which type of adapter to use at runtime. From my point of view, it should be a compile-time decision rather than a runtime one. I don't see the need to change the adapter at runtime, only logging the type for debug purposes. If it's not hard to do and you want to implement it, my suggestion would be to add a cmake option that enables this at compile time, thus letting the user decide if this feature is desired and warrants the added binary size.

Also, could you link to the code that is suppose to do this?

Regarding the owner events. If you think it's not safe to implement it on darwin, then I would not implement it at all. Instead, I would add a disclaimer in docu/readme that owner events do not report information the same way as rename events do due to added complexity and possible segfaults on darwin. If someone needs this feature, they could always open an issue and ask for it.

Another option would be to have simple owner events (without any associated events attached) for darwin and full owner events for the systems that support it. It would be a bit messier and more confusing to the user, but with proper disclaimers, it could be done.

I am not sure how often a use-case for owner events would be, personally, I can't really think of any, so I would not spend too much time on it and mark it a help-wanted issue or a vote for feature request and just add a disclaimer in all docus.

Finally, do you have a timetable when the next release will go live?

@saukijan
Copy link
Author

saukijan commented Jan 8, 2024

I forgot one thing, love the new wtr::to<string>() converters for the enum types!

One little thing though, I would avoid overcomplicating the documented example.

Instead of declaring the auto s = [](auto a) { return to<string>(a); }; lambda and using it, I would just use the to<string>() function. From personal experience, it's best to keep examples as simple as possible to understand and let the users decide how and what to optimize.

@e-dant
Copy link
Owner

e-dant commented Jan 28, 2024

Hi @e-dant! Thanks for the info and sorry that I did not reply for a long while, I did not have my work setup to test your changes during the holidays.

I am glad to confirm that the latest commit on the next branch, 3a45a7f does indeed fix the renaming issue. Now I get one main 'rename' event for the old element name and one associated 'rename' event with the new name in it. However, I noticed that you still use std::unique_ptr<event> type for the associated event. Is there a problem in reworking this into a shared pointer?

Glad to hear it! And no, I don't recall any issue making it a shared pointer. I think we can do so.

Regarding the user selectable adapter. ...
Also, could you link to the code that is suppose to do this?

It's almost all done during compile-time. There is only one adapter used on Windows and Darwin. These folders contain inline namespaces, so the adapter::watch function, when invoked, selects the function defined once in those files. There is some extra work we do for linux which selects fanotify if we have privileges for it and it's available, inotify otherwise.

Regarding the owner events. If you think it's not safe to implement it on darwin, then I would not implement it at all. Instead, I would add a disclaimer in docu/readme that owner events do not report information the same way as rename events do due to added complexity and possible segfaults on darwin. If someone needs this feature, they could always open an issue and ask for it.

Another option would be to have simple owner events (without any associated events attached) for darwin and full owner events for the systems that support it. It would be a bit messier and more confusing to the user, but with proper disclaimers, it could be done.

I am not sure how often a use-case for owner events would be, personally, I can't really think of any, so I would not spend too much time on it and mark it a help-wanted issue or a vote for feature request and just add a disclaimer in all docus.

Finally, do you have a timetable when the next release will go live?

Two things are holding up the release:

  • Resolving safety issues on Darwin
  • Adding documentation for this library's limitations

The safety issues on Darwin are somewhat outside of our control. The tradeoff currently staged on the next branch reduces how well we pick up events concurrently because it clears any pending events before closing the watcher. That's a global OS property, so it affects everything. I don't like this, but it beats the OS giving us events when nothing exists in memory to handle them. Maybe that's ok.

The documentation just takes time, and it's something I have yet to get around to.

The release was otherwise ready a while ago. I don't have a timeline for the above two items.

@e-dant
Copy link
Owner

e-dant commented Jan 28, 2024

The safety issues on Darwin are somewhat outside of our control. The tradeoff currently staged on the next branch reduces how well we pick up events concurrently because it clears any pending events before closing the watcher. That's a global OS property, so it affects everything. I don't like this, but it beats the OS giving us events when nothing exists in memory to handle them. Maybe that's ok.

I authored and a corresponding PR was merged into notify-rs in re. the Darwin bug. I often hope someone at Apple will look into it, but I can't, their code being closed and whatnot.

@e-dant e-dant closed this as completed May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants