Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignored symlinks cause original files to be skipped? #508

Open
naps62 opened this issue Feb 28, 2023 · 8 comments
Open

Ignored symlinks cause original files to be skipped? #508

naps62 opened this issue Feb 28, 2023 · 8 comments
Labels
bug Something's not right! has workaround Bug has a workaround, so it might not need urgent attention

Comments

@naps62
Copy link

naps62 commented Feb 28, 2023

I'm running watchexec over a yarn workspace (the builtin build watcher I'm using was showing some limitations).

  • Watchexec's version: 1.21.1
  • The OS you're using: Arch linux

yarn workspaces are similar to cargo, in that you can have multiple sub-packages depending on each other. However, the way that works is via symlinks: if @my-package/a depends on @my-package/z, then there'll be a symlink node_modules/@my-package/z pointing to ./z where the source code lives

node_modules is gitignore'd, of course.
And apparently, this causes any changes within ./z to be skipped as well

notice that I used z for the problematic package. This seems to only happen if alphabetical order causes node_modules to be listed first. Renaming z to b makes the problem go away

Here's a quick script with which I can reproduce this consistenly:

mkdir foo
cd foo
mkdir a z node_modules
echo "node_modules" > .gitignore
ln -s $PWD/z node_modules/z
watchexec --restart --exts ts --print-events -- "true"

touch a/main.ts # works
touch z/main.ts # fails
@naps62 naps62 added the bug Something's not right! label Feb 28, 2023
@passcod
Copy link
Member

passcod commented Feb 28, 2023

That's super weird, thanks for the short reproduction!

@passcod
Copy link
Member

passcod commented Feb 28, 2023

Okay, this is quite odd but annoyingly I can't might not be able to do anything about it, because I only ever get the symlink's event, not the original path's. To see what I'm seeing, try running with -vvvv (trace mode). You'll get logs looking like

TRACE watchexec::fs: receiving possible event from watcher

which are directly from the notify crate. When I run your example with z symlinked, I only get these events:

  • Modify(Metadata(Any)) at /home/code/rust/foo/node_modules/z/main.ts
  • Access(Close(Write)) at /home/code/rust/foo/node_modules/z/main.ts

which get correctly filtered off.

When I instead symlink a, I get the same events but for /home/code/rust/foo/a/main.ts


However, using inotifywatch -r . from community/inotify-tools returns the expected behaviour:

# with a symlinked, touching a/main.ts
$ inotifywatch -r .                                                     
Establishing watches...                                                  
Finished establishing watches, now collecting statistics.                
^Ctotal  access  attrib  close_write  close_nowrite  open  filename      
11     5       0       0            3              3     ./              
6      1       1       1            1              2     ./a/            
                                                                         
# with z symlinked, touching z/main.ts
$ inotifywatch -r .                                                     
Establishing watches...                                                  
Finished establishing watches, now collecting statistics.                
^Ctotal  access  attrib  close_write  close_nowrite  open  filename      
16     6       0       0            5              5     ./              
6      1       1       1            1              2     ./z/            
                                                                         
# with z symlinked, touching node_modules/z/main.ts
$ inotifywatch -r .                                                     
Establishing watches...                                                  
Finished establishing watches, now collecting statistics.                
^Ctotal  access  attrib  close_write  close_nowrite  open  filename      
48     20      0       0            14             14    ./              
20     7       1       1            5              6     ./z/            
7      3       0       0            2              2     ./node_modules/

which is curious.

@passcod
Copy link
Member

passcod commented Feb 28, 2023

With strace -f -e inotify_add_watch, here's what watchexec watches:

[pid 3652996] inotify_add_watch(9, "/home/code/rust/foo", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 1
[pid 3652996] inotify_add_watch(9, "/home/code/rust/foo/a", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE) = 2
[pid 3652996] inotify_add_watch(9, "/home/code/rust/foo/z", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE) = 3
[pid 3652996] inotify_add_watch(9, "/home/code/rust/foo/node_modules", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE) = 4
[pid 3652996] inotify_add_watch(9, "/home/code/rust/foo/node_modules/z", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE) = 3

and what inotifywatch watches:

inotify_add_watch(3, "./a/", IN_ACCESS|IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_CLOSE_NOWRITE|IN_OPEN|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 1
inotify_add_watch(3, "./z/", IN_ACCESS|IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_CLOSE_NOWRITE|IN_OPEN|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(3, "./node_modules/", IN_ACCESS|IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_CLOSE_NOWRITE|IN_OPEN|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 3
inotify_add_watch(3, "./", IN_ACCESS|IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_CLOSE_NOWRITE|IN_OPEN|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 4

The latter watches relative paths and the former absolute but I think that's irrelevant; something that is interesting is that inotifywatch doesn't add a watch on ./node_modules/z; it might be following symlinks and figuring that it's already watched by ./z/ earlier, which probably helps here.

The other glaring difference is the set of events that are being watched. Notable watchexec is missing IN_CLOSE_NOWRITE and IN_OPEN. However, using -e modify -e attrib -e close_write -e moved_from -e moved_to -e create -e delete -e delete_self -e move_self to restrict inotifywatch to watchexec's event set didn't change the behaviour, so it must be the extraneous watch on /home/code/rust/foo/node_modules/z that watchexec does that affects this behaviour.

...and yet, if I get inotifywatch to watch the paths watchexec does, with the set events watchexec asks for, I get the correct behaviour still:

# touching z/main.ts
strace -f -e inotify_add_watch -- inotifywatch -e modify -e attrib -e close_write -e moved_from -e moved_to -e create -e delete -e delete_self -e move_self . a/ z/ node_modules/ node_modules/z/
Establishing watches...
inotify_add_watch(3, ".", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 1
inotify_add_watch(3, "a/", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 2
inotify_add_watch(3, "z/", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 3
inotify_add_watch(3, "node_modules/", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 4
inotify_add_watch(3, "node_modules/z/", IN_MODIFY|IN_ATTRIB|IN_CLOSE_WRITE|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 3
Finished establishing watches, now collecting statistics.
^Cstrace: Process 3660586 detached
total  attrib  close_write  filename
2      1       1            z/

It's also doing it if I use absolute paths.

I am confused.

@passcod
Copy link
Member

passcod commented Feb 28, 2023

As a workaround, polling mode does work.

Also watchexec shouldn't even be watching node_modules in the first place, as it's ignored, so something's weird here too. Forcing it to watch everything but node_modules (with lots of -w, unfortunately there's no direct option for it) does behave properly without polling. If you've got a small amount of packages (or can script that up) that might be worth doing.

@passcod passcod added OS: Linux Affects Linux specifically has workaround Bug has a workaround, so it might not need urgent attention and removed OS: Linux Affects Linux specifically labels Feb 28, 2023
@naps62
Copy link
Author

naps62 commented Feb 28, 2023

@passcod one workaround I found, perhaps simpler is:
watchexec --restart --exts ts --print-events --no-project-ignore --ignore "node_modules/*" -- "true"

at least for my case, adding multiple --ignore is simpler and less error prone than individually watching packages (since I'll be adding more as time goes on)

events on z/ still show up as coming from ./node_modules/z, but that's irrelevant here since the restart trigger is all I care about

@Gelio
Copy link

Gelio commented Oct 25, 2023

I am facing a similar issue in nrwl/nx#19312. It is also related to ignoring files that have a symlink originating from node_modules.

I found that using fs::canonicalize helps follow the symlink out of node_modules to the target, where it no longer includes node_modules, and is no longer ignored.

Ideally, the notify crate would call canonicalize before setting up the watcher. This way any events from inotify already use the canonical target path. When I added .canonicalize() to the code in notify that adds the watchers recursively:

diff --git a/notify/src/inotify.rs b/notify/src/inotify.rs
index ccf8187..0a5ef86 100644
--- a/notify/src/inotify.rs
+++ b/notify/src/inotify.rs
@@ -375,7 +375,15 @@ impl EventLoop {
             .into_iter()
             .filter_map(filter_dir)
         {
-            self.add_single_watch(entry.path().to_path_buf(), is_recursive, watch_self)?;
+            self.add_single_watch(
+                entry
+                    .path()
+                    .to_path_buf()
+                    .canonicalize()
+                    .expect("canonicalize"),
+                is_recursive,
+                watch_self,
+            )?;
             watch_self = false;
         }

I got the correct events from the notify crate:

Ok(Event { kind: Modify(Metadata(Any)), paths: ["/home/voreny/projects/nx-watcher-repro/packages/lib/src/lib.ts"], attr:tracker: None, attr:flag: None, attr:info: None, attr:source: None })

whereas previously it was:

Ok(Event { kind: Modify(Metadata(Any)), paths: ["/home/voreny/projects/watchtest/../nx-watcher-repro/node_modules/lib/src/lib.ts"], attr:tracker: None, attr:flag: None, attr:info: None, attr:source: None })

This would be a change to the notify crate, which I am not sure is reasonable.

If we wanted to fix that in watchexec, we could call canonicalize on the paths that we receive from the notify crate, and before we do the filtering. This would turn the path like /home/voreny/projects/watchtest/../nx-watcher-repro/node_modules/lib/src/lib.ts to /home/voreny/projects/nx-watcher-repro/packages/lib/src/lib.ts.

Looking at the source code, I found that adding canonicalize to

let path = dunce::simplified(path);
helps resolve the path:

  2023-10-25T10:20:50.947724Z TRACE watchexec::action::worker: got event, event: Event { tags: [Source(Filesystem), FileEventKind(Modify(Metadata(Any))), Path { path: "/home/voreny/projects/nx-watcher-repro/node_modules/lib/src/lib.ts", file_type: Some(File) }], metadata: {} }, priority: Normal
    at crates/lib/src/action/worker.rs:77

  2023-10-25T10:20:50.947736Z TRACE watchexec_cli::filterer::globset: check against original event
    at crates/cli/src/filterer/globset.rs:136

  2023-10-25T10:20:50.947753Z TRACE watchexec_filterer_globset: new
    at crates/filterer/globset/src/lib.rs:116
    in watchexec_filterer_globset::filterer_check

  2023-10-25T10:20:50.947768Z TRACE watchexec_filterer_globset: checking internal ignore filterer
    at crates/filterer/globset/src/lib.rs:119
    in watchexec_filterer_globset::filterer_check

  2023-10-25T10:20:50.947784Z TRACE watchexec_filterer_ignore: new
    at crates/filterer/ignore/src/lib.rs:34
    in watchexec_filterer_ignore::filterer_check
    in watchexec_filterer_globset::filterer_check

  2023-10-25T10:20:50.947819Z TRACE watchexec_filterer_ignore: new
    at crates/filterer/ignore/src/lib.rs:41
    in watchexec_filterer_ignore::checking_against_compiled with path: "/home/voreny/projects/nx-watcher-repro/packages/lib/src/lib.ts", file_type: Some(File)
    in watchexec_filterer_ignore::filterer_check
    in watchexec_filterer_globset::filterer_check

  2023-10-25T10:20:50.947846Z TRACE ignore_files::filter: checking against path or parents
    at crates/ignore-files/src/filter.rs:322
    in watchexec_filterer_ignore::checking_against_compiled with path: "/home/voreny/projects/nx-watcher-repro/packages/lib/src/lib.ts", file_type: Some(File)
    in watchexec_filterer_ignore::filterer_check
    in watchexec_filterer_globset::filterer_check

  2023-10-25T10:20:50.947878Z TRACE watchexec_filterer_ignore: no match (pass)
    at crates/filterer/ignore/src/lib.rs:46
    in watchexec_filterer_ignore::checking_against_compiled with path: "/home/voreny/projects/nx-watcher-repro/packages/lib/src/lib.ts", file_type: Some(File)
    in watchexec_filterer_ignore::filterer_check
    in watchexec_filterer_globset::filterer_check

  2023-10-25T10:20:50.947900Z TRACE watchexec_filterer_ignore: close, time.busy: 63.4µs, time.idle: 18.2µs
    at crates/filterer/ignore/src/lib.rs:41
    in watchexec_filterer_ignore::checking_against_compiled with path: "/home/voreny/projects/nx-watcher-repro/packages/lib/src/lib.ts", file_type: Some(File)
    in watchexec_filterer_ignore::filterer_check
    in watchexec_filterer_globset::filterer_check

  2023-10-25T10:20:50.947918Z TRACE watchexec_filterer_ignore: verdict, pass: true
    at crates/filterer/ignore/src/lib.rs:64
    in watchexec_filterer_ignore::filterer_check
    in watchexec_filterer_globset::filterer_check

@passcod
Copy link
Member

passcod commented Oct 26, 2023

Yeah, but canonicalize does I/O, which we don't really want at that point.

If doing the canonicalize in notify fixes it, then perhaps the solution is to do it in watchexec, reimplementing the notify recurse logic ourselves.

@jeremysf
Copy link

This issue is brutal for pnpm managed repositories where there are a ton of symlinks between node_modules directories with similar issues as described by the original poster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something's not right! has workaround Bug has a workaround, so it might not need urgent attention
Projects
None yet
Development

No branches or pull requests

4 participants