|
| 1 | +# Changes from upstream `@parcel/watcher` |
| 2 | + |
| 3 | +This Go port started from the C++ |
| 4 | +[`@parcel/watcher`](https://github.com/parcel-bundler/watcher) (v2.5.6, |
| 5 | +`8926bb8`) and has diverged significantly. This document covers API differences, |
| 6 | +simplifications, new features, and bugfixes. |
| 7 | + |
| 8 | +## API differences |
| 9 | + |
| 10 | +### Method naming |
| 11 | + |
| 12 | +| C++ / JS | Go | |
| 13 | +| ---------------------- | ---------------------------------- | |
| 14 | +| `subscribe(dir, fn)` | `WatchDirectory(dir, fn, opts...)` | |
| 15 | +| — | `WatchFile(path, fn)` | |
| 16 | +| `unsubscribe(dir, fn)` | `w.Close()` | |
| 17 | + |
| 18 | +### Recursion default |
| 19 | + |
| 20 | +C++ `subscribe` is always recursive. Go's `WatchDirectory` is **non-recursive by |
| 21 | +default**, watching only direct children. Pass `WithRecursive()` to watch the |
| 22 | +entire tree. This matches TypeScript's `watchDirectory(path, cb, recursive?)` |
| 23 | +where recursive is opt-in. |
| 24 | + |
| 25 | +### Event kinds |
| 26 | + |
| 27 | +C++ has three event kinds: create, update, delete. Go has two: **`EventUpdate`** |
| 28 | +and **`EventDelete`**. File creation is reported as `EventUpdate`. `tsc --watch` |
| 29 | +doesn't distinguish between a file being created and a file being modified; both |
| 30 | +mean "something changed, rebuild." This also sidesteps a C++ FSEvents bug where |
| 31 | +pre-existing files are misclassified as "created" because the internal tree |
| 32 | +starts empty at subscribe time. |
| 33 | + |
| 34 | +### Watch options |
| 35 | + |
| 36 | +Go adds functional options not present in the C++ API: |
| 37 | + |
| 38 | +- **`WithRecursive()`**: opt in to recursive directory tree watching. |
| 39 | +- **`WithIgnore(func(path string) bool)`**: filter events per-subscriber before |
| 40 | + delivery. Return true to drop. |
| 41 | + |
| 42 | +### File watching |
| 43 | + |
| 44 | +`WatchFile(path, fn)` watches a single file by watching its parent directory |
| 45 | +non-recursively and filtering events to the target path. Multiple file watches |
| 46 | +in the same directory share one OS watch. Not available in the C++ API. |
| 47 | + |
| 48 | +### Error delivery |
| 49 | + |
| 50 | +C++ delivers errors via a separate error callback or return value. Go delivers |
| 51 | +errors through the same `WatchCallback(events, err)` with sentinel errors: |
| 52 | + |
| 53 | +- `ErrOverflow`: recoverable, the watch stays active. |
| 54 | +- `ErrWatchTerminated`: terminal, call `Close()` to clean up. |
| 55 | + |
| 56 | +`ErrUnavailable` is returned directly from `WatchDirectory`/`WatchFile` (not |
| 57 | +through the callback) when the watcher is not supported on the current platform. |
| 58 | + |
| 59 | +## Simplifications |
| 60 | + |
| 61 | +### No in-memory directory tree |
| 62 | + |
| 63 | +C++ maintains an in-memory `DirTree` for every subscription on every backend, |
| 64 | +storing path, type, and mtime for every watched file. The tree serves two |
| 65 | +purposes: mtime-based event dedup (suppressing events when the mtime hasn't |
| 66 | +changed) and create-vs-update classification (if a path is in the tree it's an |
| 67 | +update, otherwise it's a create). |
| 68 | + |
| 69 | +Go removes the tree entirely on inotify, fanotify, Windows, and FSEvents. With |
| 70 | +mtime tracking removed and only two event kinds (update and delete), the tree |
| 71 | +became write-only on those backends: populated during setup and event handling |
| 72 | +but never read from. Event classification relies on kernel flags instead of stat |
| 73 | +calls, eliminating O(events) syscalls from the hot path. kqueue needs a |
| 74 | +path-to-fd mapping (kqueue identifies events by fd, not path), but uses a flat |
| 75 | +map holding only path and isDir. |
| 76 | + |
| 77 | +C++ also maintains a separate lazily-populated `DirTree` for FSEvents, used for |
| 78 | +create/update classification. Because the tree starts empty at subscribe time, |
| 79 | +pre-existing files aren't in it, and the first modification of any pre-existing |
| 80 | +file is misclassified as "create" instead of "update." Go's FSEvents backend |
| 81 | +classifies events using only the kernel-provided flags. Pure |
| 82 | +create/remove/modify cases need zero syscalls; only the ambiguous-flags case |
| 83 | +(multiple flags set) does one `Lstat` to check existence. |
| 84 | + |
| 85 | +### No attribute events |
| 86 | + |
| 87 | +C++ watches `IN_ATTRIB` (inotify), `FAN_ATTRIB` (fanotify), and |
| 88 | +`FILE_NOTIFY_CHANGE_ATTRIBUTES` (Windows). Go removes all three from the watch |
| 89 | +masks. `chmod`, `chown`, and other metadata-only changes don't trigger events. |
| 90 | +kqueue still receives `NOTE_ATTRIB` (needed for truncate on some BSDs), but the |
| 91 | +events are delivered as `EventUpdate` without special handling. |
| 92 | + |
| 93 | +### Simpler event coalescing |
| 94 | + |
| 95 | +With only two event kinds (update, delete), the `eventList` coalescing logic is |
| 96 | +simpler: |
| 97 | + |
| 98 | +- `create + delete` within one batch cancels out (the entry is skipped). |
| 99 | +- `delete + create` becomes update (the rapid delete+recreate pattern). |
| 100 | +- `update + delete` yields delete. |
| 101 | +- `delete + update` yields delete (a bare `update` does not resurrect a deleted |
| 102 | + entry; only an explicit `create` does). |
| 103 | + |
| 104 | +### Per-backend debouncer |
| 105 | + |
| 106 | +Upstream uses one process-wide `Debounce::getShared()` singleton that batches |
| 107 | +events for every `Watcher` in the process. This is a fine choice for |
| 108 | +parcel-watcher's setting: Node consumers serialize through the libuv event loop |
| 109 | +anyway, so spawning multiple debounce threads wouldn't buy any downstream |
| 110 | +parallelism. |
| 111 | + |
| 112 | +Go can handle concurrent work cheaply, so the Go port creates one debouncer per |
| 113 | +backend (inotify, fanotify, kqueue, fsevents, windows) instead of one per |
| 114 | +process. Each backend's debouncer is created lazily on first subscribe and |
| 115 | +serves only that backend's `dirWatch`es, so a slow user callback on one backend |
| 116 | +can't starve event delivery on any of the others. In practice most callers will |
| 117 | +only ever use one backend (`Default()`), so this mainly matters for processes |
| 118 | +that mix backends, but the cost of the split is essentially nothing. |
| 119 | + |
| 120 | +## New backends |
| 121 | + |
| 122 | +**fanotify** (Linux, kernel ≥ 5.13) is the default on Linux when available. It |
| 123 | +uses FID-based event reporting, avoiding the inotify per-user watch limit |
| 124 | +entirely. Written from scratch rather than ported from the upstream |
| 125 | +[PR #180](https://github.com/parcel-bundler/watcher/pull/180), which has several |
| 126 | +bugs (see below). The backend runtime-probes `FAN_RENAME` (Linux 5.17+) and |
| 127 | +falls back to `FAN_MOVED_FROM`/`FAN_MOVED_TO`. |
| 128 | + |
| 129 | +## Pure Go, no cgo |
| 130 | + |
| 131 | +The C++ library requires a C++ compiler and platform-specific build |
| 132 | +configuration. The Go port is pure Go on all platforms: |
| 133 | + |
| 134 | +- **macOS FSEvents**: CoreFoundation/CoreServices calls via |
| 135 | + `//go:cgo_import_dynamic` and hand-written assembly trampolines (amd64 and |
| 136 | + arm64), following the pattern from Go's `crypto/x509/internal/macos`. The |
| 137 | + FSEvents C callback runs on a libdispatch (GCD) thread, not a Go goroutine. An |
| 138 | + assembly shim, staying entirely in C calling convention, retains the CFArray |
| 139 | + of paths, allocates a per-callback payload on the C heap, copies the flags |
| 140 | + array into it, and writes the payload pointer to the stream's event pipe, |
| 141 | + waking a dedicated Go event-loop goroutine that classifies the events and |
| 142 | + frees the payload. The shim then returns immediately, so the dispatch thread |
| 143 | + never enters Go ABI and does not wait for Go-side event classification. Each |
| 144 | + FSEventStream has its own serial GCD dispatch queue and event pipe, so |
| 145 | + callbacks for different streams run concurrently without contention: a stuck |
| 146 | + callback for one stream cannot back up callbacks for any other stream behind |
| 147 | + it. Teardown invalidates the stream and uses a `dispatch_sync_f` barrier on |
| 148 | + the stream's serial queue before closing the pipe, releasing the queue, and |
| 149 | + unpinning the callback state. |
| 150 | +- **Windows**: direct `x/sys/windows` syscalls. |
| 151 | +- **Linux/BSD**: direct `x/sys/unix` syscalls. |
| 152 | + |
| 153 | +Cross-compilation works without cgo: |
| 154 | +`CGO_ENABLED=0 GOOS=darwin GOARCH=arm64 go build ./...` |
| 155 | + |
| 156 | +## Bugfixes from upstream C++ |
| 157 | + |
| 158 | +### 1. Windows: dropped create event when GetFileAttributesEx fails |
| 159 | + |
| 160 | +`ReadDirectoryChangesW` reports `FILE_ACTION_ADDED` for files that may vanish |
| 161 | +before processing. C++ guards the event inside the attribute lookup success |
| 162 | +check, silently dropping it. Go always emits the event. |
| 163 | + |
| 164 | +### 2. Windows: race between subscribe and ReadDirectoryChangesW |
| 165 | + |
| 166 | +C++ queues an APC that eventually arms the watch. A filesystem operation between |
| 167 | +`subscribe()` returning and the APC firing is missed. Go arms the first |
| 168 | +`ReadDirectoryChangesW` synchronously before returning. |
| 169 | + |
| 170 | +### 3. kqueue: TOCTOU race and early-return in compareDir |
| 171 | + |
| 172 | +C++ emits a create event before confirming the file can be opened. If it |
| 173 | +vanishes, a phantom create is queued. Additionally, `watchDir` failure returns |
| 174 | +from the entire `compareDir`, skipping delete detection for other files. |
| 175 | + |
| 176 | +### 4. Event coalescing: create+delete+create yields wrong result |
| 177 | + |
| 178 | +C++ clears `isDeleted` without clearing `isCreated`, so a create+delete+create |
| 179 | +sequence produces a spurious "create" instead of the intended "update." |
| 180 | + |
| 181 | +### 5. Event drain race: getEvents + clear are separate locks |
| 182 | + |
| 183 | +C++ calls `getEvents()` then `clear()`, each independently locking. Events |
| 184 | +inserted between the two calls are silently lost. Go uses an atomic `drain()` |
| 185 | +that snapshots and clears under a single lock. |
| 186 | + |
| 187 | +### 6. inotify: IN_Q_OVERFLOW silently skipped |
| 188 | + |
| 189 | +C++ skips overflow events without notifying subscribers. Go delivers |
| 190 | +`ErrOverflow` to all active watches. |
| 191 | + |
| 192 | +### 7. inotify: descendant watches not cleaned on directory deletion |
| 193 | + |
| 194 | +C++ only removes exact-match watches when a directory is deleted. Watches for |
| 195 | +descendant paths remain and may receive stale events if watch descriptors are |
| 196 | +reused. |
| 197 | + |
| 198 | +### 8. kqueue: mtime guard suppresses NOTE_WRITE on coarse-mtime filesystems |
| 199 | + |
| 200 | +C++ guards all `NOTE_WRITE | NOTE_ATTRIB | NOTE_EXTEND` events behind an mtime |
| 201 | +check. On OpenBSD FFS (1-second mtime granularity), rapid writes share the same |
| 202 | +mtime and are suppressed. |
| 203 | + |
| 204 | +### 9. Windows: readTree follows symlinked directories |
| 205 | + |
| 206 | +C++ checks `FILE_ATTRIBUTE_DIRECTORY` without excluding |
| 207 | +`FILE_ATTRIBUTE_REPARSE_POINT`, causing symlinks and junctions to be traversed. |
| 208 | + |
| 209 | +### 10. kqueue: delete/create coalescing race and fd leak |
| 210 | + |
| 211 | +When a file is deleted and recreated, kqueue may deliver `NOTE_WRITE` on the |
| 212 | +parent before `NOTE_DELETE` on the file. C++ processes these in order, missing |
| 213 | +the create. Separately, deleted fds are erased from the map but never closed. |
| 214 | + |
| 215 | +### 11. kqueue: tryRewatchLocked race for directories |
| 216 | + |
| 217 | +On OpenBSD, `RemoveAll(dir)` can deliver `NOTE_DELETE` for a directory while |
| 218 | +`rmdir` is still in progress. `tryRewatchLocked` sees the directory still exists |
| 219 | +via `Lstat` and emits a spurious "update" instead of "delete." Go skips |
| 220 | +`tryRewatchLocked` for directories entirely. |
| 221 | + |
| 222 | +### 12. FSEvents: empty tree misclassifies updates as creates |
| 223 | + |
| 224 | +C++ maintains a lazily-populated `DirTree` for FSEvents. Pre-existing files |
| 225 | +aren't in the tree at subscribe time, so the first modification is classified as |
| 226 | +"create" instead of "update." |
| 227 | + |
| 228 | +## Bugfixes from upstream fanotify PR |
| 229 | + |
| 230 | +The upstream [PR #180](https://github.com/parcel-bundler/watcher/pull/180) adds |
| 231 | +a fanotify backend to the C++ library. Go's fanotify backend was written from |
| 232 | +scratch and avoids the following issues in the C++ PR: |
| 233 | + |
| 234 | +- **FAN_Q_OVERFLOW silently skipped.** C++ skips the event; Go delivers |
| 235 | + `ErrOverflow`. |
| 236 | +- **Descendant watches not cleaned.** Same exact-match-only bug as inotify. |
| 237 | +- **Unchecked lstat/stat return values.** C++ feeds uninitialized stat data to |
| 238 | + `tree->add()` on rapid create+delete. Go guards all stat calls. |
| 239 | +- **No merged-event disambiguation.** C++ processes `FAN_CREATE` before |
| 240 | + `FAN_DELETE` in an if/else chain, so a merged create+delete always emits a |
| 241 | + spurious create. Go stats the path to determine temporal order. |
| 242 | +- **No runtime FAN_RENAME probing.** C++ uses compile-time `#ifdef`; Go probes |
| 243 | + at runtime and falls back gracefully. |
0 commit comments