Skip to content

Conversation

@sungshik
Copy link
Contributor

@sungshik sungshik commented Mar 10, 2025

Note

To avoid diff noise, enable "Hide whitespace" when reviewing the code. (I pulled a few methods from an inner class to its outer class, so indentation changed.)

Overview

This PR:

  • Updates the existing JDKRecursiveDirectoryWatch to rely on the new mechanism to auto-handle overflows. These are the minimal changes required to make it work.
  • Adds a new JDKFileTreeWatch as an alternative recursive directory watch. I originally wrote this alternative as a proof-of-concept to better understand/design how auto-handling of overflows could be a standalone thing (i.e., also use it with file/non-recursive directory watches), and how the interaction with recursive directory watches would play out.

Task list

  • We should probably select one of the two watches and remove the other one to avoid unnecessary code growth
  • If we choose JDKRecursiveDirectoryWatch, then it needs to be tidied up a bit (update comments etc.)

Similarities between the types of watches

Both types of recursive directory watches are containers for a collection of internal JDKDirectoryWatch objects (one for each subdirectory in scope), subject to the following principles:

  • When a watch is started, recursively scan the directory and start an internal non-recursive JDKDirectoryWatch for each subdirectory.
  • Between the recursive scan of the directory and the start of the internal watches, CREATED events for new subdirectories may have been missed. So, after starting the internal watches, send an OVERFLOW event to the internal watch of the top-level directory. (It is forwarded to the other internal watches, as described below.) Update: This isn't needed for JDKFileTreeWatch (see the relevant comment in the code).
  • When an event happens in an internal watch for subdirectory x, in addition to calling the event-handler:
    • OVERFLOW: Forward the event to the internal watches for subdirectories y1, y2, ..., of x.
    • CREATED, new subdirectory y: (1) Start an internal watch for y. (2) Between the creation of y and the start of the internal watch, events may have been missed. So, after starting the internal watch, send an OVERFLOW event to it.
    • DELETED, old subdirectory y: Stop the internal watch for y.
    • MODIFIED: -

Differences

Using JDKRecursiveDirectoryWatch, the internal watches are stored "flat" inside a single JDKRecursiveDirectoryWatch object. Using JDKFileTreeWatch, the internal watches are stored "hierarchical" inside a tree of JDKFileTreeWatch objects (one internal watch per JDKFileTreeWatch).

For instance, suppose we have the following (sub)directories:

  • foo
  • foo/bar1
  • foo/bar2
  • foo/bar3
  • foo/bar3/baz1
  • foo/bar3/baz2

The structure of JDKRecursiveDirectoryWatch looks as follows:

graph TB
  subgraph Foo[JDKRecursiveDirectoryWatch]
    direction LR
    FooInner["JDKDirectoryWatch
    /foo"] ~~~
    Bar1Inner["JDKDirectoryWatch
    /foo/bar1"] ~~~
    Bar2Inner["JDKDirectoryWatch
    /foo/bar2"]
    Bar3Inner["JDKDirectoryWatch
    /foo/bar3"] ~~~
    Baz1Inner["JDKDirectoryWatch
    /foo/bar3/baz1"] ~~~
    Baz2Inner["JDKDirectoryWatch
    /foo/bar3/baz2"]
  end
Loading

The structure of JDKFileTreeWatch looks as follows:

graph TB
  subgraph Foo[JDKFileTreeWatch]
    FooInner["JDKDirectoryWatch
    /foo"]
  end

  subgraph Bar1[JDKFileTreeWatch]
    Bar1Inner["JDKDirectoryWatch
    /foo/bar1"]
  end

  subgraph Bar2[JDKFileTreeWatch]
    Bar2Inner["JDKDirectoryWatch
    /foo/bar2"]
  end

  subgraph Bar3[JDKFileTreeWatch]
    Bar3Inner["JDKDirectoryWatch
    /foo/bar3"]
  end

  subgraph Baz1[JDKFileTreeWatch]
    Baz1Inner["JDKDirectoryWatch
    /foo/bar3/baz1"]
  end

  subgraph Baz2[JDKFileTreeWatch]
    Baz2Inner["JDKDirectoryWatch
    /foo/bar3/baz2"]
  end

  Foo --> Bar1
  Foo --> Bar2
  Foo --> Bar3
  Bar3 --> Baz1
  Bar3 --> Baz2
Loading

@codecov
Copy link

codecov bot commented Mar 10, 2025

Codecov Report

Attention: Patch coverage is 75.67568% with 36 lines in your changes missing coverage. Please review.

Project coverage is 81.2%. Comparing base (9b0ba39) to head (3f80e77).
Report is 37 commits behind head on improved-overflow-support-main.

Files with missing lines Patch % Lines
...ineering/swat/watch/impl/jdk/JDKFileTreeWatch.java 73.0% 20 Missing and 11 partials ⚠️
src/main/java/engineering/swat/watch/Watcher.java 80.0% 1 Missing and 1 partial ⚠️
...c/main/java/engineering/swat/watch/WatchEvent.java 83.3% 0 Missing and 1 partial ⚠️
...neering/swat/watch/impl/jdk/JDKDirectoryWatch.java 88.8% 0 Missing and 1 partial ⚠️
...g/swat/watch/impl/overflows/IndexingRescanner.java 0.0% 1 Missing ⚠️
Additional details and impacted files
@@                        Coverage Diff                         @@
##             improved-overflow-support-main     #27     +/-   ##
==================================================================
- Coverage                              81.4%   81.2%   -0.3%     
- Complexity                              121     129      +8     
==================================================================
  Files                                    16      16             
  Lines                                   549     532     -17     
  Branches                                 54      54             
==================================================================
- Hits                                    447     432     -15     
+ Misses                                   72      63      -9     
- Partials                                 30      37      +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sungshik sungshik changed the title Improved overflow support: Alternative recursive directory watch (JDKFileTreeWatch) Improved overflow support: Recursive directory watches Mar 11, 2025
@sungshik sungshik marked this pull request as ready for review March 12, 2025 15:18
@sungshik sungshik requested a review from DavyLandman March 12, 2025 15:18
Copy link
Member

@DavyLandman DavyLandman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! A lot of remove code. I like the idea to make it an actual tree, that will also reduce the pressure on the map. Although it will now cause multiple lookups in smaller maps. Almost makes you consider some kind of sortedmap/trie structure to quickly jump to the right child entry.

Some larger concerns I have with the current proposed changes:

  1. It looks like the updating of the child directories (and especially missed nested child directories) depends on the overflow handling strategy. I do not think that is a wise dependency. Maybe we should add a test parameter to the torture test that cycles through all 3 enum values of the overflow handler
  2. When implementing the original code my concern was to avoid iterating the filesystem more than needed, in profiling I noticed that had quite a perf hit, especially in larger directory hierarchies.
  3. Similar point is with the scheduling of work on the Thread pool. My original approach has been to get the event that happened to the user as soon as we get them (as soon as we can), and do our internal bookkeeping in the background. For an IDE we don't want to have the event get in the back of the queue after our bookkeeping. I might be mistaken, but I have the feeling the current code is chaining a lot in andThen clauses without pushing them onto separate workjobs. If the design is that the original event gets pushed to the consumer first, and only after that is done do we handle the book-keeping, its a bit better, I mean, we might leave some cpu-cores idling, but at least we don't delay events.

Copy link
Contributor Author

@sungshik sungshik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Davy! I addressed (and tentatively closed) all comments except this one, which requires more thought...

Copy link
Contributor Author

@sungshik sungshik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today's series of commits fix the following issues:

  • Primarily, it adds the filtering mechanism and revises the test (old one deleted, new one added)
  • Also, it fixes the closing issue we encountered (unrelated to filtering)
  • Also, it fixes a few small things related to relativization (also unrelated to filtering)

Copy link
Member

@DavyLandman DavyLandman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small clarification questions.

@sungshik sungshik merged commit fd68764 into improved-overflow-support-main Mar 28, 2025
13 checks passed
@sungshik sungshik deleted the improved-overflow-support/jdk-file-tree-watch branch March 28, 2025 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants