Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--watchfs does not skip the folders from .bazelignore and external workspaces #13569

Open
konste opened this issue Jun 10, 2021 · 11 comments
Open
Labels
area-Windows Windows-specific issues and feature requests P2 We'll consider working on this in future. (Assignee optional) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: feature request

Comments

@konste
Copy link

konste commented Jun 10, 2021

Description of the problem / feature request:

With --experimental_windows_watchfs flag enabled Bazel make take up to 1.5 minutes to startup. In our workspace folder we have multiple huge subfolders not relevant to the Bazel build. So we tried to add those subfolders to .bazelignore file and then also added an empty WORKSPACE file in each of them to prevent Bazel from scanning down those folders.
It seem that while Bazel itself respects .bazelignore and WORKSPACE files and does not try to scan the folders, the mechanism which implements --experimental_windows_watchfs flag disregards either of those means and scans the whole tree. And that is what takes 1.5 min startup time on our machines.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

In any workspace add a folder "blah" and some subfolders. Add the folder "blah" to .bazelignore file. Add empty WORKSPACE file to the folder "blah". Now run Bazel build under Process Monitor and watch how it scans the folder "blah" when run with --experimental_windows_watchfs flag and does not scan it when run without that flag.

What operating system are you running Bazel on?

Windows

What's the output of bazel info release?

4.1.0

Attn: @tomdegoede

@tomdegoede
Copy link
Contributor

From what I remember the last implementation uses the same logic as Unix systems by recursively registering a watcher on each folder. Therefore I suspect the Unix implementation to also disregard the .bazelignore listings albeit performing a lot better than Windows watchers. I'm also not sure what the design expectations are regarding the .bazelignore in this specific context. Maybe someone else can look into this? cc: @meteorcloudy

@meteorcloudy meteorcloudy added area-Windows Windows-specific issues and feature requests type: feature request P3 We're not considering working on this, but happy to review a PR. (No assignee) labels Jun 11, 2021
@meteorcloudy
Copy link
Member

Therefore I suspect the Unix implementation to also disregard the .bazelignore listings albeit performing a lot better than Windows watchers.

Yes, the .bazelignore doesn't affect the watchfs functionality on any platform. And file operation is much slower on Windows compared to Linux.

I'm also not sure what the design expectations are regarding the .bazelignore in this specific context.

.bazelignore was introduced in 33afd3c to address #4888. I believe when it's implemented, it didn't take any consideration on what this meant to watchfs. 😕

@konste
Copy link
Author

konste commented Jun 11, 2021

It would be totally in the spirit of #4888 to prevent watchfs from going down the ignored folders. BUT even without .bazelignore the presence of the WORKSPACE file in the folder clearly indicates it is the root of another workspace and it should stop watcher from going down that path, but it still does.

It is very typical for all kinds of projects to have HUGE folders completely irrelevant to Bazel within Bazel workspaces. At this time there is no way to prevent watchfs from scanning and watching those folders, which at the very least causes annoying performance hit - Bazel startup time goes beyond a minute for no good reason. As the folders to be ignored get bigger watchfs starts to overflow and stops working at all.

@meteorcloudy
Copy link
Member

It would be totally in the spirit of #4888 to prevent watchfs from going down the ignored folders.

@konste I agree, looking at the code, we probably can implement it here:

public FileVisitResult preVisitDirectory(Path path, BasicFileAttributes attrs)
throws IOException {
// Do not traverse the bazel-* convenience symlinks. On windows these are created as
// junctions.
if (isWindows && attrs.isOther()) {

I'll bump this to P2, will revisit this later when having time, but happy to review a PR as well!

@meteorcloudy meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) and removed P3 We're not considering working on this, but happy to review a PR. (No assignee) labels Jun 11, 2021
@sventiffe sventiffe added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 18, 2021
@bsilver8192
Copy link
Contributor

Noting that this still seems to be true. Also the flag is no longer experimental or Windows-specific, so it's now called --watchfs, which would make this easier to find if the title and tags were updated.

I might try fixing this soon, no promises.

@konste konste changed the title --experimental_windows_watchfs does not skip the folders from .bazelignore and external workspaces --watchfs does not skip the folders from .bazelignore and external workspaces Sep 29, 2022
@konste
Copy link
Author

konste commented Sep 29, 2022

Wow, I missed when experimental flag got removed. Title updated. Not sure I have permissions to update tags.
Want to assure you that this issue is still very much actual and important.
"No promises" implied, but thanks for the hope!

@bsilver8192
Copy link
Contributor

I looked at the .bazelignore part of this a bit, and there's some complications because the DiffAwareness code processes Roots, but IgnoredPackagePrefixesFunction looks for a .bazelignore at the root of each repository. I think it can use something related to PathPackageLocator to check if the Root is for the main repository. Or maybe just have SequencedSkyframeExecutor pass the information on which is the main root directly? Once I find the main root, it can just use the .bazelignore there if present, and skip .bazelignore processing for all other entries on the package path. Package path entries aren't repository roots anyways, so they have no .bazelignore.

I'm going to skip ignoring external workspaces for now, that looks harder to get the information where it needs to be.

Interesting note: it looks like the days for the OSX-specific implementation may be numbered, I followed the bug linked in the code and openjdk/jdk#10140 implements the standard Java API well on OSX, and is being actively worked on.

@konste
Copy link
Author

konste commented Sep 30, 2022

I'm going to skip ignoring external workspaces for now

It should be perfectly fine as there is (would be) easy workaround - use .bazelignore to exclude external workspaces.

implements the standard Java API well on OSX

Wow, progress!

@nikhilkalige
Copy link
Contributor

I just moved the output to the workspace directory and did a comparison on the max number of ionotify usage by running bazel build //...

Outside Repo = 38698
Inside Repo = 586552

@sushain97
Copy link
Contributor

@bsilver8192 thanks for writing down your thought process last year! I've opened #18363 based on them.

copybara-service bot pushed a commit that referenced this issue Jul 18, 2023
We attempt a fix to #13569 for Linux/Windows which uses `WatchServiceDiffAwareness`. macOS which uses its own `MacOSXFsEventsDiffAwareness` is not supported here, partially because I didn't want to write a bunch of C++ today.

In our main monorepo, I'm seeing a 54% decrease in inotify watches (370476 → 169706) and a 51% decrease in `handleDiffs` duration (35.6s → 17.3s). This is likely a lower bound for our repo due to my machine not having many `node_modules` installed 🙃

Some outstanding questions:
1. Is there a place that is worth adding extra tests for this? I'm unfamiliar with the conventions around more complex Bazel tests.
2. Should `DiffAwarenessManager` include the list of ignored paths in its key?

cc @alexeagle

Closes #18363.

PiperOrigin-RevId: 549006817
Change-Id: I1b2c60701d722c48a6bfc9f5fcaea129097fa8ba
@pauldraper
Copy link
Contributor

It is very typical for all kinds of projects to have HUGE folders completely irrelevant to Bazel within Bazel workspaces

E.g. node_modules

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-Windows Windows-specific issues and feature requests P2 We'll consider working on this in future. (Assignee optional) team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: feature request
Projects
None yet
Development

No branches or pull requests

8 participants