Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"unlinkat ([path]) (Directory not empty)" errors with experimental_worker_for_repo_fetching #22680

Open
jfirebaugh opened this issue Jun 10, 2024 · 4 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@jfirebaugh
Copy link

Description of the bug:

After upgrading to Bazel 7.2.0 and removing --experimental_worker_for_repo_fetching=off from .bazelrc (because #21803 was fixed), we started to observe failures of the following form:

ERROR: /home/circleci/figma/WORKSPACE.bazel:157:27: fetching node_repositories rule //external:nodejs_linux_amd64: java.io.IOException: unlinkat (/home/circleci/.cache/bazel/_bazel_circleci/a6e8cb0ce75ceb1ea1faa0cca0b0ae0d/external/nodejs_linux_amd64/bin/nodejs) (Directory not empty)
ERROR: no such package '@@nodejs_linux_amd64//': unlinkat (/home/circleci/.cache/bazel/_bazel_circleci/a6e8cb0ce75ceb1ea1faa0cca0b0ae0d/external/nodejs_linux_amd64/bin/nodejs) (Directory not empty)

These errors appear to be linked with repository fetch restarts, particularly "fetch interrupted due to memory pressure; restarting". In two of two cases where I have seen this error, it happened in a repo where the repo fetch got restarted, judging by either the presence of that log message or the JSON trace profile.

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

No response

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

release 7.2.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

Bazel Slack discussion: https://bazelbuild.slack.com/archives/CA31HN1T3/p1718057176146129

Any other information, logs, or outputs that you want to share?

No response

@jfirebaugh
Copy link
Author

I am presently confirming that --experimental_worker_for_repo_fetching=off mitigates the issue.

@Wyverald Wyverald added P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed untriaged labels Jun 10, 2024
@jfirebaugh
Copy link
Author

Still on 7.2.0, haven't seen this again with --experimental_worker_for_repo_fetching=off.

@fmeum
Copy link
Collaborator

fmeum commented Jun 14, 2024

I managed to reproduce this via #22748 (comment):

INFO: repository @@aspect_rules_js~~npm~npm__decode-uri-component__0.2.2' used the following cache hits instead of downloading the corresponding file.
 * Hash '16a51843ef28d79f06c864eb305266b3daa1dc2a932af02a82ab139e42c8f2c2aed34dbca2ba8187134c16415e9f4cc6ca0e9ea40083df6a63564e7c9e0204ad' for https://registry.npmjs.org/decode-uri-component/-/decode-uri-component-0.2.2.tgz
If the definition of 'repository @@aspect_rules_js~~npm~npm__decode-uri-component__0.2.2' was updated, verify that the hashes were also updated.
com.google.devtools.build.lib.rules.repository.RepositoryFunction$RepositoryFunctionException: java.io.IOException: /private/var/tmp/_bazel_fmeum/0465a0857e26a35c072f48ab751a1794/external/aspect_rules_js~~npm~npm__decode-uri-component__0.2.2 (Directory not empty)
	at com.google.devtools.build.lib.rules.repository.RepositoryFunction.setupRepoRoot(RepositoryFunction.java:139)
	at com.google.devtools.build.lib.bazel.repository.starlark.StarlarkRepositoryFunction.lambda$fetch$1(StarlarkRepositoryFunction.java:154)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:131)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:76)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:82)
	at java.base/java.util.concurrent.ThreadPerTaskExecutor$TaskRunner.run(ThreadPerTaskExecutor.java:314)
	at java.base/java.lang.VirtualThread.run(VirtualThread.java:309)
Caused by: java.io.IOException: /private/var/tmp/_bazel_fmeum/0465a0857e26a35c072f48ab751a1794/external/aspect_rules_js~~npm~npm__decode-uri-component__0.2.2 (Directory not empty)
	at com.google.devtools.build.lib.unix.NativePosixFiles.remove(Native Method)
	at com.google.devtools.build.lib.unix.UnixFileSystem.delete(UnixFileSystem.java:444)
	at com.google.devtools.build.lib.vfs.FileSystem.deleteTree(FileSystem.java:232)
	at com.google.devtools.build.lib.vfs.Path.deleteTree(Path.java:594)
	at com.google.devtools.build.lib.rules.repository.RepositoryFunction.setupRepoRoot(RepositoryFunction.java:136)
	... 6 more
ERROR: <builtin>: fetching npm_import_rule rule //:aspect_rules_js~~npm~npm__decode-uri-component__0.2.2: java.io.IOException: /private/var/tmp/_bazel_fmeum/0465a0857e26a35c072f48ab751a1794/external/aspect_rules_js~~npm~npm__decode-uri-component__0.2.2 (Directory not empty)
ERROR: no such package '@@aspect_rules_js~~npm~npm__decode-uri-component__0.2.2//': /private/var/tmp/_bazel_fmeum/0465a0857e26a35c072f48ab751a1794/external/aspect_rules_js~~npm~npm__decode-uri-component__0.2.2 (Directory not empty)
ERROR: /Users/fmeum/git/examples/frontend/BUILD.bazel:10:22: //:.aspect_rules_js/node_modules/decode-uri-component@0.2.2/pkg depends on @@aspect_rules_js~~npm~npm__decode-uri-component__0.2.2//:pkg in repository @@aspect_rules_js~~npm~npm__decode-uri-component__0.2.2 which failed to fetch. no such package '@@aspect_rules_js~~npm~npm__decode-uri-component__0.2.2//': /private/var/tmp/_bazel_fmeum/0465a0857e26a35c072f48ab751a1794/external/aspect_rules_js~~npm~npm__decode-uri-component__0.2.2 (Directory not empty)
Use --verbose_failures to see the command lines of failed build steps.
ERROR: Analysis of target '//react/src:test_lib' failed; build aborted: Analysis failed
INFO: Elapsed time: 38.951s, Critical Path: 0.02s
INFO: 1 process: 1 internal.
ERROR: Build did NOT complete successfully
FAILED:
    Fetching repository @@aspect_rules_swc~~swc~swc_darwin-arm64; fetch interrupted due to memory pressure; restarting. 19s
    Fetching repository @@aspect_rules_js~~npm~npm__buffer-from__1.1.2; fetch interrupted due to memory pressure; restarting. 11s
    Fetching https://registry.npmjs.org/@typescript-eslint/type-utils/-/type-utils-5.44.0.tgz; 11s
    Fetching repository @@aspect_rules_js~~npm~npm__esbuild-windows-64__0.15.18; fetch interrupted due to memory pressure; restarting. 10s
    Fetching repository @@aspect_rules_js~~npm~npm__esbuild-freebsd-arm64__0.15.18; fetch interrupted due to memory pressure; restarting. 8s
    Fetching https://registry.npmjs.org/is-weakset/-/is-weakset-2.0.2.tgz; 7s
    Fetching repository @@aspect_rules_js~~npm~npm__at_next_swc-win32-arm64-msvc__13.0.5; fetch interrupted due to memory pressure; restarting. 6s
    Fetching repository @@aspect_rules_js~~npm~npm__randombytes__2.1.0; fetch interrupted due to memory pressure; restarting. 5s ... (20 fetches)

@meteorcloudy
Copy link
Member

@bazel-io fork 7.2.1

copybara-service bot pushed a commit that referenced this issue Jun 19, 2024
`StarlarkBaseExternalContext` now implements `AutoCloseable` and, in `close()`:
1. Cancels all pending async tasks.
2. Awaits their termination.
3. Cleans up the working directory (always for module extensions, on failure for repo rules).
4. Fails if there were pending async tasks in an otherwise successful evaluation.

Previously, module extensions didn't do any of those. Repo rules did 1 and 4 and sometimes 3, but not in all cases.

This change required replacing the fixed-size thread pool in `DownloadManager` with virtual threads, thereby resolving a TODO about not using a fixed-size thread pool for the `GrpcRemoteDownloader`.

Work towards #22680
Work towards #22748

Closes #22772.

PiperOrigin-RevId: 644669599
Change-Id: Ib71e5bf346830b92277ac2bd473e11c834cb2624
fmeum added a commit to fmeum/bazel that referenced this issue Jun 20, 2024
`StarlarkBaseExternalContext` now implements `AutoCloseable` and, in `close()`:
1. Cancels all pending async tasks.
2. Awaits their termination.
3. Cleans up the working directory (always for module extensions, on failure for repo rules).
4. Fails if there were pending async tasks in an otherwise successful evaluation.

Previously, module extensions didn't do any of those. Repo rules did 1 and 4 and sometimes 3, but not in all cases.

This change required replacing the fixed-size thread pool in `DownloadManager` with virtual threads, thereby resolving a TODO about not using a fixed-size thread pool for the `GrpcRemoteDownloader`.

Work towards bazelbuild#22680
Work towards bazelbuild#22748

Closes bazelbuild#22772.

PiperOrigin-RevId: 644669599
Change-Id: Ib71e5bf346830b92277ac2bd473e11c834cb2624
github-merge-queue bot pushed a commit that referenced this issue Jun 20, 2024
…22814)

`StarlarkBaseExternalContext` now implements `AutoCloseable` and, in
`close()`:
1. Cancels all pending async tasks.
2. Awaits their termination.
3. Cleans up the working directory (always for module extensions, on
failure for repo rules).
4. Fails if there were pending async tasks in an otherwise successful
evaluation.

Previously, module extensions didn't do any of those. Repo rules did 1
and 4 and sometimes 3, but not in all cases.

This change required replacing the fixed-size thread pool in
`DownloadManager` with virtual threads, thereby resolving a TODO about
not using a fixed-size thread pool for the `GrpcRemoteDownloader`.

Work towards #22680
Work towards #22748

Closes #22772

PiperOrigin-RevId: 644669599
Change-Id: Ib71e5bf346830b92277ac2bd473e11c834cb2624

Closes #22775
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

No branches or pull requests

7 participants