Skip to content

fix(pm): restore hardlink→copy fallback on clone failure#2801

Merged
elrrrrrrr merged 6 commits intonextfrom
fix/cloner-hardlink-fallback
Apr 17, 2026
Merged

fix(pm): restore hardlink→copy fallback on clone failure#2801
elrrrrrrr merged 6 commits intonextfrom
fix/cloner-hardlink-fallback

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

Summary

  • hardlink_clone::clone_dir used to try fs::hard_link first and fall back to copy_file_sync on any error. Commit 13c698e (feat(pm): support multi-workspace selection for run command #2770, the multi-workspace refactor) rewrote the loop into an if/else that propagates the hardlink error, so cross-device installs now hard-fail:
    ERROR Failed to clone ... Invalid cross-device link (os error 18)
    
    This reproduces reliably with ut i -g on any machine where $HOME/.cache/nm and /usr/local/lib/node_modules are on different filesystems (e.g. Jenkins pods, some container images).
  • Restore the fallback: on any fs::hard_link failure (EXDEV, EPERM, EMLINK, …), emit one tracing::warn! for the pair that failed and flip force_copy = true so the rest of this clone goes through copy_file_sync. Once one file in a src/dst pair fails the others will fail the same way, so retrying hardlink is wasted work.
  • Renamed the local use_copyforce_copy — the variable now means “we were forced off the hardlink path,” which is a more accurate name once the flag can flip at runtime.
  • Pre-feat(pm): support multi-workspace selection for run command #2770 the fallback was silent. The warn! makes the degraded-path condition visible in logs, which is how we would have caught this regression sooner.

Out of scope: the macOS clonefile path also has no cross-device fallback. The user's report is Linux, and the syscall/failure modes differ — handle as a follow-up.

Test plan

  • cargo fmt -p utoo-pm
  • cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps
  • cargo test -p utoo-pm util::cloner (16 passed; the fallback branch is unreachable on a single-fs test host, which is expected)
  • Manual cross-device repro on Linux: mount a tmpfs, point cache at it, run ut i -g node-gyp. Expect success with one hardlink failed … falling back to copy for remaining files warning per package.
  • CI (Linux + Windows runners will fully exercise the hardlink_clone module, which is cfg-gated off on the macOS dev host).

🤖 Generated with Claude Code

Commit 13c698e (#2770) refactored `hardlink_clone::clone_dir`'s
"try hardlink, fallback to copy" loop into an if/else that propagates
the hardlink error. Cross-device installs (e.g. `ut i -g` where the
cache lives on one filesystem and `/usr/local` on another) now fail
with `Invalid cross-device link (os error 18)` instead of degrading
to copy.

Restore the fallback: on any `fs::hard_link` error (EXDEV, EPERM,
EMLINK, …) emit one `tracing::warn!` and switch the rest of this
clone to `copy_file_sync`. The old fallback was silent; the warn
log makes the degraded-path condition observable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the file cloning logic in crates/pm/src/util/cloner.rs to implement a fallback mechanism from hardlinking to copying. When a hardlink operation fails—for instance, due to cross-device link errors—the system now logs a warning and automatically switches to copying for the current and all subsequent files in the batch. I have no feedback to provide.

elrrrrrrr and others added 4 commits April 17, 2026 14:07
When the initial `install_pkgs` hits the 60s timeout mid-unpack (seen on
a libc6 upgrade), dpkg is left in a half-configured state. The retry on
the fallback mirror then fails immediately with:

    E: dpkg was interrupted, you must manually run 'dpkg --configure -a'
       to correct the problem.

Finalise any pending unpack before the retry so the fallback mirror
attempt actually has a chance to succeed. Keep `|| true` — if there is
nothing to configure, dpkg still exits 0, but we don't want a rare
unrelated failure to mask the real `install_pkgs` retry error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…thers

EXDEV is a property of the src/dst pair — once one file hits it, every
remaining file on the same clone will fail the same way, so it's safe
(and much cheaper) to latch `force_copy` and skip hardlink attempts
for the rest of this clone.

Other hardlink errors (EMLINK when a single inode's link count is
exhausted, EPERM on a specific file) are per-file. Latching the whole
clone to copy on one of those would downgrade every subsequent file
unnecessarily. Handle them by copying just the failing file and
continuing to try hardlink on the next.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two unit tests for `hardlink_clone::clone_dir`:

- `test_clone_dir_per_file_fallback_does_not_latch`: pre-populates the
  destination with a conflicting file so `fs::hard_link` fails with
  `AlreadyExists` (a non-EXDEV ErrorKind). Verifies (a) the failing
  file is recovered by the copy fallback, and (b) a subsequent,
  non-conflicting file is still hardlinked — the per-file fallback
  must not latch the whole clone to copy.

- `test_clone_dir_install_script_forces_copy`: writes
  `_hasInstallScript` in the cache layout so `has_install_script_sync`
  returns true, forcing `force_copy` upfront. Verifies every file is
  copied (distinct inode from src), which is the safety property we
  need so install-script mutations can't leak back into the shared
  cache via hardlinks.

Both tests use `st_ino` to distinguish hardlink from copy, so they
require Unix and are gated `#[cfg(unix)]` inside the existing
`hardlink_clone_tests` module (itself `#[cfg(not(target_os = "macos"))]`).

The EXDEV-latch branch of the fallback requires actual cross-device
mounts to exercise and is left to the manual repro documented in the
PR description.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@elrrrrrrr elrrrrrrr added the A-Pkg Manager Area: Package Manager label Apr 17, 2026
@elrrrrrrr elrrrrrrr marked this pull request as ready for review April 17, 2026 08:23
@elrrrrrrr elrrrrrrr enabled auto-merge (squash) April 17, 2026 08:36
@elrrrrrrr elrrrrrrr merged commit 92a3366 into next Apr 17, 2026
28 of 30 checks passed
@elrrrrrrr elrrrrrrr deleted the fix/cloner-hardlink-fallback branch April 17, 2026 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants