Skip to content

Do not share store between builder and queue runner in unit tests#1642

Draft
Ericson2314 wants to merge 4 commits intoNixOS:masterfrom
obsidiansystems:test-multi-store
Draft

Do not share store between builder and queue runner in unit tests#1642
Ericson2314 wants to merge 4 commits intoNixOS:masterfrom
obsidiansystems:test-multi-store

Conversation

@Ericson2314
Copy link
Copy Markdown
Member

No description provided.

@Ericson2314 Ericson2314 force-pushed the test-multi-store branch 5 times, most recently from 94b39c3 to 3e737bb Compare April 7, 2026 18:05
Ok(Some(parse_drv(store.get_store_dir(), drv, &input)?))
let json = store.read_derivation_json(drv).await?;
let drv = serde_json::from_str::<Derivation>(&json)
.map_err(|e| anyhow::anyhow!("failed to parse derivation JSON: {e}"))?;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be .context("failed to parse derivation JSON"), depending on what you want the output to look like.

When Hydra is configured with a local store whose physical location
differs from its logical store dir (e.g. `local?root=X&store=Y` where `Y
!= X/nix/store`), `read_text` on the input's store path fails: the path
is a logical store path but the file lives at the real (physical)
location.

Shell out to `nix store cat` instead, which lets the store resolve the
physical path internally. We already import `Hydra::Helper::Exec`, so
reuse its `captureStdoutStderr` helper rather than pulling in `IPC::Run3`.

Not only will this work for relocated local stores, `nix store cat` will
work for arbitrary store implementations. This sort of decoupling is
generally a good thing to pursue.

In the future, we might extend the Nix Perl bindings so we don't need to
shell out for this. (This script already uses the Nix Perl bindings.)
`registerRoot` short-circuits when the target file already exists, so
calling it on a build whose GC root was already registered is a no-op.
The check is `-e $link`, which follows symlinks.

In the traditional physical = logical case this is fine: if another
process (e.g. `nix-eval-jobs --gc-roots-dir`) has already created a
symlink at `$link` pointing at the store path, the symlink's target is
a real file on disk, so `-e` returns true and we skip correctly.

Once we allow for redirected local stores, where the logical and
physical store directory paths differ, the assumption breaks down. The
symlink target is the *logical* store path, but the store object lives
at the *physical* store path, so `-e` (following the symlink) returns
false. We then fall through to `open(">", $link)`, which also follows
the symlink and tries to create its (non-existent) target — failing
with `No such file or directory`.

The solution is to switch to `readlink` + `isValidPath`. This routes the
existence decision through the store rather than relying on filesystem
checks that assume physical = logical.
Add a `read_derivation` C++ FFI wrapper that calls
`store->readDerivation()` and serialises the result to JSON via
`nlohmann::json`. Expose it in Rust as
`LocalStore::read_derivation_json()`.

Rewrite `query_drv` to:

1. Check the path via `is_valid_path` (which consults the store's own
   database) instead of `fs::try_exists` on the logical store path (the
   logical store path's presence on disk is not a reliable signal —
   only the physical path is).

2. Fetch the derivation contents via the new `read_derivation_json` FFI
   (so the store resolves the physical location internally) instead of
   reading the `.drv` file directly from the filesystem.

3. Deserialise using the harmonia `Derivation` type's
   `serde::Deserialize` impl, matching the JSON format produced on the
   C++ side.

This fixes `query_drv` for stores where the physical store directory
differs from the logical one (e.g. `local?root=X&store=Y` where `Y !=
X/nix/store`): previously `print_store_path` returned the logical path
and the direct filesystem access failed.

In fact, this fixes the code not only for redirected local stores, but
for *any* store implementation.

As an added bonus, with ATerm parsing gone, we drop the
`harmonia-store-aterm` dependency entirely.
A handful of test infrastructure fixes needed once the queue runner and
builder use distinct physical Nix stores sharing the same logical
store dir.

- `HydraTestContext`: create the builder and central roots up front
  with `make_path` (the `local?root=...` store won't initialise
  otherwise), and give the builder its own `nix_state_dir` /
  `nix_log_dir` fields alongside the existing `nix_store_dir`.

- `QueueRunnerBuildOne`:
  - Set `NIX_STORE_DIR` when starting `hydra-builder` with a `TODO`
    note: the builder currently reads this to advertise its logical
    store dir to the queue runner; it should take the value from the
    store URI instead.
  - Pump the queue-runner `IPC::Run` harness alongside the builder
    and flush accumulated stderr on every poll, so when a test hangs
    we actually see the queue runner's logs instead of a blank
    "Builder stderr: ..." wall.

- `queue-runner/notifications.t`: the "prebuild into a binary cache"
  preamble uses `nix-build` / `nix copy` / `nix log` / `nix-store`
  directly, which inherits the central store URI. That store has
  physical != logical, so building derivations whose `$PATH` points
  at the host `/nix/store/...-coreutils/bin` fails with `mkdir: not
  found`. Point all of these at the builder store (which has physical
  = logical) via `--store`, and clean up the builder's log dir at
  the end of the subtest instead of the central one.

  This also makes sure that the test doesn't rely on `nix-build` etc.
  doing things to the central / queue runner's store that we didn't
  realize, so we don't accidentally rely on those things (e.g. the test
  would fail without them.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants