test(install): make rename-fallback test exercise the actual fallback (#174)#177
Merged
Conversation
…#174) The previous test was named for the install.ps1 rename-fallback path but didn't actually trigger it. It spawned 'Scripts/python.exe -I -c "import conductor; sleep(120)"', expecting that to lock files inside Scripts/. On many uv installs that python.exe is just a ~241 KB launcher; the real interpreter and python3xx.dll live under %APPDATA%/uv/python/..., so nothing in Scripts/ gets locked, uv tool install --force succeeds on the first attempt, Test-LockError never matches, and Move-ConductorToolDirAside is dead code from the test's perspective. Replace _spawn_long_running_conductor with _spawn_file_locker, which uses ctypes to call Win32 CreateFileW(target, GENERIC_READ, FILE_SHARE_READ, ...) — no FILE_SHARE_DELETE — on the seeded venv's python.exe. Any attempt by uv to delete that file then fails with ERROR_SHARING_VIOLATION ('used by another process'), which matches a Test-LockError needle and forces the rename-fallback path the test is named for. The locker subprocess writes a ready-file once the handle is open; _wait_for_lock polls it (or fails fast on early subprocess exit), replacing the time.sleep(2.0) heuristic. Add load-bearing assertions for the two diagnostic messages that prove the fallback ran: 'Install blocked by a file lock' and 'Moved existing install to'. Without these, the test passed regardless of whether Test-LockError or Move-ConductorToolDirAside actually executed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI revealed that opening Scripts/python.exe with FILE_SHARE_READ only
was too strict: it correctly triggered Test-LockError ("Install blocked
by a file lock" appeared in the log), but Move-ConductorToolDirAside
itself failed with "used by another process" on every retry. On NTFS,
renaming a parent directory requires every open child handle to grant
FILE_SHARE_DELETE — without it, the rename returns ERROR_SHARING_VIOLATION
even though no handle points at the directory itself.
This is the exact share-mode profile the install.ps1 fallback was
designed for: the production scenario it targets is a running
conductor.exe whose Windows image section pins Scripts/ files. Image
sections always use full share modes (FILE_SHARE_READ | FILE_SHARE_WRITE
| FILE_SHARE_DELETE), so the file is pinned but the parent rename is
permitted.
Switch the locker's CreateFileW share mode to full sharing. The lock
mechanism becomes:
1. Locker opens python.exe with full share modes; the handle stays
alive.
2. uv tool install --force opens the file for DELETE and marks it via
FILE_DISPOSITION_INFO. The mark succeeds (FILE_SHARE_DELETE
granted) and the file enters delete-pending state, but the locker's
handle keeps it on disk.
3. uv calls RemoveDirectory(Scripts/) which fails with
ERROR_DIR_NOT_EMPTY because delete-pending files still appear in
directory listings.
4. uv surfaces this as 'failed to remove directory ...' — matching a
Test-LockError needle in install.ps1.
5. Move-ConductorToolDirAside renames conductor-cli aside. This
succeeds because the locker grants FILE_SHARE_DELETE.
6. Retry of uv tool install --force creates a fresh conductor-cli/.
Also folds in two top-priority items from the silent-failure-hunter
review: narrow _kill's exception handling so leaked subprocesses are
surfaced to stderr instead of being swallowed (C1), and capture child
stdout/stderr in _wait_for_lock's timeout path so 'stuck child' failures
are diagnosable in CI (H2).
Updated the helper and test docstrings to explain the new mechanism and
the reasoning behind the share-mode choice — without the explanation, a
future reader would assume FILE_SHARE_READ-only is the obvious choice
and re-introduce the CI failure.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ise the rename-fallback
The previous CI iterations established that no synthetic Win32 file lock
can simultaneously trigger Test-LockError AND let Move-ConductorToolDirAside
succeed against modern uv on Windows:
* Without FILE_SHARE_DELETE on a child file: blocks uv's open-for-DELETE
("used by another process" matches Test-LockError) but ALSO blocks
the parent dir rename (NTFS rule: rename of a directory requires every
open child handle to grant FILE_SHARE_DELETE — without it MoveFileExW
on the parent returns ERROR_SHARING_VIOLATION).
* With FILE_SHARE_DELETE: Rust ≥ 1.66 uses
FILE_DISPOSITION_FLAG_POSIX_SEMANTICS for std::fs::remove_file, which
immediately unlinks the file from the directory listing. The file
persists on disk while our handle is open but is no longer
enumerable, so RemoveDirectory(Scripts/) succeeds, no lock error
surfaces, and the fallback never fires.
The install.ps1 author's mental-model comment ("rename works as long as
no handles point at the *directory* itself") is incomplete — it only
holds when children grant FILE_SHARE_DELETE, AND modern uv routes around
that case via POSIX-semantics unlinks anyway.
Pivot to Option B from issue #174: install a uv shim that intercepts
the first `uv tool install --force` call and returns a canned
lock-error matching Test-LockError's needles. The shim defers to the
real uv on every other call (including the retry after the rename).
This gives the test a deterministic verification of install.ps1's
control flow:
1. install.ps1 runs uv → shim → canned lock-error.
2. Test-LockError matches → "Install blocked by a file lock" logged.
3. Move-ConductorToolDirAside runs (no real lock, succeeds) →
"Moved existing install to ..." logged.
4. Retry runs uv → shim defers to real uv → install succeeds.
The test now verifies what's actually load-bearing: that install.ps1's
lock-detection-and-rename-fallback flow works correctly given a
lock-shaped error from uv. Whether modern uv produces such an error
under real Windows lock conditions is a separate (and increasingly
unlikely) concern, not install.ps1's responsibility to test.
Implementation:
* _install_uv_shim(sandbox) writes uv-shim.py + uv.bat into a temp dir
and returns (shim_dir, extra_env). Caller prepends shim_dir to PATH
and merges extra_env (CONDUCTOR_TEST_REAL_UV + state file path).
* uv.bat hardcodes sys.executable + uv-shim.py absolute paths so the
shim works regardless of what's on PATH inside install.ps1's env.
* Shim is stateful via a file in sandbox.root — only the first
install --force call returns the canned error.
* Real uv path is captured BEFORE prepending shim_dir to PATH, so the
shim never accidentally calls itself.
* Removed _spawn_file_locker / _wait_for_lock / _drain_subprocess_output
helpers; they're no longer needed.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…pse helper API)
Apply four high-value suggestions from the PR-review code-simplifier pass:
* Extract the embedded uv shim source from a triple-quoted string into
a real Python file (tests/test_integration/_uv_shim.py). The string
embedding required four-backslash escape gymnastics for Windows
paths in the canned error message and put the shim source outside
of ruff/ty coverage. The standalone file is ruff-formatted,
type-checked, syntax-highlighted, and human-readable. Filename
starts with '_' so pytest does not collect it.
* Collapse _install_uv_shim's return type from (Path, dict) to a
single dict[str, str] with PATH already prepended. Every caller had
to do the prepend themselves; pushing it into the helper eliminates
the obligation and the test body shrinks accordingly.
* Drop the dead 'sandbox.python_exe.exists()' assertion that was a
leftover precondition from the abandoned Win32-handle-lock draft.
The shim approach never references python_exe, so the assertion
guards a precondition the test no longer has.
* Drop the redundant 'len(args) >= 3' check in the shim's
is_install_force test — '--force' in args[2:] already implies the
list is long enough.
Net impact: -97 / +34 lines across the test file, plus +52 lines of
clean, type-checked shim source. The test body is now ~5 lines around
the actual install invocation.
Skipped suggestions:
* Reverting _kill to its prior best-effort form. The expanded
diagnostics came from the silent-failure-hunter review (C1) and
surface leaks that the bare 'except Exception: pass' would have
swallowed silently in CI. Worth keeping bundled.
* Cosmetic: pytest.skip vs RuntimeError for missing uv, removing
.strip() on read_text, moving state file inside shim_dir. Stylistic
preferences with no behavioral impact.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #174.
Problem
tests/test_integration/test_install_scripts.py::test_upgrade_with_running_process_uses_rename_fallbackis named for the rename-fallback path ininstall.ps1but didn't actually assert that path runs. It only checkedreturncode == 0andversion == \"0.0.2\", which both hold even whenuv tool install --forcesucceeds on the first attempt without the fallback ever firing.If
Move-ConductorToolDirAsideorTest-LockErrorregressed (or were removed), the test would still pass.Why the fallback wasn't triggered before
The original helper spawned the seeded venv's
python.exe -I -c \"import conductor; sleep(120)\", expecting that to lock files insideScripts/. On many uv installs (verified with uv 0.9.14 + Python 3.13/3.14), thatpython.exeis just a ~241 KB launcher; the real interpreter andpython3xx.dlllive under%APPDATA%/uv/python/cpython-3.X-windows-x86_64-none/. So nothing insideScripts/ends up locked,uv tool install --forcesucceeds on the first attempt,Test-LockErrornever sees a lock-error string, and the rename-fallback is dead code from this test's perspective.The functional outcome (install succeeds while a venv-loaded process runs) was verified — but the fallback path itself wasn't.
Fix
Implements Option A from the issue: hold a real Win32 file lock instead of relying on the launcher to lock anything.
Changes to
tests/test_integration/test_install_scripts.pyReplaced
_spawn_long_running_conductorwith_spawn_file_locker(sandbox, ready_file). It launchessys.executable(the test runner's own Python — sidestepping the launcher problem) running a small ctypes script that calls:on the seeded venv's
Scripts/python.exe. Crucially, noFILE_SHARE_DELETE— any subsequent attempt byuv tool install --forceto delete the file fails withERROR_SHARING_VIOLATION("used by another process"), which matches one ofTest-LockError's needles ininstall.ps1and forces the rename-fallback path. After the handle is open, the child writes a ready-file and sleeps; the parent kills it infinally.Added
_wait_for_lock(proc, ready_file, timeout=10.0). Polls every 50 ms for the ready-file and fails fast (with the child's stdout/stderr) if the subprocess dies before signalling ready, replacing the oldtime.sleep(2.0)heuristic.Added two load-bearing assertions to the test that prove the fallback actually executed:
These are the diagnostic messages emitted by
Write-Warn/Write-Okimmediately aroundMove-ConductorToolDirAside. Without them, the test passes regardless of whetherTest-LockErroror the rename-fallback ran.Added a pre-flight
assert sandbox.python_exe.exists()afterseed_installso a missing seeded venv produces a clearer failure than a CreateFileW WinError from the child.Updated docstring to describe the new mechanism, the assertions, and the rationale (with a pointer to test: rename-fallback integration test doesn't actually exercise the fallback #174).
Verification
make lint(ruff check + ruff format --check on the whole repo) — passes.uv run ty check tests/test_integration/test_install_scripts.py— passes.uv run pytest -m install_scripts --collect-only tests/test_integration/test_install_scripts.py— collects 5 tests cleanly, no import errors.C:\\Users\\…\\python.exe) round-trip correctly throughrepr()into the child'starget = ...assignment with single backslashes (no double-backslash artifact).Cannot run on this machine
The Windows-only test is
@pytest.mark.skipif(not IS_WINDOWS, …)and uses Win32 ctypes APIs, so it can't be exercised on macOS where I'm working. The wholeinstall_scriptssuite is opt-in (-m install_scripts) and excluded from the defaultmake testrun, so this won't disturb the default CI path either.Reviewer-relevant notes
sys.executable(not the seeded venv'spython.exe) for the locker process. This avoids the original "launcher locks nothing" problem and is functionally equivalent — Windows enforces share modes per file regardless of which process opened the handle. The locked file (Scripts/python.exeinside the sandbox) is what production installs lock; that hasn't changed.conductor-cli/, not to the directory itself, and runs withcwd=sandbox.root. SoMove-Itemon theconductor-clidirectory still succeeds — which is exactly why the rename-fallback works.Test-LockError's needle list ininstall.ps1(e.g.os error 32,sharing violation) would harden the production fallback against locale/format changes inuv's error output. That's deliberately out of scope for this PR — file as a follow-up if desired.Severity
Low — the fallback code is unchanged across recent PRs and the install script's other behavior is well-covered. This closes an "if it breaks, we won't know" gap rather than fixing a current bug.