fix(cli): split --provider vs --hardware on slot create (#2)#64
Merged
Conversation
Close finding #2 from tests/harness/FINDINGS.md. The slot-create CLI flag --backend was always really the provider; the actual hardware backend was hardcoded to vulkan, blocking ROCm + CPU slot creation from the command line. Most of the split (new --provider / --hardware flags, hidden --backend alias, _detect_default_hardware probe) had already landed; this commit finishes the brief: - Deprecation warning now goes to stderr via typer.echo(..., err=True) with the exact phrasing the brief calls out, so stdout stays parseable when scripts pipe the success line elsewhere. - Same stderr-routing applied to slot edit's --backend alias for consistency. - New test: bare 'hal0 slot create primary' on a Strix Halo fixture (AMD iGPU, vulkan_capable=True, compute_capable=False) auto-resolves hardware=vulkan — the platform hal0 v1 most cares about. - New test: --hardware foo is rejected at the Typer/Click parse layer before the command body runs (no API call made). - Existing legacy-backend test now asserts the deprecation lands on stderr, not stdout, so re-introducing the old console.print path fails loudly.
0c4ce34 to
5458a0d
Compare
thinmintdev
added a commit
that referenced
this pull request
May 21, 2026
The hal0-toolbox-vulkan image's ENTRYPOINT is llama-server itself
(packaging/toolbox/vulkan.Dockerfile:154). The smoke-test job was
running `docker run image /bin/bash -c '...'`, which passes /bin/bash
as the first arg to llama-server, producing:
error: invalid argument: /bin/bash
Override the entrypoint to /bin/bash for the proof-of-life check.
Closes the slot-integration CI failure that was inherited by every
PR off main (#64, #67).
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes finding #2 from tests/harness/FINDINGS.md. The CLI's --backend flag was really the provider; the actual hardware backend was hardcoded to vulkan, blocking ROCm + CPU slot creation from the command line.
What
Most of the split had already landed (the --provider / --hardware flags exist, the hidden --backend alias is in place, and the _detect_default_hardware probe walks /etc/hal0/hardware.json). This PR finishes the v1 polish:
typer.echo("[deprecated] --backend will be renamed to --provider in v0.2; use --provider", err=True). Previously it was a Richconsole.printto stdout, which polluted stdout for callers piping the success line into other tools.slot edit's --backend alias for consistency.--backendremains ahidden=Truedeprecated alias (per CONTRIBUTING.md style), no full removal."backend": "vulkan"is gone from the POST body — already removed in the earlier landing; this PR preserves and tests that.Tests
hal0 slot create primaryon a Strix Halo fixture (AMD iGPU, vulkan_capable=True, compute_capable=False) auto-resolveshardware=vulkan— the platform hal0 v1 most cares about.--hardware foois rejected at the Typer/Click parse layer before the command body runs (no API call made).--backendtest now asserts the deprecation lands onresult.stderr, notresult.output, so re-introducing the oldconsole.printpath fails this regression test loudly.test_slot_create_flags.pytests + the broader 27-test CLI suite are green.Schema
SlotConfig.backendalready accepts{vulkan, rocm, flm, moonshine, kokoro, cpu}andSlotConfig.provideralready accepts{llama-server, flm, moonshine, kokoro}— no schema change needed.