fix(cli): validate boolean cache flags, fix CLI-over-env precedence#70
Merged
Conversation
The boolean cache flags `--prompt-cache-enabled` and `--apc-enabled` accepted a loosely space-separated value, which parsed inconsistently with the `=`-delimited form shown in `--help`. Add `require_equals`, `num_args = 0..=1`, `default_missing_value = "true"`, and `ArgAction::Set` to both flags on both server entrypoints (the `mlxcel-server` binary and the `mlxcel serve` subcommand). The flags now have one well-defined contract: bare `--prompt-cache-enabled` means true, `--prompt-cache-enabled=false` disables it, and a stray `--prompt-cache-enabled false` is rejected with a clear parser error instead of silently leaving the default in place and treating `false` as an unexpected argument.
Restore CLI-over-env precedence for those two flags. A new `long_cli_flag_was_set` helper (defined in `server::cli_input`, re-exported from `server`) scans the process argv for an explicit long flag. Startup previously passed a hardcoded `false` for "was this set on the CLI", so the env-var fallback was always consulted and an explicit `--prompt-cache-enabled=false` / `--apc-enabled=false` could be silently overridden by `MLXCEL_PROMPT_CACHE_ENABLED` / `APC_ENABLED`. Both binaries now pass `long_cli_flag_was_set("prompt-cache-enabled")` and `long_cli_flag_was_set("apc-enabled")`, so an explicit CLI value wins while the compiled-in default stays overridable by env when the flag is absent.
Clean up the `--help` surface and the option doc comments on the touched flags: drop stale internal tracker identifiers, and repoint the two doc links to this repository's layout — `docs/distributed.md` for the pipeline-parallelism workflow and the project README for the model-surgery operations. Add the `pipeline_tensor_parallel` value to the `--node-role` help list so it matches the roles the server actually accepts.
Update the `#[ignore]`d prompt-cache integration tests (`prompt_cache_e2e`, `prompt_cache_prefill_bench`) to pass `--prompt-cache-enabled=true|false` using the now-required `=` syntax; their previous space-separated invocations would fail to parse under the tightened contract.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Tightens the boolean cache-flag CLI contract, fixes a CLI-over-env precedence bug for those flags, and cleans up the affected
--helptext.Changes
Flag validation
--prompt-cache-enabledand--apc-enabled(on both themlxcel-serverbinary and themlxcel servesubcommand) now userequire_equals+num_args = 0..=1+default_missing_value = "true"+ArgAction::Set. This gives one well-defined contract: bare--prompt-cache-enabledmeans true,--prompt-cache-enabled=falsedisables it, and a stray space-separated--prompt-cache-enabled falseis rejected with a clear parser error instead of silently keeping the default and treatingfalseas an unexpected argument.CLI-over-env precedence (bug fix)
A new
long_cli_flag_was_sethelper (defined inserver::cli_input, re-exported fromserver) scans the process argv for an explicit long flag. Startup previously passed a hardcodedfalsefor "was this set on the CLI", so the env-var fallback was always consulted and an explicit--prompt-cache-enabled=false/--apc-enabled=falsecould be silently overridden byMLXCEL_PROMPT_CACHE_ENABLED/APC_ENABLED. Both binaries now passlong_cli_flag_was_set(...), so an explicit CLI value wins while the compiled-in default stays overridable by env when the flag is absent.Help-text cleanup
--helpsurface and option doc comments on the touched flags.docs/distributed.mdfor the pipeline-parallelism workflow and the project README for the model-surgery operations. (One identical pre-existing broken link at the--pp-autooption was corrected in the same file for consistency.)pipeline_tensor_parallelvalue to the--node-rolehelp list so it matches the roles the server accepts.Tests
Updated the
#[ignore]d prompt-cache integration tests (prompt_cache_e2e,prompt_cache_prefill_bench) to invoke--prompt-cache-enabled=true|falsewith the now-required=syntax; their previous space-separated form would fail to parse under the tightened contract.Out of scope
A version bump and the en/ko docs + manpages from the originating change are intentionally excluded: this repository uses a flat
docs/tree with no corresponding files, and the version is left at0.0.29.Verification
cargo fmt --all -- --check— cleancargo clippy --features metal,accelerate --lib --bin mlxcel --bin mlxcel-server --tests -- -D warnings— cleanmlxcel-serverand confirmed parsing:--prompt-cache-enabled=false→ parses (then fails at model load, i.e. past arg parsing)--prompt-cache-enabled false→ rejected:'false' cannot be used with '--prompt-cache-enabled[=<BOOL>]'--prompt-cache-enabled(bare) → parses viadefault_missing_value