fix(arc): use whole /dev/dri pass-through for device-drift resilience#38
Merged
Merged
Conversation
Replaces single-device pattern (${GPU_CARD:-/dev/dri/card1} + GPU_RENDER) with
whole-directory mount /dev/dri:/dev/dri.
Why:
- Original pattern hardcoded card1 as default; only safe with .env override
setting GPU_CARD=/dev/dri/card0. Future installs missing .env override
would hit the bms-ai-cluster-class device-mapping bug (msg ca3d45b4
+ Solution_files validation 2026-05-18).
- Whole-/dev/dri mount exposes all card*, renderD*, by-path/ symlinks,
is resilient to per-boot card-number drift, and works on any host
regardless of which card-N is the GPU.
- Defensive change: doesn't affect ai-stack-on-Phoenix deployment
(card0 was already mounted via .env override). Empirical measurement
on Phoenix shows 13.75 tok/s 3-run mean vs 14.45 baseline (within
slight-noise band), confirming neutral perf impact + defensive benefit.
What this fix does NOT include:
- Stage-2 SYCL env-var overrides (GGML_SYCL_DISABLE_OPT=0, etc.) — verified
via 8th-class binary-self-report probe that ai-stack's current
ava-agentone:latest image already has those opts enabled at build time
(no 'DISABLE_OPT: yes' in Build with Macros). Stage-2 overrides are
null-effect on this image build; OLLAMA_NUM_CTX=8192 actively hurts
(-17% measured via KV cache bloat).
- The bms-ai-cluster 'Solution_files' Stage-2 boost applies to OLDER
ipex-llm builds with DISABLE_OPT=yes at compile-time. Doesn't transfer
to ai-stack's current image build.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Context
NetYeti fixed his bms-ai-cluster standalone Ollama (separate compose) by switching to whole-dir mount + adding Stage-2 SYCL env-var overrides. Pod studied Solution_files (10 docs) end-to-end + ran empirical apply-test on ai-stack.
Finding: NetYeti's full Stage-2 fix does NOT transfer to ai-stack's current image build. Loom 8th-class probe (binary-self-report-read on `docker logs ollama`) shows ai-stack's `Build with Macros:` reports only `FORCE_MMQ: no` + `F16: no` — does NOT have `DISABLE_OPT: yes` at compile time (unlike bms-ai-cluster's older image). Stage-2 env-var overrides are null-effect on this build; OLLAMA_NUM_CTX=8192 actively hurts (-17% from KV cache bloat).
This PR captures only the part that's safe + defensive across image builds: the whole-dir mount.
Test plan
What this PR does NOT include
Co-Authored-By
Claude Opus 4.7 (1M context) noreply@anthropic.com