Skip to content

fix: Docker entrypoint arg handling + configurable model directory#402

Merged
ruvnet merged 1 commit intoruvnet:mainfrom
voidborne-d:fix/docker-entrypoint-and-model-path
Apr 20, 2026
Merged

fix: Docker entrypoint arg handling + configurable model directory#402
ruvnet merged 1 commit intoruvnet:mainfrom
voidborne-d:fix/docker-entrypoint-and-model-path

Conversation

@voidborne-d
Copy link
Copy Markdown
Contributor

Summary

Fixes #384 and #399 by replacing the broken Docker entrypoint pattern and making the model scan directory configurable.

Problem

Issue #384: --source: not found when passing flags to docker run

The Dockerfile used ENTRYPOINT ["/bin/sh", "-c"] with a shell-form CMD. When users passed flags:

docker run ruvnet/wifi-densepose:latest --source wifi --tick-ms 500

Docker replaced CMD entirely, producing /bin/sh -c "--source wifi --tick-ms 500" which tries to execute --source as a shell command → --source: not found.

The -- separator also fails: docker run ... -- --source wifi produces /bin/sh -c "-- --source wifi"--: not found.

Issue #399: --model / --load-rvf parameters ignored in Docker

The model discovery API (GET /api/v1/models, scan_model_files()) hardcodes the path data/models. When Docker users mount models to /app/models/:

docker run -v /path/to/models:/app/models ruvnet/wifi-densepose:latest \
  /app/sensing-server --model /app/models/wiflow-v1.rvf

The --model CLI flag loads the specific file correctly, but GET /api/v1/models scans /app/data/models/ and finds nothing → Discovered 0 model files.

Solution

1. docker/docker-entrypoint.sh (new)

A proper entrypoint script that handles three usage patterns:

# Pattern 1: Use env vars (no args)
docker run -e CSI_SOURCE=esp32 ruvnet/wifi-densepose:latest

# Pattern 2: Pass CLI flags directly
docker run ruvnet/wifi-densepose:latest --source esp32 --tick-ms 500

# Pattern 3: Override command entirely
docker run ruvnet/wifi-densepose:latest /bin/sh

When the first argument starts with - (or no args), the script prepends the server binary with sensible defaults. User-passed flags are appended and override defaults via clap's last-wins behavior.

Also sets --bind-addr 0.0.0.0 (the binary defaults to 127.0.0.1 which blocks access from outside the container).

2. MODELS_DIR environment variable

The model scan path is now configurable:

docker run -v /path/to/models:/app/models -e MODELS_DIR=/app/models \
  ruvnet/wifi-densepose:latest --source esp32

The effective_models_dir() helper reads MODELS_DIR at runtime with a default of data/models, preserving backward compatibility.

3. Updated Dockerfile.rust

  • Exec-form ENTRYPOINT pointing to the entrypoint script
  • ENV MODELS_DIR=data/models
  • COPY the entrypoint script into the image

4. Updated docker-compose.yml

  • Removed the broken shell-form command
  • Added MODELS_DIR environment variable

Files changed

File Change
docker/docker-entrypoint.sh New entrypoint script
docker/Dockerfile.rust Exec-form ENTRYPOINT + MODELS_DIR env
docker/docker-compose.yml Remove shell-form command, add MODELS_DIR
src/model_manager.rs models_dir() reads MODELS_DIR env var
src/main.rs effective_models_dir() replaces hardcoded paths
tests/test_docker_entrypoint.sh 17 regression tests

Tests

$ bash tests/test_docker_entrypoint.sh
=== Docker entrypoint tests ===
Test 1: No arguments (default CSI_SOURCE=auto)
  ✓ includes --source auto
  ✓ includes --tick-ms 100
  ✓ includes --ui-path
  ✓ includes --http-port 3000
  ✓ includes --ws-port 3001
  ✓ includes --bind-addr 0.0.0.0
Test 2: CSI_SOURCE=esp32
  ✓ includes --source esp32
Test 3: User passes --source wifi --tick-ms 500
  ✓ includes --source wifi
  ✓ includes --tick-ms 500
Test 4: CSI_SOURCE unset
  ✓ includes --source auto (default)
Test 5: User passes --model /app/models/my.rvf
  ✓ includes --model
  ✓ also includes default flags
Test 6: CSI_SOURCE=simulated
  ✓ includes --source simulated
Test 7: Explicit command (echo hello)
  ✓ passes through explicit command
  ✓ does not inject sensing-server flags
Test 8: MODELS_DIR env var propagation
  ✓ MODELS_DIR is visible
  ✓ MODELS_DIR defaults to unset
=== Results: 17 passed, 0 failed ===

Fixes ruvnet#384: docker run with --source/--tick-ms flags now works correctly.
Fixes ruvnet#399: model files in mounted volumes are now discoverable via MODELS_DIR env var.

Root cause (issue ruvnet#384):
The Dockerfile used ENTRYPOINT ["/bin/sh", "-c"] with a shell-form CMD.
When users passed flags like `--source wifi --tick-ms 500` as docker run
arguments, Docker replaced CMD entirely, resulting in
`/bin/sh -c "--source wifi --tick-ms 500"` which executes `--source` as
a shell command → `--source: not found`.

Root cause (issue ruvnet#399):
Model directory was hardcoded to the relative path `data/models`. When Docker
users mounted models to `/app/models/`, the scan looked in the wrong place.

Changes:

1. docker/docker-entrypoint.sh (new):
   - Proper entrypoint script that handles both env-var-based defaults and
     user-passed CLI flags
   - No arguments → starts server with CSI_SOURCE env var as --source
   - Flag arguments (start with -) → prepends /app/sensing-server + defaults,
     appends user flags (clap last-wins allows overrides)
   - Non-flag first arg → exec passthrough (e.g., /bin/sh for debugging)
   - Sets --bind-addr 0.0.0.0 (was 127.0.0.1 which blocks container access)

2. docker/Dockerfile.rust:
   - Switch from ENTRYPOINT ["/bin/sh", "-c"] to exec-form entrypoint
   - Add MODELS_DIR env var (default: data/models)
   - COPY the entrypoint script into the image

3. docker/docker-compose.yml:
   - Remove shell-form command (entrypoint handles defaults)
   - Add MODELS_DIR env var

4. model_manager.rs + main.rs:
   - Replace hardcoded `data/models` path with `effective_models_dir()`
     / `models_dir()` that reads MODELS_DIR env var at runtime
   - Docker users can now: docker run -v /host/models:/app/models -e MODELS_DIR=/app/models

5. tests/test_docker_entrypoint.sh (new, 17 tests):
   - Default CSI_SOURCE substitution (6 assertions)
   - Custom CSI_SOURCE propagation
   - User-passed flag arguments (--source, --tick-ms, --model)
   - Unset CSI_SOURCE defaults to auto
   - Explicit command passthrough
   - MODELS_DIR env var propagation
@proffesor-for-testing
Copy link
Copy Markdown

Thanks @voidborne-d — really nice, thorough fix. Reproduced the bug on upstream/main with a stub-binary image that mirrors the Dockerfile's ENTRYPOINT structure byte-for-byte, and confirmed docker run image --source wifi errors with /bin/sh: 0: Illegal option --. Your patch resolves it cleanly:

  • no-args injects sensible defaults + --bind-addr 0.0.0.0
  • user flags append and override via clap last-wins
  • -e CSI_SOURCE=esp32 substitutes correctly
  • -e MODELS_DIR=/app/models propagates to every Rust scan/delete path (verified all 7 previously-hardcoded data/models occurrences in main.rs + model_manager.rs now route through effective_models_dir() / models_dir())
  • explicit command passthrough (docker run image echo hi) works

Your tests/test_docker_entrypoint.sh runs 17/17 green locally. Appreciate the tests especially — made this very easy to verify.

@proffesor-for-testing
Copy link
Copy Markdown

@ruvnet — recommend merging this PR. Verified locally against upstream/main (8914538b):

  • Bug reproduced on main: docker run image --source wifi/bin/sh: 0: Illegal option --
  • Fix verified across all 5 scenarios (no-args, flag-args, env-vars, MODELS_DIR propagation, explicit-command passthrough) — all green
  • Author's test suite (tests/test_docker_entrypoint.sh): 17/17 pass
  • Security scan: no blocking issues; shellcheck-clean on the production entrypoint, path-traversal protection in delete_model preserved, no new secrets or attacker-controlled code paths

One awareness item (not blocking): the entrypoint now binds --bind-addr 0.0.0.0 inside the container. This is necessary for Docker port publishing to work at all, but the HTTP/WS API has no auth/TLS (pre-existing), so operators on shared hosts should prefer -p 127.0.0.1:3000:3000 — worth a README line.

Optional follow-ups:

  1. Add RUN chmod +x /app/docker-entrypoint.sh to the Dockerfile (Windows-checkout resilience if core.filemode=false)
  2. Log resolved MODELS_DIR on startup via info!
  3. Allowlist CSI_SOURCE values in the entrypoint (auto|esp32|wifi|simulated) to fail fast on typos
  4. Fix shellcheck SC2064 in the test script (trap 'rm -rf "$TMPDIR"' EXIT)

This PR is a superset of #407 — see my comment there for why #407 alone is insufficient.

@ruvnet ruvnet merged commit b816292 into ruvnet:main Apr 20, 2026
proffesor-for-testing added a commit to proffesor-for-testing/RuView that referenced this pull request Apr 21, 2026
Upstream moved forward with v0.6.2-esp32 (ADR-081 adaptive CSI mesh kernel,
Timer Svc stack fix) and the Docker entrypoint merge of PR ruvnet#402.

Conflicts resolved:

- `firmware/esp32-csi-node/sdkconfig.defaults`: both sides appended a new
  config block. Kept both — `CONFIG_ESP_WIFI_EXTRA_IRAM_OPT=y` (ours,
  defense-in-depth for RuView#396 SPI cache race) AND
  `CONFIG_FREERTOS_TIMER_TASK_STACK_DEPTH=8192` (upstream's ADR-081 Timer
  Svc stack bump). They target different crash modes.

- Applied the same `CONFIG_ESP_WIFI_EXTRA_IRAM_OPT=y` line to
  `sdkconfig.defaults.4mb` and `sdkconfig.defaults.template` for
  consistency — the SPI cache race is not flash-size specific, and the
  4MB / template variants run the same CSI collector code with the same
  MGMT-only callback path.

- `firmware/esp32-csi-node/version.txt`: both sides bumped to 0.6.2.
  Upstream already released v0.6.2-esp32, so our in-flight work bumps
  to 0.6.3.

- `CHANGELOG.md`: auto-merge placed our [Unreleased] ruvnet#397 entries
  (and the older ruvnet#391 / ruvnet#390 entries) inside the newly-cut
  [v0.6.2-esp32] section. Moved them back to [Unreleased] — they
  describe work that has not been released yet.

Auto-merged cleanly: `csi_collector.c`, `csi_collector.h`, `main.c`,
`docker/*`, `README.md`, `docs/user-guide.md`. Verified the PR's
defensive-copy code (`s_node_id_early_set`, `s_filter_mac`,
`CSI_MIN_PROCESS_INTERVAL_US`, `s_early_drop`, the 50 Hz rate gate,
MGMT-only filter, and `csi_collector_set_node_id()` API) is still
present, and that the dropped probe-injection symbols stay absent
(grep confirms 0 / 27 hits).

Validation in this devcontainer:

- ADR-081 host tests built and ran from `firmware/esp32-csi-node/tests/host/`:
  `test_adaptive_controller` 18/18 pass, `test_rv_feature_state` 15/15
  pass, `test_rv_mesh` 27/27 pass — 60/60 total. These exercise the
  merged-in pure-C logic that this PR has no changes against, so
  they're a regression check that the merge didn't corrupt the
  upstream modules.
- `edge_processing.c` still has `const float sample_rate = 10.0f;`.
- Brace balance and dangling-ref checks on `csi_collector.c` pass.

ESP-IDF firmware build, flash, and miniterm soak still deferred to
@ruvnet's COM7 per the original review comment.

Co-Authored-By: claude-flow <ruv@ruv.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

can't start docker container on windows - wifi: 1: --source: not found

3 participants