Skip to content

feat(phase-13-wave-6): HID PID FFB passthrough — closes ADR-015 Stage 3 path (issue #82)#166

Merged
BANANASJIM merged 4 commits intomainfrom
spec/phase-13-wave-6-pidff
Apr 30, 2026
Merged

feat(phase-13-wave-6): HID PID FFB passthrough — closes ADR-015 Stage 3 path (issue #82)#166
BANANASJIM merged 4 commits intomainfrom
spec/phase-13-wave-6-pidff

Conversation

@BANANASJIM
Copy link
Copy Markdown
Owner

@BANANASJIM BANANASJIM commented Apr 26, 2026

Summary

Implements Phase 13 Wave 6 — HID PID force-feedback passthrough through padctl's UHID virtual device path. Builds on Wave 3 (UHID routing) + Wave 5 (canary) and closes ADR-015 Stage 3 architectural path. Targets racing-wheel FFB support via kernel hid-universal-pidff modalias binding.

Architecture

Kernel hid-universal-pidff binds to padctl's UHID device (when clone_vid_pid advertises a wheel-class VID/PID) and emits UHID_OUTPUT events with FFB command bytes. padctl forwards those bytes byte-for-byte to the physical wheel's hidraw fd. No FFB remapping or interpretation — pure passthrough per ADR-015 Option (a) constraint.

kernel hid-universal-pidff -> UHID_OUTPUT (FFB cmd bytes)
   -> padctl event_loop drain
   -> FfbForwarder.forward(report)
   -> posix.write(physical_hidraw_fd, report.data)

Tasks landed

  • T1 (commit `d4c5f46` + fix `e060538`): `buildForPid` PID HID descriptor builder. Emits 12 reports total; 8 of them are kernel-mandatory per `drivers/hid/usbhid/hid-pidff.c` `PID_REQUIRED_REPORTS = 8` (`{0x21, 0x77, 0x7d, 0x7f, 0x89, 0x90, 0x96, 0xab}`). Validator catches truncated/incomplete descriptors at build time before `UHID_CREATE2`.
  • T5 (commit `1ef2239`): full `[output.force_feedback]` schema (`backend`, `kind`, `clone_vid_pid`, plus legacy `type`/`max_effects`/`auto_stop` merged) + 6-row validate matrix. 11 tests.
  • T2 (commit `4645675` + fix in `70770eb`): `clone_vid_pid` wiring through `device_instance.zig` (UHID Config vendor/product). Default `false` restores daemon identity `0xFADE:0xC001` for non-PID devices. Per-device `KERNELS=="uhid", SUBSYSTEM=="input", ATTRS{id/vendor}/{id/product}, TAG+="uaccess"` udev rule for cloned VID/PID. 6 tests.
  • T3 (commit `0698fb1` + fix in `70770eb`): `UHID_OUTPUT = 6` UAPI + `UhidOutputReq` extern struct + `pollOutputReport` parse helper + `setOutputCallback` registration on `UhidDevice` + drain loop in `event_loop.zig`. 7 tests.
  • T4 (commit `cf93222` + fix `d08ca0f`): `FfbForwarder` (`src/io/ffb_forwarder.zig`) writes report bytes to physical hidraw fd. EAGAIN → drop+counter; EACCES/ENODEV/EBADF/other unrecognised → `state=.disabled`. Wired in `device_instance.zig` when `backend=uhid + kind=pid`. 5 tests.
  • T6 (commit `4c6d145` + `e7ebdd5`): `src/test/wave6_pidff_e2e_test.zig` — 7 Layer 1 integration tests using pipe2 fixtures (no real `/dev/uhid`). End-to-end UHID_OUTPUT → hidraw forward, descriptor IDs, VID/PID clone passthrough, EAGAIN, closed-fd disable, daemon identity, and TP35 negative gate (no FFB block → no forwarder).

Probe-derived empirical foundation

Wave 6 probe (separate research at `research/wave6-pidff-probe`, parallel PR #161) ran on kernel 6.18.9 with two virtual-UHID variants:

Run VID:PID bustype acceptance hid-universal-pidff bind FFB init
1 11FF:F045 (no alias) accepted no (alias missing) hid-generic, no FFB
2 11FF:1211 (alias) accepted yes (modalias match) failed (incomplete 7-of-12 desc), kernel OOPS

Probe Run 2's incomplete descriptor caused `pidff_find_reports -ENODEV` and a kernel NULL-deref. T1's descriptor includes all 8 mandatory reports per kernel mainline to avoid this.

Test plan

  • T1-T5 unit tests pass at compile time (Layer 0)
  • T6 e2e tests pass (Layer 1, pipe2 fixtures)
  • No CJK characters in code/commits/diff
  • Spec branch compiles cleanly (`zig build`)
  • CI runs full test suite (this PR)
  • T7 — real-hardware matrix (USER ACTION REQUIRED before merge): validate FFB on Logitech G29 (kernel 5.15 + hid-lg4ff path), Moza R5/R9 (kernel 6.x + hid-universal-pidff path), Thrustmaster T300. AC8 sub-criteria 1-6 per `openspec/changes/phase-13-wave-6-pidff/test-plan.md`. Until T7 lands a green real-hardware run, this PR is integration-complete but not ADR-015-Stage-3-accepted.

Known limitations / non-goals

  • `clone_vid_pid=false` (default) is a behavior change from Wave 3 (which unconditionally cloned device VID/PID). Wave 5 device TOMLs that need to keep their real VID/PID must add `[output.force_feedback].clone_vid_pid = true` explicitly.
  • `FfbForwarder` uses `devices[0].pollfd().fd` for the physical hidraw fd. For multi-interface devices (vader5 IF1 vs IF0, etc.), the interface index is not yet TOML-controlled. T7 may surface real-hardware cases requiring an explicit `interface_id` field.
  • Vendor-specific PID drivers (`hid-lg4ff`, `hid-tmff`) are not in this PR's scope; only `hid-universal-pidff` default path is exercised. Vendor matrix is a Wave 6 follow-up.
  • FFB rate limit is currently passive (counter only). Per spec R4 risk, if real hardware saturates UHID_OUTPUT rate, a follow-up may need rate ceiling enforcement.

References

Summary by CodeRabbit

  • New Features

    • HID PID force-feedback passthrough with configurable backend/kind and optional vendor/product cloning; per-device opt-in rule emission for cloned VID/PID.
  • Tests

    • New unit and Linux-only e2e suites validating descriptor generation, UHID output handling, byte-faithful forwarding, EAGAIN behavior, and clone_vid_pid semantics.
  • Documentation

    • Added brief, design, tasks, and test-plan detailing Wave 6 execution and verification.
  • Chores

    • Clarified build/check-all documentation and CI invocation semantics.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 26, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds Wave 6 UHID PID force‑feedback passthrough: UHID OUTPUT parsing, PID descriptor build/validation, FfbForwarder to forward reports to a physical hidraw FD, event-loop/UHID wiring, clone_vid_pid config/udev changes, exported test modules, docs, and CI step adjustments.

Changes

Cohort / File(s) Summary
Documentation & Planning
openspec/changes/phase-13-wave-6-pidff/brief.md, .../design.md, .../tasks.md, .../test-plan.md
New wave‑6 design, task DAG, test plan, and briefs describing feature scope, requirements, tests, and CI probe workflow.
Build / CI
build.zig, .github/workflows/ci.yml
Docs comment update for check-all and CI split: replaces a single check-all invocation with three sequential zig build runs (test, test-safe, check-fmt) to avoid Zig cache/flock contention.
Configuration Schema
src/config/device.zig
Adds ForceFeedbackConfig (backend/kind/clone_vid_pid), replaces legacy FfConfig usage, and adds validation rules and tests enforcing backend↔kind legality and VID/PID constraints.
CLI / Udev rules
src/cli/install.zig
Adds clone_vid_pid to UdevEntry, merges flag via boolean OR during dedupe, reads flag from [output.force_feedback], and emits per-VID/PID KERNELS=="uhid" udev rules with TAG+="uaccess" when true; adds tests.
UHID core & parsing
src/io/uhid.zig, src/io/uhid_descriptor.zig
Adds UHID_OUTPUT constant, UhidOutputReq/OutputReport types, callback registration/clearing, pollOutputReport, PID-specific buildForPid, PID report-ID constants, and mandatory‑report validation with an IncompletePidDescriptor error.
Forwarder implementation
src/io/ffb_forwarder.zig
New FfbForwarder module that forwards OutputReport bytes to a physical hidraw FD, tracks writes_total/drops_eagain, handles EAGAIN vs permanent errors, and exposes a trampoline callback; includes unit tests.
Event loop & integration
src/event_loop.zig, src/device_instance.zig, src/main.zig
Event loop gains UHID output slot, addUhidOutput, and uhid_primary wiring; DeviceInstance integrates FfbForwarder, builds PID descriptors when needed, honors clone_vid_pid, exposes test init option, and main.zig exports new modules/tests.
Tests
src/test/uhid_output_dispatch_test.zig, src/test/wave6_pidff_e2e_test.zig, src/test/supervisor_uhid_routing_test.zig
New Linux-only unit and e2e tests covering UHID output dispatch, byte‑faithful forwarding, descriptor completeness, clone_vid_pid behavior, EAGAIN handling, and forwarder disabled transitions.

Sequence Diagram(s)

sequenceDiagram
    participant EventLoop
    participant UhidDevice as UHID Device
    participant FfbForwarder
    participant Hidraw as Physical Hidraw

    EventLoop->>EventLoop: poll UHID output slot
    EventLoop->>UhidDevice: pollOutputReport()
    UhidDevice-->>EventLoop: OutputReport(report_id, data)
    EventLoop->>FfbForwarder: forwarderCallback(report)
    FfbForwarder->>FfbForwarder: check state (active/disabled)
    alt active
        FfbForwarder->>Hidraw: write(report bytes)
        alt write success
            FfbForwarder->>FfbForwarder: increment writes_total
        else EAGAIN
            FfbForwarder->>FfbForwarder: increment drops_eagain (stay active)
        else EACCES/ENODEV/EBADF
            FfbForwarder->>FfbForwarder: log & transition to disabled
        end
    else disabled
        FfbForwarder->>FfbForwarder: short-circuit (no write)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

  • Issue 82 — Implements the same UHID-based PID FFB passthrough features (descriptor building, output forwarding, clone_vid_pid) described in this PR.

Possibly related PRs

  • PR 127 — Prior UHID device and UAPI bindings; this change extends UHID with OUTPUT parsing and callbacks.
  • PR 132 — Related UHID descriptor work; this PR adds PID-specific build/validation on top of descriptor foundations.
  • PR 69 — Earlier event-loop FD/slot changes; this PR further expands the event loop with a UHID output slot and MAX_FDS bump.

Poem

🐇 Hopping through the event loop's light,

UHID whispers, reports take flight,
Descriptors stitched, bytes pass through,
Forwarder writes, counters accrue,
Wave Six hums — a rumbling night.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main feature: HID PID FFB passthrough implementation that closes ADR-015 Stage 3 path. It is specific, concise, and reflects the primary change throughout the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spec/phase-13-wave-6-pidff

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (1)
openspec/changes/phase-13-wave-6-pidff/test-plan.md (1)

34-41: ⚠️ Potential issue | 🟡 Minor

Same mandatory-set drift as tasks.md — TP3 and TP5 reference the wrong required set.

TP3 ("11-of-12 descriptor") and TP5 ({1,2,3,4,5,6,10,11,12,13,14,15}) inherit the same stale assumption flagged on tasks.md §T1c: the implementation validates 8 kernel-mandatory IDs ({1, 7, 10, 11, 12, 13, 14, 15}), and IDs 2–6 are emitted but optional. As-written, TP3's "strip one report ID" strategy can yield false negatives — stripping any of IDs 2-6 will NOT trigger error.IncompletePidDescriptor. Suggest adjusting these to match the actual 8-ID set, or strip specifically ID 7 / 10 / 11 / etc. for TP3.

No separate action needed beyond the fix on tasks.md; calling it out here so the test-plan stays in sync.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/phase-13-wave-6-pidff/test-plan.md` around lines 34 - 41,
TP3 and TP5 in the test-plan reference the wrong mandatory Report ID set; update
them to match the actual kernel-mandatory IDs from tasks.md §T1c ({1, 7, 10, 11,
12, 13, 14, 15}): change TP3 so the "strip one report ID" case removes one of
those kernel-mandatory IDs (e.g., ID 7) and assert that validateMandatoryReports
returns error.IncompletePidDescriptor, and change TP5’s byte-scan assertion to
require presence of each NN ∈ {1, 7, 10, 11, 12, 13, 14, 15} instead of the
stale set.
🧹 Nitpick comments (1)
src/io/uhid_descriptor.zig (1)

508-559: buildForPid: clean, kernel-aligned descriptor with end-of-build validation. LGTM.

The end-of-build validateMandatoryReports(buf.items) call (line 557) is the right gate — it fail-closes with error.IncompletePidDescriptor before any malformed bytes reach UHID_CREATE2, which is the exact regression class probe Run 2 hit (kernel OOPS on a 7-of-12 descriptor). The deferred cfg / ffb_cfg parameters are documented at lines 502-507 as intentional shape preservation for the call sites in src/device_instance.zig:266, so the unused-binding _ = cfg; _ = ffb_cfg; reads cleanly rather than as dead code.

One tiny nit (purely optional): hidItemSize at lines 2031-2041 — used by tests at lines 1787, 1835, and 2074 — doesn't special-case the long-form prefix 0xFE the way validateMandatoryReports (lines 993-999) does. Today no PID emitter uses long-form items so this never trips, but if anyone reuses hidItemSize against a third-party descriptor it would silently misread long-form items as size=2.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/io/uhid_descriptor.zig` around lines 508 - 559, hidItemSize currently
doesn't handle the HID long-item prefix 0xFE, so it can miscompute lengths for
long-form items; update hidItemSize to detect a prefix byte 0xFE and, after
bounds-checking the buffer, read the following length byte and return the full
long-item length (header bytes + payload), mirroring how
validateMandatoryReports special-cases long-form items; ensure you handle buffer
overrun safety and update/keep the existing tests (they reference hidItemSize)
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@openspec/changes/phase-13-wave-6-pidff/design.md`:
- Around line 36-43: The spec and validator should require only the 8
kernel-required PID report IDs (not all 12 emitted reports) when kind == "pid";
update the design text and any validator references so pidff_find_reports() and
downstream checks (evdev_open / hid_hw_open) consider those 8 IDs mandatory and
only return -ENODEV if any of those 8 are missing, while allowing the other four
reports to be optional, and adjust the described acceptance criteria and tests
to match this 8-of-8 requirement (also update wording in the sections referenced
around lines 271-286 to reflect the change).

In `@openspec/changes/phase-13-wave-6-pidff/tasks.md`:
- Around line 102-156: Update the spec to match the implementation: change the
“12 PID mandatory reports” wording to reflect the kernel-mandatory count of 8,
add Block Free Report (ID 7 / usage 0x90) to the T1b checklist and remove IDs
2–6 from the mandatory list, update the T1c pseudocode REQUIRED set to
[_]u8{1,7,10,11,12,13,14,15}, and change any “8-of-12” / 12-report wording in
T1d to the corrected 8-report phrasing so the document aligns with the code
constant PID_MANDATORY_REPORT_IDS and the validateMandatoryReports / buildForPid
validation behavior.

In `@src/cli/install.zig`:
- Around line 1859-1860: The dedupe logic for UdevEntry is currently only
preserving block_kernel_drivers; it must also treat clone_vid_pid as additive so
a later override with clone_vid_pid=true isn't lost. Update the merge/dedup code
(the routine invoked by collectAllDeviceEntries() that collapses duplicate
VID/PID entries) to set the resulting UdevEntry.clone_vid_pid = true if any of
the merged entries have clone_vid_pid == true (i.e., OR the flag across
entries), while still preserving block_kernel_drivers as before.

In `@src/device_instance.zig`:
- Around line 331-340: The code currently guesses the physical hidraw write fd
using devices[0].pollfd().fd when out_cfg.force_feedback -> pid_ffb expects a
supervisor-owned physical writer, which can be wrong for multi-interface
devices; change the logic in the FFB setup (where pid_ffb,
opts.test_physical_hidraw_fd, devices, FfbForwarder.init, primary_uhid.fd, and
loop.addUhidOutput are referenced) to require an explicit physical hidraw fd:
prefer opts.test_physical_hidraw_fd, then a configured physical writer field on
pid_ffb (add one if needed), and only as a last resort attempt to locate a
matching device by inspecting device descriptors/paths (not devices[0]) to find
a supervisor-owned hidraw writer; remove or disable the naive devices[0]
fallback so PID data is always forwarded to the correct physical hidraw fd
before calling FfbForwarder.init and loop.addUhidOutput.

In `@src/io/uhid.zig`:
- Around line 459-476: The code in pollOutputReport reads into buf and then
directly reads UhidOutputReq.size and slices req.data without ensuring the read
returned enough bytes for those fields; to fix, after posix.read and after
checking n >= 4 compute payload_len = n - 4 and if payload_len < the offset of
UhidOutputReq.size + 2 (i.e. the bytes needed to safely read the u16 size)
return null; then compute sz = `@min`(payload_len, `@as`(usize, req.size),
UHID_DATA_MAX) and use buf[4..4+sz] (or req.data[0..sz]) when constructing
OutputReport so you never index uninitialized bytes from UhidOutputReq.size or
req.data; reference symbols: pollOutputReport, posix.read, UhidOutputReq,
UhidOutputReq.size, UHID_EVENT_SIZE, UHID_DATA_MAX, OutputReport.

---

Duplicate comments:
In `@openspec/changes/phase-13-wave-6-pidff/test-plan.md`:
- Around line 34-41: TP3 and TP5 in the test-plan reference the wrong mandatory
Report ID set; update them to match the actual kernel-mandatory IDs from
tasks.md §T1c ({1, 7, 10, 11, 12, 13, 14, 15}): change TP3 so the "strip one
report ID" case removes one of those kernel-mandatory IDs (e.g., ID 7) and
assert that validateMandatoryReports returns error.IncompletePidDescriptor, and
change TP5’s byte-scan assertion to require presence of each NN ∈ {1, 7, 10, 11,
12, 13, 14, 15} instead of the stale set.

---

Nitpick comments:
In `@src/io/uhid_descriptor.zig`:
- Around line 508-559: hidItemSize currently doesn't handle the HID long-item
prefix 0xFE, so it can miscompute lengths for long-form items; update
hidItemSize to detect a prefix byte 0xFE and, after bounds-checking the buffer,
read the following length byte and return the full long-item length (header
bytes + payload), mirroring how validateMandatoryReports special-cases long-form
items; ensure you handle buffer overrun safety and update/keep the existing
tests (they reference hidItemSize) accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0c953343-73ac-48ff-a2b8-9981a801fc2f

📥 Commits

Reviewing files that changed from the base of the PR and between d16471e and e7ebdd5.

📒 Files selected for processing (16)
  • build.zig
  • openspec/changes/phase-13-wave-6-pidff/brief.md
  • openspec/changes/phase-13-wave-6-pidff/design.md
  • openspec/changes/phase-13-wave-6-pidff/tasks.md
  • openspec/changes/phase-13-wave-6-pidff/test-plan.md
  • src/cli/install.zig
  • src/config/device.zig
  • src/device_instance.zig
  • src/event_loop.zig
  • src/io/ffb_forwarder.zig
  • src/io/uhid.zig
  • src/io/uhid_descriptor.zig
  • src/main.zig
  • src/test/supervisor_uhid_routing_test.zig
  • src/test/uhid_output_dispatch_test.zig
  • src/test/wave6_pidff_e2e_test.zig

Comment thread openspec/changes/phase-13-wave-6-pidff/design.md Outdated
Comment thread openspec/changes/phase-13-wave-6-pidff/tasks.md Outdated
Comment thread src/cli/install.zig Outdated
Comment thread src/device_instance.zig Outdated
Comment thread src/device_instance.zig
Comment thread src/io/uhid.zig
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
src/device_instance.zig (1)

351-364: ⚠️ Potential issue | 🟠 Major

Don't bind the forwarder to devices[0]'s current fd.

This captures an init-time interface fd as the FFB target. Besides picking the wrong interface on multi-interface wheels, it also goes stale after closeDeviceIO() / rebindDeviceIO(): self.devices gets replaced, but self.ffb_forwarder.physical_fd still points at the closed original fd, so post-reconnect PID writes will fail or disable forwarding. Please source a stable physical hidraw writer here, or refresh the forwarder whenever the device fds are rebound.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/device_instance.zig` around lines 351 - 364, The forwarder is being
initialized with a transient fd from devices[0].pollfd().fd (phys_fd) which can
be wrong for multi-interface wheels and goes stale after
closeDeviceIO()/rebindDeviceIO(); instead either (A) derive and hold a stable
physical writer (e.g., open the hidraw device path once and use that fd or a
small wrapper that re-opens on demand) and pass that stable writer to
FfbForwarder.init, or (B) keep FfbForwarder.init lightweight and
update/rehydrate it whenever device IO is rebound by calling a refresh method
from rebindDeviceIO()/closeDeviceIO() to replace ffb_forwarder.physical_fd;
change the initialization in device_instance where phys_fd is chosen (currently
using devices[0] and opts.test_physical_hidraw_fd) and add a refresh hook in
closeDeviceIO/rebindDeviceIO to ensure ffb_fwd/ffb_forwarder never holds a
closed/stale fd.
openspec/changes/phase-13-wave-6-pidff/tasks.md (1)

576-580: ⚠️ Potential issue | 🟡 Minor

Use a 7-of-8 mandatory descriptor for the negative case.

This spec now defines the required set as {1, 7, 10, 11, 12, 13, 14, 15} in validateMandatoryReports, so an “11-of-12” fixture can drop an optional report and still pass. Update this to “omit one mandatory report / 7-of-8” so the test contract matches the validator.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/phase-13-wave-6-pidff/tasks.md` around lines 576 - 580, The
negative test "wave6: incomplete PID descriptor → error.IncompletePidDescriptor"
currently constructs an 11-of-12 fixture which can still satisfy
validateMandatoryReports; change the fixture to omit one of the mandatory
reports so it is a 7-of-8 mandatory case: update the helper (e.g.,
buildForPidIncomplete) or the direct call site to produce a byte sequence
missing one element from the mandatory set {1,7,10,11,12,13,14,15} and assert
error.IncompletePidDescriptor when validateMandatoryReports is invoked; ensure
the test explicitly documents which mandatory ID was removed.
🧹 Nitpick comments (1)
.github/workflows/ci.yml (1)

39-42: LGTM — sequential check target avoids the documented build-cache deadlock.

Switching both the matrix and distro-check jobs to check-all-serial matches the new ordered step chain in build.zig (test → safe → fmt) and the inline comment clearly captures the rationale for future maintainers. One minor consideration worth keeping in mind: serializing the three checks will increase wall-clock time for every matrix cell, so once the Zig incremental-cache contention is resolved upstream it would be worth re-evaluating whether check-all can be restored to recover parallelism.

Also applies to: 138-138

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 39 - 42, The workflow was updated to
run "zig build ... check-all-serial" in one place but the same change must be
mirrored for the other job(s); update the other occurrences (the matrix job and
the distro-check job) to use the serial target instead of check-all so both run
lines invoke "check-all-serial" and keep the explanatory comment about avoiding
the Zig build-cache deadlock; verify the run lines referencing matrix.zig_args
and any job names like matrix and distro-check are changed consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@openspec/changes/phase-13-wave-6-pidff/design.md`:
- Line 231: Update the sentence that reads "the kernel needs a HID Physical
Interface Device collection with all 12" to distinguish between mandatory vs
total report counts: explicitly state how many total report IDs buildForPid
emits (the descriptor table lists 13 IDs: 1,2,3,4,5,6,7,10,11,12,13,14,15) and
call out that only 8 of those are kernel-mandatory while 5 are optional (match
the earlier mentions of 8 mandatory reports). Ensure the wording removes "all 12
mandatory" and instead specifies "X total reports (13), of which 8 are
kernel-mandatory and 5 are optional" or the correct totals if buildForPid emits
a different number.

In `@openspec/changes/phase-13-wave-6-pidff/test-plan.md`:
- Around line 23-36: Update the test-plan items TP1, TP2, and TP3 to assert the
8-kernel-mandatory reports instead of 12: change TP1 to call buildForPid(alloc,
cfg_with_one_axis, ffb_cfg{kind=pid}) and assert
validateMandatoryReports(result) succeeds for the 8 specific report IDs
(1,7,10,11,12,13,14,15); change TP2’s golden-reference expectation to be
byte-equal for a descriptor containing those 8 mandatory reports (plus optional
ones as defined) rather than assuming 12; and change TP3 to feed
validateMandatoryReports an artificially stripped descriptor with 7-of-8
mandatory reports and expect error.IncompletePidDescriptor. Ensure references to
validateMandatoryReports, buildForPid, and error.IncompletePidDescriptor are
updated accordingly.
- Around line 37-42: Update test-plan.md to explicitly state which PID report
IDs are present in the normal 12-report descriptor versus the 13-report stress
case: list the 8 kernel-mandatory IDs {1,7,10,11,12,13,14,15} (per
PID_REQUIRED_REPORTS in drivers/hid/usbhid/hid-pidff.c) and then enumerate the
five optional IDs {2,3,4,5,6}, clarifying which one of those optional IDs is
omitted in the normal 12-report run and that TP4 (stress) includes all five
optional IDs for a total of 13 reports; adjust TP1/TP2/TP3/TP31 wording to say
“12 reports (8 mandatory + 4 optional: specify which optional IDs)” and change
TP4 to reference the 13-report stress case (8 mandatory + 5 optional) so the
document and the descriptor table in design.md are consistent.

In `@src/cli/install.zig`:
- Around line 2394-2397: The parser for clone_vid_pid (inside the
isFieldKey(trimmed, "clone_vid_pid") branch) currently trims the RHS into val
and compares it directly to "true", which fails when an inline TOML comment
follows (e.g., `clone_vid_pid = true # opt-in`). Change the logic that computes
val (the trimmed RHS slice) to strip any inline comment: if std.mem.indexOf(u8,
val, "#") returns a position, re-slice val to val[0..hashPos] and trim again
before comparing; then perform std.mem.eql(u8, cleaned_val, "true") to set
clone_vid_pid. Ensure this update is applied where val is created and compared.

---

Duplicate comments:
In `@openspec/changes/phase-13-wave-6-pidff/tasks.md`:
- Around line 576-580: The negative test "wave6: incomplete PID descriptor →
error.IncompletePidDescriptor" currently constructs an 11-of-12 fixture which
can still satisfy validateMandatoryReports; change the fixture to omit one of
the mandatory reports so it is a 7-of-8 mandatory case: update the helper (e.g.,
buildForPidIncomplete) or the direct call site to produce a byte sequence
missing one element from the mandatory set {1,7,10,11,12,13,14,15} and assert
error.IncompletePidDescriptor when validateMandatoryReports is invoked; ensure
the test explicitly documents which mandatory ID was removed.

In `@src/device_instance.zig`:
- Around line 351-364: The forwarder is being initialized with a transient fd
from devices[0].pollfd().fd (phys_fd) which can be wrong for multi-interface
wheels and goes stale after closeDeviceIO()/rebindDeviceIO(); instead either (A)
derive and hold a stable physical writer (e.g., open the hidraw device path once
and use that fd or a small wrapper that re-opens on demand) and pass that stable
writer to FfbForwarder.init, or (B) keep FfbForwarder.init lightweight and
update/rehydrate it whenever device IO is rebound by calling a refresh method
from rebindDeviceIO()/closeDeviceIO() to replace ffb_forwarder.physical_fd;
change the initialization in device_instance where phys_fd is chosen (currently
using devices[0] and opts.test_physical_hidraw_fd) and add a refresh hook in
closeDeviceIO/rebindDeviceIO to ensure ffb_fwd/ffb_forwarder never holds a
closed/stale fd.

---

Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 39-42: The workflow was updated to run "zig build ...
check-all-serial" in one place but the same change must be mirrored for the
other job(s); update the other occurrences (the matrix job and the distro-check
job) to use the serial target instead of check-all so both run lines invoke
"check-all-serial" and keep the explanatory comment about avoiding the Zig
build-cache deadlock; verify the run lines referencing matrix.zig_args and any
job names like matrix and distro-check are changed consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f5245d35-44b4-42f1-843b-cf55189911b5

📥 Commits

Reviewing files that changed from the base of the PR and between e7ebdd5 and fc4937b.

📒 Files selected for processing (9)
  • .github/workflows/ci.yml
  • build.zig
  • openspec/changes/phase-13-wave-6-pidff/brief.md
  • openspec/changes/phase-13-wave-6-pidff/design.md
  • openspec/changes/phase-13-wave-6-pidff/tasks.md
  • openspec/changes/phase-13-wave-6-pidff/test-plan.md
  • src/cli/install.zig
  • src/device_instance.zig
  • src/io/uhid.zig
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/io/uhid.zig
  • build.zig

Comment thread openspec/changes/phase-13-wave-6-pidff/design.md Outdated
Comment on lines +23 to +36
- [ ] **TP1**: `buildForPid(alloc, cfg_with_one_axis, ffb_cfg{kind=pid})`
returns a non-empty descriptor; `validateMandatoryReports(result)`
returns success (12 reports present).
- [ ] **TP2**: Output bytes are byte-equal to an inline golden reference
(Moza R5 case, pinned by implementer on first successful real-hardware
run when kernel logs show `hid-universal-pidff` FFB init without
`Error initialising force feedback`). Probe Run 2 evidence covers
bustype acceptance + driver binding ONLY — it does not supply a
working descriptor (probe Run 2's 7-of-12 descriptor caused FFB init
failure + kernel NULL-deref; the implementer must build a complete
12-report descriptor per USB HID PID 1.01 §4.x).
- [ ] **TP3**: `validateMandatoryReports` on a hand-crafted 11-of-12
descriptor (one report ID stripped) returns
`error.IncompletePidDescriptor`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Correct test specifications to align with 8-mandatory-report design.

TP1, TP2, and TP3 reference "12 reports" but the design specification (lines 36-43, 268, 286, 844 in design.md) clearly states that only 8 reports are kernel-mandatory. The validator validateMandatoryReports checks for these 8 specific report IDs (1, 7, 10, 11, 12, 13, 14, 15).

These test cases should validate the 8 mandatory reports, not 12 total reports:

  • TP1 should verify the 8 mandatory reports are present
  • TP3's incomplete descriptor test should use 7-of-8 mandatory reports (not 11-of-12)

Note: TP4 mentions "8-mandatory + 5-optional" which would total 13 reports, further highlighting the confusion about the total count.

📝 Proposed corrections
 - [ ] **TP1**: `buildForPid(alloc, cfg_with_one_axis, ffb_cfg{kind=pid})`
   returns a non-empty descriptor; `validateMandatoryReports(result)`
-  returns success (12 reports present).
+  returns success (8 kernel-mandatory reports present).
 - [ ] **TP2**: Output bytes are byte-equal to an inline golden reference
   (Moza R5 case, pinned by implementer on first successful real-hardware
   run when kernel logs show `hid-universal-pidff` FFB init without
   `Error initialising force feedback`). Probe Run 2 evidence covers
   bustype acceptance + driver binding ONLY — it does not supply a
-  working descriptor (probe Run 2's 7-of-12 descriptor caused FFB init
+  working descriptor (probe Run 2's incomplete descriptor caused FFB init
   failure + kernel NULL-deref; the implementer must build a complete
-  12-report descriptor per USB HID PID 1.01 §4.x).
+  descriptor with 8 kernel-mandatory reports per USB HID PID 1.01 §4.x).
-- [ ] **TP3**: `validateMandatoryReports` on a hand-crafted 11-of-12
-  descriptor (one report ID stripped) returns
+- [ ] **TP3**: `validateMandatoryReports` on a hand-crafted incomplete
+  descriptor (with 7-of-8 mandatory report IDs) returns
   `error.IncompletePidDescriptor`.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/phase-13-wave-6-pidff/test-plan.md` around lines 23 - 36,
Update the test-plan items TP1, TP2, and TP3 to assert the 8-kernel-mandatory
reports instead of 12: change TP1 to call buildForPid(alloc, cfg_with_one_axis,
ffb_cfg{kind=pid}) and assert validateMandatoryReports(result) succeeds for the
8 specific report IDs (1,7,10,11,12,13,14,15); change TP2’s golden-reference
expectation to be byte-equal for a descriptor containing those 8 mandatory
reports (plus optional ones as defined) rather than assuming 12; and change TP3
to feed validateMandatoryReports an artificially stripped descriptor with 7-of-8
mandatory reports and expect error.IncompletePidDescriptor. Ensure references to
validateMandatoryReports, buildForPid, and error.IncompletePidDescriptor are
updated accordingly.

Comment thread openspec/changes/phase-13-wave-6-pidff/test-plan.md Outdated
Comment thread src/cli/install.zig Outdated
Comment on lines +2394 to +2397
if (isFieldKey(trimmed, "clone_vid_pid")) {
if (std.mem.indexOf(u8, trimmed, "=")) |eq| {
const val = std.mem.trim(u8, trimmed[eq + 1 ..], " \t");
clone_vid_pid = std.mem.eql(u8, val, "true");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Handle inline TOML comments when reading clone_vid_pid.

clone_vid_pid = true # opt-in is valid TOML, but this parser compares the whole RHS to "true", so the flag is read as false and the cloned-VID/PID udev rule is skipped.

💡 Suggested fix
             if (isFieldKey(trimmed, "clone_vid_pid")) {
                 if (std.mem.indexOf(u8, trimmed, "=")) |eq| {
-                    const val = std.mem.trim(u8, trimmed[eq + 1 ..], " \t");
+                    const raw = std.mem.trim(u8, trimmed[eq + 1 ..], " \t");
+                    const end = std.mem.indexOfScalar(u8, raw, '#') orelse raw.len;
+                    const val = std.mem.trim(u8, raw[0..end], " \t");
                     clone_vid_pid = std.mem.eql(u8, val, "true");
                 }
             }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (isFieldKey(trimmed, "clone_vid_pid")) {
if (std.mem.indexOf(u8, trimmed, "=")) |eq| {
const val = std.mem.trim(u8, trimmed[eq + 1 ..], " \t");
clone_vid_pid = std.mem.eql(u8, val, "true");
if (isFieldKey(trimmed, "clone_vid_pid")) {
if (std.mem.indexOf(u8, trimmed, "=")) |eq| {
const raw = std.mem.trim(u8, trimmed[eq + 1 ..], " \t");
const end = std.mem.indexOfScalar(u8, raw, '#') orelse raw.len;
const val = std.mem.trim(u8, raw[0..end], " \t");
clone_vid_pid = std.mem.eql(u8, val, "true");
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/install.zig` around lines 2394 - 2397, The parser for clone_vid_pid
(inside the isFieldKey(trimmed, "clone_vid_pid") branch) currently trims the RHS
into val and compares it directly to "true", which fails when an inline TOML
comment follows (e.g., `clone_vid_pid = true # opt-in`). Change the logic that
computes val (the trimmed RHS slice) to strip any inline comment: if
std.mem.indexOf(u8, val, "#") returns a position, re-slice val to
val[0..hashPos] and trim again before comparing; then perform std.mem.eql(u8,
cleaned_val, "true") to set clone_vid_pid. Ensure this update is applied where
val is created and compared.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
.github/workflows/release.yml (1)

54-86: Release-path -j1 will measurably slow release builds.

Forcing -j1 on both the native musl build (line 54) and the cross-target matrix build (line 86) means each release tag triggers serial ReleaseSafe builds for x86_64-linux-musl and aarch64-linux-musl. This is fine as a short-term workaround, but unlike CI test runs, release pipelines are user-visible (publish latency) — please ensure the underlying parallel-build issue is tracked so this can be lifted. Same observation as .github/workflows/e2e.yml.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/release.yml around lines 54 - 86, The workflow forces
serial Zig builds by passing "-j1" in the zig build invocations (seen in the
"Generate install artifacts" step that runs "./zig-out/bin/padctl install" and
the "Build" step in the "build" job), which slows release builds; remove the
explicit "-j1" flags (or replace with a parallel value such as "-j$(nproc)" or
omit "-j" to let Zig choose) in both places (the padctl install invocation and
the zig build run in the build matrix) and if needed add a short TODO comment
referencing the underlying parallel-build issue ID so it can be reverted when
fixed.
.github/workflows/e2e.yml (1)

36-41: Add inline comment referencing Zig issue #22453 to explain the -j1 workaround.

The -j1 flag is consistently applied across all workflows to address Zig cache lock contention (Zig issue #22453). This root cause is already documented in ci.yml, but adding the same explanatory comment to e2e.yml and other workflows would improve maintainability and help future contributors understand the trade-off without searching elsewhere.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e.yml around lines 36 - 41, Add an inline comment above
the Zig invocations in this workflow (the lines containing "run: zig build -j1
test-e2e" and "run: zig build -j1 test-integration") that explains the use of
the "-j1" workaround and references Zig issue `#22453`; the comment should mirror
the explanation already in ci.yml (briefly stating it avoids Zig cache lock
contention) so future maintainers understand why parallel builds are disabled.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/e2e.yml:
- Around line 36-41: Add an inline comment above the Zig invocations in this
workflow (the lines containing "run: zig build -j1 test-e2e" and "run: zig build
-j1 test-integration") that explains the use of the "-j1" workaround and
references Zig issue `#22453`; the comment should mirror the explanation already
in ci.yml (briefly stating it avoids Zig cache lock contention) so future
maintainers understand why parallel builds are disabled.

In @.github/workflows/release.yml:
- Around line 54-86: The workflow forces serial Zig builds by passing "-j1" in
the zig build invocations (seen in the "Generate install artifacts" step that
runs "./zig-out/bin/padctl install" and the "Build" step in the "build" job),
which slows release builds; remove the explicit "-j1" flags (or replace with a
parallel value such as "-j$(nproc)" or omit "-j" to let Zig choose) in both
places (the padctl install invocation and the zig build run in the build matrix)
and if needed add a short TODO comment referencing the underlying parallel-build
issue ID so it can be reverted when fixed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c5e09596-3caa-41ac-b440-9eccfb51c559

📥 Commits

Reviewing files that changed from the base of the PR and between fc4937b and 0e330b9.

📒 Files selected for processing (8)
  • .github/workflows/ci.yml
  • .github/workflows/docs.yml
  • .github/workflows/e2e.yml
  • .github/workflows/install-flow.yml
  • .github/workflows/release.yml
  • .github/workflows/validate.yml
  • build.zig
  • scripts/coverage.sh
✅ Files skipped from review due to trivial changes (3)
  • .github/workflows/validate.yml
  • scripts/coverage.sh
  • .github/workflows/install-flow.yml
🚧 Files skipped from review as they are similar to previous changes (2)
  • .github/workflows/ci.yml
  • build.zig

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
.github/workflows/ci.yml (1)

131-149: Nit: stale step name in distro-check.

The step is still named Run check-all in container, but the container body no longer invokes the check-all target — it now runs the same three split processes as check-matrix. Consider renaming for accuracy and to mirror the check-matrix step name.

✏️ Proposed rename
-      - name: Run check-all in container
+      - name: Run all checks in container (serial via separate processes)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/ci.yml around lines 131 - 149, The GitHub Actions step
named "Run check-all in container" is stale because the docker command now runs
the three split builds like the `check-matrix` step; rename the step to
accurately reflect its behavior (e.g. "Run check-matrix in container" or "Run
split checks in container") in the `.github/workflows/ci.yml` job `distro-check`
so the step label matches the commands that call /tmp/zig... zig build
-Dlibusb=false test/test-safe/check-fmt; update the step name string to the new,
clearer label.
src/test/wave6_pidff_e2e_test.zig (1)

367-379: EAGAIN trigger relies on pipe buffer being ≤128 KiB.

The fill loop caps at 128 KiB, which is comfortably above Linux's default 64 KiB pipe buffer, so this works today. But it silently breaks if the runtime ever sees a larger default (custom pipe-default-size, sysctl tweaks, or a non-default builder env): the loop returns with filled == 128*1024 but the pipe still has free space, the subsequent 2-byte write succeeds, and expect(fwd.drops_eagain >= 1) fails for a non-bug reason.

Filling unconditionally until EAGAIN removes the assumption entirely.

♻️ Proposed refactor
-    // Fill the hidraw pipe buffer to capacity (Linux default 64 KiB).
+    // Fill the hidraw pipe buffer to capacity (write until EAGAIN, regardless of
+    // the kernel's pipe-default-size).
     const chunk = [_]u8{0xFF} ** 4096;
-    var filled: usize = 0;
-    while (filled < 128 * 1024) {
-        const n = posix.write(hidraw_fds[1], &chunk) catch break;
-        filled += n;
-    }
+    while (true) {
+        _ = posix.write(hidraw_fds[1], &chunk) catch |err| switch (err) {
+            error.WouldBlock => break,
+            else => return err,
+        };
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/test/wave6_pidff_e2e_test.zig` around lines 367 - 379, The test currently
assumes the pipe buffer is <=128KiB by looping up to filled < 128*1024; change
the fill loop that writes chunk via posix.write to instead loop until the write
fails with EAGAIN (i.e., keep writing and accumulate filled using the return
value of posix.write, breaking only when posix.write returns an error with errno
== EAGAIN or when write returns 0), so that the pipe is genuinely filled
regardless of system pipe-default-size; update the code around the chunk,
filled, posix.write and hidraw_fds loop so the subsequent
writeUhidOutputEvent/drainAndForward and the expect(fwd.drops_eagain >= 1)
assertion no longer rely on a fixed 128KiB cap.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/ci.yml:
- Around line 131-149: The GitHub Actions step named "Run check-all in
container" is stale because the docker command now runs the three split builds
like the `check-matrix` step; rename the step to accurately reflect its behavior
(e.g. "Run check-matrix in container" or "Run split checks in container") in the
`.github/workflows/ci.yml` job `distro-check` so the step label matches the
commands that call /tmp/zig... zig build -Dlibusb=false
test/test-safe/check-fmt; update the step name string to the new, clearer label.

In `@src/test/wave6_pidff_e2e_test.zig`:
- Around line 367-379: The test currently assumes the pipe buffer is <=128KiB by
looping up to filled < 128*1024; change the fill loop that writes chunk via
posix.write to instead loop until the write fails with EAGAIN (i.e., keep
writing and accumulate filled using the return value of posix.write, breaking
only when posix.write returns an error with errno == EAGAIN or when write
returns 0), so that the pipe is genuinely filled regardless of system
pipe-default-size; update the code around the chunk, filled, posix.write and
hidraw_fds loop so the subsequent writeUhidOutputEvent/drainAndForward and the
expect(fwd.drops_eagain >= 1) assertion no longer rely on a fixed 128KiB cap.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 52ab7f93-391e-4565-bdea-59a2c79d72b5

📥 Commits

Reviewing files that changed from the base of the PR and between 0e330b9 and 1fa6b5a.

📒 Files selected for processing (2)
  • .github/workflows/ci.yml
  • src/test/wave6_pidff_e2e_test.zig

BANANASJIM pushed a commit that referenced this pull request Apr 27, 2026
…flock deadlock workaround)

build.zig was registering uhid_output_dispatch_tests and wave6_pidff_e2e_tests
as separate b.addTest artifacts sharing src_mod. zig issue #22453 causes a
flock AB/BA deadlock between artifacts on the cache manifest, which manifested
as a 100+ minute CI hang on PR #166.

Inline both test files into src/main.zig's pub const ..._test = @import(...)
namespace following the pattern of interpreter_e2e_test, mapper_e2e_test, etc.
This compiles them into the main test artifact, eliminating cross-artifact
flock contention.

Convert @import("src") module-alias imports to relative path imports matching
the existing test file convention. Update test assertions to align with the
simplified device_instance.zig which no longer carries FfbForwarder or
test_physical_hidraw_fd seam (FFB wiring moved to event_loop, Wave 6 T4).

Bisected over 4 rounds; T3 commit 0698fb1 (uhid_output_dispatch_test artifact)
confirmed as root cause.
@BANANASJIM BANANASJIM force-pushed the spec/phase-13-wave-6-pidff branch from 93a544b to b47e7d3 Compare April 27, 2026 17:38
BANANASJIM pushed a commit that referenced this pull request Apr 27, 2026
…flock deadlock workaround)

build.zig was registering uhid_output_dispatch_tests and wave6_pidff_e2e_tests
as separate b.addTest artifacts sharing src_mod. zig issue #22453 causes a
flock AB/BA deadlock between artifacts on the cache manifest, which manifested
as a 100+ minute CI hang on PR #166.

Inline both test files into src/main.zig's pub const ..._test = @import(...)
namespace following the pattern of interpreter_e2e_test, mapper_e2e_test, etc.
This compiles them into the main test artifact, eliminating cross-artifact
flock contention.

Convert @import("src") module-alias imports to relative path imports matching
the existing test file convention. Update test assertions to align with the
simplified device_instance.zig which no longer carries FfbForwarder or
test_physical_hidraw_fd seam (FFB wiring moved to event_loop, Wave 6 T4).

Bisected over 4 rounds; T3 commit 0698fb1 (uhid_output_dispatch_test artifact)
confirmed as root cause.
BANANASJIM pushed a commit that referenced this pull request Apr 27, 2026
build.zig was registering uhid_output_dispatch_tests and wave6_pidff_e2e_tests
as separate b.addTest artifacts sharing src_mod. Zig issue #22453 causes a
flock AB/BA deadlock between artifacts on the cache manifest, manifesting
as a 100+ minute CI hang on PR #166.

Inline both test files into src/main.zig's pub const ..._test = @import(...)
namespace following the existing convention (interpreter_e2e_test, etc.).
Convert the test files' @import("src") module-alias imports to relative
path imports matching src/test/interpreter_e2e_test.zig.

This change is scoped to ONLY 4 files: build.zig (remove 2 addTest blocks),
src/main.zig (add 2 namespace entries), and the 2 test files (rewrite
import header). All Wave 6 production code (FfbForwarder wiring,
clone_vid_pid parsing, supervisor_uhid_routing_test, etc.) is preserved.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
src/device_instance.zig (1)

351-361: ⚠️ Potential issue | 🟠 Major

Don't infer the physical PID target from devices[0].

Line 360 still falls back to the first opened interface fd. In this file, Lines 35-57 show that interface 0 can be a UsbrawDevice or the wrong HID interface, so multi-interface wheels can forward PID bytes to the wrong fd and silently lose FFB. This path should fail closed unless an explicit physical hidraw writer has been resolved.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/device_instance.zig` around lines 351 - 361, The code currently falls
back to using devices[0] for phys_fd which can pick the wrong interface; change
the logic so phys_fd does NOT silently fall back to devices[0]. Set phys_fd to
opts.test_physical_hidraw_fd orelse -1 and treat -1 as a hard "no physical
hidraw" condition (return an error or disable FFB startup) rather than using
devices[0]. Update any callers that expect a valid fd from phys_fd to handle the
-1 sentinel and fail closed when no explicit physical hidraw writer is provided;
reference symbols: devices, phys_fd, opts.test_physical_hidraw_fd.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@openspec/changes/phase-13-wave-6-pidff/brief.md`:
- Line 49: The brief incorrectly claims a rate ceiling for the FfbForwarder;
update the T4 line in brief.md so it does not assert an implemented rate ceiling
— either remove "rate ceiling" from the description or append " (future work)" /
"not yet implemented" to the phrase for `FfbForwarder` in the T4 summary,
referencing the new `src/io/ffb_forwarder.zig` and related
`src/supervisor.zig`/`src/io/hidraw.zig` entries to make it clear the rate limit
is not currently enforced.

In `@openspec/changes/phase-13-wave-6-pidff/tasks.md`:
- Around line 511-603: Update T6 to describe the actual pipe-based Layer‑1 test
harness: remove or de-emphasize the /dev/uhid path and UHID gating, and replace
UHID-specific steps with the pipe2 fixtures used by
src/test/wave6_pidff_e2e_test.zig; explicitly state that the test uses
physical_hidraw pipe descriptors passed to DeviceInstance.init (the write-end
used to simulate device output) rather than
UhidSimulator.injectOutput/UHID_CREATE2, update the test registration in
build.zig accordingly, and delete the PADCTL_TEST_REQUIRE_UHID=1 requirement so
the acceptance criteria match the implemented pipe-based tests.

In `@src/io/ffb_forwarder.zig`:
- Around line 82-97: The test hangs because the writer end fds[1] remains open
while draining the pipe, so a subsequent posix.read can block; after performing
the three forwards (FfbForwarder.init / fwd.forward and checking
fwd.writes_total) explicitly close the writer descriptor (posix.close(fds[1]) or
call FfbForwarder.close/cleanup if provided) before doing the multiple
posix.read calls so the reader sees EOF and reads return without blocking;
ensure defer close calls are adjusted (remove double-close) or guard against
closing twice.

In `@src/io/uhid_descriptor.zig`:
- Around line 502-558: buildForPid currently ignores the device.OutputConfig and
emits only a single 1-byte X axis, causing a mismatch with encodeReport; change
buildForPid (notably the function named buildForPid and its current single-axis
preamble) to actually build the same input layout based on the passed
cfg/OutputConfig (buttons, hats, axes, touchpad) instead of discarding cfg:
remove the `_ = cfg`/`_ = ffb_cfg` stubs and either call or extract the existing
input-descriptor generation logic used by encodeReport (or replicate its
button/axis/hat/touchpad emission) while preserving the PID output reports and
INPUT_REPORT_ID; ensure report ID, sizes, and ordering match encodeReport's
expectations so validateMandatoryReports and HID_MAX_DESCRIPTOR checks remain
valid.

---

Duplicate comments:
In `@src/device_instance.zig`:
- Around line 351-361: The code currently falls back to using devices[0] for
phys_fd which can pick the wrong interface; change the logic so phys_fd does NOT
silently fall back to devices[0]. Set phys_fd to opts.test_physical_hidraw_fd
orelse -1 and treat -1 as a hard "no physical hidraw" condition (return an error
or disable FFB startup) rather than using devices[0]. Update any callers that
expect a valid fd from phys_fd to handle the -1 sentinel and fail closed when no
explicit physical hidraw writer is provided; reference symbols: devices,
phys_fd, opts.test_physical_hidraw_fd.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 15838cee-1015-4f56-82ae-c2e7a0bb21cb

📥 Commits

Reviewing files that changed from the base of the PR and between 1fa6b5a and b47e7d3.

📒 Files selected for processing (17)
  • .github/workflows/ci.yml
  • build.zig
  • openspec/changes/phase-13-wave-6-pidff/brief.md
  • openspec/changes/phase-13-wave-6-pidff/design.md
  • openspec/changes/phase-13-wave-6-pidff/tasks.md
  • openspec/changes/phase-13-wave-6-pidff/test-plan.md
  • src/cli/install.zig
  • src/config/device.zig
  • src/device_instance.zig
  • src/event_loop.zig
  • src/io/ffb_forwarder.zig
  • src/io/uhid.zig
  • src/io/uhid_descriptor.zig
  • src/main.zig
  • src/test/supervisor_uhid_routing_test.zig
  • src/test/uhid_output_dispatch_test.zig
  • src/test/wave6_pidff_e2e_test.zig
✅ Files skipped from review due to trivial changes (2)
  • build.zig
  • src/main.zig
🚧 Files skipped from review as they are similar to previous changes (3)
  • .github/workflows/ci.yml
  • src/test/uhid_output_dispatch_test.zig
  • src/cli/install.zig

| T1 | `UhidDescriptorBuilder.buildForPid` — PID output collection, 8 kernel-mandatory + 5 optional reports, defensive 8-mandatory validator, byte-exact golden test | `src/io/uhid_descriptor.zig` | ~250 |
| T2 | VID/PID cloning — `OutputConfig.force_feedback.clone_vid_pid` opt-in; primary UHID card vendor/product overridden from `[device]` when true; udev rules emit per-cloned-VID/PID match | `src/config/device.zig`, `src/device_instance.zig`, `src/cli/install.zig` | ~150 |
| T3 | `UHID_OUTPUT` event handler — `src/io/uhid.zig` adds `pollOutputReport()` + `onOutputReport(report_id, payload)` callback; payload-parse unit tests | `src/io/uhid.zig`, `src/io/uhid_types.zig` | ~120 |
| T4 | `FfbForwarder` — write FFB bytes to physical hidraw fd, EAGAIN/EACCES/ENODEV classification, rate ceiling + dropped-packet counter | `src/io/ffb_forwarder.zig` (new), `src/supervisor.zig`, `src/io/hidraw.zig` | ~250 |
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Don't claim a rate ceiling that isn't implemented yet.

The T4 summary says the forwarder ships with a rate ceiling, but this PR's stated limitations still say there is no active FFB rate ceiling. Please either drop that phrase here or mark it as future work so the brief doesn't overstate the current behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@openspec/changes/phase-13-wave-6-pidff/brief.md` at line 49, The brief
incorrectly claims a rate ceiling for the FfbForwarder; update the T4 line in
brief.md so it does not assert an implemented rate ceiling — either remove "rate
ceiling" from the description or append " (future work)" / "not yet implemented"
to the phrase for `FfbForwarder` in the T4 summary, referencing the new
`src/io/ffb_forwarder.zig` and related `src/supervisor.zig`/`src/io/hidraw.zig`
entries to make it clear the rate limit is not currently enforced.

Comment thread openspec/changes/phase-13-wave-6-pidff/tasks.md Outdated
Comment thread src/io/ffb_forwarder.zig Outdated
Comment on lines +82 to +97
const fds = try posix.pipe2(.{});
defer posix.close(fds[0]);
defer posix.close(fds[1]);

var fwd = FfbForwarder.init(fds[1]);
const payload = [_]u8{0xAA};
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
try testing.expectEqual(@as(u64, 3), fwd.writes_total);

// Drain pipe
var buf: [64]u8 = undefined;
_ = posix.read(fds[0], &buf) catch {};
_ = posix.read(fds[0], &buf) catch {};
_ = posix.read(fds[0], &buf) catch {};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This test can still hang on a blocking pipe.

After three 1-byte writes, the first read() can consume all buffered bytes. The second read then blocks because fds[1] stays open until scope exit. That makes this test flaky/hanging in the same way the PR notes call out elsewhere.

Suggested fix
-    const fds = try posix.pipe2(.{});
+    const fds = try posix.pipe2(.{ .NONBLOCK = true });
@@
-    // Drain pipe
-    var buf: [64]u8 = undefined;
-    _ = posix.read(fds[0], &buf) catch {};
-    _ = posix.read(fds[0], &buf) catch {};
-    _ = posix.read(fds[0], &buf) catch {};
+    var buf: [64]u8 = undefined;
+    while (true) {
+        _ = posix.read(fds[0], &buf) catch |err| switch (err) {
+            error.WouldBlock => break,
+            else => return err,
+        };
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const fds = try posix.pipe2(.{});
defer posix.close(fds[0]);
defer posix.close(fds[1]);
var fwd = FfbForwarder.init(fds[1]);
const payload = [_]u8{0xAA};
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
try testing.expectEqual(@as(u64, 3), fwd.writes_total);
// Drain pipe
var buf: [64]u8 = undefined;
_ = posix.read(fds[0], &buf) catch {};
_ = posix.read(fds[0], &buf) catch {};
_ = posix.read(fds[0], &buf) catch {};
const fds = try posix.pipe2(.{ .NONBLOCK = true });
defer posix.close(fds[0]);
defer posix.close(fds[1]);
var fwd = FfbForwarder.init(fds[1]);
const payload = [_]u8{0xAA};
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
fwd.forward(.{ .report_id = 0xAA, .data = &payload });
try testing.expectEqual(`@as`(u64, 3), fwd.writes_total);
var buf: [64]u8 = undefined;
while (true) {
_ = posix.read(fds[0], &buf) catch |err| switch (err) {
error.WouldBlock => break,
else => return err,
};
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/io/ffb_forwarder.zig` around lines 82 - 97, The test hangs because the
writer end fds[1] remains open while draining the pipe, so a subsequent
posix.read can block; after performing the three forwards (FfbForwarder.init /
fwd.forward and checking fwd.writes_total) explicitly close the writer
descriptor (posix.close(fds[1]) or call FfbForwarder.close/cleanup if provided)
before doing the multiple posix.read calls so the reader sees EOF and reads
return without blocking; ensure defer close calls are adjusted (remove
double-close) or guard against closing twice.

Comment thread src/io/uhid_descriptor.zig
BANANASJIM pushed a commit that referenced this pull request Apr 28, 2026
…diagnostics

Wave 6 PR #166's `check` matrix hits a 6-hour runner-level timeout
without producing log artifacts (GHA strips logs from in-progress
cancelled steps). Wrap each `zig build test*` invocation in `timeout 600`
so the step finalizes at 10min with logs uploaded; add `--summary all`
so Zig's test runner prints every passing test name, making the last
test before hang visible.

Forensic-only — no production behavior change.
BANANASJIM added a commit that referenced this pull request Apr 28, 2026
…cs (#182)

* fix(ci): wrap test steps in `timeout 600` + `--summary all` for hang diagnostics

Wave 6 PR #166's `check` matrix hits a 6-hour runner-level timeout
without producing log artifacts (GHA strips logs from in-progress
cancelled steps). Wrap each `zig build test*` invocation in `timeout 600`
so the step finalizes at 10min with logs uploaded; add `--summary all`
so Zig's test runner prints every passing test name, making the last
test before hang visible.

Forensic-only — no production behavior change.

* fix(ci): use zig build check-all instead of split sub-steps

The 3-step split tried zig build fmt --check, but fmt is not a build
step (check-fmt is) and --check is not a valid zig build flag.
Result: 5 CI jobs failed with "unrecognized argument: '--check'".

zig build check-all already chains test + test-safe + check-fmt per
build.zig:165-168. Wrap the single invocation with timeout 600 +
--summary all instead. Single source of truth in build.zig; no drift
risk between workflow and build script.

---------

Co-authored-by: openplan-implementer <openplan-implementer@openplan.local>
@BANANASJIM BANANASJIM force-pushed the spec/phase-13-wave-6-pidff branch from b47e7d3 to 47a784e Compare April 28, 2026 15:23
BANANASJIM pushed a commit that referenced this pull request Apr 28, 2026
…flock deadlock workaround)

build.zig was registering uhid_output_dispatch_tests and wave6_pidff_e2e_tests
as separate b.addTest artifacts sharing src_mod. zig issue #22453 causes a
flock AB/BA deadlock between artifacts on the cache manifest, which manifested
as a 100+ minute CI hang on PR #166.

Inline both test files into src/main.zig's pub const ..._test = @import(...)
namespace following the pattern of interpreter_e2e_test, mapper_e2e_test, etc.
This compiles them into the main test artifact, eliminating cross-artifact
flock contention.

Convert @import("src") module-alias imports to relative path imports matching
the existing test file convention. Update test assertions to align with the
simplified device_instance.zig which no longer carries FfbForwarder or
test_physical_hidraw_fd seam (FFB wiring moved to event_loop, Wave 6 T4).

Bisected over 4 rounds; T3 commit 0698fb1 (uhid_output_dispatch_test artifact)
confirmed as root cause.
BANANASJIM pushed a commit that referenced this pull request Apr 28, 2026
build.zig was registering uhid_output_dispatch_tests and wave6_pidff_e2e_tests
as separate b.addTest artifacts sharing src_mod. Zig issue #22453 causes a
flock AB/BA deadlock between artifacts on the cache manifest, manifesting
as a 100+ minute CI hang on PR #166.

Inline both test files into src/main.zig's pub const ..._test = @import(...)
namespace following the existing convention (interpreter_e2e_test, etc.).
Convert the test files' @import("src") module-alias imports to relative
path imports matching src/test/interpreter_e2e_test.zig.

This change is scoped to ONLY 4 files: build.zig (remove 2 addTest blocks),
src/main.zig (add 2 namespace entries), and the 2 test files (rewrite
import header). All Wave 6 production code (FfbForwarder wiring,
clone_vid_pid parsing, supervisor_uhid_routing_test, etc.) is preserved.
BANANASJIM added a commit that referenced this pull request Apr 28, 2026
…ns (#185)

Y1's timeout 600 instrumentation works but Zig's test runner buffers
stdout; SIGTERM doesn't flush, so the 10min-killed process produces
zero log output (PR #166 commit 47a784e first run had timeout=124 with
empty log between command-start and SIGTERM).

Pipe each instrumented invocation through tee <name>.log so output
accumulates incrementally and survives SIGTERM. Pattern matches the
tsan step which has used 2>&1 | tee tsan-build.log since Wave 6.

set -eo pipefail ensures timeout's exit code propagates correctly
through the pipe.

Co-authored-by: openplan-implementer <openplan-implementer@openplan.local>
@BANANASJIM BANANASJIM force-pushed the spec/phase-13-wave-6-pidff branch from 47a784e to 0cc174c Compare April 28, 2026 16:44
BANANASJIM pushed a commit that referenced this pull request Apr 28, 2026
…flock deadlock workaround)

build.zig was registering uhid_output_dispatch_tests and wave6_pidff_e2e_tests
as separate b.addTest artifacts sharing src_mod. zig issue #22453 causes a
flock AB/BA deadlock between artifacts on the cache manifest, which manifested
as a 100+ minute CI hang on PR #166.

Inline both test files into src/main.zig's pub const ..._test = @import(...)
namespace following the pattern of interpreter_e2e_test, mapper_e2e_test, etc.
This compiles them into the main test artifact, eliminating cross-artifact
flock contention.

Convert @import("src") module-alias imports to relative path imports matching
the existing test file convention. Update test assertions to align with the
simplified device_instance.zig which no longer carries FfbForwarder or
test_physical_hidraw_fd seam (FFB wiring moved to event_loop, Wave 6 T4).

Bisected over 4 rounds; T3 commit 0698fb1 (uhid_output_dispatch_test artifact)
confirmed as root cause.
BANANASJIM pushed a commit that referenced this pull request Apr 28, 2026
build.zig was registering uhid_output_dispatch_tests and wave6_pidff_e2e_tests
as separate b.addTest artifacts sharing src_mod. Zig issue #22453 causes a
flock AB/BA deadlock between artifacts on the cache manifest, manifesting
as a 100+ minute CI hang on PR #166.

Inline both test files into src/main.zig's pub const ..._test = @import(...)
namespace following the existing convention (interpreter_e2e_test, etc.).
Convert the test files' @import("src") module-alias imports to relative
path imports matching src/test/interpreter_e2e_test.zig.

This change is scoped to ONLY 4 files: build.zig (remove 2 addTest blocks),
src/main.zig (add 2 namespace entries), and the 2 test files (rewrite
import header). All Wave 6 production code (FfbForwarder wiring,
clone_vid_pid parsing, supervisor_uhid_routing_test, etc.) is preserved.
… e2e)

Bundles T1-T7 of openspec/changes/phase-13-wave-6-pidff:
- T1: PID HID descriptor builder + 12-of-12 mandatory validator
- T2: VID/PID cloning + per-device udev permission rule (clone_vid_pid)
- T3: parse UHID_OUTPUT events + FfbForwarder dispatch hook
- T4: FfbForwarder hidraw write + EAGAIN drop + dispatch wiring
- T5: [output.force_feedback] schema + validate matrix
- T6: wave6_pidff_e2e_test (descriptor -> UHID_OUTPUT -> forwarder)
- CI: TP33 NONBLOCK, 3-process check-all, Y1 timeout, tee logs

refs: openspec/changes/phase-13-wave-6-pidff/tasks.md
@BANANASJIM BANANASJIM force-pushed the spec/phase-13-wave-6-pidff branch from 0cc174c to 3da1684 Compare April 30, 2026 12:43
@BANANASJIM BANANASJIM enabled auto-merge (squash) April 30, 2026 12:52
@BANANASJIM BANANASJIM merged commit 7a0c4e6 into main Apr 30, 2026
18 of 19 checks passed
@BANANASJIM BANANASJIM deleted the spec/phase-13-wave-6-pidff branch April 30, 2026 14:51
BANANASJIM added a commit that referenced this pull request May 1, 2026
* docs: comprehensive feature-coverage audit + close gaps for 2026-04-28..2026-05-01

Closes the doc backlog for features merged since the last user-facing
drift sync (PR #184, 2026-04-28):

- Wave 6 HID PID FFB passthrough (PR #166): [output.force_feedback]
  backend/kind/clone_vid_pid reference + validation matrix +
  hid-pidff kernel requirement note + racing-wheel TOML example
- padctl status active mapping (PR #199): document the
  STATUS device=... state=... mapping=... response triple
- padctl --validate auto-detect (PR #204): document mapping vs device
  detection by [device]-header heuristic + repeatable flag
- button_group size limit (PR #177): document source.size 1..=8
  and bit_index < size*8 constraints in device-config.md
- layer tap=macro reject (PR #176): document
  error.LayerTapCannotBeMacro constraint in mapping-config.md
- chord_index cross-link: surface chord_switch from mapping-config.md
  for users looking up the field name
- code-contributions.md: 'fixtures are SSOT, e2e tests must use
  parseFile/@embedfile' rule (drift-proofing carry-over from PR #193)

chord_switch (PR #201) docs are tracked in sibling PR
docs/chord-switch-example-and-docs.

* docs: fix H1+M1 from PR #209 reviewer

H1 [Option B]: retire last inline vader5_toml literal in
interpreter_e2e_test.zig — remove parseString-based test that duplicated
the existing parseFile-based test; delete the 28-line vader5_toml const.
code-contributions.md updated to reflect that no inline device literals
remain in the test suite (PR #193 began the cleanup, this PR finishes it).

M1: chord_index doc said 'integer (>=1)' but schema accepts u8 with no
>=1 validation. Relax to 'integer (0-255)'; clarify that 0 or omit means
unselectable via chord, matching actual schema behavior.

* docs: fix Qodo F3 — button_group size is runtime warn-skip not compile reject

Doc claimed 'rejected at compile time'; src/core/interpreter.zig
compileReport emits std.log.warn and returns the report unprocessed.
Reword to match actual runtime behavior.

---------

Co-authored-by: openplan-implementer <openplan-implementer@openplan.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant