openipc_frame_ts: ISP_FEND event source, edge-detect both event types, split counters (closes #176, #177)#178
Merged
Merged
Conversation
Extends the chrdev ABI introduced in #155 with a second event source and an event-type filter, so consumers can: 1. Observe both edges of the sensor-readout window: - MIPI_FS — sensor begins streaming row 0 (existing behaviour, fires from MIPI RX driver on MIPI_CTRL_INT.int_vsync) - ISP_FEND — ISP front-end finished receiving the last row (new, fires from the ISP IRQ on ISP_INT_FE_FEND bit set, register defined in kernel/isp/arch/include/ isp_drv_defines.h) 2. Compute wall_ns[FEND] − wall_ns[FS] ≈ sensor readout duration per frame, decomposing the "kernel-anchored capture wall-clock" vs "encoder-finished wall-clock" gap into readout-bound vs encoder-bound components. 3. Filter at the chrdev level — consumers that only care about FS (e.g. RTCP SR anchoring; widgetii/majestic#83) call OPENIPC_FT_IOC_SET_EVENT_MASK with bit 0 set and skip FEND entirely; pipeline-latency measurement tools take both. ## ABI struct openipc_frame_ts_event grows two u32 fields: __u32 event_type; /* OPENIPC_FT_EVT_MIPI_FS | OPENIPC_FT_EVT_ISP_FEND */ __u32 reserved; /* zero; for future event sources */ Size: 24 → 32 bytes. This is a hard ABI break vs the v1 layout in #155 — but since #155 isn't merged yet, the breakage is local to the PR chain and downstream majestic / sample / consumer-test patches all land in lockstep. Future event types append to OPENIPC_FT_EVT_*; the struct stays 32 bytes. Added IOCTL: OPENIPC_FT_IOC_SET_EVENT_MASK _IOW('o', 3, __u32) Bit n set ↔ events with event_type == n pass through. Default ~0u (every type). Pairs with the existing SET_CHANNEL_MASK; both filter the same fd. ## Kernel-side hook surface `openipc_frame_ts_push` grows an event_type parameter. Two call sites already in #155 — both MIPI RX paths (shared and cv500) — pass OPENIPC_FT_EVT_MIPI_FS to preserve existing semantics. The cv200 ISP fallback hook also passes MIPI_FS (cv200's hook fires at VI_PT0_INT_FSTART = frame start; there's no FEND IRQ exposed to userspace on cv200). The new ISP_FEND emission is wired into: - kernel/isp/mkp/src/isp.c (ev200 / gk7205v200 / hi3519v101 — they all share this V4 ISP source per CHIPARCH dispatch) - kernel/isp/arch/hi3516cv500/mkp/src/isp.c (cv500 / av300 / dv300) In both, the FEND push fires when (isp_int_status & ISP_INT_FE_FEND) inside the ISP IRQ handler — after the existing per-pipe status read, before any error checks. Zero impact on the existing ISP IRQ flow. ## Dedup Per-(chn, event_type) dedup within 1 ms is preserved from #155 (catches level-triggered IRQ retrigger and the cv500 ~4% MIPI double-fire). The dedup deliberately does NOT cross event types — MIPI_FS and ISP_FEND for the same frame can be tens of ms apart (= readout duration) and that gap is the whole point of having both events. ## seq numbering Per-(chn, event_type) seq counters. So MIPI_FS_seq and ISP_FEND_seq both start at 0 and tick once per real frame. The pairing MIPI_FS_seq[N] ↔ ISP_FEND_seq[N] gives same-frame correspondence — useful for the readout-duration computation in the userspace probe. ## Build hook kernel/isp/Kbuild adds `-I$(src)/../include` so the ISP modules pick up the openipc_frame_ts.h header. Same path PR #155 added to each per-SoC kbuild for the MIPI RX side; this version puts it in the shared isp/Kbuild so all SoC variants inherit it without duplication. ## Build verified Standalone cv500 module build clean: make -C output-cv500/build/linux-custom \ M=$PWD/kernel CHIPARCH=hi3516cv500 \ CROSS_COMPILE=arm-openipc-linux-gnueabi- ARCH=arm modules Empirical verification (FS-vs-FEND delta = readout duration on real hardware) is the next step in this PR chain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cherry-picked widgetii FEND extension emitted zero events on both V4 and cv500 hardware. Two compounding hardware quirks caused it: 1. Vendor ISP HAL leaves ISP_INT_FE_MASK at 0 by default (only the unused ISP_DRV_SetIntEnable ioctl path flips FEND on), so reading the masked status from ISP_ISR returns 0 even when hardware has asserted FEND. The cherry-pick read masked status — never fired. 2. Reading the raw status instead gives results, but the FEND raw bit is level-held across the inter-frame gap: hardware re-asserts it after each W1C while the underlying frame-finished condition still holds, so every ISR call during that window would fire. Empirically measured at ~683 Hz for a 30 fps sensor (capped only by the 1 ms dedupe). After the W1C-induced 0→1→0 oscillation: ~half that. Two-layer fix: - In ISP_ISR (V4 and cv500): track per-pipe previous raw-FEND state and emit only on the 0→1 transition. This catches the bulk of the spurious extras the W1C oscillation introduces. - In openipc_frame_ts: make the per-event-type dedupe interval configurable and bump FEND from 1 ms to 25 ms — sufficient to swallow whatever residual sub-frame retriggers slip past the edge-detect, while still capping at 40 Hz (every supported sensor on these SoCs tops out at 30 fps single-stream, so no real frame is ever dropped). FS remains at 1 ms to preserve the cv500 double-vsync dedupe behavior. Validated empirically: - ev300 (V4, IMX335, 37 fps single-stream): clean 27.0 ms cadence, ~37 Hz FEND output matching the sensor frame rate exactly. - av300 (cv500, IMX415): clean 50.0 ms cadence, 20 Hz FEND output matching the configured 20 fps sensor mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Same level-held-bit pathology as ISP_INT_FE.FEND (commit 28c200d): the MIPI_CTRL_INT.vsync and LVDS_CTRL_INT.lvds_vsync raw bits are re-asserted by hardware across the inter-frame gap after each W1C clear while the underlying vsync window is open. Measured ~185 Hz MIPI_FS output on a 37 fps ev300 IMX335 (capped by the 1 ms openipc_frame_ts dedupe), against an expected 37 Hz. Apply the same per-device `static bool s_vsync_was_set` 0→1 edge-detect to both the V4-shared mipi_rx_interrupt_route and its cv500 counterpart. The 1 ms dedupe in openipc_frame_ts remains as defence-in-depth, importantly covering cv500's known ~30–80 µs double-vsync quirk on ~4 % of frames. Validated on av300/cv500 (IMX415 @ 20 fps): MIPI_FS now emits at clean 50 ms cadence (was deduping at 1 ms cap before). Paired sensor-readout-time measurement against ISP_FEND: 31.5 ms p50 across 342 frames (σ ≈ 0.1 ms). ev300 re-test deferred: board hit issue #159 reboot-stress hang after this push, needs power cycle to recover. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three coupled changes that lift the practical ceiling from "leaks drops
above 60 fps" to "clean across every supported single-stream mode":
- Per-channel ring depth bumped from 64 to 256 (≈ 0.5 s of buffer at
240 fps × 2 event types). Per-channel cost grows from 2 KiB to 8 KiB
— negligible at 8 channels.
- Sample test opens chrdev with O_NONBLOCK and drains in 64-event
batches per read() syscall. Two bugs fall out:
* the previous tight `while ((n = read(... sizeof(ev))) == sizeof(ev))`
blocked on read() once the ring drained, so `-t N` never fired at
idle and never fired again once events started flowing because the
drain loop never returned to the deadline check;
* single-event reads added syscall overhead that was visible in
drop counts at high fps.
Batched read keeps the outer poll/deadline loop responsive at any
event rate.
- README updated for the new depth.
Validated on lab hardware:
- av300 / cv500 / IMX415: <2 drops in 10 s across imx415_i2c.ini, 60fps,
720p120fps, vga200fps sensor configs.
- ev300 / V4 / IMX335 5M default: 13 drops/s (was unbounded before).
- ev300 high-fps modes (1280x720@120, 800x480@240): ~800–1700 drops/s
remain — these are upstream of the chrdev (the level-held vsync bit
toggles faster than edge-detect can constrain at sub-ms sensor
periods), not a ring or reader limit. Separate follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The single 'dropped' counter conflated two very different conditions: 1. **Ring overflow** — events arrived but no reader drained the chrdev in time. Genuine data loss; consumer needs to size the ring or pace the reader. 2. **Dedupe rejection** — push() called within the per-event-type dedupe window of a previously-pushed event. NOT data loss; these are the level-held vsync / FEND duplicates the dedupe is there to absorb, and they grow naturally with sensor IRQ rate. At low sensor rates the two are nearly equal (both ≈ 0). At high configured rates (ev300 800x480@240fps) the kernel emits ~1000 dedupe rejections per 10 s while ring overflow stays at 0 — under the old counter that looked like 1000 lost frames; it's actually 1000 correctly-filtered duplicate IRQ fires. Split into two counters: - `dropped` keeps its ABI but now only counts ring overflows. - New `coalesced` counter + `OPENIPC_FT_IOC_GET_COALESCED` ioctl. Sample test prints both, labelled. README updated to explain the distinction. Verified across 5 ev300 sensor modes (5M default through 240fps) and 4 av300 modes (imx415 default through vga200fps): drops = 0 in all tested modes; coalesced grows monotonically with sensor IRQ rate exactly as expected: ev300 5M default irq=49Hz → fs=16Hz drop=0 coal=0 ev300 1944p@45 irq=90Hz → fs=25Hz drop=0 coal=0 ev300 1080p@55 irq=101Hz → fs=28Hz drop=0 coal=89 ev300 720p@120 irq=173Hz → fs=35Hz drop=0 coal=456 ev300 800x480@240 irq=229Hz → fs=40Hz drop=0 coal=1007 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
widgetii
added a commit
that referenced
this pull request
May 23, 2026
…182) The hi3516cv200 family build also produces the firmware-shipped hi3518e_isp.ko for hi3518ev200 boards (same CHIPARCH=hi3516cv200, same open_isp.ko renamed at install time). The openipc_frame_ts_push call I added to the cv200 ISP_ISR in PR #178 adds a few µs to the ISR hot path which, on real hi3518ev200 hardware streaming with majestic, tips a latent i2c-from-hardirq race: kernel WARN at rtmutex.c:1545 via rt_mutex_trylock → hi_sensor_i2c_write → i2c_transfer; majestic loses sensor i2c and /image.jpg returns HTTP 000 / 10s timeout. Pre-existing backtrace surfaced by my timing shift; openipc/firmware nightly built and shipped the regression to V2A boards before it was caught. Comment out the cv200 ISP_ISR push call until: - the i2c-in-IRQ path is fixed upstream, OR - the hook is hardware-validated on both hi3516cv200 AND hi3518ev200 cameras under a real majestic stream. The chrdev /dev/openipc-frame-ts still loads cleanly on these SoCs, just emits no events; consumers see an empty stream and fall back to clock_gettime() (same as on a kernel without the modules). README updated to list hi3516cv200/hi3518ev200 as "not validated" with a full explanation of the i2c race and the re-enable criteria. Other SoC hooks (V4 ev200/ev300/gk7205v200, cv500/av300) are unchanged — they're hardware-validated and continue to fire. Tracking: OpenIPC/firmware#2128 reverted the opensdk bump that shipped the regression to nightly users. Co-authored-by: Your Name <you@example.com>
This was referenced May 23, 2026
Closed
widgetii
added a commit
to widgetii/openhisilicon
that referenced
this pull request
May 24, 2026
…penIPC#183) Unblocks the gate from OpenIPC#182 and re-enables the openipc_frame_ts MIPI_FS hook on cv200 / hi3518ev200 without bricking /image.jpg. PR OpenIPC#155 (firmware bump #2126) introduced a synchronous openipc_frame_ts_push() call inside ISP_ISR's u32PortIntStatus branch; OpenIPC#178 extended it. The few µs added to the cv200 ISR top half tipped a timing race downstream in the vendor VPSS / VENC chn 1 startup, raising the /image.jpg HTTP-000 brick rate on hi3518ev200 from ~20 % (latent baseline) to ~60 %. firmware#2128 urgently reverted; OpenIPC#182 gated the cv200 hook off. The brick mechanism is NOT the rt_mutex_trylock WARN at rtmutex.c:1545 that OpenIPC#183 originally blamed. We confirmed empirically: deferring the i2c writes to a workqueue (cv500-pattern) silences the WARN but makes the brick *worse* (40-60 % rate). The race is purely about µs cost in the cv200 ISR hot path before the synchronous ISP_IntBottomHalf runs. Fix: replace the direct openipc_frame_ts_push() with tasklet_hi_schedule(). tasklet_hi_schedule() from hardirq is ~10s of cycles (single bit set + softirq raise); the actual push then runs in softirq context after the hardirq returns, so the ISR hot path stays at near-zero added µs and the downstream race doesn't fire. Empirical validation on dlab hi3518ev200 (JXF22 sensor, kernel 4.9.37), 10 power-cycles per state, /image.jpg http_code via init-script-started majestic: | state | success | brick | | baseline (no hook, with OpenIPC#182 gate on) | 8/10 | 20% | | push direct in ISR + cv500-style i2c defer | 4/10 | 60% | | tasklet defer for push + cv500-style i2c defer | 6/10 | 40% | | this patch — tasklet defer alone (no i2c changes) | 10/10 | 0% | The tasklet-defer-alone fix is BETTER than baseline (which still intermittently bricks 20 % from the pre-frame_ts era) AND keeps the frame_ts hook enabled on cv200. Timestamp precision: the push reads sched_clock() at tasklet-run time (~µs after the IRQ rather than inside it). Negligible for 30 fps frame-edge events on a 33 ms cadence. openipc_frame_ts_push() already has a per-event-type dedupe window absorbing any coalescing from multiple ISR firings between tasklet runs. The rt_mutex_trylock WARN still fires once per boot on cv200 — it's cosmetic, not the brick cause. A cv500-pattern workqueue defer for i2c-from-hardirq to silence the WARN was prototyped and tested; it regresses the brick rate so it does NOT land here. Tracked as future work in openhisilicon#185. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
widgetii
added a commit
that referenced
this pull request
May 24, 2026
) (#186) Unblocks the gate from #182 and re-enables the openipc_frame_ts MIPI_FS hook on cv200 / hi3518ev200 without bricking /image.jpg. PR #155 (firmware bump #2126) introduced a synchronous openipc_frame_ts_push() call inside ISP_ISR's u32PortIntStatus branch; #178 extended it. The few µs added to the cv200 ISR top half tipped a timing race downstream in the vendor VPSS / VENC chn 1 startup, raising the /image.jpg HTTP-000 brick rate on hi3518ev200 from ~20 % (latent baseline) to ~60 %. firmware#2128 urgently reverted; #182 gated the cv200 hook off. The brick mechanism is NOT the rt_mutex_trylock WARN at rtmutex.c:1545 that #183 originally blamed. We confirmed empirically: deferring the i2c writes to a workqueue (cv500-pattern) silences the WARN but makes the brick *worse* (40-60 % rate). The race is purely about µs cost in the cv200 ISR hot path before the synchronous ISP_IntBottomHalf runs. Fix: replace the direct openipc_frame_ts_push() with tasklet_hi_schedule(). tasklet_hi_schedule() from hardirq is ~10s of cycles (single bit set + softirq raise); the actual push then runs in softirq context after the hardirq returns, so the ISR hot path stays at near-zero added µs and the downstream race doesn't fire. Empirical validation on dlab hi3518ev200 (JXF22 sensor, kernel 4.9.37), 10 power-cycles per state, /image.jpg http_code via init-script-started majestic: | state | success | brick | | baseline (no hook, with #182 gate on) | 8/10 | 20% | | push direct in ISR + cv500-style i2c defer | 4/10 | 60% | | tasklet defer for push + cv500-style i2c defer | 6/10 | 40% | | this patch — tasklet defer alone (no i2c changes) | 10/10 | 0% | The tasklet-defer-alone fix is BETTER than baseline (which still intermittently bricks 20 % from the pre-frame_ts era) AND keeps the frame_ts hook enabled on cv200. Timestamp precision: the push reads sched_clock() at tasklet-run time (~µs after the IRQ rather than inside it). Negligible for 30 fps frame-edge events on a 33 ms cadence. openipc_frame_ts_push() already has a per-event-type dedupe window absorbing any coalescing from multiple ISR firings between tasklet runs. The rt_mutex_trylock WARN still fires once per boot on cv200 — it's cosmetic, not the brick cause. A cv500-pattern workqueue defer for i2c-from-hardirq to silence the WARN was prototyped and tested; it regresses the brick rate so it does NOT land here. Tracked as future work in openhisilicon#185. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a second event source — ISP_FEND (ISP front-end's "last sensor row received") — to
/dev/openipc-frame-tsalongside the existing MIPI_FS event, then fixes the hardware quirks that prevented it from working on real silicon. Pairing both events frame-by-frame gives consumers the sensor readout duration directly.Closes #176 (parent: FEND extension for sensor-readout-time telemetry) and #177 (cv500 FEND emits 1/8 of FS rate). The #177 hypothesis ("wrong IRQ resolved by name") turned out to be wrong — root cause is the level-held raw-status bit, not IRQ routing.
Commits
cff4a3b— Plumb ISP_FEND as a second event type through the chrdev: ABI bump (event_typefield added tostruct openipc_frame_ts_event),OPENIPC_FT_IOC_SET_EVENT_MASKioctl, per-(channel, event_type) sequence counters, per-event-type dedupe inopenipc_frame_ts_push. Hook from V4 and cv500ISP_ISR. (Original widgetii branch, picked into this PR.)28c200d— Make FEND actually fire on hardware: edge-detect the raw FEND bit inISP_ISR(V4 + cv500). The cherry-picked version above read masked status, which the vendor HAL leaves at 0 — never fired. Reading raw status fires at IRQ rate (sticky bit). Per-pipestatic bool s_fend_was_setcatches just the 0→1 transitions.e888ee0— Symmetric fix for MIPI/LVDS vsync bits inmipi_rx_interrupt_route(V4 + cv500). Same level-held pathology; measured ~185 Hz MIPI_FS on a 37 fps sensor before the fix.e3e5da2— High-fps headroom: per-channel ring depth 64 → 256 (≈ 0.5 s of buffer at 480 Hz). Sample test opensO_NONBLOCKand drains 64 events perread()syscall, fixing two latent bugs in the previous tight read loop (-t Nnever firing at idle; deadline check unreachable under load).1d67816— Split the singledroppedcounter intodropped(ring overflow, data loss) andcoalesced(dedupe rejects, expected). NewOPENIPC_FT_IOC_GET_COALESCEDioctl. Without the split, the legitimate filtering of level-held-bit duplicates at high IRQ rates was being misreported as data loss.Root cause
Both
MIPI_CTRL_INT.vsync/LVDS_CTRL_INT.lvds_vsyncandISP_INT_FE.FENDare level-held raw status bits, not pulses. Hardware re-asserts after each W1C while the underlying condition still holds — every ISR call during that window would otherwise fire. For FEND specifically, the vendor ISP HAL also leavesISP_INT_FE_MASKat 0 by default, so the masked-status read returns zero. Both effects compound: the original cherry-pick fired never; reading raw status fires at IRQ rate (hundreds of Hz).Per-pipe boolean edge-detection in each ISR + per-event-type dedupe in
openipc_frame_ts_pushsolves it at typical sensor rates. The dedupe still trips at very high IRQ rates by design — that's now visible separately via thecoalescedcounter.Test results
Drop / coalesce counters across full mode sweep (10 s steady-state per mode)
Zero drops in every tested configuration.
Sensor readout time (FEND.wall_ns − first-FS-in-cluster.wall_ns)
Numbers scale linearly with active-row count and inversely with line-clock — the physically expected sensor readout durations. p50−p5 typically < 50 µs.
CI green across all 22 jobs (8 SDK builds × 3 neo kernel variants, 8 QEMU boots, 3 library checks, IVE/NNIE regressions).
🤖 Generated with Claude Code