Release v3.3.0 - April 30, 2026 #95
jnilo1
announced in
5. 📢 Announcements (Maintainers only)
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
A reliability release for Thread / OT-RCP users: closes #89 (
HandleRcpTimeout()after ~1 h at 460800 baud) at the operating-system layer. Bundles two unrelated8250driver hardenings from an out-of-band audit, and a substantial refactor offlash_efr32.shthat resolves long-standing UX issues around switching radio modes.Hardware UART flow control — the actual fix for #89
If you saw repeated
HandleRcpTimeout()/Failed to communicate with RCPerrors inotbr-agentlogs after roughly an hour at 460800 baud, this release fixes that.Root cause:
S70otbrwas opening/dev/ttyS1with default termios — noCRTSCTS— so the 8250 kernel driver never engaged its hardware flow-control bit. The gateway therefore ran without RTS/CTS at all. At 460800 the 16-byte RX FIFO fills in ~280 µs; under bursty Spinel traffic — Matter commissioning attestation in particular — the kernel could not drain it fast enough and the FIFO would overrun. Lost bytes corrupted HDLC frames,otbr-agentlost sync, eventually gave up.@olivluca's 14-h baseline at 115200 was clean because at that rate the FIFO budget is ~1.1 ms — enough to absorb burst latency without flow control. That data point split the search space and pointed straight at flow-control configuration as the culprit.
The fix is one line in
S70otbr: append&uart-flow-control=trueto the spinel radio URL.otbr-agentnow setsCRTSCTS, the 8250 core setsMCR_AFE(bit 29 of the 32-bit MCR alias on this SoC), and the hardware auto-asserts RTS when the FIFO approaches full, throttling the EFR32. Validated with a 3 h+ continuous run, two Matter sleepy devices paired, two consecutive commissioning bursts: zero overruns.To verify the fix landed on a running v3.3.0 gateway:
Kernel
8250_rtl819x.c— defensive layers + audit hardeningFour changes in the custom RTL8196E UART driver. Two are belt-and-suspenders for #89, two are unrelated audit findings bundled in because we were touching the same file.
Defensive layers for #89. With
MCR_AFEengaged the FIFO can't overrun by spec, but the path is now hardened so a future config drift can't reintroduce the failure mode silently:UART_FCR_R_TRIG_01). Gives the kernel ~210 µs of IRQ latency budget at 460800 instead of ~140 µs, absorbing scheduling spikes on the single-core 200 MHz Lexra.port->lock(irqsave). Closes a race where the 8250 core's byte-level MCR writes (serial8250_set_mctrl,em485_stop_tx) could clobber the AFE bit between ourreadl()andwritel(). Audit finding 8250RTL-003.Audit hardening, orthogonal to #89:
realtek,syscondevice-tree phandle is now mandatory (audit 8250RTL-001). If it's missing at probe time, the driver fails explicitly instead of registering attyS1that looks usable but has no signal on the EFR32 pins.ttyS1(audit 8250RTL-002). The kernel UART bridge,S50uart_bridge,S70otbr, andradio.confall assume/dev/ttyS1— accepting an opportunistic line would silently mis-wire the bridge.flash_efr32.sh— switch-mode UX, baud sweep, hardeningSubstantial robustness pass on the EFR32 over-the-air flash script. Resolves three UX items that had piled up over the v3.0–v3.2 cycle:
radio.confreconciliation. Editing/userdata/etc/radio.confto switch modes (Zigbee ↔ OTBR) used to require a manual sysfs rearm before the script would work. Now the script disarms + rearms the in-kernel UART bridge at the new baud andflow_controlautomatically.radio.confbaud fails, the script now tries{115200, 230400, 460800, 691200, 892857}in turn (skipping the one already tried) before giving up. The previous code only tried 115200, missing the inverse case (radio.conf=115200 but chip really at 460800 from a prior test)../flash_efr32.sh -g IP otrcpJust Works on a Zigbee-installed gateway. Combination of the two changes above with the existing post-flashradio.confwrite-back means no more manualradio.confedits or sysfs gymnastics to switch radio mode.Plus a handful of smaller robustness wins, bundled here while the script was being touched:
--firmware-file PATHoption for explicit GBL selection — useful for testing custom builds outside the repo'sfirmware/tree. Refuses ambiguous matches (the oldls -t … | head -1silently picked the most recent by mtime, which could hide a stale or wrong image).universal-silabs-flasherin the pinned venv is not 1.0.3, reinstall before use; abort if reinstall didn't take. Prevents the probe-method CLI drift bugs that motivated the venv pin in the first place (Usage: universal-silabs-flasher [OPTIONS] COMMAND [ARGS]... #92).assert_bridge_idle()race protection: rechecks that no TCP client has grabbed:8888between detection and the actual flash.tail -1defensive parse onFIRMWARE_BAUDlookup, so duplicate keys inradio.conf(manual edits, stale migrations) resolve to the last value rather than the first. The same defensive pattern was added toS70otbrandS50uart_bridge.Known issue —
S40buttonSIGSEGV (intermittent)S40buttonmay receive a SIGSEGV from busybox after some hours of idle polling — observed once in ~1 h 21 m of dev-box uptime, not reproducible on demand. The shell process dies; the long-press →recover_efr32recovery surface is silently lost until the next reboot. The radio path (otbr-agent,S50uart_bridge,S70otbr) is independent and unaffected. Pre-existing — not introduced by v3.3.0. A follow-up release will add a supervisor that respawnsS40buttonif it exits unexpectedly.If you want to check whether your gateway has lost the long-press handler:
A
rebootbrings it back; the long-press itself still works for you in the meantime viassh root@<gw> recover_efr32or viaflash_install_rtl8196e.shfor full re-flash.Upgrade
In-place upgrade — no fullflash required from v3.2.x. Existing
radio.confis preserved across the upgrade, so no migration friction. Theuart-flow-control=trueflag activates on the first reboot after userdata is updated.If you'd rather not type the password during an unattended upgrade:
(Requires
sudo apt install sshpass.)Acknowledgments
radio.confmandatory work (radio.conf not written #93, v3.2.1) made this release simpler to ship.Full changelogs
Beta Was this translation helpful? Give feedback.
All reactions