Skip to content

Limitations and Roadmap

Julius Bairaktaris edited this page Jun 21, 2026 · 5 revisions

Limitations and Roadmap

What this stack deliberately does not do, what is deferred and why, and findings that matter beyond this project.

Not supported (by design, for now)

Hardware bridge offload (nss-bridge-mgr)

Bridged (same-L2) traffic between ports is forwarded by the Linux bridge in software; ECM still accelerates routed/NATed flows. The vendor nss-bridge-mgr was audited and deliberately not ported:

  • it makes ~47 fal_* calls into qca-ssdk (plus vlan-mgr's ~25, which it requires, plus bonding hooks) — none of which exist on the upstream qca_ppe driver, which has no ACL layer;
  • the firmware accepts only one port per VSI (see Architecture), so shared-bridge-VSI semantics need a real redesign through the firmware's bridge interface, not a port of the vendor code.

Software bridging soaks fine at multi-hundred-Mbit rates; revisit only with a concrete same-subnet throughput need (routed/NATed flows are already accelerated regardless).

Multicast (routed) offload

Routed multicast is not offloaded: ECM is built without ECM_MULTICAST_ENABLE, so IGMP/MLD-driven multicast forwarding stays on the host slow path (visible as v4/v6_mcast_feature_disabled in the ECM exception stats). Unicast routed/NATed flows are unaffected.

The throughput-relevant case for this feature is high-rate routed multicast such as operator IPTV. Enabling it means building ECM with ECM_MULTICAST_ENABLE=y, running a multicast routing daemon (igmpproxy/omcproxy), and validating the firmware multicast connection path (the ppe conn mc * counters, currently unused). The firmware supports it; the work is host-side plumbing plus a gate.

Wi-Fi offload (ath11k) — integrated, with bounds

Integrated: the wifili data path runs both Wi-Fi radios (QCN5024 + QCN5054) on the NSS cores; see Architecture and Runtime Operation. Bounds:

  • NSS mesh offload is not wired up — it requires NSS firmware 11.4, and this tree ships 12.5. Plain (host) mesh still works.
  • Changing nss_offload requires re-probing the radio (the bring-up script does a platform unbind/bind); it is not a live toggle.
  • The carried Wi-Fi patch set (from the vendor lineage) was audited against pristine backports 6.18.26: the verifiably obsolete and dead patches have been dropped (89 remain), the mesh chain and the stats/diagnostics group are droppable with quilt regeneration, and a dozen generic ath11k fixes are candidates for upstream submission independent of NSS.

Bonding/LAG, tunnels, IPsec/DTLS, vendor netlink

Compiled out or not packaged. The kernel-side bonding hooks come from QSDK kernel patches this tree does not carry; the tunnel/crypto managers depend on nss-crypto/cfi stacks that were out of scope. PPTP and L2TPv2 ECM interface support is compiled out (no kernel hooks); plain PPPoE offload is fully supported.

On nss-crypto/cfi specifically: their only meaningful consumer is NSS IPsec (ESP) offload, and IPsec/DTLS/TLS are marked unsupported for IPQ807x in the vendor support matrix anyway, so there is no working high-rate IPsec path to feed. The NSS crypto engine accelerates AES/3DES/SHA only — not ChaCha20-Poly1305, so it cannot offload WireGuard. And the Cortex-A53 cores expose the ARMv8 crypto extensions (aes pmull sha1 sha2 in /proc/cpuinfo), so the CPU already does AES/SHA in hardware — for the router's own light TLS and modest crypto, inline CPU crypto beats the descriptor round-trip to an offload engine. Net: nss-crypto/cfi is worth revisiting only for a working, high-throughput IPsec deployment, which this platform/firmware does not currently provide.

ECN marking from the firmware CoDel

The 12.5 firmware does not ECN-mark (verified at the firmware level; see SQM and Shaping). Not fixable host-side.

Known cosmetic issues

  • rmmod qca-nss-drv triggers a stock-driver regulator_put refcount WARN in devres teardown. Harmless; latent upstream bug (the vendor stack never rmmods in production).
  • Under bursty load the MEDIUM profile logs occasional N2H payload-allocation failures (hundreds per tens of millions of packets); no measured throughput effect.
  • A single WARN_ON in skb_try_coalesce (net/core/skbuff.c) can fire once from the NSS RX delivery path (nss_core_handle_napi_queue → GRO), most likely a pp_recycle flag mismatch between firmware-delivered skbs and the host coalesce path. WARN_ON_ONCE, non-fatal, no observed effect on forwarding; noted for the driver's skb handling and tracked for upstream.

Findings relevant to upstream (PR #22381)

Found during this work and relevant even without NSS; tracked for reporting upstream:

  1. qca_edma misc-interrupt handling can storm. Probe leaves EDMA_REG_MISC_INT_MASK = 0x1ff with a handler that reads status but never acks or masks; any sticky misc condition (e.g. flow- control trips) becomes an IRQ storm that wedges the SoC without an oops. Fixed in this tree (mask 0 + self-disarming handler).
  2. edma_irq_disable_all() overruns the TX interrupt register space on IPQ807x (8 TX-interrupt slots; the loop wrote into RXFILL registers). Fixed in this tree (bounded loop).
  3. edma_hw_stop() clears global state (PORT_CTRL, all rings) more broadly than the host owns. In this tree it stops host rings only.

Roadmap

  • Final-spec release image for the validated device class (hardened toolchain, LuCI, the SQM stack preconfigured) and a re-run of the full gate suite on it.
  • Wi-Fi patch-set reduction: drop the audited obsolete/no-op patches and the mesh chain; submit the generic ath11k fixes to openwrt:main / linux-wireless.
  • A/B benchmark vs the vendor-driver stack (same device, qosmio tree image) for throughput, SQM grades and Wi-Fi performance.
  • Upstreaming: feed the three qca_edma findings above into PR #22381 review; long-term, the glue module could become a proper in-tree consumer of a stable redirect API.
  • More boards: the DT and drivers cover the whole IPQ807x family; per-board validation reports are the missing piece.
  • Offload coverage vs the vendor NSS support matrix: this build offloads NATed IPv4, routed IPv6, PPPoE(-over-VLAN), 802.1Q VLAN flows, DSCP/mark classification, IGS ingress shaping and both Wi-Fi radios. The vendor extras not carried — routed multicast and hardware bridge offload (both above), plus the tunnel/VPN managers (GRE, L2TPv2, PPTP, MAP-T, 6RD, VXLAN, SIT, IPsec, bonding) — each sit behind their own kernel hooks/managers and are add-on items gated on a real need and a validation pass, not missing core function.