ee/lwip: enable LWIP_NETIF_LOOPBACK + bump MEMP_NUM_TCP_PCB#839
Draft
fjtrujy wants to merge 2 commits into
Draft
ee/lwip: enable LWIP_NETIF_LOOPBACK + bump MEMP_NUM_TCP_PCB#839fjtrujy wants to merge 2 commits into
fjtrujy wants to merge 2 commits into
Conversation
Bumps the vendored lwIP from STABLE-2_0_3_RELEASE to STABLE-2_2_1_RELEASE
(7+ years of upstream fixes / features) on both the IOP and EE
networking stacks.
ps2sdk-side adjustments to track the upstream API and tune for the
PS2 targets:
iop/tcpip and ee/network/tcpip lwipopts.h:
- LWIP_TCPIP_CORE_LOCKING=1 + LWIP_TCPIP_CORE_LOCKING_INPUT=1 (lwIP
2.2.1 default). Socket / netconn API calls take a binary-sem mutex on
the calling app thread instead of round-tripping through the tcpip
thread mailbox. lwIP releases the lock before any blocking I/O wait
so the tcpip thread + netif input still make progress. Saves a
context switch + sem wait per API call versus message passing.
- DEFAULT_THREAD_STACKSIZE=0x600 on IOP (matches the historical lwIP
2.0.3 budget; the deep socket call chains run on app threads under
core-locking, the tcpip thread only handles timers + tcpip_callback
dispatch, max ~450 B measured via -fstack-usage).
- LWIP_DHCP_DOES_ACD_CHECK=0 + LWIP_ACD=0: no AutoIP, controlled LAN,
the conflict-detection timer/code is dead weight here.
- Pruned three options removed/renamed in upstream:
LWIP_SOCKET_SET_ERRNO (gone since 2017's commit 0ee6ad0a),
DHCP_DOES_ARP_CHECK (renamed to LWIP_DHCP_DOES_ACD_CHECK in 2.2.0),
LWIP_DHCP_CHECK_LINK_UP (no longer referenced anywhere in 2.2.1).
ee/network/tcpip lwipopts.h MEM_ALIGNMENT:
- Drop from 64 (historical "EE cache design" value, attributed to
SP193) to 16, matching the alignment newlib's malloc returns on EE.
With MEM_ALIGNMENT=64, pbuf_alloc(PBUF_RAM) computed
payload = LWIP_MEM_ALIGN(p + SIZEOF_STRUCT_PBUF + offset)
inside an allocation sized assuming p was already 64-byte aligned;
newlib only guarantees 16, set by newlib/newlib/configure.host:254
mips64r5900*)
machine_dir=r5900
newlib_cflags="${newlib_cflags} -DMALLOC_ALIGNMENT=16"
;;
so the payload pointer slid past the allocation by up to 48 bytes
and scribbled into the next chunk's header — TLB misses inside
_malloc_r / _free_r after the first close cycle of any TCP server
(real hw locks up; PCSX2 logs the misses but limps on).
PBUF_POOL pbufs (the only ones touched by IOP->EE DMA + cache
invalidate) come from memp's static pools and are 64-byte aligned
regardless of MEM_ALIGNMENT, so the cache-design invariant the old
comment cited is preserved by memp, not by MEM_ALIGNMENT. The
IOP-side lwipopts has used MEM_ALIGNMENT=4 forever for the same
reason.
iop/tcpip API:
- tcpip_callback_with_block was promoted to tcpip_callback (the macro
became the real function in lwIP 2.1.0); update exports.tab,
imports.lst, and call sites.
- iop/network/smap/src/imports.lst: corresponding rename for the
netif callbacks the smap driver imports from ps2ip-nm.irx.
- ps2ip.c: define `int errno` in .data so socket error paths have
somewhere to write to (lwIP 2.2.1 always writes errno; the section
attribute was already corrected in the previous commit).
iop/tcpip-base/sys_arch.c:
- Add sys_mbox_trypost_fromisr stub (new in lwIP 2.2.1, called from
tcpip_input when LWIP_TCPIP_CORE_LOCKING_INPUT=0; harmless in our
build but the symbol must be present).
ee/network/tcpip/src/sys_arch.c:
- Replace the previous DI/EI-based sys_arch_protect with a per-thread
recursive semaphore. The DI/EI variant deadlocked the EE: any code
path inside a SYS_ARCH_PROTECT region that ended up calling newlib's
malloc/free would WaitSema on the heap recursive mutex with
interrupts disabled. The sema-based variant lets nested waits work
normally and removes the EE-specific incompatibility between lwIP
and any other library that uses newlib's locks. lwIP allows
SYS_ARCH_PROTECT to nest, so the implementation tracks the owning
thread + a recursion counter and only Wait/Signal on the outermost
transitions.
iop/tcpip/tcpip/Makefile + ps2api_IPV4 list:
- acd.c (new in lwIP 2.2.0) intentionally not added; LWIP_ACD=0 makes
it a no-op TU and we'd rather drop the few KB outright.
Verified on real hardware: ps2link boots, execee + reset cycle works
repeatedly, IOP printf-over-UDP (KPRTTY) coexists cleanly with module
loading. EE-side TCP server (ps2_drivers/samples/tcp_server_ee) and
mongoose-based ps2_http both sustain 100/100 burst tests with 0 TLB
misses, exercising the EE lwipopts MEM_ALIGNMENT and sys_arch.c
adjustments above.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LWIP_NETIF_LOOPBACK=1 lets the EE-side lwIP route packets sent to a netif's own IP back through the netif's input path, so an EE app can act as both client and server to itself for testing without needing inbound TCP delivery from the host (which PCSX2 Sockets mode doesn't provide). Required for ps2_drivers' tcp_echo_test_ee sample. Forces LWIP_HAVE_LOOPIF=0 explicitly: lwIP would otherwise auto-default it to (LWIP_NETIF_LOOPBACK && !LWIP_SINGLE_NETIF) = 1, which creates a 127.0.0.1 netif at init and breaks DHCP DISCOVER routing in PCSX2. Bumps MEMP_NUM_TCP_PCB from the lwIP default of 5 to 32. The default is way too small for a server pattern: each completed request leaves the closing-side pcb in TIME_WAIT for 2*MSL (~60 s) holding a slot, plus there's always one slot for the listening pcb itself. With only 4 effective slots an HTTP server runs out as soon as a few back-to-back requests close. 32 leaves headroom for sustained traffic plus the SYN_RECV in-flight state for a burst. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two small EE-side
lwipopts.htunings, kept separate from the lwIP 2.2.1 upgrade (#838) so each can be evaluated and reverted on its own. Stacked on top of #838.Changes
LWIP_NETIF_LOOPBACK = 1(with explicitLWIP_HAVE_LOOPIF = 0)Enables in-place loopback delivery: when an EE-side app sends to its own netif IP, lwIP loops the packet through the netif's
inputcallback rather than emitting it on the wire. Useful for self-tests where an EE app acts as both client and server (notablyps2_drivers/samples/tcp_echo_test_ee, our diagnostic loopback test).LWIP_HAVE_LOOPIFis forced to0because lwIP would otherwise auto-default it to(LWIP_NETIF_LOOPBACK && !LWIP_SINGLE_NETIF) = 1, which auto-creates a 127.0.0.1 netif at init. We don't need a separate loopback netif when in-place loopback delivery is enough, and forcing it off keeps init lean.Cost:
struct netifgains two pointers (loop_first,loop_last) whenLWIP_NETIF_LOOPBACK=1— +8 bytes per netif on 32-bit. With one SMAP netif, +8 bytes static. CPU cost is a single "is dst == our netif IP?" check on outboundip4_output_if_src— negligible.MEMP_NUM_TCP_PCB = 32(up from lwIP default 5)Defensive headroom for server workloads where the server does the active TCP close (each closed pcb sits in TIME_WAIT for 2×MSL ≈ 60 s holding a slot). The default of 5 is enough for the common case where the client closes first (e.g. our
http_burst.pytests viahttp.client), but a server-active-close pattern exhausts it after a handful of fast back-to-back requests.Cost: each
struct tcp_pcbis roughly 140 B on the EE-side build, so 27 extra entries cost ≈ 3.8 KB of BSS in theMEMP_PBUF_POOLallocated atlwip_init. No CPU cost —memp_mallocis O(1).Why not in #838?
#838 is the strict upgrade-only PR ("upgrade to lwIP 2.2.1, nothing else"), which is much easier to review and revert if a regression surfaces. These two tunings are:
Asymmetry note
If the PCB bump is desired, a future change should mirror it on the IOP-side
lwipopts.hfor parity (the IOP-side ps2_http path benefits equally). Held off here to keep this PR focused on a single side.Test plan
tcp_echo_test_ee— 100+ self-loopback iterations, all OK.ps2_http— 30/30 HTTP 200 with this branch (also passes without it; this just doesn't regress).🤖 Generated with Claude Code