Wire struct guards#408
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request implements robust safeguards against silent client freezes caused by packet structure size mismatches. By integrating OpenMU's packet definitions directly into the build process and enhancing runtime error reporting, the changes ensure that any drift in packet structure size is caught either at compile-time or clearly logged at runtime, significantly improving system stability and debuggability. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces automated wire-size validation for network packets by integrating OpenMU's authoritative XML definitions as a submodule and adding a Python script to generate static_assert guards. It also enhances the logging system to ensure MCD_ERROR messages are always captured and improves the safe_cast utility to provide detailed feedback on packet size mismatches. A potential security issue was identified regarding the use of vswprintf, which may not guarantee null-termination if the output exceeds the buffer size, potentially leading to a buffer over-read.
Adds two defenses for the bug class behind PR sven-n#402 / fixed by sven-n#404: 1. static_assert(sizeof == N) on PRECEIVE_CREATE_CHARACTER (19), PRECEIVE_JOIN_MAP_SERVER_EXTENDED (92), and PRECEIVE_REVIVAL_EXTENDED (36). If the #pragma pack(push, 1) ever gets dropped or a field is added/reordered, the build now breaks at the offending struct with a pointer to check the pragma. Wire sizes match OpenMU's XML schema (CharacterInformationExtended, RespawnAfterDeathExtended). 2. Named overload safe_cast<T>(span, "TYPE_NAME") that logs an MCD_ERROR with packet type + received vs expected size on failure. Used at the join-map receive site. The dispatcher also logs the user-visible symptom ("loading screen will appear frozen") so a future drift turns a silent freeze into a one-line console diagnostic.
Older Windows CRTs (msvcrt.dll pre-VS2015) routed via vswprintf do not recognise the C99 %zu length modifier. Cast the size_t arguments to unsigned and use %u; packet sizes always fit in 32 bits on this i686 target. Per Gemini review on PR #66.
Merge the two safe_cast overloads into a single template with an optional packet_type. Now every safe_cast<T>(span) call site -- not just the ones updated to pass a name -- emits an MCD_ERROR on size mismatch. When no name is supplied, typeid(T).name() (mangled but readable) is used as a fallback. Keeps g_ConsoleDebug rather than stderr (stderr is invisible in the GUI client). Addresses follow-up review feedback.
Two coupled fixes that make MCD_ERROR diagnostics actually visible in Release builds: 1. CmuConsoleDebug::GetInstance() now always returns a valid static instance. The previous behaviour (return nullptr unless CSK_LH_DEBUG_CONSOLE was defined) made every g_ConsoleDebug->Write call site a null-pointer member-function call. It "worked" only because the Write body was empty without CONSOLE_DEBUG; any code added there would have crashed in Release. 2. Write() now routes MCD_ERROR to g_ErrorReport.Write (file-based, always compiled) before the existing CONSOLE_DEBUG-gated path. MCD_SEND / MCD_RECEIVE / MCD_NORMAL stay debug-only so production logs don't get flooded. Net effect: any MCD_ERROR call site -- including the safe_cast size mismatch diagnostics added in PR #66 -- now lands in MuError.log even in Release builds.
Replace the three hand-typed static_asserts with a generated header sourced from OpenMU's authoritative packet schema. Same protection against the PR sven-n#402 class of bug, but maintained automatically. Pieces: - third_party/OpenMU: git submodule pinned to 1a56f6dd5. - tools/gen_wire_sizes.py: parses ServerToClientPackets.xml, joins against a hand-maintained packet-name mapping, emits wire_sizes.generated.h with one static_assert(sizeof(X) <= N) per mapped packet. The mapping is currently the three structs PR #66 covered by hand -- adding more is a one-line append. - src/CMakeLists.txt: add_custom_command runs the generator when the XML changes; output goes to ${CMAKE_BINARY_DIR}/generated/, which is added to the include path. Main depends on GenWireSizes. - src/source/Network/Server/WSclient.h: drop the three hand-written asserts; #include the generated header at end-of-file. We assert `<=` rather than `==` because some client structs intentionally don't decode all trailing fields (e.g. PRECEIVE_CREATE_ CHARACTER doesn't decode PreviewData, so its sizeof is 19 vs the wire length 42). The actual safety property is "client struct must fit in wire packet so safe_cast succeeds," which `<=` expresses correctly. To add coverage for another packet: append a tuple to PACKET_MAPPING in tools/gen_wire_sizes.py and rebuild.
The codegen step in src/CMakeLists.txt depends on the OpenMU packet XML, which lives in the third_party/OpenMU submodule added by the previous commit. CI was running actions/checkout@v4 without `submodules: recursive` so the submodule directory was empty and the build broke at the codegen step (cannot find ServerToClientPackets.xml).
The previous codegen commit recorded the submodule pointer at master HEAD at the time of `git submodule add` (d6689f04), not the commit I actually checked out and built against locally (1a56f6dd5). Local builds worked because the working tree was at the right SHA; CI failed because it clones the recorded pointer, which pointed at a tree where ServerToClientPackets.xml is at a different path/state. Pinning to 1a56f6dd5 -- the OpenMU master HEAD on 2026-05-21 -- which matches the wire sizes the static_asserts encode (CharacterInformation Extended = 92, RespawnAfterDeathExtended = 36, CharacterCreation Successful = 42).
Per Gemini review on PR #66 (high-priority security finding). The 3-arg vswprintf(buf, fmt, args) is a Microsoft extension with no buffer-size parameter -- a sufficiently long format expansion can overflow the fixed 256-wchar buffer. Switch the new MCD_ERROR path in CmuConsoleDebug::Write to the C99 4-arg form vswprintf(buf, n, fmt, args), matching the convention already used by CErrorReport::Write. Also cap the type-name field in LogSafeCastSizeMismatch with %.64hs so even a pathological typeid(T).name() expansion cannot bloat the format output. Defense in depth at the format-string level on top of the buffer-size fix. The pre-existing 3-arg vswprintf in the legacy CONSOLE_DEBUG-gated console output is left alone (out of scope for this PR).
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
The wire-size codegen needs OpenMU's authoritative ServerToClientPackets.xml. The previous setup vendored the entire OpenMU repository as a git submodule (third_party/OpenMU) -- hundreds of MB of C# server, tooling, and tests just to read one XML file. The .NET ClientLibrary already pulls MUnique.OpenMU.Network.Packets via NuGet (see MUnique.Client.Library.csproj). That package ships all four packet XMLs (ClientToServer, ServerToClient, ChatServer, ConnectServer) under contentFiles/. Sourcing from there means a single versioned dependency (0.9.9) instead of vendoring the entire OpenMU repo, and removes the "MuMain's wire definitions are coupled to a specific OpenMU server commit" framing in favour of a normal package upgrade flow. Changes: - src/CMakeLists.txt: hoist DOTNET_EXECUTABLE detection, mu_native_path(), and MU_NUGET_CACHE_DIR above the wire-size block so codegen can resolve the package path. The wire-size block then builds OPENMU_PACKETS_XML against the cache and, if missing, triggers a one-shot `dotnet restore` of MUnique.Client.Library.csproj at configure time. dotnet restore is fast on cache-hit so re-configuring stays cheap. WSLENV=NUGET_PACKAGES/w is set so the cache override propagates across the WSL->Windows interop boundary when DOTNET_EXECUTABLE is dotnet.exe. - src/CMakeLists.txt: the Native-AOT cross-OS guard now clears a local _dotnet_for_aot variable instead of DOTNET_EXECUTABLE, so a Linux dotnet targeting Windows can still restore the NuGet package for codegen even though it cannot publish the AOT DLL. - tools/gen_wire_sizes.py: take --xml directly (and an optional --source-label for the generated header's provenance comment) instead of --openmu-root + an internal layout assumption. Cleaner decoupling from any specific source layout. - .gitmodules / third_party/OpenMU: removed. The other submodules (imgui, SDL, SDL_mixer) are unaffected -- they auto-init via mu_ensure_submodule() and don't need `submodules: recursive`. - .github/workflows/mingw-build*.yml: drop `submodules: recursive` (only sven-n#408 added it, only for the OpenMU submodule). Add actions/setup-dotnet@v4 step so the wire-size codegen's restore-on-demand has a dotnet to invoke. Manually verified locally: configure restores the package to .nuget/munique.openmu.network.packets/0.9.9/, wire_sizes.generated.h emits the same three static_asserts (lengths 42, 96, 36), and the mingw-i686 build links cleanly with the editor-enabled target.
ff30020 to
6adc729
Compare
Summary
Defends against the bug class behind PR #402 (silent client freeze on the character-select / loading screen) — the freeze itself was patched by PR #404; this PR adds the safeguards that should prevent a regression and make the next one diagnosable.
Scope grew during review based on what was uncovered. Final shape:
1. Generated wire-size guards from OpenMU XML (codegen Level 1)
third_party/OpenMUgit submodule pinned to1a56f6dd5.tools/gen_wire_sizes.py— Python 3 stdlib, parsesServerToClientPackets.xml, joins against a baked-in mapping (XML name → C++ struct name), emits onestatic_assert(sizeof(X) <= N)per mapped packet.add_custom_commandinsrc/CMakeLists.txtruns the generator only when the XML or script changes; output lands in${CMAKE_BINARY_DIR}/generated/Network/Server/wire_sizes.generated.h, whichWSclient.hincludes at end-of-file.CharacterCreationSuccessful → PRECEIVE_CREATE_CHARACTER,CharacterInformationExtended → PRECEIVE_JOIN_MAP_SERVER_EXTENDED,RespawnAfterDeathExtended → PRECEIVE_REVIVAL_EXTENDED.<=rather than==because some client structs intentionally don't decode all trailing fields (e.g.PreviewDataon character-creation). The actual safety property is "client struct fits in wire packet sosafe_castsucceeds."Adding coverage for another packet is a one-line append to
PACKET_MAPPING.2. Visible safe_cast failures (runtime diagnostics)
safe_cast<T>(span, packet_type = nullptr)template that logs aMCD_ERRORon every size mismatch — not just the call sites updated to pass a name. Fallback name istypeid(T).name()(mangled, but readable enough; RTTI is on).LogSafeCastSizeMismatchdefined inWSclient.cppso the header stays free ofg_ConsoleDebugincludes. Uses bounded 4-argvswprintfand a%.64hsfield-width cap on the type name to avoid any buffer-overflow surface.ReceiveJoinMapServerdispatch site (WSclient.cpp:13207) also emits anMCD_ERRORdescribing the user-visible symptom ("loading screen will appear frozen") so the cause is obvious in the log.3. MCD_ERROR actually fires in Release builds
Without this, the runtime diagnostics from #2 would compile but never reach the user, because the entire
CmuConsoleDebug::Writebody was gated by#ifdef CONSOLE_DEBUG(not defined in any production build).CmuConsoleDebug::GetInstance()now always returns a valid static instance. Previously returnednullptrunlessCSK_LH_DEBUG_CONSOLEwas defined, making everyg_ConsoleDebug->Write(...)call site a null-pointer member-function call. It "worked" only because the body was empty.Write()now routesMCD_ERRORtog_ErrorReport.Write(file-based, always compiled, opensMuError.logon startup) before the existingCONSOLE_DEBUG-gated console path.MCD_SEND / MCD_RECEIVE / MCD_NORMALstay debug-only so production logs aren't flooded.Net effect after merge: a future regression of the PR #402 shape produces a one-line entry like
in
MuError.loginstead of a silent hang. Or, more likely, the build never compiles in the first place because the generatedstatic_asserttrips.4. CI: recurse submodules
actions/checkout@v4in the three MinGW build workflows now usessubmodules: recursive. Required so CI can see the OpenMU XML the codegen step depends on.Test plan
wire_sizes.generated.hcontains the expected three asserts;sizeofchecks compile.Future consideration (not in this PR)
The submodule currently vendors all of OpenMU — hundreds of MB of C# server, ASP.NET tooling, tests, etc. — just to access one XML file. Long-term, it would be cleaner to factor the packet schema into a dedicated repo (e.g.
MUnique/mu-packet-schema) containing only the XML and the XSD, then have both OpenMU and MuMain pull it in as a tiny submodule. That makes the schema the formal contract between the two projects, lets other client implementations consume it without dragging in OpenMU's server tree, and removes the "MuMain's wire definitions are coupled to a specific OpenMU server commit" framing.Out of scope here — requires coordination with @sven-n and the OpenMU side — but worth flagging now since this PR is the first time MuMain takes a hard dependency on OpenMU. Mentioned briefly in the design-doc issue #407.
Scope notes
PRECEIVE_JOIN_MAP_SERVER_EXTENDEDand the surrounding receive structs" — grew based on review feedback (broader safe_cast logging) and a real discovery (MCD_ERROR being a no-op in Release).vswprintfin the legacyCONSOLE_DEBUG-gated console path is not changed. Same bug pattern but out of scope; worth a separate sweep PR if desired.