Rewrite mac@index normalization to avoid UB and a release-build misco…#689
Merged
DigiH merged 1 commit intodevelopmentfrom Apr 26, 2026
Merged
Rewrite mac@index normalization to avoid UB and a release-build misco…#689DigiH merged 1 commit intodevelopmentfrom
DigiH merged 1 commit intodevelopmentfrom
Conversation
…mpile The previous implementation built a std::string from mac_id, then walked it with `for (int x = 0; x < mac_string.length(); x++)` while calling `mac_string.erase(x, 1)` and `mac_string[x] = tolower(mac_string[x])` in the same iteration. Two latent issues: 1. Passing nullptr to std::string's `const char*` constructor (when mac_id is null) is undefined behaviour. 2. Reading `mac_string[x]` immediately after `mac_string.erase(x, 1)` relies on subtle ordering: the erase shifts characters left, so the value at index x changes between the colon check and the tolower read. With aggressive optimization, the compiler is free to cache either the length or the character value across the erase call, which produces a corrupted normalized MAC. Observed in the wild: TheengsApp built with Android NDK Clang 19 (release pipeline: -O3 + LTO + BOLT) silently returned false from the entire mac@index/revmac@index branch for IBT-2X(S) and MB/SW advertisements, even though the Linux/Python build of the same source decoded them correctly. The pattern above is a plausible trigger. Rewrite: - Strip colons + lowercase into a fixed 13-byte stack buffer in a single forward pass over `mac_id`. - Bail out cleanly if `mac_id` is null or contains fewer than 12 hex chars after stripping (a 12-char comparison cannot match anyway). - Reuse `cond_len` (12) for the buffer index limit, the reverse length, and the strncmp length, removing the duplicated magic literal. Verified: - ctest (BLE + BLE_fail) - 553 + 18 cases, no regressions. - TheengsApp Android (Qt 6.10.3, NDK 28.2 / Clang 19): IBT-2X(S) and MB/SW vectors that previously decoded as no-match now publish to MQTT correctly.
Contributor
|
Thanks |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description:
The previous implementation built a std::string from mac_id, then walked it with
for (int x = 0; x < mac_string.length(); x++)while callingmac_string.erase(x, 1)andmac_string[x] = tolower(mac_string[x])in the same iteration. Two latent issues:Passing nullptr to std::string's
const char*constructor (when mac_id is null) is undefined behaviour.Reading
mac_string[x]immediately aftermac_string.erase(x, 1)relies on subtle ordering: the erase shifts characters left, so the value at index x changes between the colon check and the tolower read. With aggressive optimization, the compiler is free to cache either the length or the character value across the erase call, which produces a corrupted normalized MAC.Observed in the wild: TheengsApp built with Android NDK Clang 19 (release pipeline: -O3 + LTO + BOLT) silently returned false from the entire mac@index/revmac@index branch for IBT-2X(S) and MB/SW advertisements, even though the Linux/Python build of the same source decoded them correctly. The pattern above is a plausible trigger.
Rewrite:
mac_id.mac_idis null or contains fewer than 12 hex chars after stripping (a 12-char comparison cannot match anyway).cond_len(12) for the buffer index limit, the reverse length, and the strncmp length, removing the duplicated magic literal.Verified:
Checklist: