Fix sporadic parsing errors with binary ScalarList#762
Merged
Conversation
Multiple root causes:
1. _parse_token raised FoamFileDecodeError (not ParseError) when binary data
started with ASCII letter + '(' (depth tracking scanned through binary data)
2. _parse_token raised UnicodeDecodeError for quoted strings with non-ASCII bytes
3. _ASCIINumericListParser raised UnicodeDecodeError from decode('ascii') when
comment-pattern regex matched binary data
4. _ASCIINumericListParser/_parse_binary_numeric_list raised ValueError from
reshape when element count wasn't divisible by 3
5. Binary parsers in _parse_standalone_data_entry were chained (int32 first,
float64 only if int32 failed), so int32 could win accidentally when ')' byte
appeared at position count*4 in the float64 data
Fixes:
- _parse_token: raise ParseError instead of FoamFileDecodeError/UnicodeDecodeError
- _ASCIINumericListParser: handle UnicodeDecodeError and ValueError as ParseError
- _parse_binary_numeric_list: handle ValueError from reshape as ParseError
- _parse_standalone_data_entry: try all binary types independently, pick the
one that advances furthest; catch FoamFileDecodeError/UnicodeDecodeError from
_parse_data
Add regression tests in test_issue_binary_scalar_list.py
Co-authored-by: gerlero <15150530+gerlero@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix parsing error with binary ScalarList in FoamFile class
Fix sporadic parsing errors with binary ScalarList
Feb 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Binary-format
scalarListfiles would sporadically raiseFoamFileDecodeError,UnicodeDecodeError, orValueErrordepending on the specific float values written — because the parser tried to interpret raw binary bytes as ASCII text before falling back to the binary parser.Root causes
_parse_tokenfatal errors escaping suppression: When binary data begins with an ASCII letter followed by((e.g.H(),_parse_token's depth-tracking logic scanned through the binary payload and raisedFoamFileDecodeError(not aParseErrorsubclass), bypassing allcontextlib.suppress(ParseError)guards. Same issue forUnicodeDecodeErrorwhen a binary byte matching"triggered the quoted-string path._ASCIINumericListParserdecode/reshape errors: The comment-aware regex (_pattern, nore.ASCIIflag) could match binary data containing non-ASCII bytes, causingUnicodeDecodeErrorfromdata.decode("ascii"). For vector lists, a mismatched element count causedValueErrorfromreshape().int32 binary parser masking float64: Binary parsers were chained — int32 was tried first, float64 only on failure. The byte
0x29()) can appear at offsetcount × 4within float64 data, making the int32 parser falsely succeed and float64 never be attempted.Changes
_parse_token: RaiseParseErrorinstead ofFoamFileDecodeErrorfor unmatched parens and unclosed quoted/#{}tokens; wrap alldecode()calls to convertUnicodeDecodeError→ParseError._ASCIINumericListParser.__call__and_parse_ascii_faces_like_list: Wrapdata.decode("ascii")andret.reshape()to convertUnicodeDecodeError/ValueError→ParseError._parse_binary_numeric_list: Wrapret.reshape()to convertValueError→ParseError._parse_standalone_data_entry: Try all three binary dtypes (int32, float64, float64×3) independently rather than in a chain, always keeping the result with the highest consumed position. Float64 will always win over int32 for the same count because it consumes twice the bytes. Also catchFoamFileDecodeError/UnicodeDecodeErrorfrom_parse_data.Original prompt
This section details on the original issue you should resolve
<issue_title>Parsing error with binary ScalarList</issue_title>
<issue_description>I experienced a strange parsing error when trying to read a
scalarListwith theFoamFileclass when in binary format, in the form:This error is really sporadic. I can tell that it is not caused by floating point special cases (I had datasets with nan and inf and worked fine). I tried storing the data in the
FoamFileaslistandnumpyarrays before saving but got still the same problem.Thank you a lot in advance and let me know if you need further info.
How to reproduce
Here are the functions that I used to write the scalar list, and the one that I am using to reading it.
Output: