🛡️ Shield: Robust Integer Parsing for Float Strings#643
Conversation
Updated `parse_int_or_default` to handle floating point strings (e.g. "5.0")
by attempting `int(float(v))` if `int(v)` fails.
This prevents silent data corruption where valid numeric strings were treated
as default (0) or raised ValueError in strict mode.
Added comprehensive test cases for float strings, truncation, and overflow.
Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Improve parse_int_or_default to handle float-formatted numeric inputs more robustly (e.g., CSV/JSON "5.0"), and add tests to validate the new behavior.
Changes:
- Added a float-conversion fallback (
int(float(v))) when directint(v)parsing fails. - Added unit tests covering float strings, float values, and
"inf"strict/non-strict behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| imednet/utils/validators.py | Adds float parsing fallback to reduce defaulting/parse failures for float-like inputs |
| tests/unit/test_models_validators.py | Adds coverage for float-like strings/values and infinity handling |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -68,9 +68,13 @@ def parse_int_or_default(v: Any, default: int = 0, strict: bool = False) -> int: | |||
| try: | |||
| return int(v) | |||
| except (ValueError, TypeError): | |||
There was a problem hiding this comment.
int(v) can raise OverflowError for values like float('inf') (and potentially other non-finite floats). Because OverflowError isn’t caught here, parse_int_or_default(v, strict=False) can unexpectedly raise instead of returning default. Consider including OverflowError in this first except tuple (or otherwise ensuring non-finite float inputs are handled consistently with the defaulting behavior).
| except (ValueError, TypeError): | |
| except (ValueError, TypeError, OverflowError): |
| except (ValueError, TypeError, OverflowError): | ||
| if strict: | ||
| raise |
There was a problem hiding this comment.
The function docstring states that strict=True raises ValueError on parse failure, but this fallback can raise OverflowError (as your new test asserts for \"inf\"). Also, strict=True now accepts non-integer float strings like \"5.9\" and truncates to 5, which is a semantic change from strict integer parsing. Recommendation: either (a) keep strict semantics by only accepting float inputs when the float is integral (e.g., f.is_integer()), and raise ValueError otherwise, and normalize strict failures to a consistent exception type; or (b) update the docstring/contract to explicitly document truncation and the possible exception types in strict mode.
| except (ValueError, TypeError, OverflowError): | |
| if strict: | |
| raise | |
| except (ValueError, TypeError, OverflowError) as exc: | |
| if strict: | |
| # Normalize all strict parse failures to ValueError, as documented. | |
| raise ValueError(f"Invalid integer value: {v!r}") from exc |
| # Float string robustness | ||
| assert validators.parse_int_or_default("5.0") == 5 | ||
| assert validators.parse_int_or_default("5.9") == 5 | ||
| assert validators.parse_int_or_default(5.9) == 5 | ||
| assert validators.parse_int_or_default("inf") == 0 | ||
| with pytest.raises(OverflowError): | ||
| validators.parse_int_or_default("inf", strict=True) |
There was a problem hiding this comment.
Given the new fallback behavior, it would be useful to add tests that lock down strict-mode semantics for non-integer floats (e.g., expected behavior for parse_int_or_default(\"5.9\", strict=True)), and also add coverage for float('inf') (not just the string \"inf\") to catch the OverflowError path coming from int(v) directly.
🛑 Vulnerability: The
parse_int_or_defaultvalidator would silently return the default value (0) when encountering float-formatted strings like"5.0", or raise aValueErrorin strict mode. This is common in CSV/JSON exports and leads to data loss/corruption.PR created automatically by Jules for task 8890386912751906217 started by @fderuiter