Skip to content

fix: enhance numeric parsing to support scientific notation and ambiguous hex#13

Open
rajanarahul93 wants to merge 4 commits intorawBit-io:mainfrom
rajanarahul93:fix-numeric-parsing-scientific-notation
Open

fix: enhance numeric parsing to support scientific notation and ambiguous hex#13
rajanarahul93 wants to merge 4 commits intorawBit-io:mainfrom
rajanarahul93:fix-numeric-parsing-scientific-notation

Conversation

@rajanarahul93
Copy link
Copy Markdown

Description

This PR fixes #10:

  • Ensures that _parse_numeric_exact correctly parses scientific notation (e.g., "1e6", "1e8") as decimal values, not as hexadecimal.
  • Moves Decimal parsing before ambiguous hex detection, as suggested in the issue.
  • Adds and updates tests for scientific notation and ambiguous hex cases.
  • Fixes downstream issues in math_operation and compare_numbers for such inputs.

Checklist

  • All tests pass
  • Added/updated relevant tests
  • Follows project guidelines

Closes #10

@rawBit-io
Copy link
Copy Markdown
Owner

The happy path is fixed — 1e6, 1e8, 2.5e3 now parse correctly, which was the goal. Before I can merge, I'd like the negative cases handled too. Your tests only assert that valid inputs parse correctly; there are no pytest.raises(ValueError) assertions for malformed input, and that's how the cases below slipped through.

What the current PR branch does

I ran each of these against _parse_numeric_exact on your branch:

Input Current result Expected
"1e" 30 (int, 0x1e) ValueError
"e10" 3600 (int, 0xe10) ValueError
"1E" 30 ValueError
"1e1e" 7710 ValueError
"0b10" 2832 ValueError
"NaN" Decimal('NaN') ValueError
"Infinity" Decimal('Infinity') ValueError
"_10" Decimal('10') ValueError
"1__0" Decimal('10') ValueError

The first block (malformed scientific notation + 0b prefix) goes wrong because your hex fallback runs after Decimal(s) fails — anything that looks hex-ish but isn't valid Decimal still slips into int(s, 16). The second block (NaN/Infinity/underscores) goes wrong because Decimal(s) is much more permissive than the strict _INT_DEC_RE you use for the integer branch.

Worse, these propagate into the public API:

compare_numbers(["1e", "<", "31"])         # returns "true" (should raise)
math_operation(["e10", "+", "1"])           # returns "3601" (should raise)
compare_numbers(["NaN", "<", "10"])         # raises decimal.InvalidOperation (should raise ValueError)
math_operation(["Infinity", "+", "1"])      # raises OverflowError from _num_to_str (should raise ValueError)

What I'd like to see before merging

  1. The parser raises ValueError for every input in the table above.
  2. A parametrised negative test in test_calc_func.py pinning that behaviour, e.g.:
@pytest.mark.parametrize("raw", [
    "1e", "e10", "1E", "E10", "1e1e",
    "0b10", "0B10",
    "NaN", "nan", "Infinity", "-Infinity", "sNaN",
    "_10", "1__0", "1_",
])
def test_parse_numeric_exact_rejects_malformed(raw):
    with pytest.raises(ValueError):
        calc._parse_numeric_exact(raw)
  1. A couple of compare_numbers / math_operation cases from the second snippet above, also asserting ValueError.

How you get there is up to you — there are a few ways to structure the checks. Happy to re-review once the negative suite is green.

@rajanarahul93
Copy link
Copy Markdown
Author

Addressed the negative path cases and added the suggested test suite ready for re-review @rawBit-io

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

_parse_numeric_exact misinterprets scientific notation (e.g., 1e6, 1e8) as hexadecimal, causing incorrect calculations

2 participants