Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 24, 2025

📄 35% (0.35x) speedup for validate_decimal in src/mistralai/utils/serializers.py

⏱️ Runtime : 1.66 milliseconds 1.23 milliseconds (best of 67 runs)

📝 Explanation and details

The optimized code achieves a 35% speedup by replacing expensive isinstance() calls with faster type checking strategies:

Key optimizations:

  1. Pre-computed type tuples: Moved (Decimal, Unset) and (str, int, float) outside the function to avoid recreating them on every call.

  2. Direct type comparison for common cases: Uses type(d) is Decimal and type(d) is Unset instead of isinstance() for exact type matching, which bypasses the overhead of inheritance hierarchy checks.

  3. Fast-path for built-in types: Uses type(d) in _STR_INT_FLOAT_TYPES for quick membership testing against the pre-computed tuple of common types.

  4. Fallback isinstance() checks: Maintains backward compatibility by keeping isinstance() checks as fallbacks to handle edge cases like subclasses.

Why it's faster:

  • isinstance() is expensive because it checks the entire class hierarchy (MRO)
  • type(d) is X only does a single pointer comparison
  • type(d) in tuple is faster than isinstance(d, tuple) for exact type matches
  • Pre-computed tuples eliminate repeated tuple creation overhead

Performance by test case type:

  • Best gains (40-66% faster): String conversions like "123", "0.0" benefit most from the fast-path type checking
  • Good gains (28-35% faster): Decimal instances and numeric types (int/float) see solid improvements
  • Slight regression (5-33% slower): Complex invalid types like lists/dicts hit the fallback path, but these are error cases anyway

The optimization maintains identical behavior while dramatically improving the common-case performance.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 3061 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from decimal import Decimal

# imports
import pytest
from mistralai.utils.serializers import validate_decimal


# Simulate the Unset type for testing purposes
class Unset:
    pass
from mistralai.utils.serializers import validate_decimal

# unit tests

# 1. Basic Test Cases

def test_validate_decimal_with_none():
    """Should return None when input is None."""
    codeflash_output = validate_decimal(None) # 496ns -> 440ns (12.7% faster)

def test_validate_decimal_with_decimal_instance():
    """Should return the Decimal instance unchanged."""
    dec = Decimal("3.14159")
    codeflash_output = validate_decimal(dec) # 604ns -> 470ns (28.5% faster)


def test_validate_decimal_with_int():
    """Should convert integer input to Decimal."""
    codeflash_output = validate_decimal(42) # 3.52μs -> 2.75μs (28.1% faster)

def test_validate_decimal_with_float():
    """Should convert float input to Decimal (as string for precision)."""
    codeflash_output = validate_decimal(3.14) # 3.77μs -> 3.04μs (23.6% faster)

def test_validate_decimal_with_str_int():
    """Should convert string integer to Decimal."""
    codeflash_output = validate_decimal("123") # 2.08μs -> 1.42μs (46.2% faster)

def test_validate_decimal_with_str_float():
    """Should convert string float to Decimal."""
    codeflash_output = validate_decimal("0.001") # 2.05μs -> 1.47μs (40.0% faster)

def test_validate_decimal_with_str_scientific():
    """Should convert scientific notation string to Decimal."""
    codeflash_output = validate_decimal("1e3") # 2.31μs -> 1.62μs (42.4% faster)

# 2. Edge Test Cases

def test_validate_decimal_with_negative_int():
    """Should handle negative integer input."""
    codeflash_output = validate_decimal(-7) # 2.20μs -> 1.57μs (39.7% faster)

def test_validate_decimal_with_negative_float():
    """Should handle negative float input."""
    codeflash_output = validate_decimal(-0.001) # 3.39μs -> 2.87μs (18.0% faster)

def test_validate_decimal_with_empty_string():
    """Should raise an exception for empty string input."""
    with pytest.raises(Exception):
        validate_decimal("") # 2.81μs -> 2.06μs (36.3% faster)

def test_validate_decimal_with_invalid_string():
    """Should raise an exception for non-numeric string input."""
    with pytest.raises(Exception):
        validate_decimal("not_a_number") # 2.75μs -> 2.02μs (36.5% faster)


def test_validate_decimal_with_list():
    """Should raise ValueError for list input."""
    with pytest.raises(ValueError):
        validate_decimal([1,2,3]) # 2.14μs -> 2.55μs (15.9% slower)

def test_validate_decimal_with_dict():
    """Should raise ValueError for dict input."""
    with pytest.raises(ValueError):
        validate_decimal({'a': 1}) # 1.75μs -> 2.05μs (14.7% slower)

def test_validate_decimal_with_tuple():
    """Should raise ValueError for tuple input."""
    with pytest.raises(ValueError):
        validate_decimal((1,2)) # 1.76μs -> 1.87μs (5.56% slower)

def test_validate_decimal_with_bytes():
    """Should raise ValueError for bytes input."""
    with pytest.raises(ValueError):
        validate_decimal(b'123') # 1.69μs -> 1.72μs (1.75% slower)

def test_validate_decimal_with_complex():
    """Should raise ValueError for complex number input."""
    with pytest.raises(ValueError):
        validate_decimal(1+2j) # 1.71μs -> 1.79μs (4.14% slower)

def test_validate_decimal_with_large_float():
    """Should handle very large float input."""
    large_float = 1e100
    codeflash_output = validate_decimal(large_float) # 5.88μs -> 5.40μs (8.98% faster)

def test_validate_decimal_with_small_float():
    """Should handle very small float input."""
    small_float = 1e-100
    codeflash_output = validate_decimal(small_float) # 4.15μs -> 3.63μs (14.1% faster)

def test_validate_decimal_with_leading_trailing_spaces():
    """Should handle string with leading/trailing spaces."""
    codeflash_output = validate_decimal("  123.45  ") # 2.25μs -> 1.59μs (41.2% faster)

def test_validate_decimal_with_zero():
    """Should handle zero in various forms."""
    codeflash_output = validate_decimal(0) # 2.02μs -> 1.49μs (35.3% faster)
    codeflash_output = validate_decimal(0.0) # 1.62μs -> 1.30μs (24.7% faster)
    codeflash_output = validate_decimal("0") # 757ns -> 454ns (66.7% faster)
    codeflash_output = validate_decimal("0.0") # 517ns -> 344ns (50.3% faster)

# 3. Large Scale Test Cases






def test_validate_decimal_large_mixed_types():
    """Should handle a large mixed-type list and raise for invalid types."""
    mixed = [0, 1.1, "2.2", None, Unset(), "bad", [1], {"a": 1}]
    expected = [Decimal("0"), Decimal("1.1"), Decimal("2.2"), None, mixed[4]]
    results = []
    # Only valid types should succeed
    for i, item in enumerate(mixed):
        if i < 5:
            results.append(validate_decimal(item))
        else:
            with pytest.raises(ValueError):
                validate_decimal(item)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from decimal import Decimal

# imports
import pytest
from mistralai.utils.serializers import validate_decimal


# Dummy Unset class for testing, since mistralai.types.basemodel.Unset is not available
class Unset:
    pass
from mistralai.utils.serializers import validate_decimal

# unit tests

# ---------------------------
# BASIC TEST CASES
# ---------------------------

def test_none_returns_none():
    # Test that None input returns None
    codeflash_output = validate_decimal(None) # 473ns -> 443ns (6.77% faster)

def test_decimal_passthrough():
    # Test that Decimal input returns the same Decimal
    d = Decimal("1.23")
    codeflash_output = validate_decimal(d) # 628ns -> 529ns (18.7% faster)


def test_int_conversion():
    # Test that integer input is converted to Decimal
    codeflash_output = validate_decimal(42) # 3.65μs -> 2.80μs (30.7% faster)

def test_float_conversion():
    # Test that float input is converted to Decimal (via str)
    # This avoids float precision issues
    codeflash_output = validate_decimal(3.14) # 3.76μs -> 2.93μs (28.3% faster)

def test_str_conversion():
    # Test that string input is converted to Decimal
    codeflash_output = validate_decimal("123.456") # 2.20μs -> 1.54μs (42.8% faster)

def test_str_int_conversion():
    # Test that string representing int is converted to Decimal
    codeflash_output = validate_decimal("100") # 2.01μs -> 1.37μs (47.1% faster)

def test_str_float_scientific():
    # Test that string in scientific notation is converted correctly
    codeflash_output = validate_decimal("1e3") # 2.15μs -> 1.65μs (30.4% faster)

# ---------------------------
# EDGE TEST CASES
# ---------------------------

def test_invalid_type_list_raises():
    # Test that a list input raises ValueError
    with pytest.raises(ValueError):
        validate_decimal([1,2,3]) # 1.55μs -> 2.30μs (32.5% slower)

def test_invalid_type_dict_raises():
    # Test that a dict input raises ValueError
    with pytest.raises(ValueError):
        validate_decimal({"a": 1}) # 1.61μs -> 2.00μs (19.5% slower)


def test_empty_string():
    # Test that empty string input raises ValueError from Decimal
    with pytest.raises(Exception):
        validate_decimal("") # 4.22μs -> 3.23μs (30.6% faster)

def test_non_numeric_string():
    # Test that non-numeric string input raises ValueError from Decimal
    with pytest.raises(Exception):
        validate_decimal("not_a_number") # 3.11μs -> 2.34μs (33.1% faster)

def test_str_with_spaces():
    # Test that string with spaces is converted if valid
    codeflash_output = validate_decimal("  77.7 ") # 2.39μs -> 1.64μs (45.8% faster)

def test_str_with_leading_zeros():
    # Test string with leading zeros
    codeflash_output = validate_decimal("000123.45") # 2.14μs -> 1.60μs (33.7% faster)

def test_large_integer_string():
    # Test very large integer string
    s = "9" * 100
    codeflash_output = validate_decimal(s) # 2.66μs -> 1.94μs (37.6% faster)

def test_negative_int():
    # Test negative integer
    codeflash_output = validate_decimal(-100) # 2.12μs -> 1.54μs (37.2% faster)

def test_negative_float():
    # Test negative float
    codeflash_output = validate_decimal(-123.456) # 3.68μs -> 3.02μs (21.9% faster)

def test_negative_str():
    # Test negative number as string
    codeflash_output = validate_decimal("-789.01") # 2.04μs -> 1.41μs (44.6% faster)

def test_float_precision():
    # Test float with many decimal places
    val = 0.1234567890123456
    # Decimal(str(val)) will not be exact, but should match the string conversion
    codeflash_output = validate_decimal(val) # 3.46μs -> 2.87μs (20.4% faster)

def test_str_with_plus_sign():
    # Test string with explicit plus sign
    codeflash_output = validate_decimal("+42.5") # 2.00μs -> 1.42μs (41.5% faster)

def test_str_with_exponent():
    # Test string with exponent notation
    codeflash_output = validate_decimal("1.23e4") # 2.10μs -> 1.57μs (34.1% faster)

def test_str_with_exponent_uppercase():
    # Test string with uppercase exponent notation
    codeflash_output = validate_decimal("1.23E4") # 1.99μs -> 1.50μs (32.5% faster)


def test_str_neg_inf():
    # Test string '-Infinity'
    codeflash_output = validate_decimal("-Infinity"); dec = codeflash_output # 3.17μs -> 2.34μs (35.5% faster)

# ---------------------------
# LARGE SCALE TEST CASES
# ---------------------------

def test_large_list_of_ints():
    # Test conversion of a large number of integers
    for i in range(1000):
        codeflash_output = validate_decimal(i) # 460μs -> 321μs (43.2% faster)

def test_large_list_of_floats():
    # Test conversion of a large number of floats
    for i in range(1000):
        val = i + 0.5
        codeflash_output = validate_decimal(val) # 619μs -> 478μs (29.5% faster)

def test_large_list_of_strs():
    # Test conversion of a large number of numeric strings
    for i in range(1000):
        s = str(i * 3.14159)
        codeflash_output = validate_decimal(s) # 449μs -> 320μs (40.1% faster)

def test_large_string_number():
    # Test conversion of a very large number as string
    s = "1" * 999  # 999-digit number
    codeflash_output = validate_decimal(s) # 6.00μs -> 5.33μs (12.7% faster)

def test_large_decimal_passthrough():
    # Test that a large Decimal is passed through unchanged
    d = Decimal("9" * 999)
    codeflash_output = validate_decimal(d) # 545ns -> 488ns (11.7% faster)


def test_subclass_of_decimal():
    # Test that a subclass of Decimal is passed through unchanged
    class MyDecimal(Decimal):
        pass
    md = MyDecimal("123.45")
    codeflash_output = validate_decimal(md) # 719ns -> 1.08μs (33.4% slower)


def test_zero_int():
    # Test zero integer
    codeflash_output = validate_decimal(0) # 3.59μs -> 2.74μs (31.2% faster)

def test_zero_float():
    # Test zero float
    codeflash_output = validate_decimal(0.0) # 2.94μs -> 2.38μs (23.3% faster)

def test_zero_str():
    # Test zero as string
    codeflash_output = validate_decimal("0") # 2.00μs -> 1.42μs (40.6% faster)

def test_str_with_trailing_dot():
    # Test string with trailing dot
    codeflash_output = validate_decimal("123.") # 2.15μs -> 1.41μs (52.0% faster)

def test_str_with_leading_dot():
    # Test string with leading dot
    codeflash_output = validate_decimal(".456") # 2.00μs -> 1.43μs (39.5% faster)

def test_str_with_multiple_dots():
    # Test string with multiple dots (invalid)
    with pytest.raises(Exception):
        validate_decimal("1.2.3") # 2.75μs -> 2.11μs (30.2% faster)




def test_str_with_comma():
    # Test string with comma (should fail)
    with pytest.raises(Exception):
        validate_decimal("1,234.56") # 4.24μs -> 3.51μs (20.9% faster)

def test_str_with_currency_symbol():
    # Test string with currency symbol (should fail)
    with pytest.raises(Exception):
        validate_decimal("$123.45") # 2.91μs -> 2.09μs (39.4% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-validate_decimal-mh4is4sv and push.

Codeflash

The optimized code achieves a **35% speedup** by replacing expensive `isinstance()` calls with faster type checking strategies:

**Key optimizations:**

1. **Pre-computed type tuples**: Moved `(Decimal, Unset)` and `(str, int, float)` outside the function to avoid recreating them on every call.

2. **Direct type comparison for common cases**: Uses `type(d) is Decimal` and `type(d) is Unset` instead of `isinstance()` for exact type matching, which bypasses the overhead of inheritance hierarchy checks.

3. **Fast-path for built-in types**: Uses `type(d) in _STR_INT_FLOAT_TYPES` for quick membership testing against the pre-computed tuple of common types.

4. **Fallback isinstance() checks**: Maintains backward compatibility by keeping `isinstance()` checks as fallbacks to handle edge cases like subclasses.

**Why it's faster:**
- `isinstance()` is expensive because it checks the entire class hierarchy (MRO)
- `type(d) is X` only does a single pointer comparison
- `type(d) in tuple` is faster than `isinstance(d, tuple)` for exact type matches
- Pre-computed tuples eliminate repeated tuple creation overhead

**Performance by test case type:**
- **Best gains (40-66% faster)**: String conversions like `"123"`, `"0.0"` benefit most from the fast-path type checking
- **Good gains (28-35% faster)**: Decimal instances and numeric types (int/float) see solid improvements
- **Slight regression (5-33% slower)**: Complex invalid types like lists/dicts hit the fallback path, but these are error cases anyway

The optimization maintains identical behavior while dramatically improving the common-case performance.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 24, 2025 07:18
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant