⚡️ Speed up function `construct_type_unchecked` by 16% #32

codeflash-ai · 2025-10-30T06:06:15Z

📄 16% (0.16x) speedup for `construct_type_unchecked` in `src/openai/_models.py`

⏱️ Runtime : 29.1 milliseconds → 25.2 milliseconds (best of 110 runs)

📝 Explanation and details

The optimized code achieves a 15% speedup through several key micro-optimizations that reduce redundant operations and function calls:

1. Eliminated redundant get_args() calls: The original code called get_args(type_) multiple times for dict processing (_, items_type = get_args(type_)). The optimized version stores the result once and directly accesses items_type = args[1], avoiding repeated tuple unpacking.

2. Added fast-path for empty containers: For both dict and list processing, the optimized code checks if not value: and returns empty containers immediately ({} or []), avoiding unnecessary comprehension overhead for empty inputs. This is particularly effective as shown in test cases like test_empty_dict() (15.3% faster) and test_empty_list() (12.8% faster).

3. Optimized model construction logic: Instead of repeatedly calling getattr(type_, "construct", None) within comprehensions, the optimized code fetches the construct method once and reuses it. It also reordered the expensive is_literal_type() check after the cheaper inspect.isclass() check.

4. Reduced attribute lookups: By caching function references and avoiding repeated dictionary/tuple access patterns, the code minimizes Python's attribute resolution overhead.

These optimizations are most effective for large-scale data processing scenarios (17-21% speedup on large lists/dicts with 1000+ elements) and container-heavy workloads where dict/list construction dominates runtime. The improvements are consistent across nested structures, making this particularly valuable for API response parsing and data serialization tasks typical in the OpenAI library.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 78 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

from datetime import date, datetime, timezone
from typing import (Annotated, Any, Dict, List, Literal, Optional, Type,
                    TypeVar, Union)

# imports
import pytest
from openai._models import construct_type_unchecked
from pydantic import BaseModel

# function to test (from above)
_T = TypeVar("_T")
from openai._models import construct_type_unchecked

# ------------------- UNIT TESTS -------------------

# 1. Basic Test Cases
def test_int_to_float_basic():
    # Basic: int to float conversion
    codeflash_output = construct_type_unchecked(value=5, type_=float) # 8.17μs -> 7.28μs (12.3% faster)

def test_float_to_float_basic():
    # Basic: float to float, should be unchanged
    codeflash_output = construct_type_unchecked(value=3.14, type_=float) # 7.02μs -> 6.04μs (16.1% faster)

def test_list_of_ints_basic():
    # Basic: list of ints to list of floats
    codeflash_output = construct_type_unchecked(value=[1, 2, 3], type_=List[float]) # 16.6μs -> 15.3μs (8.38% faster)

def test_dict_basic():
    # Basic: dict of str->int to dict of str->float
    codeflash_output = construct_type_unchecked(value={"a": 1, "b": 2}, type_=Dict[str, float]) # 15.4μs -> 13.9μs (11.0% faster)

def test_literal_basic():
    # Basic: Literal type
    codeflash_output = construct_type_unchecked(value="foo", type_=Literal["foo", "bar"]) # 6.80μs -> 6.26μs (8.74% faster)
    codeflash_output = construct_type_unchecked(value="bar", type_=Literal["foo", "bar"]) # 2.79μs -> 2.69μs (3.91% faster)

def test_annotated_basic():
    # Basic: Annotated type (should act as its base type)
    codeflash_output = construct_type_unchecked(value=42, type_=Annotated[int, "meta"]) # 9.66μs -> 8.62μs (12.1% faster)


def test_union_basic():
    # Basic: Union type
    codeflash_output = construct_type_unchecked(value=1, type_=Union[int, str]) # 14.3μs -> 13.4μs (6.83% faster)
    codeflash_output = construct_type_unchecked(value="abc", type_=Union[int, str]) # 4.86μs -> 4.65μs (4.65% faster)

def test_datetime_basic():
    # Basic: datetime from string
    dt_str = "2020-01-01T12:00:00Z"
    codeflash_output = construct_type_unchecked(value=dt_str, type_=datetime); result = codeflash_output # 25.0μs -> 23.5μs (6.47% faster)

def test_date_basic():
    # Basic: date from string
    d_str = "2020-01-01"
    codeflash_output = construct_type_unchecked(value=d_str, type_=date); result = codeflash_output # 15.4μs -> 15.4μs (0.364% faster)

# 2. Edge Test Cases

def test_int_to_float_edge():
    # Edge: int that cannot be exactly represented as float
    val = 2**53 + 1
    codeflash_output = construct_type_unchecked(value=val, type_=float); result = codeflash_output # 8.12μs -> 7.84μs (3.49% faster)

def test_list_wrong_type_edge():
    # Edge: value is not a list, but type is list
    codeflash_output = construct_type_unchecked(value="notalist", type_=List[int]); result = codeflash_output # 7.37μs -> 6.96μs (5.84% faster)

def test_dict_wrong_type_edge():
    # Edge: value is not a dict, but type is dict
    codeflash_output = construct_type_unchecked(value="notadict", type_=Dict[str, int]); result = codeflash_output # 7.66μs -> 7.71μs (0.675% slower)

def test_literal_invalid_edge():
    # Edge: value not in Literal options
    codeflash_output = construct_type_unchecked(value="baz", type_=Literal["foo", "bar"]); result = codeflash_output # 6.28μs -> 6.03μs (4.26% faster)

def test_union_no_match_edge():
    # Edge: value does not match any union variant
    codeflash_output = construct_type_unchecked(value=None, type_=Union[int, str]); result = codeflash_output # 51.0μs -> 49.0μs (4.05% faster)

def test_datetime_invalid_edge():
    # Edge: invalid datetime string
    codeflash_output = construct_type_unchecked(value="notadatetime", type_=datetime); result = codeflash_output # 12.4μs -> 11.9μs (4.90% faster)

def test_date_invalid_edge():
    # Edge: invalid date string
    codeflash_output = construct_type_unchecked(value="notadate", type_=date); result = codeflash_output # 11.1μs -> 10.7μs (3.39% faster)


def test_nested_list_dict_edge():
    # Edge: deeply nested list/dict
    val = [{"a": [1, 2]}, {"a": [3, 4]}]
    type_ = List[Dict[str, List[float]]]
    codeflash_output = construct_type_unchecked(value=val, type_=type_); result = codeflash_output # 35.2μs -> 32.8μs (7.23% faster)

def test_annotated_with_metadata_edge():
    # Edge: Annotated type with metadata
    codeflash_output = construct_type_unchecked(value=123, type_=Annotated[int, "meta", "moremeta"]) # 9.32μs -> 8.28μs (12.6% faster)

# 3. Large Scale Test Cases

def test_large_list_of_ints():
    # Large: list of 1000 ints to floats
    vals = list(range(1000))
    codeflash_output = construct_type_unchecked(value=vals, type_=List[float]); result = codeflash_output # 2.07ms -> 1.75ms (18.1% faster)

def test_large_dict_of_ints():
    # Large: dict of 1000 str->int to str->float
    vals = {str(i): i for i in range(1000)}
    codeflash_output = construct_type_unchecked(value=vals, type_=Dict[str, float]); result = codeflash_output # 2.12ms -> 1.80ms (17.4% faster)

def test_large_nested_structure():
    # Large: nested structure (list of dicts of lists)
    vals = [{"a": list(range(10))} for _ in range(100)]
    type_ = List[Dict[str, List[float]]]
    codeflash_output = construct_type_unchecked(value=vals, type_=type_); result = codeflash_output # 2.56ms -> 2.18ms (17.0% faster)



def test_none_type():
    # Edge: value is None, type is Optional[int]
    codeflash_output = construct_type_unchecked(value=None, type_=Optional[int]) # 13.5μs -> 12.8μs (5.60% faster)

def test_empty_list():
    # Edge: empty list
    codeflash_output = construct_type_unchecked(value=[], type_=List[int]) # 8.36μs -> 7.41μs (12.8% faster)

def test_empty_dict():
    # Edge: empty dict
    codeflash_output = construct_type_unchecked(value={}, type_=Dict[str, int]) # 8.67μs -> 7.52μs (15.3% faster)

def test_list_of_none():
    # Edge: list of None values
    codeflash_output = construct_type_unchecked(value=[None, None], type_=List[Optional[int]]); result = codeflash_output # 17.3μs -> 16.6μs (4.31% faster)

def test_dict_of_none():
    # Edge: dict of None values
    codeflash_output = construct_type_unchecked(value={"a": None, "b": None}, type_=Dict[str, Optional[int]]); result = codeflash_output # 16.3μs -> 15.5μs (5.02% faster)

def test_nested_union():
    # Edge: nested union types
    type_ = Union[List[int], Dict[str, int]]
    codeflash_output = construct_type_unchecked(value=[1, 2, 3], type_=type_) # 10.9μs -> 10.4μs (4.78% faster)
    codeflash_output = construct_type_unchecked(value={"a": 1}, type_=type_) # 5.80μs -> 6.29μs (7.79% slower)

def test_union_with_literal():
    # Edge: Union with Literal
    type_ = Union[Literal["foo"], int]
    codeflash_output = construct_type_unchecked(value="foo", type_=type_) # 10.2μs -> 9.72μs (5.08% faster)
    codeflash_output = construct_type_unchecked(value=42, type_=type_) # 5.21μs -> 5.06μs (3.00% faster)

def test_datetime_from_unix_timestamp():
    # Edge: datetime from unix timestamp
    ts = 1609459200  # 2021-01-01T00:00:00Z
    codeflash_output = construct_type_unchecked(value=ts, type_=datetime); result = codeflash_output # 18.7μs -> 18.5μs (1.34% faster)

def test_date_from_unix_timestamp():
    # Edge: date from unix timestamp
    ts = 1609459200  # 2021-01-01T00:00:00Z
    codeflash_output = construct_type_unchecked(value=ts, type_=date); result = codeflash_output # 13.6μs -> 12.7μs (6.81% faster)


#------------------------------------------------
from datetime import date, datetime
from typing import (Annotated, Any, Dict, Generic, List, Literal, Optional,
                    TypeVar, Union)

# imports
import pytest
from openai._models import construct_type_unchecked
from typing_extensions import TypeAliasType

# --- Test Models for Pydantic scenarios ---
try:
    import pydantic

    class SimpleModel(pydantic.BaseModel):
        a: int
        b: str

    class NestedModel(pydantic.BaseModel):
        x: int
        y: SimpleModel

    class DiscriminatedFoo(pydantic.BaseModel):
        kind: Literal["foo"]
        value: str

    class DiscriminatedBar(pydantic.BaseModel):
        kind: Literal["bar"]
        value: int

    DiscriminatedUnion = Union[DiscriminatedFoo, DiscriminatedBar]

except ImportError:
    SimpleModel = None
    NestedModel = None
    DiscriminatedFoo = None
    DiscriminatedBar = None
    DiscriminatedUnion = None

# --- Basic Test Cases ---
def test_basic_int():
    # Should return value unchanged for int type
    codeflash_output = construct_type_unchecked(value=5, type_=int) # 8.31μs -> 7.86μs (5.77% faster)

def test_basic_float_from_int():
    # Should coerce int to float
    codeflash_output = construct_type_unchecked(value=7, type_=float) # 7.19μs -> 7.15μs (0.588% faster)

def test_basic_float_from_float():
    # Should return float unchanged
    codeflash_output = construct_type_unchecked(value=3.14, type_=float) # 6.87μs -> 6.35μs (8.30% faster)

def test_basic_str():
    # Should return string unchanged
    codeflash_output = construct_type_unchecked(value="hello", type_=str) # 6.56μs -> 6.04μs (8.51% faster)

def test_basic_list_of_int():
    # Should construct list of ints
    codeflash_output = construct_type_unchecked(value=[1, 2, 3], type_=List[int]) # 16.1μs -> 15.3μs (5.26% faster)

def test_basic_dict_of_str_int():
    # Should construct dict with int values
    codeflash_output = construct_type_unchecked(value={"a": 1, "b": 2}, type_=Dict[str, int]) # 15.1μs -> 14.4μs (4.42% faster)

def test_basic_union_int_str():
    # Should match first type in Union
    codeflash_output = construct_type_unchecked(value=42, type_=Union[int, str]) # 11.1μs -> 10.9μs (1.77% faster)
    codeflash_output = construct_type_unchecked(value="test", type_=Union[int, str]) # 4.87μs -> 4.57μs (6.61% faster)

def test_basic_literal():
    # Should accept only values in Literal
    codeflash_output = construct_type_unchecked(value="foo", type_=Literal["foo", "bar"]) # 6.24μs -> 5.72μs (9.09% faster)
    codeflash_output = construct_type_unchecked(value="bar", type_=Literal["foo", "bar"]) # 2.79μs -> 2.68μs (4.03% faster)
    codeflash_output = construct_type_unchecked(value="baz", type_=Literal["foo", "bar"]) # 2.24μs -> 2.08μs (7.84% faster)

def test_basic_annotated():
    # Should unwrap Annotated and treat as base type
    codeflash_output = construct_type_unchecked(value=123, type_=Annotated[int, "meta"]) # 9.46μs -> 8.52μs (11.1% faster)


def test_edge_empty_list():
    # Should handle empty list
    codeflash_output = construct_type_unchecked(value=[], type_=List[int]) # 7.56μs -> 6.70μs (12.8% faster)

def test_edge_empty_dict():
    # Should handle empty dict
    codeflash_output = construct_type_unchecked(value={}, type_=Dict[str, int]) # 8.01μs -> 6.73μs (19.0% faster)

def test_edge_list_wrong_type():
    # Should return value as-is if not a list
    codeflash_output = construct_type_unchecked(value="notalist", type_=List[int]) # 6.73μs -> 6.40μs (5.14% faster)

def test_edge_dict_wrong_type():
    # Should return value as-is if not a dict
    codeflash_output = construct_type_unchecked(value="notadict", type_=Dict[str, int]) # 6.91μs -> 6.42μs (7.50% faster)


def test_edge_float_from_non_int():
    # Should return value unchanged if not int or float
    codeflash_output = construct_type_unchecked(value="notanumber", type_=float) # 9.31μs -> 8.31μs (12.1% faster)

def test_edge_datetime_from_str():
    # Should parse valid ISO datetime string
    codeflash_output = construct_type_unchecked(value="2022-01-01T12:34:56Z", type_=datetime); dt = codeflash_output # 24.1μs -> 23.8μs (0.902% faster)

def test_edge_datetime_from_invalid_str():
    # Should return value as-is if string is not valid datetime
    codeflash_output = construct_type_unchecked(value="notadatetime", type_=datetime) # 11.7μs -> 11.1μs (5.47% faster)

def test_edge_date_from_str():
    # Should parse valid ISO date string
    codeflash_output = construct_type_unchecked(value="2022-01-01", type_=date); d = codeflash_output # 15.4μs -> 15.1μs (2.47% faster)

def test_edge_date_from_invalid_str():
    # Should return value as-is if string is not valid date
    codeflash_output = construct_type_unchecked(value="notadate", type_=date) # 11.0μs -> 10.4μs (6.25% faster)

def test_edge_pydantic_model_construct():
    # Should construct pydantic model from dict
    if SimpleModel is not None:
        codeflash_output = construct_type_unchecked(value={"a": 1, "b": "x"}, type_=SimpleModel); obj = codeflash_output # 10.6μs -> 9.26μs (14.9% faster)

def test_edge_pydantic_model_nested():
    # Should construct nested pydantic model
    if NestedModel is not None:
        codeflash_output = construct_type_unchecked(value={"x": 5, "y": {"a": 2, "b": "y"}}, type_=NestedModel); obj = codeflash_output # 9.68μs -> 8.45μs (14.6% faster)

def test_edge_union_discriminated():
    # Should construct correct discriminated union variant
    if DiscriminatedUnion is not None:
        codeflash_output = construct_type_unchecked(value={"kind": "foo", "value": "abc"}, type_=DiscriminatedUnion); foo_obj = codeflash_output # 22.0μs -> 21.3μs (3.03% faster)
        codeflash_output = construct_type_unchecked(value={"kind": "bar", "value": 123}, type_=DiscriminatedUnion); bar_obj = codeflash_output # 6.45μs -> 6.34μs (1.72% faster)

def test_edge_optional_type():
    # Should handle Optional types (Union with None)
    codeflash_output = construct_type_unchecked(value=None, type_=Optional[int]) # 8.41μs -> 8.22μs (2.25% faster)
    codeflash_output = construct_type_unchecked(value=7, type_=Optional[int]) # 3.57μs -> 3.62μs (1.24% slower)

def test_edge_list_of_dicts():
    # Should construct list of dicts
    value = [{"a": 1}, {"a": 2}]
    codeflash_output = construct_type_unchecked(value=value, type_=List[Dict[str, int]]); result = codeflash_output # 22.6μs -> 21.2μs (6.51% faster)

def test_edge_list_of_lists():
    # Should construct list of lists
    value = [[1, 2], [3, 4]]
    codeflash_output = construct_type_unchecked(value=value, type_=List[List[int]]); result = codeflash_output # 23.1μs -> 20.9μs (10.1% faster)

def test_edge_dict_of_lists():
    # Should construct dict of lists
    value = {"x": [1, 2], "y": [3, 4]}
    codeflash_output = construct_type_unchecked(value=value, type_=Dict[str, List[int]]); result = codeflash_output # 24.1μs -> 21.7μs (11.4% faster)

def test_edge_list_of_optional():
    # Should handle list of Optional[int]
    value = [1, None, 3]
    codeflash_output = construct_type_unchecked(value=value, type_=List[Optional[int]]); result = codeflash_output # 19.1μs -> 17.7μs (7.47% faster)

def test_edge_dict_with_non_str_keys():
    # Should return value as-is if dict keys are not strings
    value = {1: "a", 2: "b"}
    codeflash_output = construct_type_unchecked(value=value, type_=Dict[str, str]); result = codeflash_output # 14.6μs -> 13.3μs (10.3% faster)

# --- Large Scale Test Cases ---
def test_large_list_of_ints():
    # Should handle large lists efficiently
    large_list = list(range(1000))
    codeflash_output = construct_type_unchecked(value=large_list, type_=List[int]); result = codeflash_output # 1.98ms -> 1.65ms (20.4% faster)

def test_large_dict_of_ints():
    # Should handle large dicts efficiently
    large_dict = {str(i): i for i in range(1000)}
    codeflash_output = construct_type_unchecked(value=large_dict, type_=Dict[str, int]); result = codeflash_output # 2.04ms -> 1.73ms (17.8% faster)

def test_large_nested_structure():
    # Should handle large nested structures
    large_nested = [{"a": i, "b": str(i)} for i in range(1000)]
    if SimpleModel is not None:
        codeflash_output = construct_type_unchecked(value=large_nested, type_=List[SimpleModel]); result = codeflash_output # 3.10ms -> 2.56ms (21.2% faster)
        for i, obj in enumerate(result):
            pass

def test_large_list_of_lists():
    # Should handle large list of lists
    large_list_of_lists = [[i for i in range(10)] for _ in range(100)]
    codeflash_output = construct_type_unchecked(value=large_list_of_lists, type_=List[List[int]]); result = codeflash_output # 2.21ms -> 1.91ms (15.7% faster)

def test_large_dict_of_lists():
    # Should handle large dict of lists
    large_dict_of_lists = {str(i): [i, i+1] for i in range(500)}
    codeflash_output = construct_type_unchecked(value=large_dict_of_lists, type_=Dict[str, List[int]]); result = codeflash_output # 3.16ms -> 2.75ms (14.9% faster)

def test_large_list_of_dicts():
    # Should handle large list of dicts
    large_list_of_dicts = [{"x": i, "y": i + 1} for i in range(500)]
    codeflash_output = construct_type_unchecked(value=large_list_of_dicts, type_=List[Dict[str, int]]); result = codeflash_output # 3.26ms -> 2.77ms (17.6% faster)


def test_large_list_of_optional():
    # Should handle large list of Optional[int]
    large_list = [i if i % 2 == 0 else None for i in range(1000)]
    codeflash_output = construct_type_unchecked(value=large_list, type_=List[Optional[int]]); result = codeflash_output # 2.03ms -> 1.98ms (2.30% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-construct_type_unchecked-mhd0u03v and push.

The optimized code achieves a **15% speedup** through several key micro-optimizations that reduce redundant operations and function calls: **1. Eliminated redundant `get_args()` calls**: The original code called `get_args(type_)` multiple times for dict processing (`_, items_type = get_args(type_)`). The optimized version stores the result once and directly accesses `items_type = args[1]`, avoiding repeated tuple unpacking. **2. Added fast-path for empty containers**: For both dict and list processing, the optimized code checks `if not value:` and returns empty containers immediately (`{}` or `[]`), avoiding unnecessary comprehension overhead for empty inputs. This is particularly effective as shown in test cases like `test_empty_dict()` (15.3% faster) and `test_empty_list()` (12.8% faster). **3. Optimized model construction logic**: Instead of repeatedly calling `getattr(type_, "construct", None)` within comprehensions, the optimized code fetches the construct method once and reuses it. It also reordered the expensive `is_literal_type()` check after the cheaper `inspect.isclass()` check. **4. Reduced attribute lookups**: By caching function references and avoiding repeated dictionary/tuple access patterns, the code minimizes Python's attribute resolution overhead. These optimizations are most effective for **large-scale data processing scenarios** (17-21% speedup on large lists/dicts with 1000+ elements) and **container-heavy workloads** where dict/list construction dominates runtime. The improvements are consistent across nested structures, making this particularly valuable for API response parsing and data serialization tasks typical in the OpenAI library.

codeflash-ai bot requested a review from mashraf-222 October 30, 2025 06:06

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `construct_type_unchecked` by 16% #32

⚡️ Speed up function `construct_type_unchecked` by 16% #32

Uh oh!

codeflash-ai bot commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function construct_type_unchecked by 16% #32

Are you sure you want to change the base?

⚡️ Speed up function construct_type_unchecked by 16% #32

Uh oh!

Conversation

codeflash-ai bot commented Oct 30, 2025

📄 16% (0.16x) speedup for construct_type_unchecked in src/openai/_models.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `construct_type_unchecked` by 16% #32

⚡️ Speed up function `construct_type_unchecked` by 16% #32

📄 16% (0.16x) speedup for `construct_type_unchecked` in `src/openai/_models.py`