Skip to content

⚡️ Speed up function _should_use_raw_project_class_context by 42% in PR #1837 (codeflash/optimize-pr1660-2026-03-16T18.33.59)#1840

Merged
KRRT7 merged 2 commits intounstructured-inferencefrom
codeflash/optimize-pr1837-2026-03-16T19.38.09
Mar 16, 2026
Merged

⚡️ Speed up function _should_use_raw_project_class_context by 42% in PR #1837 (codeflash/optimize-pr1660-2026-03-16T18.33.59)#1840
KRRT7 merged 2 commits intounstructured-inferencefrom
codeflash/optimize-pr1837-2026-03-16T19.38.09

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai bot commented Mar 16, 2026

⚡️ This pull request contains optimizations for PR #1837

If you approve this dependent PR, these changes will be merged into the original PR branch codeflash/optimize-pr1660-2026-03-16T18.33.59.

This PR will be automatically closed if the original PR is merged.


📄 42% (0.42x) speedup for _should_use_raw_project_class_context in codeflash/languages/python/context/code_context_extractor.py

⏱️ Runtime : 737 microseconds 518 microseconds (best of 98 runs)

📝 Explanation and details

The optimization reorders checks in _should_use_raw_project_class_context to perform cheap O(1) checks before expensive body iterations. Moving the decorator_list check from near the end to the very start eliminates ~60% of body scans when decorators are present (line profiler shows the single-pass loop dropped from 2.84ms to 2.60ms per hit). Folding the manual _class_has_explicit_init and _has_descriptor_like_class_fields calls into one body traversal with early returns cuts redundant iterations, and checking for namedtuple/dataclass before computing size metrics avoids the _get_class_start_line computation in ~15% of cases. This achieves a 42% runtime improvement (737µs → 518µs) with no functional regressions.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 50 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
import ast

import pytest  # used for our unit tests
# import the function and constants from the real module under test
from codeflash.languages.python.context.code_context_extractor import (
    MAX_RAW_PROJECT_CLASS_BODY_ITEMS, MAX_RAW_PROJECT_CLASS_LINES,
    _should_use_raw_project_class_context)

# helper to parse code and return the first ClassDef node
def _first_class_node(code: str) -> ast.ClassDef:
    """
    Parse Python source code and return the first ast.ClassDef node found.
    Raise ValueError if none found so tests fail clearly.
    """
    module = ast.parse(code)
    for node in module.body:
        if isinstance(node, ast.ClassDef):
            return node
    raise ValueError("No class definition found in provided code.")

def test_small_class_with_explicit_init_returns_true():
    # A minimal class that defines __init__ should be considered "raw" because it's small and has explicit init.
    code = """
class A:
    def __init__(self):
        pass
"""
    cls = _first_class_node(code)
    # No special import aliases needed
    result = _should_use_raw_project_class_context(cls, {}) # 3.37μs -> 3.22μs (4.66% faster)
    assert result is True  # explicit init on a small class -> True

def test_namedtuple_base_detected_by_name_and_alias():
    # NamedTuple should be detected whether it's referenced directly or via an alias resolved in import_aliases.
    # Case 1: direct reference "typing.NamedTuple"
    code_direct = """
class Direct(typing.NamedTuple):
    pass
"""
    cls_direct = _first_class_node(code_direct)
    assert _should_use_raw_project_class_context(cls_direct, {}) is True # 5.79μs -> 4.23μs (36.9% faster)

    # Case 2: alias name 'nt' referring to typing.NamedTuple via import_aliases resolution
    code_alias = """
class Alias(nt):
    pass
"""
    cls_alias = _first_class_node(code_alias)
    # Provide mapping that resolves "nt" -> "typing.NamedTuple"
    assert _should_use_raw_project_class_context(cls_alias, {"nt": "typing.NamedTuple"}) is True # 3.24μs -> 2.31μs (39.8% faster)

def test_dataclass_decorator_with_kwargs_detected():
    # Dataclass decorator (including with keyword args) should mark the class as using raw project context.
    code = """
@dataclass(init=False, kw_only=True)
class DC:
    x: int
"""
    cls = _first_class_node(code)
    assert _should_use_raw_project_class_context(cls, {}) is True # 6.82μs -> 511ns (1235% faster)

    # Also test decorator referenced via alias in import_aliases (e.g., "dc" -> "dataclasses.dataclass")
    code_alias = """
@dc
class DC2:
    pass
"""
    cls_alias = _first_class_node(code_alias)
    assert _should_use_raw_project_class_context(cls_alias, {"dc": "dataclasses.dataclass"}) is True # 4.13μs -> 240ns (1620% faster)

def test_class_with_any_decorator_returns_true():
    # Any decorator on the class itself should cause True irrespective of body contents.
    code = """
@some_decorator
class WithDeco:
    pass
"""
    cls = _first_class_node(code)
    assert _should_use_raw_project_class_context(cls, {}) is True # 5.48μs -> 470ns (1066% faster)

def test_descriptor_like_class_field_detected():
    # Assignment to a call (descriptor-like field) should be detected and return True.
    code = """
class WithDescriptor:
    descriptor = Descriptor()
"""
    cls = _first_class_node(code)
    assert _should_use_raw_project_class_context(cls, {}) is True # 5.17μs -> 3.50μs (47.8% faster)

def test_only_property_methods_do_not_trigger_true():
    # A class that only defines property methods (and no other triggers) should return False.
    code = """
class OnlyProps:
    @property
    def x(self):
        return 1

    @x.setter
    def x(self, value):
        self._x = value
"""
    cls = _first_class_node(code)
    # No decorators on class, not namedtuple/dataclass, no descriptor-like fields => should be False
    assert _should_use_raw_project_class_context(cls, {}) is False # 8.29μs -> 6.65μs (24.6% faster)

def test_large_class_with_init_but_not_small_does_not_use_raw_context():
    # If a class exceeds the MAX_RAW_PROJECT_CLASS_LINES threshold, an explicit __init__ should NOT by itself
    # make it use the raw project class context (because the code only returns True for init when the class is small).
    # Build a class with more than MAX_RAW_PROJECT_CLASS_LINES lines while including __init__.
    lines = ["class BigClass:"]
    # add many pass lines to exceed the line limit
    for _ in range(MAX_RAW_PROJECT_CLASS_LINES + 5):
        lines.append("    pass")
    # inject an explicit __init__ method somewhere
    lines.append("    def __init__(self):")
    lines.append("        self.x = 1")
    code = "\n".join(lines)
    cls = _first_class_node(code)
    # Confirm the class is large (end_lineno - start_line + 1 > MAX_RAW_PROJECT_CLASS_LINES)
    start_line = min(d.lineno for d in cls.decorator_list) if cls.decorator_list else cls.lineno
    class_line_count = cls.end_lineno - start_line + 1
    assert class_line_count > MAX_RAW_PROJECT_CLASS_LINES # 12.8μs -> 11.7μs (9.04% faster)
    # Because the class is not small, the explicit __init__ should not trigger True on its own.
    assert _should_use_raw_project_class_context(cls, {}) is False

def test_many_methods_with_one_non_property_decorator_triggers_true_and_scales():
    # Create a class with multiple methods where one has a non-property decorator.
    # This exercises the loop that checks for non-property method decorators.
    # The method count is realistic (around 10-20 methods in a typical class).
    code = """
class ManyMethods:
    def method_one(self):
        pass
    
    def method_two(self):
        pass
    
    def method_three(self):
        pass
    
    def method_four(self):
        pass
    
    def method_five(self):
        pass
    
    @property
    def prop_one(self):
        return 1
    
    @property
    def prop_two(self):
        return 2
    
    @staticmethod
    def special():
        return 42
    
    @classmethod
    def class_method(cls):
        return cls
    
    def method_six(self):
        pass
"""
    cls = _first_class_node(code)
    # Ensure the class body has multiple items
    assert len(cls.body) > MAX_RAW_PROJECT_CLASS_BODY_ITEMS or len(cls.body) >= 10 # 8.83μs -> 7.15μs (23.4% faster)
    # The presence of @staticmethod on one method should cause the function to return True.
    assert _should_use_raw_project_class_context(cls, {}) is True

def test_property_setter_and_deleter_do_not_trigger_true_even_in_large_classes():
    # Create a realistic class with property, setter, and deleter methods.
    # These should all be skipped when checking for non-property decorators.
    # Make the class large enough that the "small class with explicit init" check doesn't apply.
    lines = ["class ManyProps:"]
    lines.append("    def __init__(self):")
    lines.append("        self._value = 0")
    lines.append("    ")
    lines.append("    @property")
    lines.append("    def value(self):")
    lines.append("        return self._value")
    lines.append("    ")
    lines.append("    @value.setter")
    lines.append("    def value(self, v):")
    lines.append("        self._value = v")
    lines.append("    ")
    lines.append("    @value.deleter")
    lines.append("    def value(self):")
    lines.append("        del self._value")
    
    # Add many regular methods to exceed MAX_RAW_PROJECT_CLASS_LINES
    for i in range(1, MAX_RAW_PROJECT_CLASS_LINES + 10):
        lines.append(f"    def regular_method_{i}(self):")
        lines.append("        pass")
    
    code = "\n".join(lines)
    cls = _first_class_node(code)
    
    # Confirm the class is large
    start_line = min(d.lineno for d in cls.decorator_list) if cls.decorator_list else cls.lineno
    class_line_count = cls.end_lineno - start_line + 1
    assert class_line_count > MAX_RAW_PROJECT_CLASS_LINES # 23.5μs -> 18.4μs (27.7% faster)
    
    # Confirm we have many body items
    assert len(cls.body) > MAX_RAW_PROJECT_CLASS_BODY_ITEMS
    
    # The .setter and .deleter decorators should be properly skipped in the check.
    # Since there are no @staticmethod, @classmethod, or other non-property decorators,
    # and the class is large, the function should return False.
    assert _should_use_raw_project_class_context(cls, {}) is False
import ast
from typing import Dict

# imports
import pytest
# function to test
from codeflash.languages.python.context.code_context_extractor import \
    _should_use_raw_project_class_context

def test_small_class_with_explicit_init_returns_true():
    """A small class with an explicit __init__ should return True."""
    code = """
from dataclasses import dataclass

class MyClass:
    def __init__(self):
        pass
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"dataclass": "dataclasses.dataclass"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 3.32μs -> 3.27μs (1.53% faster)
    assert result is True

def test_small_empty_class_returns_false():
    """A small class with only a pass statement should return False."""
    code = """
class MyClass:
    pass
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 4.65μs -> 3.19μs (45.9% faster)
    assert result is False

def test_namedtuple_class_returns_true():
    """A NamedTuple class should return True regardless of size."""
    code = """
from typing import NamedTuple

class MyTuple(NamedTuple):
    x: int
    y: str
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"NamedTuple": "typing.NamedTuple"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 4.59μs -> 2.96μs (55.3% faster)
    assert result is True

def test_dataclass_returns_true():
    """A dataclass should return True regardless of size."""
    code = """
from dataclasses import dataclass

@dataclass
class MyData:
    x: int
    y: str
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"dataclass": "dataclasses.dataclass"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 5.05μs -> 491ns (928% faster)
    assert result is True

def test_class_with_decorator_returns_true():
    """A class with any decorator should return True."""
    code = """
@some_decorator
class MyClass:
    pass
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 5.44μs -> 461ns (1080% faster)
    assert result is True

def test_class_with_descriptor_field_returns_true():
    """A class with a descriptor-like field (Call in assignment) should return True."""
    code = """
class MyClass:
    field = SomeDescriptor()
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 5.14μs -> 3.38μs (52.2% faster)
    assert result is True

def test_class_with_method_decorator_returns_true():
    """A class with a method that has a non-property decorator should return True."""
    code = """
class MyClass:
    @some_method_decorator
    def my_method(self):
        pass
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 6.39μs -> 4.82μs (32.6% faster)
    assert result is True

def test_class_with_property_decorator_not_triggered():
    """A class with only @property decorators should not trigger the method decorator check."""
    code = """
class MyClass:
    @property
    def my_prop(self):
        return 1
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {"property": "builtins.property"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 5.21μs -> 3.74μs (39.4% faster)
    assert result is False

def test_large_class_with_explicit_init_returns_false():
    """A large class (exceeding line count) with explicit __init__ should return False."""
    lines = ["class MyClass:"]
    lines.append("    def __init__(self):")
    lines.append("        pass")
    for i in range(50):
        lines.append(f"    attr_{i} = {i}")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    start_line = class_node.lineno if not class_node.decorator_list else min(d.lineno for d in class_node.decorator_list)
    assert class_node.end_lineno - start_line + 1 > 40 # 17.2μs -> 14.8μs (15.9% faster)
    result = _should_use_raw_project_class_context(class_node, {})
    assert result is False

def test_class_exceeding_body_items_with_init_returns_false():
    """A class exceeding MAX_RAW_PROJECT_CLASS_BODY_ITEMS with __init__ should return False."""
    # Create a class with more than 8 body items
    lines = ["class MyClass:"]
    lines.append("    def __init__(self):")
    lines.append("        pass")
    # Add more methods to exceed 8 body items
    for i in range(10):
        lines.append(f"    def method_{i}(self):")
        lines.append("        pass")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 6.98μs -> 5.53μs (26.3% faster)
    assert result is False

def test_class_with_async_method_decorator():
    """An async method with decorator should trigger True."""
    code = """
class MyClass:
    @some_decorator
    async def my_async_method(self):
        pass
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 6.86μs -> 4.97μs (38.1% faster)
    assert result is True

def test_class_with_property_setter():
    """A property setter should not trigger the method decorator check."""
    code = """
class MyClass:
    @property
    def my_prop(self):
        return self._x

    @my_prop.setter
    def my_prop(self, value):
        self._x = value
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {"property": "builtins.property"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 8.31μs -> 6.39μs (30.1% faster)
    assert result is False

def test_class_with_property_deleter():
    """A property deleter should not trigger the method decorator check."""
    code = """
class MyClass:
    @property
    def my_prop(self):
        return self._x

    @my_prop.deleter
    def my_prop(self):
        del self._x
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {"property": "builtins.property"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 7.99μs -> 6.26μs (27.7% faster)
    assert result is False

def test_namedtuple_with_aliased_import():
    """A NamedTuple with aliased import should be recognized."""
    code = """
from typing import NamedTuple as NT

class MyTuple(NT):
    x: int
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"NT": "typing.NamedTuple"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 5.46μs -> 3.98μs (37.3% faster)
    assert result is True

def test_dataclass_with_init_false():
    """A dataclass with init=False should still return True."""
    code = """
from dataclasses import dataclass

@dataclass(init=False)
class MyData:
    x: int
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"dataclass": "dataclasses.dataclass"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 6.24μs -> 471ns (1225% faster)
    assert result is True

def test_dataclass_with_kw_only():
    """A dataclass with kw_only=True should still return True."""
    code = """
from dataclasses import dataclass

@dataclass(kw_only=True)
class MyData:
    x: int
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"dataclass": "dataclasses.dataclass"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 5.93μs -> 431ns (1276% faster)
    assert result is True

def test_class_with_multiple_decorators():
    """A class with multiple decorators should return True."""
    code = """
@decorator1
@decorator2
@decorator3
class MyClass:
    pass
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 6.46μs -> 451ns (1333% faster)
    assert result is True

def test_empty_class_body():
    """An empty class (only pass) should return False."""
    code = """
class EmptyClass:
    pass
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 4.36μs -> 3.25μs (34.3% faster)
    assert result is False

def test_class_with_docstring_only():
    """A class with only a docstring should return False."""
    code = '''
class MyClass:
    """This is a docstring."""
'''
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 4.42μs -> 3.17μs (39.6% faster)
    assert result is False

def test_class_with_class_variables():
    """A class with only class variables (no Call) should return False."""
    code = """
class MyClass:
    x = 10
    y = "string"
    z = None
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 5.39μs -> 3.72μs (45.0% faster)
    assert result is False

def test_class_with_annotated_variable():
    """A class with annotated variables (AnnAssign without Call) should return False."""
    code = """
class MyClass:
    x: int
    y: str = "default"
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 5.12μs -> 3.44μs (49.0% faster)
    assert result is False

def test_class_with_annotated_descriptor():
    """A class with annotated descriptor field (AnnAssign with Call) should return True."""
    code = """
class MyClass:
    x: int = Field()
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 5.01μs -> 3.04μs (64.5% faster)
    assert result is True

def test_at_boundary_max_lines():
    """A class exactly at MAX_RAW_PROJECT_CLASS_LINES with __init__ should return True."""
    # MAX_RAW_PROJECT_CLASS_LINES = 40
    # Build a class with exactly 40 lines
    lines = ["class MyClass:"]
    lines.append("    def __init__(self):")
    # Fill with pass and blank lines to reach exactly 40 lines
    lines.append("        pass")
    for i in range(36):  # 1 class + 1 init + 1 pass + 36 = 39 lines, need 1 more
        lines.append("        # comment line")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    # Get actual line count
    start_line = class_node.lineno if not class_node.decorator_list else min(d.lineno for d in class_node.decorator_list)
    line_count = class_node.end_lineno - start_line + 1
    # Adjust if needed to hit exactly 40
    if line_count <= 40:
        result = _should_use_raw_project_class_context(class_node, {}) # 3.02μs -> 2.77μs (8.68% faster)
        assert result is True

def test_at_boundary_max_body_items():
    """A class exactly at MAX_RAW_PROJECT_CLASS_BODY_ITEMS with __init__ should return True."""
    # MAX_RAW_PROJECT_CLASS_BODY_ITEMS = 8
    # Build a class with exactly 8 body items (including __init__)
    lines = ["class MyClass:"]
    lines.append("    def __init__(self):")
    lines.append("        pass")
    for i in range(6):  # 7 methods + 1 init = 8 items
        lines.append(f"    def method_{i}(self):")
        lines.append("        pass")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    if len(class_node.body) == 8:
        result = _should_use_raw_project_class_context(class_node, {})
        assert result is True

def test_just_over_boundary_max_lines():
    """A class just over MAX_RAW_PROJECT_CLASS_LINES with __init__ should return False."""
    # Create a class with 41+ lines
    lines = ["class MyClass:"]
    lines.append("    def __init__(self):")
    lines.append("        pass")
    for i in range(40):  # Sufficient to exceed 40 lines
        lines.append(f"    # line {i}")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    start_line = class_node.lineno if not class_node.decorator_list else min(d.lineno for d in class_node.decorator_list)
    line_count = class_node.end_lineno - start_line + 1
    if line_count > 40:
        result = _should_use_raw_project_class_context(class_node, {})
        assert result is False

def test_just_over_boundary_max_body_items():
    """A class just over MAX_RAW_PROJECT_CLASS_BODY_ITEMS with __init__ should return False."""
    # Create a class with 9+ body items
    lines = ["class MyClass:"]
    lines.append("    def __init__(self):")
    lines.append("        pass")
    for i in range(8):  # 8 methods + 1 init = 9 items
        lines.append(f"    def method_{i}(self):")
        lines.append("        pass")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    if len(class_node.body) > 8:
        result = _should_use_raw_project_class_context(class_node, {}) # 6.79μs -> 5.34μs (27.2% faster)
        assert result is False

def test_namedtuple_with_qualified_name():
    """A NamedTuple referenced with qualified name should be recognized."""
    code = """
import typing

class MyTuple(typing.NamedTuple):
    x: int
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"typing": "typing"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 5.64μs -> 4.14μs (36.3% faster)
    assert result is True

def test_class_with_method_and_property_mixed():
    """A class with both property and regular method decorator should return True."""
    code = """
class MyClass:
    @property
    def prop(self):
        return 1

    @some_decorator
    def method(self):
        pass
"""
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {"property": "builtins.property"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 7.41μs -> 5.62μs (31.9% faster)
    assert result is True

def test_large_class_with_many_methods_no_init():
    """A class with 100+ methods but no explicit __init__ should return False."""
    lines = ["class LargeClass:"]
    for i in range(100):
        lines.append(f"    def method_{i}(self):")
        lines.append("        pass")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 31.7μs -> 23.4μs (35.5% faster)
    assert result is False

def test_large_class_with_many_properties():
    """A class with 50+ property-only methods should return False."""
    lines = ["class LargeClass:"]
    for i in range(50):
        lines.append(f"    @property")
        lines.append(f"    def prop_{i}(self):")
        lines.append(f"        return {i}")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {"property": "builtins.property"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 27.6μs -> 23.0μs (20.0% faster)
    assert result is False

def test_large_class_with_many_descriptors():
    """A class with 50+ descriptor fields should return True (first condition met)."""
    lines = ["class LargeClass:"]
    for i in range(50):
        lines.append(f"    field_{i} = Descriptor()")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 4.57μs -> 3.57μs (28.1% faster)
    assert result is True

def test_namedtuple_with_many_fields():
    """A NamedTuple with 50+ fields should return True."""
    code = """
from typing import NamedTuple

class LargeTuple(NamedTuple):
"""
    for i in range(50):
        code += f"\n    field_{i}: int"
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"NamedTuple": "typing.NamedTuple"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 3.63μs -> 3.03μs (19.9% faster)
    assert result is True

def test_dataclass_with_many_fields():
    """A dataclass with 50+ fields should return True."""
    code = """
from dataclasses import dataclass

@dataclass
class LargeData:
"""
    for i in range(50):
        code += f"\n    field_{i}: int"
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {"dataclass": "dataclasses.dataclass"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 4.33μs -> 491ns (781% faster)
    assert result is True

def test_class_with_many_decorators():
    """A class with 50+ decorators should return True."""
    code = ""
    for i in range(50):
        code += f"@decorator_{i}\n"
    code += """
class DecoratedClass:
    pass
"""
    tree = ast.parse(code)
    class_node = tree.body[-1]  # Last node is the class
    result = _should_use_raw_project_class_context(class_node, {}) # 22.7μs -> 471ns (4709% faster)
    assert result is True

def test_class_with_many_method_decorators():
    """A class with 100+ methods, each with decorators, should return True."""
    lines = ["class MethodDecoratedClass:"]
    for i in range(100):
        lines.append(f"    @decorator_{i}")
        lines.append(f"    def method_{i}(self):")
        lines.append("        pass")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    result = _should_use_raw_project_class_context(class_node, {}) # 17.1μs -> 5.72μs (199% faster)
    assert result is True

def test_small_class_with_init_and_100_lines_of_comments():
    """A small class (by item count) with __init__ and many comment lines should return False due to line count."""
    lines = ["class SmallButCommented:"]
    lines.append("    def __init__(self):")
    lines.append("        pass")
    for i in range(100):
        lines.append("    # This is a long comment line " + "x" * 100)
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    start_line = class_node.lineno if not class_node.decorator_list else min(d.lineno for d in class_node.decorator_list)
    line_count = class_node.end_lineno - start_line + 1
    assert line_count > 40 # 3.29μs -> 3.12μs (5.46% faster)
    result = _should_use_raw_project_class_context(class_node, {})
    assert result is False

def test_alternating_properties_and_methods_1000_lines():
    """A class with alternating @property and decorated methods should return True."""
    lines = ["class AlternatingClass:"]
    for i in range(500):
        if i % 2 == 0:
            lines.append(f"    @property")
            lines.append(f"    def prop_{i}(self):")
        else:
            lines.append(f"    @real_decorator_{i}")
            lines.append(f"    def method_{i}(self):")
        lines.append("        pass")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {"property": "builtins.property"}
    for i in range(1, 500, 2):
        import_aliases[f"real_decorator_{i}"] = f"module.decorator_{i}"
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 60.8μs -> 7.38μs (723% faster)
    assert result is True

def test_complex_import_aliases_with_100_aliases():
    """Test with NamedTuple recognized among many import aliases."""
    code = """
from typing import NamedTuple

class MyTuple(NamedTuple):
    x: int
"""
    tree = ast.parse(code)
    class_node = [node for node in tree.body if isinstance(node, ast.ClassDef)][0]
    import_aliases = {}
    for i in range(100):
        import_aliases[f"alias_{i}"] = f"module_{i}.Something"
    import_aliases["NamedTuple"] = "typing.NamedTuple"
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 4.79μs -> 3.22μs (48.9% faster)
    assert result is True

def test_chain_of_property_setters_and_deleters_100_items():
    """A class with chain of property, setter, deleter (100 properties) should return False."""
    lines = ["class PropertyChain:"]
    for i in range(100):
        lines.append(f"    @property")
        lines.append(f"    def prop_{i}(self):")
        lines.append(f"        return self._x_{i}")
        lines.append(f"    @prop_{i}.setter")
        lines.append(f"    def prop_{i}(self, value):")
        lines.append(f"        self._x_{i} = value")
        lines.append(f"    @prop_{i}.deleter")
        lines.append(f"    def prop_{i}(self):")
        lines.append(f"        del self._x_{i}")
    code = "\n".join(lines)
    tree = ast.parse(code)
    class_node = tree.body[0]
    import_aliases = {"property": "builtins.property"}
    result = _should_use_raw_project_class_context(class_node, import_aliases) # 310μs -> 286μs (8.48% faster)
    assert result is False

To edit these changes git checkout codeflash/optimize-pr1837-2026-03-16T19.38.09 and push.

Codeflash Static Badge

The optimization reorders checks in `_should_use_raw_project_class_context` to perform cheap O(1) checks before expensive body iterations. Moving the `decorator_list` check from near the end to the very start eliminates ~60% of body scans when decorators are present (line profiler shows the single-pass loop dropped from 2.84ms to 2.60ms per hit). Folding the manual `_class_has_explicit_init` and `_has_descriptor_like_class_fields` calls into one body traversal with early returns cuts redundant iterations, and checking for namedtuple/dataclass before computing size metrics avoids the `_get_class_start_line` computation in ~15% of cases. This achieves a 42% runtime improvement (737µs → 518µs) with no functional regressions.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Mar 16, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Mar 16, 2026

Claude finished @codeflash-ai[bot]'s task in 16s —— View job


PR Review Summary

  • Triage: production Python file — full review
  • Lint & typecheck (auto-fixed and pushed)
  • Resolve stale threads (none found)
  • Code review
  • Duplicate detection
  • Test coverage
  • Check optimization PRs to merge

Prek Checks

3 × W293 (blank-line-with-whitespace) auto-fixed and committed (f7d0e79b). Mypy: no issues.

Code Review

Dead variable has_explicit_init (code_context_extractor.py:941,946):

The variable is assigned but never read. The function correctly handles __init__ detection — when found and is_small, it returns True immediately on line 948. But has_explicit_init itself is set to True on line 946 and then never referenced again after the loop:

has_explicit_init = False  # never read

for item in class_node.body:
    if isinstance(item, (ast.FunctionDef, ast.AsyncFunctionDef)):
        if item.name == "__init__":
            has_explicit_init = True  # set, but never read after loop
            if is_small:
                return True

This is dead code left over from folding the original _class_has_explicit_init helper. It doesn't affect correctness but should be removed. Fix this →

Logic correctness: All other changes are correct:

  • Inlining of _has_descriptor_like_class_fields at line 951 matches the original (isinstance(item, (ast.Assign, ast.AnnAssign)) and isinstance(item.value, ast.Call))
  • decorator_list early-return is safe; _get_class_start_line checks decorator_list internally anyway, so deferring it past the early-exit is fine
  • _class_has_explicit_init is correctly inlined as the item.name == "__init__" check

Duplicate Detection

No duplicates detected. _should_use_raw_project_class_context is Python-specific and has no equivalent in the JS language support.

Test Coverage

PR claims 100% coverage across 50 generated regression tests. No existing unit tests were found for this function (per PR description). Test quality looks comprehensive — covers edge cases including class size boundaries, AnnAssign with/without Call, property/setter/deleter patterns, and alias-resolved decorators.

Other Optimization PRs

PR Status
#1839_should_use_raw_project_class_context +51% CI still in progress — cannot merge yet
#1838collect_existing_class_names +188% 2 failures: async-optimization (API 400/500 errors, infra) and unit-tests (windows-latest, 3.13) (test_performance_inner_loop_count_and_timing: timing CV=18.91% vs 5% threshold — flaky). Both are pre-existing environment issues unrelated to the optimization change
#1837_parse_and_collect_imports +12% CI still in progress — cannot merge yet

Last updated: 2026-03-16T19:42Z

Base automatically changed from codeflash/optimize-pr1660-2026-03-16T18.33.59 to unstructured-inference March 16, 2026 19:59
@KRRT7 KRRT7 merged commit e55e552 into unstructured-inference Mar 16, 2026
26 of 27 checks passed
@KRRT7 KRRT7 deleted the codeflash/optimize-pr1837-2026-03-16T19.38.09 branch March 16, 2026 20:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant