⚡️ Speed up function `add_codeflash_decorator_to_code` by 17% in PR #294 (`add-timing-info-to-generated-tests`) #297

codeflash-ai · 2025-06-06T05:50:20Z

⚡️ This pull request contains optimizations for PR #294

If you approve this dependent PR, these changes will be merged into the original PR branch add-timing-info-to-generated-tests.

This PR will be automatically closed if the original PR is merged.

📄 17% (0.17x) speedup for `add_codeflash_decorator_to_code` in `codeflash/benchmarking/instrument_codeflash_trace.py`

⏱️ Runtime : 596 milliseconds → 510 milliseconds (best of 18 runs)

📝 Explanation and details

Here’s an optimized version of your program. The main bottleneck is not in the construction of target_functions (which is a tiny fraction of the runtime), but in the way you handle parsing and transformation with libcst.
However, gathering target_functions can be optimized using a list comprehension with tuple unpacking, while avoiding multiple attribute lookups.

Also, the main time is spent in module.visit(transformer) and cst.parse_module. If you have control over how the transformer (i.e., AddDecoratorTransformer) is written, you should make it as restrictive and fast as possible, using visit_/leave_ functions that early-exit on non-target nodes.
Below, I’ll only optimize what’s asked–rewriting this function to minimize unnecessary slow steps and any wasted computations, while preserving the code logic and interface.

Changes.

Combined the logic of extracting (class_name, function_name) into a set comprehension for fewer attribute accesses and tighter bytecode.
Added a check: if target_functions is empty, we just return the original code immediately (this prevents any parsing/visiting if there's nothing to decorate).
Comments left untouched unless relevant code was modified.
Retained function signature and interface.
Provided some minor micro-optimizations (generator expressions, less branching).
Eliminated redundant variable assignments.

Notes.

As per your profile, most time is spent in parsing and visiting. Optimize the AddDecoratorTransformer for early exits and to do as little as possible, in its own definition (not here), for further improvement.
Return early for the empty case: saves any parse/visit calls at all.
If you do ever get code objects, you could use isinstance(module, cst.Module) to skip reparsing, but as per the signature we always expect str.

If you want even more speed, next steps are:

Minimize the work that AddDecoratorTransformer does (match by qualified name to avoid visiting subtrees needlessly).
Use native AST parsing/writing if you have full control over decorator syntax constraints.

Let me know if you want to see transformer optimizations as well!

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 13 Passed
🌀 Generated Regression Tests	✅ 44 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Existing Unit Tests Details

- test_instrument_codeflash_trace.py

🌀 Generated Regression Tests Details

from __future__ import annotations

from dataclasses import dataclass, field

import libcst as cst
# imports
import pytest  # used for our unit tests
from codeflash.benchmarking.instrument_codeflash_trace import \
    add_codeflash_decorator_to_code


# --- Minimal stub for FunctionToOptimize for testing ---
@dataclass
class Parent:
    type: str
    name: str

@dataclass
class FunctionToOptimize:
    function_name: str
    parents: list[Parent] = field(default_factory=list)

# --- Minimal AddDecoratorTransformer for testing ---
class AddDecoratorTransformer(cst.CSTTransformer):
    def __init__(self, target_functions):
        self.target_functions = target_functions

    def leave_FunctionDef(self, original_node, updated_node):
        # Get class name if inside a class
        class_name = ""
        parent = getattr(original_node, "parent", None)
        # libcst doesn't link parents by default, so we rely on the transformer input
        # Instead, we use the function name and class name from the target_functions set
        # So we just match by name
        fn_name = original_node.name.value
        # Try to get class name from the decorator context, but since we don't have it,
        # we just check for both (class_name, fn_name) and ("", fn_name)
        for (cls, name) in self.target_functions:
            if name == fn_name:
                if cls == "":
                    # Top-level function
                    break
                # If a class, check if the parent is a classdef
                parent = getattr(original_node, "parent", None)
                while parent:
                    if isinstance(parent, cst.ClassDef) and parent.name.value == cls:
                        break
                    parent = getattr(parent, "parent", None)
                else:
                    continue
                break
        else:
            return updated_node

        # Check if already decorated with codeflash_trace
        for dec in updated_node.decorators:
            if isinstance(dec.decorator, cst.Name) and dec.decorator.value == "codeflash_trace":
                return updated_node
            if (
                isinstance(dec.decorator, cst.Attribute)
                and dec.decorator.attr.value == "codeflash_trace"
            ):
                return updated_node
        # Add the decorator
        new_decorator = cst.Decorator(decorator=cst.Name("codeflash_trace"))
        return updated_node.with_changes(decorators=[new_decorator] + list(updated_node.decorators))
from codeflash.benchmarking.instrument_codeflash_trace import \
    add_codeflash_decorator_to_code

# unit tests

# -------------------------
# 1. BASIC TEST CASES
# -------------------------

def test_add_decorator_simple_function():
    # Single top-level function, no decorators
    code = "def foo():\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 331μs -> 335μs

def test_add_decorator_already_present():
    # Function already has the decorator, should not duplicate
    code = "@codeflash_trace\ndef foo():\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 379μs -> 384μs

def test_add_decorator_multiple_functions():
    # Multiple functions, only one should get the decorator
    code = "def foo():\n    pass\ndef bar():\n    pass"
    functions = [FunctionToOptimize(function_name="bar")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 469μs -> 478μs

def test_add_decorator_class_method():
    # Add to a method inside a class
    code = "class A:\n    def foo(self):\n        pass"
    functions = [FunctionToOptimize(function_name="foo", parents=[Parent(type="ClassDef", name="A")])]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 459μs -> 467μs

def test_add_decorator_staticmethod():
    # Add to a static method with existing @staticmethod decorator
    code = "class A:\n    @staticmethod\n    def foo():\n        pass"
    functions = [FunctionToOptimize(function_name="foo", parents=[Parent(type="ClassDef", name="A")])]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 467μs -> 470μs

def test_add_decorator_no_functions():
    # No functions to optimize, code should remain unchanged
    code = "def foo():\n    pass"
    codeflash_output = add_codeflash_decorator_to_code(code, []); result = codeflash_output # 241μs -> 107μs

# -------------------------
# 2. EDGE TEST CASES
# -------------------------

def test_add_decorator_function_with_existing_other_decorator():
    # Function has a different decorator, codeflash_trace should be added above it
    code = "@other_decorator\ndef foo():\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 370μs -> 382μs

def test_add_decorator_nested_class_method():
    # Method inside nested class, only add if parents match
    code = "class A:\n    class B:\n        def foo(self):\n            pass"
    # Only add if class is B
    functions = [FunctionToOptimize(function_name="foo", parents=[Parent(type="ClassDef", name="B")])]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 317μs -> 321μs

def test_add_decorator_function_with_attribute_decorator():
    # Function has a decorator like @module.codeflash_trace, should not add another
    code = "@module.codeflash_trace\ndef foo():\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 431μs -> 438μs

def test_add_decorator_function_with_args_in_decorator():
    # Function has a decorator with arguments, codeflash_trace should still be added
    code = "@decorator(arg=1)\ndef foo():\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 525μs -> 537μs

def test_add_decorator_function_with_multiline_signature():
    # Function with multiline signature
    code = "def foo(\n    a,\n    b\n):\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 496μs -> 507μs

def test_add_decorator_unicode_function_name():
    # Function name with unicode
    code = "def föö():\n    pass"
    functions = [FunctionToOptimize(function_name="föö")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 326μs -> 324μs

def test_add_decorator_function_with_docstring():
    # Function with a docstring
    code = 'def foo():\n    """Docstring"""\n    pass'
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 394μs -> 401μs

def test_add_decorator_empty_code():
    # Empty code string should not fail
    code = ""
    codeflash_output = add_codeflash_decorator_to_code(code, []); result = codeflash_output # 56.7μs -> 41.6μs

def test_add_decorator_function_with_type_annotations():
    # Function with type annotations
    code = "def foo(a: int, b: str) -> bool:\n    return True"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 550μs -> 567μs

def test_add_decorator_function_with_complex_signature():
    # Function with *args, **kwargs, defaults, and type hints
    code = "def foo(a: int = 1, *args, b=None, **kwargs) -> None:\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 663μs -> 672μs

def test_add_decorator_function_with_blank_lines():
    # Function with blank lines before/after
    code = "\n\ndef foo():\n    pass\n\n"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 378μs -> 373μs

def test_add_decorator_function_with_comment_above():
    # Function with a comment above
    code = "# comment\ndef foo():\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 349μs -> 352μs

def test_add_decorator_function_with_multiple_decorators():
    # Function with multiple decorators
    code = "@a\n@b\n@c\ndef foo():\n    pass"
    functions = [FunctionToOptimize(function_name="foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 461μs -> 459μs

def test_add_decorator_function_name_collision():
    # Two functions with same name in different classes
    code = (
        "class A:\n    def foo(self):\n        pass\n"
        "class B:\n    def foo(self):\n        pass"
    )
    # Only add to class B
    functions = [FunctionToOptimize(function_name="foo", parents=[Parent(type="ClassDef", name="B")])]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 733μs -> 741μs

# -------------------------
# 3. LARGE SCALE TEST CASES
# -------------------------

def test_add_decorator_many_functions():
    # Large number of top-level functions
    code = "\n".join([f"def foo{i}():\n    pass" for i in range(100)])
    functions = [FunctionToOptimize(function_name=f"foo{i}") for i in range(100)]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 14.2ms -> 14.5ms

def test_add_decorator_many_classes_and_methods():
    # Large number of classes, each with one method
    code = "\n".join(
        [f"class C{i}:\n    def m(self):\n        pass" for i in range(50)]
    )
    functions = [
        FunctionToOptimize(function_name="m", parents=[Parent(type="ClassDef", name=f"C{i}")])
        for i in range(50)
    ]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 13.1ms -> 13.4ms

def test_add_decorator_large_mixed_code():
    # Mix of classes, functions, and methods, some to decorate, some not
    code = ""
    functions = []
    for i in range(20):
        code += f"def f{i}():\n    pass\n"
        if i % 2 == 0:
            functions.append(FunctionToOptimize(function_name=f"f{i}"))
    for i in range(10):
        code += f"class X{i}:\n"
        for j in range(5):
            code += f"    def m{j}(self):\n        pass\n"
            if j % 2 == 1:
                functions.append(
                    FunctionToOptimize(
                        function_name=f"m{j}",
                        parents=[Parent(type="ClassDef", name=f"X{i}")]
                    )
                )
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 12.2ms -> 12.4ms

def test_add_decorator_performance():
    # Stress test with 500 functions
    code = "\n".join([f"def foo{i}():\n    pass" for i in range(500)])
    functions = [FunctionToOptimize(function_name=f"foo{i}") for i in range(500)]
    codeflash_output = add_codeflash_decorator_to_code(code, functions); result = codeflash_output # 70.6ms -> 72.9ms
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from __future__ import annotations

from dataclasses import dataclass
from typing import List, Tuple

import libcst as cst
# imports
import pytest  # used for our unit tests
from codeflash.benchmarking.instrument_codeflash_trace import \
    add_codeflash_decorator_to_code

# --- Minimal stub for FunctionToOptimize and its parent structure ---

@dataclass
class Parent:
    type: str
    name: str

@dataclass
class FunctionToOptimize:
    function_name: str
    parents: List[Parent]

# --- Transformer for adding the decorator ---

class AddDecoratorTransformer(cst.CSTTransformer):
    def __init__(self, target_functions: set[Tuple[str, str]]):
        self.target_functions = target_functions

    def leave_FunctionDef(
        self, original_node: cst.FunctionDef, updated_node: cst.FunctionDef
    ) -> cst.FunctionDef:
        # Determine class context, if any
        class_name = ""
        parent = self.get_metadata(cst.metadata.ParentNodeProvider, original_node, None)
        while parent:
            if isinstance(parent, cst.ClassDef):
                class_name = parent.name.value
                break
            parent = self.get_metadata(cst.metadata.ParentNodeProvider, parent, None)
        # Check if this function is a target
        if (class_name, original_node.name.value) in self.target_functions:
            # Check if decorator already present
            for dec in updated_node.decorators:
                if (
                    isinstance(dec.decorator, cst.Name)
                    and dec.decorator.value == "codeflash_trace"
                ):
                    return updated_node  # Already decorated
                if (
                    isinstance(dec.decorator, cst.Attribute)
                    and dec.decorator.attr.value == "codeflash_trace"
                ):
                    return updated_node  # Already decorated
            # Add the decorator
            return updated_node.with_changes(
                decorators=[
                    cst.Decorator(decorator=cst.Name("codeflash_trace"))
                ] + list(updated_node.decorators)
            )
        return updated_node

    @staticmethod
    def get_metadata(provider, node, default=None):
        # Dummy implementation for test, as metadata is not used in this context
        return None
from codeflash.benchmarking.instrument_codeflash_trace import \
    add_codeflash_decorator_to_code

# --- Unit Tests ---

# Helper to create FunctionToOptimize with/without class context
def make_fopt(func_name, class_name=None):
    if class_name:
        parents = [Parent(type="ClassDef", name=class_name)]
    else:
        parents = []
    return FunctionToOptimize(function_name=func_name, parents=parents)

# 1. BASIC TEST CASES

def test_add_decorator_to_simple_function():
    # Basic: Add to top-level function
    code = "def foo():\n    pass\n"
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 319μs -> 333μs

def test_add_decorator_to_method_in_class():
    # Basic: Add to method in class
    code = "class Bar:\n    def baz(self):\n        return 1\n"
    fopts = [make_fopt("baz", "Bar")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 499μs -> 515μs

def test_add_decorator_to_multiple_functions():
    # Basic: Add to multiple functions
    code = (
        "def foo():\n    pass\n"
        "def bar():\n    pass\n"
    )
    fopts = [make_fopt("foo"), make_fopt("bar")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 479μs -> 493μs

def test_no_decorator_if_function_not_in_list():
    # Basic: Don't add if not in list
    code = "def foo():\n    pass\n"
    fopts = [make_fopt("bar")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 239μs -> 243μs

def test_add_decorator_to_static_and_class_methods():
    # Basic: Add decorator to static and class methods as well
    code = (
        "class C:\n"
        "    @staticmethod\n"
        "    def foo():\n"
        "        pass\n"
        "    @classmethod\n"
        "    def bar(cls):\n"
        "        pass\n"
    )
    fopts = [make_fopt("foo", "C"), make_fopt("bar", "C")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 706μs -> 738μs

# 2. EDGE TEST CASES

def test_already_decorated_function():
    # Edge: Should not add duplicate decorator
    code = "@codeflash_trace\ndef foo():\n    pass\n"
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 368μs -> 376μs

def test_function_with_other_decorators():
    # Edge: Should add codeflash_trace before other decorators
    code = (
        "@other_decorator\n"
        "def foo():\n"
        "    pass\n"
    )
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 369μs -> 376μs

def test_function_with_attribute_decorator():
    # Edge: Should not add if decorator is codeflash_trace as attribute
    code = "@some.codeflash_trace\ndef foo():\n    pass\n"
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 419μs -> 433μs

def test_nested_functions():
    # Edge: Should only decorate top-level or class methods, not inner functions
    code = (
        "def outer():\n"
        "    def inner():\n"
        "        pass\n"
        "    return inner\n"
    )
    fopts = [make_fopt("outer"), make_fopt("inner")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 431μs -> 446μs

def test_function_in_inherited_class():
    # Edge: Should only decorate if class name matches
    code = (
        "class Base:\n"
        "    def foo(self):\n"
        "        pass\n"
        "class Child(Base):\n"
        "    def foo(self):\n"
        "        pass\n"
    )
    fopts = [make_fopt("foo", "Child")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 827μs -> 854μs

def test_function_with_multiline_signature():
    # Edge: Handles multiline signatures
    code = (
        "def foo(\n"
        "    a,\n"
        "    b\n"
        "):\n"
        "    pass\n"
    )
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 488μs -> 502μs

def test_function_with_unicode_name():
    # Edge: Unicode function name
    code = "def f_áéíóú():\n    pass\n"
    fopts = [make_fopt("f_áéíóú")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 323μs -> 326μs


def test_no_functions_in_code():
    # Edge: Code with no functions
    code = "x = 42\nprint(x)\n"
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 347μs -> 408μs

def test_function_with_type_hints_and_annotations():
    # Edge: Function with type hints and return annotation
    code = "def foo(a: int, b: str) -> bool:\n    return True\n"
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 547μs -> 596μs

def test_function_with_docstring():
    # Edge: Function with a docstring
    code = (
        "def foo():\n"
        "    '''This is a docstring.'''\n"
        "    pass\n"
    )
    fopts = [make_fopt("foo")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 390μs -> 415μs

def test_class_method_with_same_name_in_two_classes():
    # Edge: Only decorate the method in the specified class
    code = (
        "class A:\n"
        "    def foo(self):\n"
        "        pass\n"
        "class B:\n"
        "    def foo(self):\n"
        "        pass\n"
    )
    fopts = [make_fopt("foo", "B")]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 722μs -> 754μs
    # Only B.foo should be decorated
    a_section = result.split("class A:")[1].split("class B:")[0]
    b_section = result.split("class B:")[1]

# 3. LARGE SCALE TEST CASES

def test_large_number_of_functions():
    # Large: 500 top-level functions, decorate 100 of them
    N = 500
    code = "\n".join([f"def func_{i}():\n    return {i}" for i in range(N)])
    fopts = [make_fopt(f"func_{i}") for i in range(0, 100)]
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 78.0ms -> 81.3ms
    # All decorated functions should have the decorator right above
    for i in range(0, 100):
        func_def = f"def func_{i}():"
        idx = result.index(func_def)
        before = result[:idx].rstrip().splitlines()[-1]

def test_large_number_of_classes_and_methods():
    # Large: 50 classes, each with 10 methods, decorate 5 methods per class
    N_CLASSES = 50
    N_METHODS = 10
    code = ""
    fopts = []
    for c in range(N_CLASSES):
        code += f"class C{c}:\n"
        for m in range(N_METHODS):
            code += f"    def m{m}(self):\n        return {m}\n"
            if m < 5:
                fopts.append(make_fopt(f"m{m}", f"C{c}"))
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 104ms -> 108ms
    # Check that for each class, the first 5 methods have the decorator
    for c in range(N_CLASSES):
        class_code = result.split(f"class C{c}:")[1]
        for m in range(5):
            method_def = f"def m{m}(self):"
            idx = class_code.index(method_def)
            before = class_code[:idx].rstrip().splitlines()[-1]

def test_large_code_with_no_targets():
    # Large: 1000 functions, none to decorate
    N = 1000
    code = "\n".join([f"def f{i}():\n    return {i}" for i in range(N)])
    fopts = []
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 153ms -> 51.5ms

def test_performance_large_mix():
    # Large: Mix of classes, functions, some to decorate, some not
    N_FUNCS = 300
    N_CLASSES = 20
    N_METHODS = 20
    code = ""
    fopts = []
    # Top-level functions
    for i in range(N_FUNCS):
        code += f"def top_{i}():\n    return {i}\n"
        if i % 10 == 0:
            fopts.append(make_fopt(f"top_{i}"))
    # Classes and methods
    for c in range(N_CLASSES):
        code += f"class Cls{c}:\n"
        for m in range(N_METHODS):
            code += f"    def meth_{m}(self):\n        return {m}\n"
            if m % 5 == 0:
                fopts.append(make_fopt(f"meth_{m}", f"Cls{c}"))
    codeflash_output = add_codeflash_decorator_to_code(code, fopts); result = codeflash_output # 126ms -> 131ms
    # Should have correct number of decorators
    expected = (N_FUNCS // 10) + (N_CLASSES * (N_METHODS // 5))
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-pr294-2025-06-06T05.50.14 and push.

(`add-timing-info-to-generated-tests`) Here’s an optimized version of your program. The main bottleneck is not in the construction of `target_functions` (which is a tiny fraction of the runtime), but in the way you handle parsing and transformation with `libcst`. However, gathering `target_functions` can be optimized using a list comprehension with tuple unpacking, while avoiding multiple attribute lookups. Also, the main time is spent in `module.visit(transformer)` and `cst.parse_module`. If you have control over how the transformer (i.e., `AddDecoratorTransformer`) is written, you should make it as restrictive and fast as possible, using `visit_`/`leave_` functions that early-exit on non-target nodes. Below, I’ll only optimize what’s asked–rewriting this function to minimize unnecessary slow steps and any wasted computations, while preserving the code logic and interface. ### Changes. - Combined the logic of extracting `(class_name, function_name)` into a set comprehension for fewer attribute accesses and tighter bytecode. - Added a check: if `target_functions` is empty, we just return the original code immediately (this prevents any parsing/visiting if there's nothing to decorate). - Comments left untouched unless relevant code was modified. - Retained function signature and interface. - Provided some minor micro-optimizations (generator expressions, less branching). - Eliminated redundant variable assignments. ### Notes. - As per your profile, most time is spent in parsing and visiting. Optimize the `AddDecoratorTransformer` for early exits and to do as little as possible, in its own definition (not here), for further improvement. - Return early for the empty case: saves any parse/visit calls at all. - If you do ever get code objects, you could use `isinstance(module, cst.Module)` to skip reparsing, but as per the signature we always expect `str`. If you want even more speed, next steps are: - Minimize the work that `AddDecoratorTransformer` does (match by qualified name to avoid visiting subtrees needlessly). - Use native AST parsing/writing if you have full control over decorator syntax constraints. Let me know if you want to see transformer optimizations as well!

codeflash-ai · 2025-06-06T18:13:36Z

This PR has been automatically closed because the original PR #294 by misrasaurabh1 was closed.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jun 6, 2025

codeflash-ai bot mentioned this pull request Jun 6, 2025

Add the test case timing to the generated test [CF-482] #294

Merged

codeflash-ai bot closed this Jun 6, 2025

codeflash-ai bot deleted the codeflash/optimize-pr294-2025-06-06T05.50.14 branch June 6, 2025 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `add_codeflash_decorator_to_code` by 17% in PR #294 (`add-timing-info-to-generated-tests`) #297

⚡️ Speed up function `add_codeflash_decorator_to_code` by 17% in PR #294 (`add-timing-info-to-generated-tests`) #297

Uh oh!

codeflash-ai bot commented Jun 6, 2025

Uh oh!

codeflash-ai bot commented Jun 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function add_codeflash_decorator_to_code by 17% in PR #294 (add-timing-info-to-generated-tests) #297

⚡️ Speed up function add_codeflash_decorator_to_code by 17% in PR #294 (add-timing-info-to-generated-tests) #297

Uh oh!

Conversation

codeflash-ai bot commented Jun 6, 2025

⚡️ This pull request contains optimizations for PR #294

📄 17% (0.17x) speedup for add_codeflash_decorator_to_code in codeflash/benchmarking/instrument_codeflash_trace.py

📝 Explanation and details

Changes.

Notes.

Uh oh!

codeflash-ai bot commented Jun 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `add_codeflash_decorator_to_code` by 17% in PR #294 (`add-timing-info-to-generated-tests`) #297

⚡️ Speed up function `add_codeflash_decorator_to_code` by 17% in PR #294 (`add-timing-info-to-generated-tests`) #297

📄 17% (0.17x) speedup for `add_codeflash_decorator_to_code` in `codeflash/benchmarking/instrument_codeflash_trace.py`