Skip to content

Conversation

@aseembits93
Copy link
Contributor

@aseembits93 aseembits93 commented Jul 2, 2025

PR Type

Enhancement


Description

  • Refactor runtime comment injection to use qualified_name

  • Replace codeflash_output assignment checks with AST call visitor

  • Propagate new qualified_name parameter through APIs

  • Update tests to pass and validate qualified_name handling


Changes diagram

flowchart LR
  FO["function_optimizer: extract qualified_name"] --> ART["add_runtime_comments_to_generated_tests"]
  ART --> RCT["RuntimeCommentTransformer init with qualified_name"]
  RCT --> VISIT["CfoVisitor.visit_Call matches function name"]
  RCT --> CONTAINS["_contains_myfunc_call finds calls in statements"]
  RCT --> INSERT["Insert timing comments on matching lines"]
Loading

Changes walkthrough 📝

Relevant files
Enhancement
edit_generated_tests.py
Use qualified_name and Call visitor for runtime comments 

codeflash/code_utils/edit_generated_tests.py

  • Added qualifed_name parameter to visitor and transformer
  • Removed multiple visit_Assign/visit_AugAssign methods
  • Implemented visit_Call to detect function name calls
  • Added helper _contains_myfunc_call for statement scanning
  • +55/-98 
    Configuration changes
    function_optimizer.py
    Propagate qualified_name to runtime comment API                   

    codeflash/optimization/function_optimizer.py

  • Pass qualified_name to add_runtime_comments_to_generated_tests
  • Compute qualified_name using self.function_to_optimize
  • Adjust argument ordering for new API signature
  • +7/-2     
    Tests
    test_add_runtime_comments.py
    Update tests for qualified_name parameter                               

    tests/test_add_runtime_comments.py

  • Updated tests to define and pass qualified_name
  • Adjust all invocations of add_runtime_comments_to_generated_tests
  • Ensure tests cover new signature and behavior
  • +54/-43 

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @github-actions
    Copy link

    github-actions bot commented Jul 2, 2025

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Parameter Typo

    The qualifed_name parameter is consistently misspelled without the "i", which may cause confusion or inconsistency; consider renaming it to qualified_name.

    def __init__(self, qualifed_name: str, source_code: str) -> None:
        self.source_lines = source_code.splitlines()
        self.name = qualifed_name.split(".")[-1]
        self.results: list[int] = []  # map actual line number to line number in ast
    Signature Mismatch

    Verify that the updated call to existing_tests_source_for and add_runtime_comments_to_generated_tests match their new signatures, as adding the extra qualifed_name parameter may break existing usage.

    qualifed_name = self.function_to_optimize.qualified_name_with_modules_from_root(self.project_root)
    # Add runtime comments to generated tests before creating the PR
    generated_tests = add_runtime_comments_to_generated_tests(
        qualifed_name,
        self.test_cfg,
        generated_tests,
        original_runtime_by_test,
        optimized_runtime_by_test,
    )
    Performance Concern

    The code now parses and visits the AST for each function definition on the fly, which could introduce performance overhead for large test files; consider caching or optimizing these calls.

    def find_codeflash_output_assignments(qualifed_name: str, source_code: str) -> list[int]:
        tree = ast.parse(source_code)
        visitor = CfoVisitor(qualifed_name, source_code)
        visitor.visit(tree)
        return visitor.results

    @github-actions
    Copy link

    github-actions bot commented Jul 2, 2025

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    General
    Fix typo in parameter name

    Correct the parameter name typo from qualifed_name to qualified_name in both the
    function signature and all its usages to improve readability and consistency.

    codeflash/code_utils/edit_generated_tests.py [74]

    -def find_codeflash_output_assignments(qualifed_name: str, source_code: str) -> list[int]:
    +def find_codeflash_output_assignments(qualified_name: str, source_code: str) -> list[int]:
    Suggestion importance[1-10]: 5

    __

    Why: Correcting qualifed_name to qualified_name improves consistency and prevents confusion in the API.

    Low
    Guard empty function name

    Add an early guard in _contains_myfunc_call to immediately return False when
    self.name is empty or unset to avoid unnecessary AST traversal.

    codeflash/code_utils/edit_generated_tests.py [212-232]

     def _contains_myfunc_call(self, node):
         """Recursively search for any Call node in the statement whose function is named self.name (including obj.myfunc)."""
    +    if not self.name:
    +        return False
     
         class Finder(cst.CSTVisitor):
             ...
         finder = Finder(self.name)
         node.visit(finder)
         return finder.found
    Suggestion importance[1-10]: 3

    __

    Why: The guard prevents unnecessary traversal when self.name is empty, but self.name is always set, so the impact is minimal.

    Low

    misrasaurabh1
    misrasaurabh1 previously approved these changes Jul 2, 2025
    codeflash-ai bot added a commit that referenced this pull request Jul 2, 2025
    …488 (`fix-runtime-comments`)
    
    Certainly! Based on your line profiling, the vast majority of the time in `find_codeflash_output_assignments` is spent in `visitor.visit(tree)` (82.1%), with notable time in `ast.parse` as well (17.6%). Since you did not provide the implementation for `CfoVisitor`, I’ll focus on optimizing the import, AST parsing, and visitor instantiation.
    
    // In this function, the main opportunities are efficient use of parsing and traversal, and minimizing repeated work.
    
    ### Fast Optimized Version
    
    - We use `ast.parse` directly, and this is already quite optimal for parsing, but we can consider reducing memory footprint by using `'eval'` mode if reasonable (not likely useful here).
    - **Critical:** As the bottleneck is in the `visitor.visit(tree)`, improvement will mainly be realized by rewriting or improving `CfoVisitor`, which is not provided.
    - There is unnecessary passing of `source_code` to the visitor constructor unless it’s specifically required for the visitor operation. If not used, don’t pass it.
    - Removing future import if not needed, as it creates a small overhead (but it is sometimes needed for type hinting).
    - You may benefit slightly from using the `__slots__` declaration on the visitor class if it uses instance attributes heavily (only if you can edit it).
    - If `CfoVisitor` can be made to use `ast.NodeVisitor.generic_visit` in C (or faster `ast.walk`), do so. But without its code, this can't be rewritten here.
    
    Here's an optimal rewrite **without changing the function signature or the visitor** (assuming you can't optimize `CfoVisitor`, because its code is not present).
    
    
    
    #### Additional Suggestions
    
    If you **can** alter `CfoVisitor`, consider the following for major runtime wins.
    
    - Use `__slots__` on visitor class to reduce memory use and instantiation time.
    - Batch process nodes as much as possible rather than recursive methods.
    - Minimize string manipulation/attributes.
    - If you only want line numbers, ensure `CfoVisitor` does not store references to nodes; just immediate integers.
    
    #### Example Visitor Optimization
    
    If the visitor is slow due to excessive recursion or attribute access, a fast pattern using `ast.walk` (if applicable) is like so (this would replace `CfoVisitor` logic).
    
    
    
    **But do not change this unless you know exactly what CfoVisitor does!**
    
    ---
    
    ## Summary.
    **With current information, the program is already minimal and optimal.** Major improvements will come from optimizing (or rewriting) the `CfoVisitor`. The functional wrapper provided is already essentially as optimal as possible unless you inline the visitor logic or batch your traversal.
    
    If you can provide the body of `CfoVisitor`, much larger runtime reductions are possible! Let me know if you'd like a deeper optimization with that code.
    codeflash-ai bot added a commit that referenced this pull request Jul 2, 2025
    …% in PR #488 (`fix-runtime-comments`)
    
    Here’s a heavily optimized rewrite of your function, focused on the main bottleneck: the `tree.visit(transformer)` call inside the main loop (~95% of your runtime!). Across the entire function, the following optimizations (all applied **without changing any functional output**) are used.
    
    1. **Precompute Data Structures:** Several expensive operations (especially `relative_to` path gymnastics and conversions) are moved out of inner loops and stored as sensible lookups, since their results are almost invariant across tests.
    2. **Merge For Loops:** The two near-identical `for` loops per invocation in `leave_SimpleStatementLine` are merged into one, halving search cost.
    3. **Optimize Invocation Matching:** An indexed lookup is pre-built mapping the unique tuple keys `(rel_path, qualified_name, cfo_loc)` to their runtimes. This makes runtime-access O(1) instead of requiring a full scan per statement.
    4. **Avoid Deep AST/Normalized Source Walks:** If possible, recommend optimizing `find_codeflash_output_assignments` to operate on the CST or directly on the parsed AST rather than reparsing source code. (**The code preserves your current approach but this is a further large opportunity.**)
    5. **Faster CST Name/Call detection:** The `leave_SimpleStatementLine`’s `_contains_myfunc_call` is further micro-optimized by breaking as soon as a match is found (using exception for early escape), avoiding unnecessary traversal.
    6. **Minimize Object Creations:** The `GeneratedTests` objects are only constructed once and appended.
    7. **Eliminating Minor Redundant Computation.**
    8. **Reduce try/except Overhead:** Only exceptions propagate; no functional change here.
    
    Below is the optimized code, with comments kept as close as possible to your original code (apart from changed logic).
    
    
    
    **Summary of key gains:**  
    - The O(N*M) runtimes loop is now O(1) due to hash indexes.
    - All constant/cached values are precomputed outside the node visitor.
    - Deep tree walks and list traversals have early exits and critical-path logic is tightened.
    - No functional changes, all corner cases preserved.
    
    **Still slow?**:  
    The biggest remaining hit will be the `find_codeflash_output_assignments` (which reparses source); move this to operate directly on CST if possible for further big wins.
    
    Let me know your measured speedup! 🚀
    @codeflash-ai
    Copy link
    Contributor

    codeflash-ai bot commented Jul 2, 2025

    ⚡️ Codeflash found optimizations for this PR

    📄 239% (2.39x) speedup for add_runtime_comments_to_generated_tests in codeflash/code_utils/edit_generated_tests.py

    ⏱️ Runtime : 1.16 seconds 341 milliseconds (best of 24 runs)

    I created a new dependent PR with the suggested changes. Please review:

    If you approve, it will be merged into this PR (branch fix-runtime-comments).

    @aseembits93 aseembits93 requested a review from misrasaurabh1 July 3, 2025 06:02
    @aseembits93
    Copy link
    Contributor Author

    have a newer one with more bugfixes

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants