Skip to content

⚡️ Speed up function get_optimized_code_for_module by 21% in PR #1774 (feat/gradle-executor-from-java)#1849

Closed
codeflash-ai[bot] wants to merge 2 commits intofeat/gradle-executor-from-javafrom
codeflash/optimize-pr1774-2026-03-17T02.35.54
Closed

⚡️ Speed up function get_optimized_code_for_module by 21% in PR #1774 (feat/gradle-executor-from-java)#1849
codeflash-ai[bot] wants to merge 2 commits intofeat/gradle-executor-from-javafrom
codeflash/optimize-pr1774-2026-03-17T02.35.54

Conversation

@codeflash-ai
Copy link
Contributor

@codeflash-ai codeflash-ai bot commented Mar 17, 2026

⚡️ This pull request contains optimizations for PR #1774

If you approve this dependent PR, these changes will be merged into the original PR branch feat/gradle-executor-from-java.

This PR will be automatically closed if the original PR is merged.


📄 21% (0.21x) speedup for get_optimized_code_for_module in codeflash/languages/code_replacer.py

⏱️ Runtime : 18.8 milliseconds 15.5 milliseconds (best of 37 runs)

📝 Explanation and details

The hot path in basename matching (30.9% of original runtime) was replaced with an early-exit loop that avoids building an intermediate list and constructing Path objects for every dictionary entry: the optimized code uses os.path.basename (a simple string slice) instead of Path(path).name and stops scanning after finding two matches. Line profiler confirms the original list-comprehension cost ~20.7 ms across all calls, now reduced to ~3.8 ms by iterating once and breaking early. Debug logging was guarded with isEnabledFor checks to skip expensive f-string formatting when debug is off, saving ~70 µs per suppressed log. Runtime improved 21% with no correctness regressions.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 19 Passed
🌀 Generated Regression Tests 38 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
⚙️ Click to see Existing Unit Tests
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_basename_match 34.0μs 28.1μs 21.1%✅
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_basename_match_with_different_prefix 34.3μs 27.7μs 23.8%✅
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_exact_path_match 25.5μs 32.2μs -20.8%⚠️
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_no_match_returns_empty 518μs 519μs -0.237%⚠️
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_none_key_fallback 28.8μs 26.9μs 7.27%✅
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_none_key_with_multiple_blocks_no_match 504μs 512μs -1.59%⚠️
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_single_block_fallback_non_python 43.6μs 36.5μs 19.4%✅
test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_single_block_fallback_python_does_not_match 506μs 520μs -2.57%⚠️
test_languages/test_get_optimized_code_for_module.py::test_basename_fallback_ambiguous_returns_empty 486μs 496μs -1.99%⚠️
test_languages/test_get_optimized_code_for_module.py::test_basename_fallback_different_directory 27.5μs 20.4μs 34.9%✅
test_languages/test_get_optimized_code_for_module.py::test_basename_fallback_skips_non_matching_context_files 30.4μs 21.7μs 40.3%✅
test_languages/test_get_optimized_code_for_module.py::test_context_files_only_returns_empty 500μs 504μs -0.819%⚠️
test_languages/test_get_optimized_code_for_module.py::test_empty_markdown_returns_empty 456μs 469μs -2.58%⚠️
test_languages/test_get_optimized_code_for_module.py::test_exact_match_preferred_over_basename 17.2μs 17.1μs 0.468%✅
test_languages/test_get_optimized_code_for_module.py::test_exact_path_match_picks_correct_file 17.3μs 17.2μs 0.646%✅
test_languages/test_get_optimized_code_for_module.py::test_exact_path_match_single_file 16.5μs 15.9μs 3.70%✅
test_languages/test_get_optimized_code_for_module.py::test_no_match_returns_empty 472μs 481μs -1.98%⚠️
test_languages/test_get_optimized_code_for_module.py::test_none_path_fallback_ignored_when_named_blocks_exist 489μs 504μs -3.03%⚠️
test_languages/test_get_optimized_code_for_module.py::test_none_path_fallback_single_block 19.9μs 15.5μs 28.6%✅
🌀 Click to see Generated Regression Tests
from pathlib import Path

# imports
import codeflash.languages.current as current_module
from codeflash.languages.code_replacer import get_optimized_code_for_module
from codeflash.models.models import CodeStringsMarkdown


def test_returns_exact_match_simple():
    # Create a CodeStringsMarkdown instance and populate its internal cache directly.
    # We avoid relying on how code_strings are validated by pydantic by injecting the
    # precomputed file_to_path mapping into the private _cache which file_to_path() uses.
    cm = CodeStringsMarkdown()  # create the model instance using real constructor

    target_path = Path("a/module.py")  # relative path we'll search for
    expected_code = "print('exact match')"

    # Put the mapping directly into the private cache so file_to_path() returns it.
    cm._cache["file_to_path"] = {str(target_path): expected_code}

    # When there is an exact string match for the relative path, the function should return it.
    result = get_optimized_code_for_module(target_path, cm, allow_fallback=True)  # 7.93μs -> 7.78μs (1.93% faster)
    assert result == expected_code  # exact match should be returned


def test_no_fallback_returns_empty_when_missing():
    # Prepare a CodeStringsMarkdown instance with no entry for the requested file.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"other.py": "some code"}

    # With allow_fallback=False and no exact match, the function must return an empty string.
    result = get_optimized_code_for_module(
        Path("missing.py"), cm, allow_fallback=False
    )  # 9.33μs -> 9.68μs (3.63% slower)
    assert result == ""


def test_fallback_single_none_key_used():
    # If optimized output contains a single code block with key "None",
    # the function should use that block as a fallback when allow_fallback=True.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"None": "fallback code for unspecified path"}

    result = get_optimized_code_for_module(
        Path("any/path.py"), cm, allow_fallback=True
    )  # 14.8μs -> 10.8μs (37.1% faster)
    assert result == "fallback code for unspecified path"


def test_basename_match_returns_when_unique():
    # When there's no exact path match but a single entry matches by basename,
    # the function should return that entry's code.
    cm = CodeStringsMarkdown()
    # The LLM returned code for "src/other/module.py" but we are asking for "module.py"
    cm._cache["file_to_path"] = {"src/other/module.py": "code for module.py", "another.py": "irrelevant"}

    result = get_optimized_code_for_module(
        Path("module.py"), cm, allow_fallback=True
    )  # 22.8μs -> 13.8μs (65.2% faster)
    # There are two keys in the map; basename of the first entry matches our target name,
    # and it's a unique basename match, so it should be returned.
    assert result == "code for module.py"


def test_basename_match_ambiguous_returns_empty():
    # If multiple entries share the same basename, the basename fallback should not pick one.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"a/path/shared.py": "first", "b/other/shared.py": "second"}

    # Requesting "shared.py" should return empty string because basename matches are ambiguous.
    result = get_optimized_code_for_module(Path("shared.py"), cm, allow_fallback=True)  # 487μs -> 581μs (16.1% slower)
    assert result == ""


def test_single_non_python_block_used_when_not_python(monkeypatch):
    # When the language is not Python and there is exactly one code block overall,
    # the function should return that lone code block as a fallback.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"weird/path.txt": "non-python code"}

    # Force is_python() to return False for this test to trigger fallback 3.
    monkeypatch.setattr(current_module, "is_python", lambda: False)

    result = get_optimized_code_for_module(Path("any.py"), cm, allow_fallback=True)  # 22.0μs -> 14.0μs (56.8% faster)
    assert result == "non-python code"


def test_no_match_logs_and_returns_empty(monkeypatch):
    # When no match is found (and no acceptable fallback), the function should return empty string.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"x.py": "x", "y.py": "y"}

    # Ensure language is Python so that non-python single-block fallback does not apply.
    monkeypatch.setattr(current_module, "is_python", lambda: True)

    result = get_optimized_code_for_module(Path("z.py"), cm, allow_fallback=True)  # 471μs -> 478μs (1.52% slower)
    assert result == ""


def test_special_characters_in_path_basename_matching():
    # Ensure that filenames with special characters are handled correctly by basename fallback.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"some/dir/weird name @#$%.py": "special file code"}

    # Request with a Path that has only the basename; the basename matches uniquely.
    result = get_optimized_code_for_module(
        Path("weird name @#$%.py"), cm, allow_fallback=True
    )  # 21.0μs -> 13.0μs (62.5% faster)
    assert result == "special file code"


def test_large_number_of_entries_with_exact_match_performance():
    # Create a large mapping of 1000 entries and ensure the correct key is returned quickly.
    cm = CodeStringsMarkdown()
    mapping = {}
    # Fill mapping with many unrelated entries
    for i in range(1000):
        mapping[f"dir/file_{i}.py"] = f"code_{i}"
    # Put the exact target near the end to ensure search handles many entries
    target = Path("dir/file_999.py")
    mapping[str(target)] = "final_code"
    cm._cache["file_to_path"] = mapping

    result = get_optimized_code_for_module(target, cm, allow_fallback=True)  # 7.48μs -> 7.45μs (0.402% faster)
    assert result == "final_code"


def test_large_basename_search_with_single_match():
    # Create many entries but ensure exactly one basename matches the requested name.
    cm = CodeStringsMarkdown()
    mapping = {}
    # Create many files with unique basenames
    for i in range(900):
        mapping[f"a/subdir_{i}/file_{i}.py"] = f"c{i}"
    # Add many files with different names but same basename for some subset
    for i in range(50):
        mapping[f"other/{i}/common_name_{i}.py"] = f"c_common_{i}"
    # Add exactly one file whose basename is "target_common.py"
    mapping["some/deep/path/target_common.py"] = "the-one"
    # Add other files that have the same basename but ensure only one actually matches the basename we will ask
    mapping["another/place/not_target.py"] = "noise"

    cm._cache["file_to_path"] = mapping

    result = get_optimized_code_for_module(
        Path("target_common.py"), cm, allow_fallback=True
    )  # 1.94ms -> 348μs (456% faster)
    assert result == "the-one"
from pathlib import Path

# imports
from codeflash.languages.code_replacer import get_optimized_code_for_module
from codeflash.models.models import CodeString, CodeStringsMarkdown


def test_exact_match_returns_code():
    """Test that exact path match returns the correct code."""
    # Create a CodeString with a specific file path and code
    code_string = CodeString(file_path=Path("module.py"), code="optimized_code_1")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Call the function with the exact matching path
    result = get_optimized_code_for_module(Path("module.py"), optimized_code)  # 17.2μs -> 17.4μs (1.15% slower)

    # Verify the correct code is returned
    assert result == "optimized_code_1"


def test_exact_match_multiple_files():
    """Test that exact match works when multiple files are present."""
    # Create multiple CodeString instances
    code_string_1 = CodeString(file_path=Path("file1.py"), code="code_for_file1")
    code_string_2 = CodeString(file_path=Path("file2.py"), code="code_for_file2")
    code_string_3 = CodeString(file_path=Path("file3.py"), code="code_for_file3")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2, code_string_3])

    # Query for the middle file
    result = get_optimized_code_for_module(Path("file2.py"), optimized_code)  # 17.9μs -> 18.0μs (0.450% slower)

    # Verify correct code is returned
    assert result == "code_for_file2"


def test_no_match_with_fallback_disabled():
    """Test that empty string is returned when no match is found and fallback is disabled."""
    # Create a CodeString with a different path
    code_string = CodeString(file_path=Path("other.py"), code="some_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Call with allow_fallback=False
    result = get_optimized_code_for_module(
        Path("target.py"), optimized_code, allow_fallback=False
    )  # 15.9μs -> 16.0μs (0.562% slower)

    # Verify empty string is returned
    assert result == ""


def test_none_path_fallback_single_block():
    """Test fallback 1: code block with None file path is used when it's the only block."""
    # Create a CodeString with None as file path
    code_string = CodeString(file_path=None, code="fallback_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Call with any path
    result = get_optimized_code_for_module(Path("any_file.py"), optimized_code)  # 19.9μs -> 15.6μs (27.7% faster)

    # Verify fallback code is returned
    assert result == "fallback_code"


def test_basename_match_fallback():
    """Test fallback 2: matching by filename basename when full path doesn't match."""
    # Create a CodeString with a different directory but same filename
    code_string = CodeString(file_path=Path("different/dir/target.py"), code="basename_matched_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with different directory structure but same filename
    result = get_optimized_code_for_module(
        Path("some/other/target.py"), optimized_code
    )  # 26.7μs -> 19.8μs (34.4% faster)

    # Verify basename-matched code is returned
    assert result == "basename_matched_code"


def test_basename_match_multiple_candidates():
    """Test that basename match is NOT used when there are multiple matches."""
    # Create multiple CodeStrings with same basename but different directories
    code_string_1 = CodeString(file_path=Path("dir1/target.py"), code="code1")
    code_string_2 = CodeString(file_path=Path("dir2/target.py"), code="code2")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query with different path but same basename
    result = get_optimized_code_for_module(Path("dir3/target.py"), optimized_code)  # 486μs -> 490μs (0.943% slower)

    # Should not use basename fallback and return empty string
    assert result == ""


def test_empty_code_strings_list():
    """Test behavior with empty code_strings list."""
    # Create CodeStringsMarkdown with no code strings
    optimized_code = CodeStringsMarkdown(code_strings=[])

    # Call the function
    result = get_optimized_code_for_module(Path("any.py"), optimized_code)  # 459μs -> 466μs (1.55% slower)

    # Should return empty string
    assert result == ""


def test_empty_code_strings_list_no_fallback():
    """Test behavior with empty code_strings list and fallback disabled."""
    # Create CodeStringsMarkdown with no code strings
    optimized_code = CodeStringsMarkdown(code_strings=[])

    # Call the function with fallback disabled
    result = get_optimized_code_for_module(
        Path("any.py"), optimized_code, allow_fallback=False
    )  # 14.3μs -> 13.9μs (3.11% faster)

    # Should return empty string
    assert result == ""


def test_path_with_multiple_directory_levels():
    """Test exact match with deeply nested paths."""
    # Create a CodeString with deeply nested path
    code_string = CodeString(file_path=Path("a/b/c/d/e/f/deep_module.py"), code="deep_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with exact same deep path
    result = get_optimized_code_for_module(
        Path("a/b/c/d/e/f/deep_module.py"), optimized_code
    )  # 16.2μs -> 16.1μs (1.06% faster)

    # Should match exactly
    assert result == "deep_code"


def test_path_with_special_characters():
    """Test handling of paths with special characters in filename."""
    # Create a CodeString with special characters in filename
    code_string = CodeString(file_path=Path("module_test-v2.0.py"), code="special_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with exact same path
    result = get_optimized_code_for_module(
        Path("module_test-v2.0.py"), optimized_code
    )  # 15.6μs -> 15.3μs (1.64% faster)

    # Should match
    assert result == "special_code"


def test_empty_code_string_value():
    """Test that empty code string is returned when it matches."""
    # Create a CodeString with empty code
    code_string = CodeString(file_path=Path("empty.py"), code="")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query for this file
    result = get_optimized_code_for_module(Path("empty.py"), optimized_code)  # 15.0μs -> 14.9μs (0.948% faster)

    # Should return the empty string
    assert result == ""


def test_whitespace_only_code():
    """Test that whitespace-only code is returned as-is."""
    # Create a CodeString with only whitespace
    code_string = CodeString(file_path=Path("whitespace.py"), code="   \n\t  ")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query for this file
    result = get_optimized_code_for_module(Path("whitespace.py"), optimized_code)  # 14.6μs -> 14.6μs (0.000% faster)

    # Should return the whitespace as-is
    assert result == "   \n\t  "


def test_none_path_with_other_files_no_fallback():
    """Test that None path fallback is not used when other files exist and fallback is disabled."""
    # Create multiple CodeStrings including one with None path
    code_string_1 = CodeString(file_path=None, code="none_code")
    code_string_2 = CodeString(file_path=Path("other.py"), code="other_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query with no match and fallback disabled
    result = get_optimized_code_for_module(
        Path("target.py"), optimized_code, allow_fallback=False
    )  # 16.2μs -> 16.0μs (1.12% faster)

    # Should return empty string
    assert result == ""


def test_case_sensitivity_in_exact_match():
    """Test that path matching is case-sensitive."""
    # Create a CodeString with lowercase filename
    code_string = CodeString(file_path=Path("module.py"), code="lowercase_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with uppercase filename (on case-sensitive filesystems)
    result = get_optimized_code_for_module(Path("Module.py"), optimized_code)  # 479μs -> 483μs (0.715% slower)

    # Should not match due to case difference
    assert result == ""


def test_path_normalization_forward_slash():
    """Test that forward-slash paths are handled correctly."""
    # Create a CodeString with forward-slash path
    code_string = CodeString(file_path=Path("dir/file.py"), code="slash_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with same forward-slash path
    result = get_optimized_code_for_module(Path("dir/file.py"), optimized_code)  # 16.5μs -> 16.3μs (1.36% faster)

    # Should match
    assert result == "slash_code"


def test_cache_utilization():
    """Test that caching works across multiple calls."""
    # Create a CodeString
    code_string = CodeString(file_path=Path("module.py"), code="cached_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Make first call
    result1 = get_optimized_code_for_module(Path("module.py"), optimized_code)  # 15.6μs -> 15.0μs (4.06% faster)

    # Make second call with same CodeStringsMarkdown instance
    result2 = get_optimized_code_for_module(Path("module.py"), optimized_code)

    # Both should return same result
    assert result1 == result2 == "cached_code"  # 6.47μs -> 6.32μs (2.37% faster)
    # Verify cache was populated
    assert optimized_code._cache.get("file_to_path") is not None


def test_very_long_code_string():
    """Test handling of very long code strings."""
    # Create a CodeString with very long code
    long_code = "x = 1\n" * 1000  # 6000 characters
    code_string = CodeString(file_path=Path("long.py"), code=long_code)
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query for this file
    result = get_optimized_code_for_module(Path("long.py"), optimized_code)  # 17.3μs -> 17.2μs (0.471% faster)

    # Should return the entire long code
    assert result == long_code
    assert len(result) == 6000


def test_many_files_exact_match():
    """Test exact match performance with many files."""
    # Create many CodeStrings
    code_strings = [CodeString(file_path=Path(f"file_{i}.py"), code=f"code_{i}") for i in range(100)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for a file in the middle
    result = get_optimized_code_for_module(Path("file_50.py"), optimized_code)  # 102μs -> 101μs (1.64% faster)

    # Should find exact match
    assert result == "code_50"


def test_basename_match_with_many_different_files():
    """Test basename match when there are many files but only one basename match."""
    # Create CodeStrings with different paths but one unique basename match
    code_strings = [CodeString(file_path=Path(f"dir_{i}/other_{i}.py"), code=f"code_{i}") for i in range(50)]
    # Add the target basename
    code_strings.append(CodeString(file_path=Path("deep/nested/target.py"), code="target_code"))
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query with different path but same basename
    result = get_optimized_code_for_module(
        Path("other/path/target.py"), optimized_code
    )  # 169μs -> 86.2μs (97.0% faster)

    # Should match by basename
    assert result == "target_code"


def test_none_path_ignored_when_other_exact_match():
    """Test that None path is not used when exact match exists."""
    # Create multiple CodeStrings including one with None path
    code_string_1 = CodeString(file_path=None, code="none_code")
    code_string_2 = CodeString(file_path=Path("target.py"), code="exact_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query for target.py
    result = get_optimized_code_for_module(Path("target.py"), optimized_code)  # 16.5μs -> 16.5μs (0.121% faster)

    # Should return exact match, not None path
    assert result == "exact_code"


def test_multiple_none_paths():
    """Test handling when multiple code blocks have None as path."""
    # Create multiple CodeStrings with None path (unusual but possible)
    code_string_1 = CodeString(file_path=None, code="code1")
    code_string_2 = CodeString(file_path=None, code="code2")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query for any file
    result = get_optimized_code_for_module(Path("any.py"), optimized_code)  # 20.5μs -> 15.8μs (29.8% faster)

    # Should not use None fallback (length > 1)
    assert result == ""


def test_relative_path_with_dots():
    """Test paths containing dot notation."""
    # Create a CodeString with dots in the path
    code_string = CodeString(file_path=Path("../parent/module.py"), code="parent_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with exact same path
    result = get_optimized_code_for_module(
        Path("../parent/module.py"), optimized_code
    )  # 16.0μs -> 15.8μs (1.59% faster)

    # Should match exactly
    assert result == "parent_code"


def test_large_number_of_code_blocks():
    """Test performance with large number of code blocks."""
    # Create 1000 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:04d}.py"), code=f"code_block_{i:04d}") for i in range(1000)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for file in the middle
    result = get_optimized_code_for_module(Path("module_0500.py"), optimized_code)  # 819μs -> 806μs (1.63% faster)

    # Should find exact match
    assert result == "code_block_0500"


def test_large_number_of_code_blocks_query_last():
    """Test performance querying the last file in a large set."""
    # Create 1000 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:04d}.py"), code=f"code_block_{i:04d}") for i in range(1000)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for the last file
    result = get_optimized_code_for_module(Path("module_0999.py"), optimized_code)  # 819μs -> 796μs (2.95% faster)

    # Should find exact match
    assert result == "code_block_0999"


def test_large_number_of_code_blocks_no_match():
    """Test performance when no match exists in large set."""
    # Create 1000 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:04d}.py"), code=f"code_block_{i:04d}") for i in range(1000)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for a file that doesn't exist
    result = get_optimized_code_for_module(Path("nonexistent.py"), optimized_code)  # 6.50ms -> 5.61ms (15.9% faster)

    # Should return empty string
    assert result == ""


def test_basename_match_in_large_set():
    """Test basename matching performance with large number of files."""
    # Create 500 CodeStrings with different basenames
    code_strings = [CodeString(file_path=Path(f"dir_{i:03d}/module_{i:03d}.py"), code=f"code_{i}") for i in range(500)]
    # Add one more with a unique basename we'll search for
    code_strings.append(CodeString(file_path=Path("deep/nested/dir/target.py"), code="target_found"))
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query with different path but same basename
    result = get_optimized_code_for_module(
        Path("other/path/target.py"), optimized_code
    )  # 1.40ms -> 634μs (120% faster)

    # Should match by basename
    assert result == "target_found"


def test_cache_performance_multiple_queries():
    """Test that caching improves performance for repeated queries."""
    # Create 500 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:03d}.py"), code=f"code_{i}") for i in range(500)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Make multiple queries to same file (should use cache)
    results = [get_optimized_code_for_module(Path("module_250.py"), optimized_code) for _ in range(100)]

    # All results should be identical
    assert all(r == "code_250" for r in results)
    # Cache should be populated after first call
    assert optimized_code._cache.get("file_to_path") is not None

To edit these changes git checkout codeflash/optimize-pr1774-2026-03-17T02.35.54 and push.

Codeflash Static Badge

The hot path in basename matching (30.9% of original runtime) was replaced with an early-exit loop that avoids building an intermediate list and constructing Path objects for every dictionary entry: the optimized code uses `os.path.basename` (a simple string slice) instead of `Path(path).name` and stops scanning after finding two matches. Line profiler confirms the original list-comprehension cost ~20.7 ms across all calls, now reduced to ~3.8 ms by iterating once and breaking early. Debug logging was guarded with `isEnabledFor` checks to skip expensive f-string formatting when debug is off, saving ~70 µs per suppressed log. Runtime improved 21% with no correctness regressions.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Mar 17, 2026
@claude
Copy link
Contributor

claude bot commented Mar 17, 2026

Claude finished @codeflash-ai[bot]'s task in 4m 52s —— View job


PR Review Summary

  • Triage PR
  • Lint and typecheck (auto-fixed issues, pushed)
  • Resolve stale threads (none found)
  • Code review
  • Duplicate detection
  • Test coverage
  • Summary comment
  • Merge optimization PRs (CI pending — cannot merge yet)

Prek Checks

Two lint errors were found and auto-fixed in a follow-up commit (a707de7b):

Issue Fix
TC003: from pathlib import Path outside TYPE_CHECKING block Moved Path into TYPE_CHECKING — all usages are type annotations (lazy with from __future__ import annotations)
PTH119: os.path.basename() flagged (should use Path.name) Added # noqa: PTH119 — using Path(path).name defeats the entire performance optimization

Additional fixes:

  • first_match: str | None = Nonefirst_match: str = "" (resolves mypy return-value error)
  • Removed duplicate # Fallback 3 comment that was accidentally introduced

Prek now passes cleanly.

Code Review

The optimization is sound and well-targeted:

  • Early-exit loop (Fallback 2): replaces [code for path, code in ... if Path(path).name == target_name] with an early-break loop — avoids constructing Path objects for every dictionary entry and stops at 2 matches. Profiler confirmed 30.9% of original runtime was in this list comprehension.
  • isEnabledFor(logging.DEBUG) guards: avoids f-string evaluation for suppressed log calls. Small but correct.
  • target_key = str(relative_path): minor de-dup to avoid calling str() twice.

One observation: after the loop, first_match holds the second basename match's code (not the first) when match_count > 1. This is harmless since if match_count == 1 is false in that case, but the variable name is slightly misleading. Not a bug.

Duplicate Detection

No duplicates detected. get_optimized_code_for_module exists only in codeflash/languages/code_replacer.py.

Test Coverage

92 related tests passed. Coverage for codeflash/languages/code_replacer.py: 84% (96 stmts, 15 missed). Coverage maintained — the PR itself reported 100% coverage for get_optimized_code_for_module specifically.

Merge Optimization PRs

PR #1849 (this PR) is the only open codeflash-ai[bot] PR. CI is currently in progress — will need to wait for it to complete before merging.


| Branch

- Move Path to TYPE_CHECKING block (TC003)
- Suppress PTH119 on os.path.basename - performance-justified exception
- Fix first_match type annotation from str | None to str
- Remove duplicate comment

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@claude
Copy link
Contributor

claude bot commented Mar 17, 2026

Closing: CI check js-cjs-function-optimization is failing on the target branch (feat/gradle-executor-from-java). This failure appears pre-existing on the base branch and unrelated to the get_optimized_code_for_module optimization, but I cannot safely merge while CI is red.

@claude claude bot closed this Mar 17, 2026
@claude claude bot deleted the codeflash/optimize-pr1774-2026-03-17T02.35.54 branch March 17, 2026 03:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants