⚡️ Speed up function `get_optimized_code_for_module` by 21% in PR #1774 (`feat/gradle-executor-from-java`) by codeflash-ai[bot] · Pull Request #1849 · codeflash-ai/codeflash

codeflash-ai · 2026-03-17T02:36:00Z

⚡️ This pull request contains optimizations for PR #1774

If you approve this dependent PR, these changes will be merged into the original PR branch feat/gradle-executor-from-java.

This PR will be automatically closed if the original PR is merged.

📄 21% (0.21x) speedup for `get_optimized_code_for_module` in `codeflash/languages/code_replacer.py`

⏱️ Runtime : 18.8 milliseconds → 15.5 milliseconds (best of 37 runs)

📝 Explanation and details

The hot path in basename matching (30.9% of original runtime) was replaced with an early-exit loop that avoids building an intermediate list and constructing Path objects for every dictionary entry: the optimized code uses os.path.basename (a simple string slice) instead of Path(path).name and stops scanning after finding two matches. Line profiler confirms the original list-comprehension cost ~20.7 ms across all calls, now reduced to ~3.8 ms by iterating once and breaking early. Debug logging was guarded with isEnabledFor checks to skip expensive f-string formatting when debug is off, saving ~70 µs per suppressed log. Runtime improved 21% with no correctness regressions.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	✅ 19 Passed
🌀 Generated Regression Tests	✅ 38 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

⚙️ Click to see Existing Unit Tests

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_basename_match`	34.0μs	28.1μs	21.1%✅
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_basename_match_with_different_prefix`	34.3μs	27.7μs	23.8%✅
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_exact_path_match`	25.5μs	32.2μs	-20.8%⚠️
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_no_match_returns_empty`	518μs	519μs	-0.237%⚠️
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_none_key_fallback`	28.8μs	26.9μs	7.27%✅
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_none_key_with_multiple_blocks_no_match`	504μs	512μs	-1.59%⚠️
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_single_block_fallback_non_python`	43.6μs	36.5μs	19.4%✅
`test_code_replacer_matching.py::TestGetOptimizedCodeForModule.test_single_block_fallback_python_does_not_match`	506μs	520μs	-2.57%⚠️
`test_languages/test_get_optimized_code_for_module.py::test_basename_fallback_ambiguous_returns_empty`	486μs	496μs	-1.99%⚠️
`test_languages/test_get_optimized_code_for_module.py::test_basename_fallback_different_directory`	27.5μs	20.4μs	34.9%✅
`test_languages/test_get_optimized_code_for_module.py::test_basename_fallback_skips_non_matching_context_files`	30.4μs	21.7μs	40.3%✅
`test_languages/test_get_optimized_code_for_module.py::test_context_files_only_returns_empty`	500μs	504μs	-0.819%⚠️
`test_languages/test_get_optimized_code_for_module.py::test_empty_markdown_returns_empty`	456μs	469μs	-2.58%⚠️
`test_languages/test_get_optimized_code_for_module.py::test_exact_match_preferred_over_basename`	17.2μs	17.1μs	0.468%✅
`test_languages/test_get_optimized_code_for_module.py::test_exact_path_match_picks_correct_file`	17.3μs	17.2μs	0.646%✅
`test_languages/test_get_optimized_code_for_module.py::test_exact_path_match_single_file`	16.5μs	15.9μs	3.70%✅
`test_languages/test_get_optimized_code_for_module.py::test_no_match_returns_empty`	472μs	481μs	-1.98%⚠️
`test_languages/test_get_optimized_code_for_module.py::test_none_path_fallback_ignored_when_named_blocks_exist`	489μs	504μs	-3.03%⚠️
`test_languages/test_get_optimized_code_for_module.py::test_none_path_fallback_single_block`	19.9μs	15.5μs	28.6%✅

🌀 Click to see Generated Regression Tests

from pathlib import Path

# imports
import codeflash.languages.current as current_module
from codeflash.languages.code_replacer import get_optimized_code_for_module
from codeflash.models.models import CodeStringsMarkdown


def test_returns_exact_match_simple():
    # Create a CodeStringsMarkdown instance and populate its internal cache directly.
    # We avoid relying on how code_strings are validated by pydantic by injecting the
    # precomputed file_to_path mapping into the private _cache which file_to_path() uses.
    cm = CodeStringsMarkdown()  # create the model instance using real constructor

    target_path = Path("a/module.py")  # relative path we'll search for
    expected_code = "print('exact match')"

    # Put the mapping directly into the private cache so file_to_path() returns it.
    cm._cache["file_to_path"] = {str(target_path): expected_code}

    # When there is an exact string match for the relative path, the function should return it.
    result = get_optimized_code_for_module(target_path, cm, allow_fallback=True)  # 7.93μs -> 7.78μs (1.93% faster)
    assert result == expected_code  # exact match should be returned


def test_no_fallback_returns_empty_when_missing():
    # Prepare a CodeStringsMarkdown instance with no entry for the requested file.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"other.py": "some code"}

    # With allow_fallback=False and no exact match, the function must return an empty string.
    result = get_optimized_code_for_module(
        Path("missing.py"), cm, allow_fallback=False
    )  # 9.33μs -> 9.68μs (3.63% slower)
    assert result == ""


def test_fallback_single_none_key_used():
    # If optimized output contains a single code block with key "None",
    # the function should use that block as a fallback when allow_fallback=True.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"None": "fallback code for unspecified path"}

    result = get_optimized_code_for_module(
        Path("any/path.py"), cm, allow_fallback=True
    )  # 14.8μs -> 10.8μs (37.1% faster)
    assert result == "fallback code for unspecified path"


def test_basename_match_returns_when_unique():
    # When there's no exact path match but a single entry matches by basename,
    # the function should return that entry's code.
    cm = CodeStringsMarkdown()
    # The LLM returned code for "src/other/module.py" but we are asking for "module.py"
    cm._cache["file_to_path"] = {"src/other/module.py": "code for module.py", "another.py": "irrelevant"}

    result = get_optimized_code_for_module(
        Path("module.py"), cm, allow_fallback=True
    )  # 22.8μs -> 13.8μs (65.2% faster)
    # There are two keys in the map; basename of the first entry matches our target name,
    # and it's a unique basename match, so it should be returned.
    assert result == "code for module.py"


def test_basename_match_ambiguous_returns_empty():
    # If multiple entries share the same basename, the basename fallback should not pick one.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"a/path/shared.py": "first", "b/other/shared.py": "second"}

    # Requesting "shared.py" should return empty string because basename matches are ambiguous.
    result = get_optimized_code_for_module(Path("shared.py"), cm, allow_fallback=True)  # 487μs -> 581μs (16.1% slower)
    assert result == ""


def test_single_non_python_block_used_when_not_python(monkeypatch):
    # When the language is not Python and there is exactly one code block overall,
    # the function should return that lone code block as a fallback.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"weird/path.txt": "non-python code"}

    # Force is_python() to return False for this test to trigger fallback 3.
    monkeypatch.setattr(current_module, "is_python", lambda: False)

    result = get_optimized_code_for_module(Path("any.py"), cm, allow_fallback=True)  # 22.0μs -> 14.0μs (56.8% faster)
    assert result == "non-python code"


def test_no_match_logs_and_returns_empty(monkeypatch):
    # When no match is found (and no acceptable fallback), the function should return empty string.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"x.py": "x", "y.py": "y"}

    # Ensure language is Python so that non-python single-block fallback does not apply.
    monkeypatch.setattr(current_module, "is_python", lambda: True)

    result = get_optimized_code_for_module(Path("z.py"), cm, allow_fallback=True)  # 471μs -> 478μs (1.52% slower)
    assert result == ""


def test_special_characters_in_path_basename_matching():
    # Ensure that filenames with special characters are handled correctly by basename fallback.
    cm = CodeStringsMarkdown()
    cm._cache["file_to_path"] = {"some/dir/weird name @#$%.py": "special file code"}

    # Request with a Path that has only the basename; the basename matches uniquely.
    result = get_optimized_code_for_module(
        Path("weird name @#$%.py"), cm, allow_fallback=True
    )  # 21.0μs -> 13.0μs (62.5% faster)
    assert result == "special file code"


def test_large_number_of_entries_with_exact_match_performance():
    # Create a large mapping of 1000 entries and ensure the correct key is returned quickly.
    cm = CodeStringsMarkdown()
    mapping = {}
    # Fill mapping with many unrelated entries
    for i in range(1000):
        mapping[f"dir/file_{i}.py"] = f"code_{i}"
    # Put the exact target near the end to ensure search handles many entries
    target = Path("dir/file_999.py")
    mapping[str(target)] = "final_code"
    cm._cache["file_to_path"] = mapping

    result = get_optimized_code_for_module(target, cm, allow_fallback=True)  # 7.48μs -> 7.45μs (0.402% faster)
    assert result == "final_code"


def test_large_basename_search_with_single_match():
    # Create many entries but ensure exactly one basename matches the requested name.
    cm = CodeStringsMarkdown()
    mapping = {}
    # Create many files with unique basenames
    for i in range(900):
        mapping[f"a/subdir_{i}/file_{i}.py"] = f"c{i}"
    # Add many files with different names but same basename for some subset
    for i in range(50):
        mapping[f"other/{i}/common_name_{i}.py"] = f"c_common_{i}"
    # Add exactly one file whose basename is "target_common.py"
    mapping["some/deep/path/target_common.py"] = "the-one"
    # Add other files that have the same basename but ensure only one actually matches the basename we will ask
    mapping["another/place/not_target.py"] = "noise"

    cm._cache["file_to_path"] = mapping

    result = get_optimized_code_for_module(
        Path("target_common.py"), cm, allow_fallback=True
    )  # 1.94ms -> 348μs (456% faster)
    assert result == "the-one"

from pathlib import Path

# imports
from codeflash.languages.code_replacer import get_optimized_code_for_module
from codeflash.models.models import CodeString, CodeStringsMarkdown


def test_exact_match_returns_code():
    """Test that exact path match returns the correct code."""
    # Create a CodeString with a specific file path and code
    code_string = CodeString(file_path=Path("module.py"), code="optimized_code_1")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Call the function with the exact matching path
    result = get_optimized_code_for_module(Path("module.py"), optimized_code)  # 17.2μs -> 17.4μs (1.15% slower)

    # Verify the correct code is returned
    assert result == "optimized_code_1"


def test_exact_match_multiple_files():
    """Test that exact match works when multiple files are present."""
    # Create multiple CodeString instances
    code_string_1 = CodeString(file_path=Path("file1.py"), code="code_for_file1")
    code_string_2 = CodeString(file_path=Path("file2.py"), code="code_for_file2")
    code_string_3 = CodeString(file_path=Path("file3.py"), code="code_for_file3")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2, code_string_3])

    # Query for the middle file
    result = get_optimized_code_for_module(Path("file2.py"), optimized_code)  # 17.9μs -> 18.0μs (0.450% slower)

    # Verify correct code is returned
    assert result == "code_for_file2"


def test_no_match_with_fallback_disabled():
    """Test that empty string is returned when no match is found and fallback is disabled."""
    # Create a CodeString with a different path
    code_string = CodeString(file_path=Path("other.py"), code="some_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Call with allow_fallback=False
    result = get_optimized_code_for_module(
        Path("target.py"), optimized_code, allow_fallback=False
    )  # 15.9μs -> 16.0μs (0.562% slower)

    # Verify empty string is returned
    assert result == ""


def test_none_path_fallback_single_block():
    """Test fallback 1: code block with None file path is used when it's the only block."""
    # Create a CodeString with None as file path
    code_string = CodeString(file_path=None, code="fallback_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Call with any path
    result = get_optimized_code_for_module(Path("any_file.py"), optimized_code)  # 19.9μs -> 15.6μs (27.7% faster)

    # Verify fallback code is returned
    assert result == "fallback_code"


def test_basename_match_fallback():
    """Test fallback 2: matching by filename basename when full path doesn't match."""
    # Create a CodeString with a different directory but same filename
    code_string = CodeString(file_path=Path("different/dir/target.py"), code="basename_matched_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with different directory structure but same filename
    result = get_optimized_code_for_module(
        Path("some/other/target.py"), optimized_code
    )  # 26.7μs -> 19.8μs (34.4% faster)

    # Verify basename-matched code is returned
    assert result == "basename_matched_code"


def test_basename_match_multiple_candidates():
    """Test that basename match is NOT used when there are multiple matches."""
    # Create multiple CodeStrings with same basename but different directories
    code_string_1 = CodeString(file_path=Path("dir1/target.py"), code="code1")
    code_string_2 = CodeString(file_path=Path("dir2/target.py"), code="code2")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query with different path but same basename
    result = get_optimized_code_for_module(Path("dir3/target.py"), optimized_code)  # 486μs -> 490μs (0.943% slower)

    # Should not use basename fallback and return empty string
    assert result == ""


def test_empty_code_strings_list():
    """Test behavior with empty code_strings list."""
    # Create CodeStringsMarkdown with no code strings
    optimized_code = CodeStringsMarkdown(code_strings=[])

    # Call the function
    result = get_optimized_code_for_module(Path("any.py"), optimized_code)  # 459μs -> 466μs (1.55% slower)

    # Should return empty string
    assert result == ""


def test_empty_code_strings_list_no_fallback():
    """Test behavior with empty code_strings list and fallback disabled."""
    # Create CodeStringsMarkdown with no code strings
    optimized_code = CodeStringsMarkdown(code_strings=[])

    # Call the function with fallback disabled
    result = get_optimized_code_for_module(
        Path("any.py"), optimized_code, allow_fallback=False
    )  # 14.3μs -> 13.9μs (3.11% faster)

    # Should return empty string
    assert result == ""


def test_path_with_multiple_directory_levels():
    """Test exact match with deeply nested paths."""
    # Create a CodeString with deeply nested path
    code_string = CodeString(file_path=Path("a/b/c/d/e/f/deep_module.py"), code="deep_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with exact same deep path
    result = get_optimized_code_for_module(
        Path("a/b/c/d/e/f/deep_module.py"), optimized_code
    )  # 16.2μs -> 16.1μs (1.06% faster)

    # Should match exactly
    assert result == "deep_code"


def test_path_with_special_characters():
    """Test handling of paths with special characters in filename."""
    # Create a CodeString with special characters in filename
    code_string = CodeString(file_path=Path("module_test-v2.0.py"), code="special_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with exact same path
    result = get_optimized_code_for_module(
        Path("module_test-v2.0.py"), optimized_code
    )  # 15.6μs -> 15.3μs (1.64% faster)

    # Should match
    assert result == "special_code"


def test_empty_code_string_value():
    """Test that empty code string is returned when it matches."""
    # Create a CodeString with empty code
    code_string = CodeString(file_path=Path("empty.py"), code="")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query for this file
    result = get_optimized_code_for_module(Path("empty.py"), optimized_code)  # 15.0μs -> 14.9μs (0.948% faster)

    # Should return the empty string
    assert result == ""


def test_whitespace_only_code():
    """Test that whitespace-only code is returned as-is."""
    # Create a CodeString with only whitespace
    code_string = CodeString(file_path=Path("whitespace.py"), code="   \n\t  ")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query for this file
    result = get_optimized_code_for_module(Path("whitespace.py"), optimized_code)  # 14.6μs -> 14.6μs (0.000% faster)

    # Should return the whitespace as-is
    assert result == "   \n\t  "


def test_none_path_with_other_files_no_fallback():
    """Test that None path fallback is not used when other files exist and fallback is disabled."""
    # Create multiple CodeStrings including one with None path
    code_string_1 = CodeString(file_path=None, code="none_code")
    code_string_2 = CodeString(file_path=Path("other.py"), code="other_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query with no match and fallback disabled
    result = get_optimized_code_for_module(
        Path("target.py"), optimized_code, allow_fallback=False
    )  # 16.2μs -> 16.0μs (1.12% faster)

    # Should return empty string
    assert result == ""


def test_case_sensitivity_in_exact_match():
    """Test that path matching is case-sensitive."""
    # Create a CodeString with lowercase filename
    code_string = CodeString(file_path=Path("module.py"), code="lowercase_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with uppercase filename (on case-sensitive filesystems)
    result = get_optimized_code_for_module(Path("Module.py"), optimized_code)  # 479μs -> 483μs (0.715% slower)

    # Should not match due to case difference
    assert result == ""


def test_path_normalization_forward_slash():
    """Test that forward-slash paths are handled correctly."""
    # Create a CodeString with forward-slash path
    code_string = CodeString(file_path=Path("dir/file.py"), code="slash_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with same forward-slash path
    result = get_optimized_code_for_module(Path("dir/file.py"), optimized_code)  # 16.5μs -> 16.3μs (1.36% faster)

    # Should match
    assert result == "slash_code"


def test_cache_utilization():
    """Test that caching works across multiple calls."""
    # Create a CodeString
    code_string = CodeString(file_path=Path("module.py"), code="cached_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Make first call
    result1 = get_optimized_code_for_module(Path("module.py"), optimized_code)  # 15.6μs -> 15.0μs (4.06% faster)

    # Make second call with same CodeStringsMarkdown instance
    result2 = get_optimized_code_for_module(Path("module.py"), optimized_code)

    # Both should return same result
    assert result1 == result2 == "cached_code"  # 6.47μs -> 6.32μs (2.37% faster)
    # Verify cache was populated
    assert optimized_code._cache.get("file_to_path") is not None


def test_very_long_code_string():
    """Test handling of very long code strings."""
    # Create a CodeString with very long code
    long_code = "x = 1\n" * 1000  # 6000 characters
    code_string = CodeString(file_path=Path("long.py"), code=long_code)
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query for this file
    result = get_optimized_code_for_module(Path("long.py"), optimized_code)  # 17.3μs -> 17.2μs (0.471% faster)

    # Should return the entire long code
    assert result == long_code
    assert len(result) == 6000


def test_many_files_exact_match():
    """Test exact match performance with many files."""
    # Create many CodeStrings
    code_strings = [CodeString(file_path=Path(f"file_{i}.py"), code=f"code_{i}") for i in range(100)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for a file in the middle
    result = get_optimized_code_for_module(Path("file_50.py"), optimized_code)  # 102μs -> 101μs (1.64% faster)

    # Should find exact match
    assert result == "code_50"


def test_basename_match_with_many_different_files():
    """Test basename match when there are many files but only one basename match."""
    # Create CodeStrings with different paths but one unique basename match
    code_strings = [CodeString(file_path=Path(f"dir_{i}/other_{i}.py"), code=f"code_{i}") for i in range(50)]
    # Add the target basename
    code_strings.append(CodeString(file_path=Path("deep/nested/target.py"), code="target_code"))
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query with different path but same basename
    result = get_optimized_code_for_module(
        Path("other/path/target.py"), optimized_code
    )  # 169μs -> 86.2μs (97.0% faster)

    # Should match by basename
    assert result == "target_code"


def test_none_path_ignored_when_other_exact_match():
    """Test that None path is not used when exact match exists."""
    # Create multiple CodeStrings including one with None path
    code_string_1 = CodeString(file_path=None, code="none_code")
    code_string_2 = CodeString(file_path=Path("target.py"), code="exact_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query for target.py
    result = get_optimized_code_for_module(Path("target.py"), optimized_code)  # 16.5μs -> 16.5μs (0.121% faster)

    # Should return exact match, not None path
    assert result == "exact_code"


def test_multiple_none_paths():
    """Test handling when multiple code blocks have None as path."""
    # Create multiple CodeStrings with None path (unusual but possible)
    code_string_1 = CodeString(file_path=None, code="code1")
    code_string_2 = CodeString(file_path=None, code="code2")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string_1, code_string_2])

    # Query for any file
    result = get_optimized_code_for_module(Path("any.py"), optimized_code)  # 20.5μs -> 15.8μs (29.8% faster)

    # Should not use None fallback (length > 1)
    assert result == ""


def test_relative_path_with_dots():
    """Test paths containing dot notation."""
    # Create a CodeString with dots in the path
    code_string = CodeString(file_path=Path("../parent/module.py"), code="parent_code")
    optimized_code = CodeStringsMarkdown(code_strings=[code_string])

    # Query with exact same path
    result = get_optimized_code_for_module(
        Path("../parent/module.py"), optimized_code
    )  # 16.0μs -> 15.8μs (1.59% faster)

    # Should match exactly
    assert result == "parent_code"


def test_large_number_of_code_blocks():
    """Test performance with large number of code blocks."""
    # Create 1000 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:04d}.py"), code=f"code_block_{i:04d}") for i in range(1000)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for file in the middle
    result = get_optimized_code_for_module(Path("module_0500.py"), optimized_code)  # 819μs -> 806μs (1.63% faster)

    # Should find exact match
    assert result == "code_block_0500"


def test_large_number_of_code_blocks_query_last():
    """Test performance querying the last file in a large set."""
    # Create 1000 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:04d}.py"), code=f"code_block_{i:04d}") for i in range(1000)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for the last file
    result = get_optimized_code_for_module(Path("module_0999.py"), optimized_code)  # 819μs -> 796μs (2.95% faster)

    # Should find exact match
    assert result == "code_block_0999"


def test_large_number_of_code_blocks_no_match():
    """Test performance when no match exists in large set."""
    # Create 1000 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:04d}.py"), code=f"code_block_{i:04d}") for i in range(1000)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query for a file that doesn't exist
    result = get_optimized_code_for_module(Path("nonexistent.py"), optimized_code)  # 6.50ms -> 5.61ms (15.9% faster)

    # Should return empty string
    assert result == ""


def test_basename_match_in_large_set():
    """Test basename matching performance with large number of files."""
    # Create 500 CodeStrings with different basenames
    code_strings = [CodeString(file_path=Path(f"dir_{i:03d}/module_{i:03d}.py"), code=f"code_{i}") for i in range(500)]
    # Add one more with a unique basename we'll search for
    code_strings.append(CodeString(file_path=Path("deep/nested/dir/target.py"), code="target_found"))
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Query with different path but same basename
    result = get_optimized_code_for_module(
        Path("other/path/target.py"), optimized_code
    )  # 1.40ms -> 634μs (120% faster)

    # Should match by basename
    assert result == "target_found"


def test_cache_performance_multiple_queries():
    """Test that caching improves performance for repeated queries."""
    # Create 500 CodeStrings
    code_strings = [CodeString(file_path=Path(f"module_{i:03d}.py"), code=f"code_{i}") for i in range(500)]
    optimized_code = CodeStringsMarkdown(code_strings=code_strings)

    # Make multiple queries to same file (should use cache)
    results = [get_optimized_code_for_module(Path("module_250.py"), optimized_code) for _ in range(100)]

    # All results should be identical
    assert all(r == "code_250" for r in results)
    # Cache should be populated after first call
    assert optimized_code._cache.get("file_to_path") is not None

To edit these changes git checkout codeflash/optimize-pr1774-2026-03-17T02.35.54 and push.

The hot path in basename matching (30.9% of original runtime) was replaced with an early-exit loop that avoids building an intermediate list and constructing Path objects for every dictionary entry: the optimized code uses `os.path.basename` (a simple string slice) instead of `Path(path).name` and stops scanning after finding two matches. Line profiler confirms the original list-comprehension cost ~20.7 ms across all calls, now reduced to ~3.8 ms by iterating once and breaking early. Debug logging was guarded with `isEnabledFor` checks to skip expensive f-string formatting when debug is off, saving ~70 µs per suppressed log. Runtime improved 21% with no correctness regressions.

claude · 2026-03-17T02:36:33Z

Claude finished @codeflash-ai[bot]'s task in 4m 52s —— View job

PR Review Summary

Prek Checks

Two lint errors were found and auto-fixed in a follow-up commit (a707de7b):

Issue	Fix
`TC003`: `from pathlib import Path` outside `TYPE_CHECKING` block	Moved `Path` into `TYPE_CHECKING` — all usages are type annotations (lazy with `from __future__ import annotations`)
`PTH119`: `os.path.basename()` flagged (should use `Path.name`)	Added `# noqa: PTH119` — using `Path(path).name` defeats the entire performance optimization

Additional fixes:

first_match: str | None = None → first_match: str = "" (resolves mypy return-value error)
Removed duplicate # Fallback 3 comment that was accidentally introduced

Prek now passes cleanly.

Code Review

The optimization is sound and well-targeted:

Early-exit loop (Fallback 2): replaces [code for path, code in ... if Path(path).name == target_name] with an early-break loop — avoids constructing Path objects for every dictionary entry and stops at 2 matches. Profiler confirmed 30.9% of original runtime was in this list comprehension.
isEnabledFor(logging.DEBUG) guards: avoids f-string evaluation for suppressed log calls. Small but correct.
target_key = str(relative_path): minor de-dup to avoid calling str() twice.

One observation: after the loop, first_match holds the second basename match's code (not the first) when match_count > 1. This is harmless since if match_count == 1 is false in that case, but the variable name is slightly misleading. Not a bug.

Duplicate Detection

No duplicates detected. get_optimized_code_for_module exists only in codeflash/languages/code_replacer.py.

Test Coverage

92 related tests passed. Coverage for codeflash/languages/code_replacer.py: 84% (96 stmts, 15 missed). Coverage maintained — the PR itself reported 100% coverage for get_optimized_code_for_module specifically.

Merge Optimization PRs

PR #1849 (this PR) is the only open codeflash-ai[bot] PR. CI is currently in progress — will need to wait for it to complete before merging.

| Branch

- Move Path to TYPE_CHECKING block (TC003) - Suppress PTH119 on os.path.basename - performance-justified exception - Fix first_match type annotation from str | None to str - Remove duplicate comment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

claude · 2026-03-17T03:53:16Z

Closing: CI check js-cjs-function-optimization is failing on the target branch (feat/gradle-executor-from-java). This failure appears pre-existing on the base branch and unrelated to the get_optimized_code_for_module optimization, but I cannot safely merge while CI is red.

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Mar 17, 2026

codeflash-ai bot mentioned this pull request Mar 17, 2026

feat: add Gradle build tool support for Java optimization #1774

Open

6 tasks

claude bot mentioned this pull request Mar 17, 2026

⚡️ Speed up function _expr_matches_name by 26% in PR #1660 (unstructured-inference) #1850

Merged

claude bot closed this Mar 17, 2026

claude bot deleted the codeflash/optimize-pr1774-2026-03-17T02.35.54 branch March 17, 2026 03:53

claude bot mentioned this pull request Mar 17, 2026

feat: improve function ranking with reference graph and test-based boosting #1660

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up function `get_optimized_code_for_module` by 21% in PR #1774 (`feat/gradle-executor-from-java`)#1849

⚡️ Speed up function `get_optimized_code_for_module` by 21% in PR #1774 (`feat/gradle-executor-from-java`)#1849
codeflash-ai[bot] wants to merge 2 commits intofeat/gradle-executor-from-javafrom
codeflash/optimize-pr1774-2026-03-17T02.35.54

codeflash-ai bot commented Mar 17, 2026

Uh oh!

claude bot commented Mar 17, 2026 •

edited

Loading

Uh oh!

claude bot commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

codeflash-ai bot commented Mar 17, 2026

⚡️ This pull request contains optimizations for PR #1774

📄 21% (0.21x) speedup for get_optimized_code_for_module in codeflash/languages/code_replacer.py

📝 Explanation and details

Uh oh!

claude bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Code Review

Duplicate Detection

Test Coverage

Merge Optimization PRs

Uh oh!

claude bot commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

📄 21% (0.21x) speedup for `get_optimized_code_for_module` in `codeflash/languages/code_replacer.py`

claude bot commented Mar 17, 2026 •

edited

Loading