-
Notifications
You must be signed in to change notification settings - Fork 21
codeflash-omni-java #1199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
codeflash-omni-java #1199
Conversation
- Add JaCoCo Maven plugin management to build_tools.py: - is_jacoco_configured() to check if plugin exists - add_jacoco_plugin_to_pom() to inject plugin configuration - get_jacoco_xml_path() for coverage report location - Add JacocoCoverageUtils class to coverage_utils.py: - Parses JaCoCo XML reports into CoverageData objects - Handles method boundary detection and line/branch coverage - Update test_runner.py to support coverage collection: - run_behavioral_tests() now handles enable_coverage=True - Automatically adds JaCoCo plugin and runs jacoco:report goal - Update critic.py to enforce 60% coverage threshold for Java (previously Java was bypassed) - Add comprehensive test suite with 19 tests for coverage functionality Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix config parser to find codeflash.toml for Java projects
(was only looking for pyproject.toml)
- Fix JaCoCo plugin addition to pom.xml:
- Use string manipulation instead of ElementTree to avoid
namespace prefix corruption (ns0:project issue)
- ElementTree was changing <project> to <ns0:project> which
broke Maven
- Add Java coverage parsing in parse_test_output.py:
- Route Java coverage to JacocoCoverageUtils instead of
Python's CoverageUtils
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix is_jacoco_configured() to search all build/plugins sections recursively, including those in profiles - Fix add_jacoco_plugin_to_pom() to correctly find the main build section when profiles exist (not insert into profile builds) - Add _find_closing_tag() helper to handle nested XML tags - Remove explicit jacoco:report goal from Maven command since the plugin execution binds report to test phase automatically Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
d2050b1 to
77cddec
Compare
- Add _find_multi_module_root() to detect when tests are in a separate module - Add _get_test_module_target_dir() to find the correct surefire reports dir - Update run_behavioral_tests() and run_benchmarking_tests() to: - Run Maven from the parent project root for multi-module projects - Use -pl <module> -am to build only the test module and dependencies - Use -DfailIfNoTests=false to allow modules without tests to pass - Use -DskipTests=false to override pom.xml skipTests settings - Look for surefire reports in the test module's target directory Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update TestConfig._detect_java_test_framework() to check parent pom.xml for multi-module projects where test deps are in a different module - Add framework aliases in registry to map junit4/testng to Java support - Correctly detect JUnit 4 projects and send correct framework to AI service Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use ^(?:public\s+)?class pattern to match class declaration at start of line - Prevents matching words like "command" or text in comments that contain "class" - Fixes issue where test files were named incorrectly (e.g., "and__perfinstrumented.java") Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…dule projects - Fix duplicate test file issue: when multiple tests have the same class name, append unique index suffix (e.g., CryptoTest_2) to avoid file overwrites - Fix multi-module JaCoCo support: add JaCoCo plugin to test module's pom.xml instead of source module, ensuring coverage data is collected where tests run - Fix timeout: use minimum 60s (120s with coverage) for Java builds since Maven takes longer than the default 15s INDIVIDUAL_TESTCASE_TIMEOUT - Fix Maven phase: use 'verify' instead of 'test' when coverage is enabled, with maven.test.failure.ignore=true to generate report even if tests fail - Update JaCoCo report phase from 'test' to 'verify' to run after tests complete Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat: add JaCoCo test coverage support for Java optimization
- Update coverage_critic to skip coverage check when CoverageStatus.NOT_FOUND is returned (e.g., when JaCoCo report doesn't exist in multi-module projects where the test module has no source classes) - Add JaCoCo configuration to include all class files for multi-module support This fixes "threshold for test confidence was not met" errors that occurred even when all tests passed, because JaCoCo couldn't generate coverage reports for test modules without source classes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
fix: handle NOT_FOUND coverage status in Java multi-module projects
| project_root = Path.cwd() | ||
|
|
||
| # Check for existing codeflash config in pom.xml or a separate config file | ||
| codeflash_config_path = project_root / "codeflash.toml" | ||
| if codeflash_config_path.exists(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡️Codeflash found 70% (0.70x) speedup for should_modify_java_config in codeflash/cli_cmds/init_java.py
⏱️ Runtime : 714 microseconds → 421 microseconds (best of 60 runs)
📝 Explanation and details
The optimized code achieves a 69% speedup (714μs → 421μs) by replacing pathlib.Path operations with equivalent os module functions, which have significantly lower overhead.
Key optimizations:
-
os.getcwd()instead ofPath.cwd(): The line profiler showsPath.cwd()took 689,637ns (34.1% of total time) vsos.getcwd()taking only 68,036ns (7.4%). This is a ~10x improvement becausePath.cwd()instantiates a Path object and performs additional normalization, whileos.getcwd()returns a raw string from a system call. -
os.path.join()instead of Path division operator: Constructing the config path viaproject_root / "codeflash.toml"took 386,582ns (19.1%) vsos.path.join()taking 190,345ns (20.6%). Though the percentage appears similar, the absolute time is ~50% faster because the/operator creates a new Path object with its associated overhead. -
os.path.exists()instead ofPath.exists(): The existence check dropped from 476,490ns (23.6%) to 223,477ns (24.2%) - roughly 2x faster. Theos.path.exists()function directly calls the stat syscall, whilePath.exists()goes through Path's object model.
Why this works:
Path objects provide a cleaner API but add object instantiation, method dispatch, and normalization overhead. For simple filesystem checks in initialization code that runs frequently, using lower-level os functions eliminates this overhead while maintaining identical functionality.
Test results:
All test cases show 68-111% speedup across scenarios including:
- Empty directories (fastest: 82-87% improvement)
- Large directories with 500 files (68-111% improvement)
- Edge cases like symlinks and directory-as-file (75-82% improvement)
The optimization is particularly beneficial for CLI initialization code that may run on every command invocation, where sub-millisecond improvements in frequently-called functions compound into noticeable user experience gains.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 23 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
from __future__ import annotations
# imports
import os
from pathlib import Path
from typing import Any
import pytest # used for our unit tests
from codeflash.cli_cmds.init_java import should_modify_java_config
def test_no_config_file_does_not_prompt_and_returns_true(monkeypatch, tmp_path):
# Arrange: ensure working directory has no codeflash.toml
monkeypatch.chdir(tmp_path) # set cwd to a clean temporary directory
# Replace Confirm.ask with a function that fails the test if called.
def fail_if_called(*args, **kwargs):
raise AssertionError("Confirm.ask should not be called when no config file exists")
# Patch the exact attribute that the function imports at runtime.
monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)
# Act: call function under test
codeflash_output = should_modify_java_config(); result = codeflash_output # 28.9μs -> 15.9μs (82.0% faster)
def test_config_file_exists_prompts_and_respects_true_choice(monkeypatch, tmp_path):
# Arrange: create a codeflash.toml file so the function will detect it
monkeypatch.chdir(tmp_path)
config_file = tmp_path / "codeflash.toml"
config_file.write_text("existing = true") # create the file
# Capture the arguments passed to Confirm.ask and return True to simulate user acceptance
called = {}
def fake_ask(prompt, default, show_default):
# Record inputs for later assertions
called["prompt"] = prompt
called["default"] = default
called["show_default"] = show_default
return True
# Patch Confirm.ask used inside the function
monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 25.6μs -> 13.7μs (86.9% faster)
def test_config_file_exists_prompts_and_respects_false_choice(monkeypatch, tmp_path):
# Arrange: create the config file
monkeypatch.chdir(tmp_path)
(tmp_path / "codeflash.toml").write_text("existing = true")
# Simulate user declining re-configuration
def fake_ask_decline(prompt, default, show_default):
return False
monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask_decline, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 24.7μs -> 13.3μs (86.3% faster)
def test_presence_of_pom_xml_does_not_trigger_prompt(monkeypatch, tmp_path):
# Arrange: create a pom.xml but NOT codeflash.toml
monkeypatch.chdir(tmp_path)
(tmp_path / "pom.xml").write_text("<project></project>")
# If Confirm.ask is called, fail the test because only codeflash.toml should trigger it in current implementation
def fail_if_called(*args, **kwargs):
raise AssertionError("Confirm.ask should not be called when only pom.xml exists (implementation checks codeflash.toml)")
monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 28.3μs -> 16.6μs (69.9% faster)
def test_codeflash_config_is_directory_triggers_prompt(monkeypatch, tmp_path):
# Arrange: create a directory named codeflash.toml (Path.exists will be True)
monkeypatch.chdir(tmp_path)
(tmp_path / "codeflash.toml").mkdir()
# Simulate user selecting True
monkeypatch.setattr("rich.prompt.Confirm.ask", lambda *a, **k: True, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 23.6μs -> 12.9μs (82.2% faster)
def test_codeflash_config_symlink_triggers_prompt_if_supported(monkeypatch, tmp_path):
# Arrange: attempt to create a symlink to a real file; skip if symlink not supported
if not hasattr(os, "symlink"):
pytest.skip("Platform does not support os.symlink; skipping symlink test")
real = tmp_path / "real_config"
real.write_text("x = 1")
link = tmp_path / "codeflash.toml"
try:
os.symlink(real, link) # may fail on Windows without privileges
except (OSError, NotImplementedError) as e:
pytest.skip(f"Could not create symlink on this platform/environment: {e}")
monkeypatch.chdir(tmp_path)
# Simulate user declining re-configuration
monkeypatch.setattr("rich.prompt.Confirm.ask", lambda *a, **k: False, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 24.9μs -> 14.2μs (75.7% faster)
def test_large_directory_without_config_is_fast_and_does_not_prompt(monkeypatch, tmp_path):
# Large scale scenario: create many files (but under 1000) to simulate busy project directory.
monkeypatch.chdir(tmp_path)
num_files = 500 # under the 1000 element guideline
for i in range(num_files):
# Create many innocuous files; should not affect the function's behavior
(tmp_path / f"file_{i}.txt").write_text(str(i))
# Ensure Confirm.ask is not called
def fail_if_called(*args, **kwargs):
raise AssertionError("Confirm.ask should not be called when codeflash.toml is absent even in large directories")
monkeypatch.setattr("rich.prompt.Confirm.ask", fail_if_called, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 36.3μs -> 21.6μs (68.0% faster)
def test_large_directory_with_config_prompts_once(monkeypatch, tmp_path):
# Large scale scenario with config present: many files plus codeflash.toml
monkeypatch.chdir(tmp_path)
num_files = 500
for i in range(num_files):
(tmp_path / f"file_{i}.txt").write_text(str(i))
# Create the config file that should trigger prompting
(tmp_path / "codeflash.toml").write_text("reconfigure = maybe")
# Track how many times Confirm.ask is invoked to ensure single prompt
counter = {"calls": 0}
def fake_ask(prompt, default, show_default):
counter["calls"] += 1
return True
monkeypatch.setattr("rich.prompt.Confirm.ask", fake_ask, raising=True)
# Act
codeflash_output = should_modify_java_config(); result = codeflash_output # 30.8μs -> 14.6μs (111% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import os
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch
# imports
import pytest
from codeflash.cli_cmds.init_java import should_modify_java_config
class TestShouldModifyJavaConfigBasic:
"""Basic test cases for should_modify_java_config function."""
def test_no_config_file_exists_returns_true(self):
"""
Scenario: Project has no existing codeflash.toml file
Expected: Function returns (True, None) without prompting user
"""
# Create a temporary directory without codeflash.toml
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_exists_user_confirms(self):
"""
Scenario: Project has existing codeflash.toml and user confirms re-configuration
Expected: Function prompts user and returns (True, None) if user confirms
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a codeflash.toml file
config_file = Path(tmpdir) / "codeflash.toml"
config_file.touch()
# Mock the Confirm.ask to return True (user confirms)
with patch('rich.prompt.Confirm.ask', return_value=True):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_exists_user_declines(self):
"""
Scenario: Project has existing codeflash.toml and user declines re-configuration
Expected: Function prompts user and returns (False, None) if user declines
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a codeflash.toml file
config_file = Path(tmpdir) / "codeflash.toml"
config_file.touch()
# Mock the Confirm.ask to return False (user declines)
with patch('rich.prompt.Confirm.ask', return_value=False):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_return_tuple_structure(self):
"""
Scenario: Verify the function always returns a tuple with specific structure
Expected: Return value is a tuple of (bool, None)
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
class TestShouldModifyJavaConfigEdgeCases:
"""Edge case test cases for should_modify_java_config function."""
def test_config_file_exists_but_empty(self):
"""
Scenario: codeflash.toml file exists but is empty
Expected: File is still considered as existing, prompts user
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create an empty codeflash.toml file
config_file = Path(tmpdir) / "codeflash.toml"
config_file.write_text("")
with patch('rich.prompt.Confirm.ask', return_value=True):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_with_content(self):
"""
Scenario: codeflash.toml file exists with actual TOML content
Expected: Prompts user regardless of file content
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a codeflash.toml file with content
config_file = Path(tmpdir) / "codeflash.toml"
config_file.write_text("[codeflash]\nversion = 1\n")
with patch('rich.prompt.Confirm.ask', return_value=False):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_case_sensitive(self):
"""
Scenario: Directory has 'Codeflash.toml' or 'CODEFLASH.TOML' instead of lowercase
Expected: Function only recognizes 'codeflash.toml' (case-sensitive on Unix)
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create a file with different casing
config_file = Path(tmpdir) / "Codeflash.toml"
config_file.touch()
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
def test_config_file_is_directory_not_file(self):
"""
Scenario: codeflash.toml exists as a directory instead of a file
Expected: Path.exists() still returns True, prompts user
"""
with tempfile.TemporaryDirectory() as tmpdir:
original_cwd = os.getcwd()
try:
os.chdir(tmpdir)
# Create codeflash.toml as a directory
config_dir = Path(tmpdir) / "codeflash.toml"
config_dir.mkdir()
with patch('rich.prompt.Confirm.ask', return_value=True):
codeflash_output = should_modify_java_config(); result = codeflash_output
finally:
os.chdir(original_cwd)
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T21.20.00
| project_root = Path.cwd() | |
| # Check for existing codeflash config in pom.xml or a separate config file | |
| codeflash_config_path = project_root / "codeflash.toml" | |
| if codeflash_config_path.exists(): | |
| project_root = os.getcwd() | |
| # Check for existing codeflash config in pom.xml or a separate config file | |
| codeflash_config_path = os.path.join(project_root, "codeflash.toml") | |
| if os.path.exists(codeflash_config_path): |
This optimization achieves a **26x speedup (2598% improvement)** by eliminating expensive logging operations that dominated the original runtime.
## Key Performance Improvements
### 1. **Conditional Logging Guard (95% of original time eliminated)**
The original code unconditionally formatted expensive log messages even when logging was disabled:
```python
logger.warning(
f"Optimized code not found for {relative_path} In the context\n-------\n{optimized_code}\n-------\n"
...
)
```
This single operation consumed **111ms out of 117ms total runtime** (95%).
The optimization adds a guard check:
```python
if logger.isEnabledFor(logger.level):
logger.warning(...)
```
This prevents string formatting and object serialization when the log message won't be emitted, dramatically reducing overhead in production scenarios where warning-level logging may be disabled.
### 2. **Eliminated Redundant Path Object Creation**
The original created `Path` objects repeatedly during filename matching:
```python
if file_path_str and Path(file_path_str).name == target_filename:
```
The optimized version uses string operations:
```python
if file_path_str.endswith(target_filename) and (len(file_path_str) == len(target_filename) or file_path_str[-len(target_filename)-1] in ('/', '\\')):
```
This removes overhead from Path instantiation (1.16ms → 44µs in the profiler).
### 3. **Minor Cache Lookup Optimization**
Changed from `self._cache.get("file_to_path") is not None` to `"file_to_path" in self._cache` and hoisted the dict assignment to avoid inline mutation, providing small gains in the caching path.
### 4. **String Conversion Hoisting**
Pre-computed `relative_path_str = str(relative_path)` to avoid repeated conversions.
## Test Case Performance Patterns
- **Exact path matches** (most common case): 10-20% faster due to optimized caching
- **No-match scenarios** (fallback paths): **78-189x faster** due to eliminated logger.warning overhead
- `test_empty_code_strings`: 1.03ms → 12.9µs (7872% faster)
- `test_no_match_multiple_blocks`: 1.28ms → 16.3µs (7753% faster)
- `test_many_code_blocks_no_match`: 20.5ms → 107µs (18985% faster)
The optimization particularly benefits scenarios where file path mismatches occur, as these trigger the expensive warning path in the original code. For the common case of exact matches, the improvements are modest but consistent.
…2026-02-01T22.01.32 ⚡️ Speed up function `get_optimized_code_for_module` by 2,599% in PR #1199 (`omni-java`)
| if os.path.exists("mvnw"): | ||
| return "./mvnw" | ||
| if os.path.exists("mvnw.cmd"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡️Codeflash found 32% (0.32x) speedup for find_maven_executable in codeflash/languages/java/build_tools.py
⏱️ Runtime : 584 microseconds → 441 microseconds (best of 81 runs)
📝 Explanation and details
The optimization achieves a 32% runtime improvement (from 584μs to 441μs) by replacing os.path.exists() with os.access() for file existence checks. This change delivers measurable performance gains across all test scenarios.
Key Optimization:
The code replaces os.path.exists("mvnw") with os.access("mvnw", os.F_OK). While both functions check for file existence, os.access() with the os.F_OK flag is more efficient because:
- It performs a direct system call (
access()) that's optimized for permission/existence checks os.path.exists()internally does additional path normalization and exception handling that adds overhead- For simple existence checks,
os.access()avoids Python-level abstraction layers
Performance Impact by Scenario:
The line profiler shows that the wrapper checks (lines checking for "mvnw" and "mvnw.cmd") improved from ~576ns + 139ns to ~317ns + 76ns - nearly 2x faster for these critical paths. Test results confirm consistent improvements:
- Wrapper present cases: 68-84% faster (5.78μs → 3.32μs)
- No wrapper, system Maven cases: 31-52% faster
- Edge cases (directories, symlinks): 56-77% faster
Why This Matters:
Based on the function references, find_maven_executable() is called from test infrastructure and build tool detection code. While not in an obvious hot loop, build tool detection typically occurs at project initialization and in test setup/teardown - contexts where this function may be called repeatedly. The optimization is particularly valuable when:
- Running large test suites that reinitialize build contexts frequently
- Working in CI/CD environments with repeated project setup
- Dealing with directories containing many files (test shows 77% improvement with 500 files present)
The optimization maintains identical semantics - both os.path.exists() and os.access(..., os.F_OK) return True for files, directories, and symlinks, ensuring backward compatibility while delivering consistent double-digit runtime improvements.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 34 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | ✅ 1 Passed |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
import os
import pathlib
import shutil
import pytest # used for our unit tests
from codeflash.languages.java.build_tools import find_maven_executable
def test_prefers_mvnw_wrapper_when_present(tmp_path, monkeypatch):
# Create an isolated temporary directory and switch to it
# so os.path.exists checks only our test files.
monkeypatch.chdir(tmp_path)
# Create a file named "mvnw" to simulate the Maven wrapper being present.
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
# Call the real function under test and assert it returns the wrapper path.
# According to implementation, when "mvnw" exists it should return "./mvnw".
codeflash_output = find_maven_executable() # 5.78μs -> 3.32μs (74.3% faster)
def test_returns_mvnw_cmd_when_only_windows_wrapper_exists(tmp_path, monkeypatch):
# Switch to a fresh temporary directory for isolation.
monkeypatch.chdir(tmp_path)
# Create only "mvnw.cmd" and ensure no plain "mvnw" exists.
(tmp_path / "mvnw.cmd").write_text("@echo off\necho mvnw.cmd\n")
# The function should detect "mvnw.cmd" and return that exact string.
codeflash_output = find_maven_executable() # 13.2μs -> 7.16μs (84.0% faster)
def test_prefers_mvnw_over_mvnw_cmd_when_both_present(tmp_path, monkeypatch):
# Ensure both wrapper files exist; "mvnw" should be preferred because it's checked first.
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
(tmp_path / "mvnw.cmd").write_text("@echo off\necho mvnw.cmd\n")
# Confirm that "./mvnw" is returned, demonstrating the precedence.
codeflash_output = find_maven_executable() # 5.58μs -> 3.32μs (68.3% faster)
def test_returns_system_mvn_when_no_wrappers(monkeypatch, tmp_path):
# Make sure current directory has no wrapper files.
monkeypatch.chdir(tmp_path)
# Monkeypatch shutil.which to simulate an installed mvn on PATH.
monkeypatch.setattr(shutil, "which", lambda name: "/usr/bin/mvn" if name == "mvn" else None)
# The function should return whatever shutil.which returns when no wrappers present.
codeflash_output = find_maven_executable() # 14.0μs -> 9.18μs (52.3% faster)
def test_returns_none_when_nothing_found(monkeypatch, tmp_path):
# No wrapper files in cwd.
monkeypatch.chdir(tmp_path)
# Simulate no mvn on PATH by returning None (or falsy string).
monkeypatch.setattr(shutil, "which", lambda name: None)
# Expect None when neither wrapper nor system Maven is found.
codeflash_output = find_maven_executable() # 13.6μs -> 8.93μs (52.2% faster)
def test_ignores_empty_string_from_which(monkeypatch, tmp_path):
# If shutil.which returns an empty string (falsy), function should treat it as not found.
monkeypatch.chdir(tmp_path)
monkeypatch.setattr(shutil, "which", lambda name: "")
# Expect None because empty string is falsy and treated like "not found".
codeflash_output = find_maven_executable() # 13.3μs -> 8.87μs (49.5% faster)
def test_directory_named_mvnw_counts_as_exists(tmp_path, monkeypatch):
# Create a directory named "mvnw" (os.path.exists returns True for directories).
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").mkdir()
# The function checks os.path.exists only, so it should return "./mvnw" even if it's a directory.
codeflash_output = find_maven_executable() # 5.50μs -> 3.11μs (77.1% faster)
def test_symlink_wrapper_to_existing_target(tmp_path, monkeypatch):
# Create a real target file and a symlink named "mvnw" pointing to it.
monkeypatch.chdir(tmp_path)
target = tmp_path / "real_mvnw"
target.write_text("#!/bin/sh\necho real\n")
symlink = tmp_path / "mvnw"
# Create a symlink; ensure platform supports it (on Windows this may require admin, so skip if not possible).
try:
symlink.symlink_to(target)
except (OSError, NotImplementedError):
pytest.skip("Symlinks not supported in this environment")
# The symlink points to an existing file, so os.path.exists should be True and wrapper detected.
codeflash_output = find_maven_executable() # 7.11μs -> 4.56μs (56.1% faster)
def test_wrapper_has_precedence_over_system_mvn(monkeypatch, tmp_path):
# Even if shutil.which finds a system mvn, a wrapper present in cwd must take precedence.
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
monkeypatch.setattr(shutil, "which", lambda name: "/usr/local/bin/mvn")
# Confirm wrapper is returned, not the system path.
codeflash_output = find_maven_executable() # 5.59μs -> 3.33μs (68.1% faster)
def test_large_number_of_files_with_wrapper_present(tmp_path, monkeypatch):
# Create many files to simulate a crowded project directory.
monkeypatch.chdir(tmp_path)
# Create 500 dummy files (well under the 1000-element limit).
for i in range(500):
(tmp_path / f"file_{i}.txt").write_text(f"dummy {i}")
# Place the wrapper among many files and confirm detection remains correct.
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
# The function should still return the wrapper path quickly and correctly.
codeflash_output = find_maven_executable() # 6.15μs -> 3.47μs (77.4% faster)
def test_large_number_of_files_without_wrapper_uses_system_mvn(monkeypatch, tmp_path):
# With many files but no wrapper, the function should fall back to shutil.which.
monkeypatch.chdir(tmp_path)
for i in range(250):
(tmp_path / f"other_{i}.data").write_text("x" * 10)
# Simulate a system Maven found on PATH.
monkeypatch.setattr(shutil, "which", lambda name: r"C:\Program Files\Apache\Maven\bin\mvn.bat" if name == "mvn" else None)
# Return should be the system path provided by shutil.which.
codeflash_output = find_maven_executable() # 22.0μs -> 16.7μs (31.6% faster)
def test_multiple_invocations_return_same_result(tmp_path, monkeypatch):
# Ensure stable behavior across multiple calls with same environment.
monkeypatch.chdir(tmp_path)
(tmp_path / "mvnw").write_text("#!/bin/sh\necho mvnw\n")
codeflash_output = find_maven_executable(); first = codeflash_output # 5.66μs -> 3.30μs (71.7% faster)
codeflash_output = find_maven_executable(); second = codeflash_output # 2.88μs -> 1.66μs (73.5% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import os
import shutil
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from codeflash.languages.java.build_tools import find_maven_executable
def test_finds_mvnw_in_current_directory():
"""Test that find_maven_executable returns ./mvnw when mvnw exists in current directory."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw file
mvnw_path = os.path.join(tmpdir, "mvnw")
Path(mvnw_path).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_mvnw_cmd_in_current_directory():
"""Test that find_maven_executable returns mvnw.cmd when mvnw.cmd exists and mvnw does not."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw.cmd file
mvnw_cmd_path = os.path.join(tmpdir, "mvnw.cmd")
Path(mvnw_cmd_path).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_prefers_mvnw_over_mvnw_cmd():
"""Test that find_maven_executable prefers ./mvnw over mvnw.cmd when both exist."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create both mvnw and mvnw.cmd files
Path(os.path.join(tmpdir, "mvnw")).touch()
Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_system_maven_when_wrappers_not_present():
"""Test that find_maven_executable finds system Maven when wrappers are not present."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return a maven path
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result = codeflash_output
mock_which.assert_called_once_with("mvn")
finally:
os.chdir(original_dir)
def test_returns_none_when_no_maven_found():
"""Test that find_maven_executable returns None when no Maven executable is found."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return None
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_mvnw_wrapper_takes_priority_over_system_maven():
"""Test that ./mvnw is returned even when system Maven is available."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw file
Path(os.path.join(tmpdir, "mvnw")).touch()
# Mock shutil.which to return a system maven path
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result = codeflash_output
mock_which.assert_not_called()
finally:
os.chdir(original_dir)
def test_mvnw_cmd_takes_priority_over_system_maven():
"""Test that mvnw.cmd is returned even when system Maven is available."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw.cmd file
Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
# Mock shutil.which to return a system maven path
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result = codeflash_output
mock_which.assert_not_called()
finally:
os.chdir(original_dir)
def test_handles_system_maven_with_absolute_path():
"""Test that find_maven_executable correctly returns absolute path for system Maven."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return an absolute path
with patch('shutil.which') as mock_which:
absolute_path = "/opt/maven/bin/mvn"
mock_which.return_value = absolute_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_handles_system_maven_with_relative_path():
"""Test that find_maven_executable correctly returns relative path for system Maven."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return a relative path
with patch('shutil.which') as mock_which:
relative_path = "./bin/mvn"
mock_which.return_value = relative_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_mvnw_exists_as_directory_not_file():
"""Test behavior when 'mvnw' exists but is a directory, not a file."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw as a directory
os.makedirs(os.path.join(tmpdir, "mvnw"))
# Mock shutil.which to return None (so it falls through to system check)
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_mvnw_cmd_exists_as_directory_not_file():
"""Test behavior when 'mvnw.cmd' exists but is a directory, not a file."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw.cmd as a directory
os.makedirs(os.path.join(tmpdir, "mvnw.cmd"))
# Mock shutil.which to return None
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_empty_string_from_system_maven():
"""Test handling when shutil.which returns an empty string."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return an empty string
with patch('shutil.which') as mock_which:
mock_which.return_value = ""
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_whitespace_string_from_system_maven():
"""Test handling when shutil.which returns a whitespace string."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Mock shutil.which to return a whitespace string
with patch('shutil.which') as mock_which:
mock_which.return_value = " "
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_maven_in_directory_with_many_files():
"""Test that find_maven_executable works correctly in a directory with many files."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create many files in the directory
for i in range(100):
Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
# Create mvnw
Path(os.path.join(tmpdir, "mvnw")).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_mvnw_cmd_in_directory_with_many_files():
"""Test that find_maven_executable finds mvnw.cmd in a directory with many files."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create many files in the directory
for i in range(100):
Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
# Create mvnw.cmd
Path(os.path.join(tmpdir, "mvnw.cmd")).touch()
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_performance_with_no_maven_in_large_directory():
"""Test that find_maven_executable performs well when returning None in a large directory."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create many files to simulate a large project directory
for i in range(500):
Path(os.path.join(tmpdir, f"file_{i}.txt")).touch()
# Mock shutil.which to return None
with patch('shutil.which') as mock_which:
mock_which.return_value = None
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_multiple_calls_return_consistent_results():
"""Test that multiple calls to find_maven_executable return consistent results."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create mvnw
Path(os.path.join(tmpdir, "mvnw")).touch()
# Call find_maven_executable multiple times
results = [find_maven_executable() for _ in range(50)]
finally:
os.chdir(original_dir)
def test_switching_directories_finds_correct_maven():
"""Test that find_maven_executable correctly finds Maven when switching directories."""
with tempfile.TemporaryDirectory() as tmpdir1:
with tempfile.TemporaryDirectory() as tmpdir2:
original_dir = os.getcwd()
try:
# First directory with mvnw
os.chdir(tmpdir1)
Path(os.path.join(tmpdir1, "mvnw")).touch()
codeflash_output = find_maven_executable(); result1 = codeflash_output
# Second directory without mvnw
os.chdir(tmpdir2)
with patch('shutil.which') as mock_which:
mock_which.return_value = "/usr/bin/mvn"
codeflash_output = find_maven_executable(); result2 = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_system_maven_with_long_path():
"""Test that find_maven_executable handles system Maven with a very long path."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create a very long path for Maven
long_path = "/very/long/path/" + "subdirectory/" * 50 + "mvn"
with patch('shutil.which') as mock_which:
mock_which.return_value = long_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
def test_finds_system_maven_with_special_characters_in_path():
"""Test that find_maven_executable handles system Maven with special characters in path."""
with tempfile.TemporaryDirectory() as tmpdir:
original_dir = os.getcwd()
try:
os.chdir(tmpdir)
# Create a path with special characters
special_path = "/opt/maven-3.8.1/bin/mvn"
with patch('shutil.which') as mock_which:
mock_which.return_value = special_path
codeflash_output = find_maven_executable(); result = codeflash_output
finally:
os.chdir(original_dir)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.from codeflash.languages.java.build_tools import find_maven_executable
def test_find_maven_executable():
find_maven_executable()🔎 Click to see Concolic Coverage Tests
| Test File::Test Function | Original ⏱️ | Optimized ⏱️ | Speedup |
|---|---|---|---|
codeflash_concolic_34v0t72u/tmp1x2llvvp/test_concolic_coverage.py::test_find_maven_executable |
81.3μs | 78.4μs | 3.65%✅ |
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T23.07.44
| if os.path.exists("mvnw"): | |
| return "./mvnw" | |
| if os.path.exists("mvnw.cmd"): | |
| if os.access("mvnw", os.F_OK): | |
| return "./mvnw" | |
| if os.access("mvnw.cmd", os.F_OK): |
| while pos < len(content): | ||
| next_open = content.find(open_tag, pos) | ||
| next_open_short = content.find(open_tag_short, pos) | ||
| next_close = content.find(close_tag, pos) | ||
|
|
||
| if next_close == -1: | ||
| return -1 | ||
|
|
||
| # Find the earliest opening tag (if any) | ||
| candidates = [x for x in [next_open, next_open_short] if x != -1 and x < next_close] | ||
| next_open_any = min(candidates) if candidates else len(content) + 1 | ||
|
|
||
| if next_open_any < next_close: | ||
| # Found opening tag first - nested tag | ||
| depth += 1 | ||
| pos = next_open_any + 1 | ||
| else: | ||
| # Found closing tag first | ||
| depth -= 1 | ||
| if depth == 0: | ||
| return next_close | ||
| pos = next_close + len(close_tag) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡️Codeflash found 84% (0.84x) speedup for _find_closing_tag in codeflash/languages/java/build_tools.py
⏱️ Runtime : 1.01 milliseconds → 548 microseconds (best of 233 runs)
📝 Explanation and details
The optimized code achieves an 83% speedup (from 1.01ms to 548μs) by fundamentally changing the search strategy from multiple independent substring searches to a single progressive scan.
Key Optimization:
The original code performs three separate content.find() calls per iteration to locate <tag>, <tag , and </tag> patterns, then constructs a candidate list to determine which appears first. This results in redundant scanning of the same content regions multiple times.
The optimized version instead:
- Finds the next
<character once withcontent.find("<", pos) - Uses
content.startswith()at that position to check if it's a relevant opening or closing tag - Eliminates the candidate list construction and min() operation
Why This Is Faster:
- Reduced string searches: One
find("<")call instead of threefind()calls searching for longer patterns - Earlier bailout: When no
<is found, we immediately return -1 without further checks - Eliminated allocations: No list comprehension creating the
candidateslist on each iteration - Better locality:
startswith()checks are O(k) where k is the tag length, performed only once at the found position
Performance Characteristics:
The test results show the optimization excels with:
- Nested same-name tags:
test_large_nested_tags_scalabilityshows 680% speedup (713μs → 91.5μs) for 200 nested levels - Simple structures: Most simple cases show 50-100% speedup (e.g.,
test_basic_single_pair55.9% faster) - Missing closing tags:
test_performance_with_large_string_no_matchshows 745% speedup (13.7μs → 1.62μs)
The optimization performs slightly worse on content with many different tag types at the same level (e.g., test_large_content_simple 90% slower) because it must scan through more < characters that aren't relevant to the target tag. However, the overall runtime improvement in typical XML parsing scenarios (nested same-name tags, sequential scanning) makes this an excellent trade-off.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 53 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | ✅ 3 Passed |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
from __future__ import annotations
# imports
import pytest # used for our unit tests
from codeflash.languages.java.build_tools import _find_closing_tag
def test_basic_single_pair():
# Basic: single matching pair should return the index of the closing tag
content = "<root>hello</root>"
start = content.find("<root") # position of the opening tag
expected_close = content.find("</root>") # expected position of closing tag
# The function should find the closing tag start index
codeflash_output = _find_closing_tag(content, start, "root") # 2.65μs -> 1.70μs (55.9% faster)
def test_nested_same_tag_simple():
# Nested tags of same name: outer must match its own closing tag, not inner
content = "<a><a>inner</a>outer</a>"
start_outer = content.find("<a>") # first opening tag
# expected closing for outermost is the last occurrence of "</a>"
expected_outer_close = content.rfind("</a>")
codeflash_output = _find_closing_tag(content, start_outer, "a") # 5.10μs -> 2.63μs (93.5% faster)
def test_with_attributes_and_spaces():
# Opening tags with attributes (using "<tag " form) must be recognized as openings
content = "<tag attr='1'>text<tag attr2='2'>inner</tag></tag>"
start = content.find("<tag") # first opening (with attributes)
expected_close = content.rfind("</tag>")
codeflash_output = _find_closing_tag(content, start, "tag") # 5.09μs -> 2.60μs (96.1% faster)
def test_missing_closing_returns_minus_one():
# When a closing tag is missing entirely, the function should return -1
content = "<x>no close here"
start = content.find("<x")
codeflash_output = _find_closing_tag(content, start, "x") # 1.75μs -> 1.36μs (28.7% faster)
def test_similar_tag_names_not_confused():
# Ensure tags with similar names (e.g., <a> vs <ab>) do not confuse matching
content = "<a><ab></ab></a>"
start = content.find("<a")
expected_close = content.find("</a>")
# The function should match the </a> closing tag, not get fooled by <ab>
codeflash_output = _find_closing_tag(content, start, "a") # 2.58μs -> 2.50μs (3.61% faster)
def test_self_closing_tag_returns_minus_one():
# Self-closing tags like <a/> have no corresponding </a>, so result should be -1
content = "<a/>"
start = content.find("<a")
# Even though start points to the tag, there is no closing tag, so expect -1
codeflash_output = _find_closing_tag(content, start, "a") # 1.55μs -> 1.27μs (22.1% faster)
def test_start_pos_not_zero_and_multiple_instances():
# When there are multiple sibling tags, ensure we can target the second one by start_pos
content = "pre<a>one</a><a>two</a>post"
# locate the second <a> by searching after the first one
first = content.find("<a>")
second = content.find("<a>", first + 1)
expected_close_second = content.find("</a>", second)
# The function should find the closing tag corresponding to the second opening
codeflash_output = _find_closing_tag(content, second, "a") # 2.35μs -> 1.43μs (64.3% faster)
def test_open_tag_with_space_only_and_plain_variant_later():
# If only an open_tag_short appears (i.e., "<tag " with attributes) before a closing,
# the algorithm must still count it as an opening.
content = "<b attr=1><b>inner</b></b>"
start = content.find("<b")
# ensure that the outer closing is matched
expected_close_outer = content.rfind("</b>")
codeflash_output = _find_closing_tag(content, start, "b") # 4.91μs -> 2.40μs (105% faster)
def test_partial_start_pos_inside_opening_still_finds_closing():
# If start_pos is slightly offset (caller error), the code still attempts to find a closing.
# This ensures the function is somewhat robust to non-zero offsets inside the opening tag.
content = "<a>text</a>"
actual_open = content.find("<a>")
# pick a start_pos one character after the '<' (inside the opening)
start_offset = actual_open + 1
# Even if start_pos is not exactly the '<', the function should still locate the closing tag
expected_close = content.find("</a>")
codeflash_output = _find_closing_tag(content, start_offset, "a") # 2.36μs -> 1.44μs (63.8% faster)
def test_multiple_opening_variants_only_open_tag_short_exists():
# Only "<tag " variant exists (no plain "<tag>") - ensure detection of nested openings works
content = "<div class='x'><div id='y'></div></div>"
start = content.find("<div")
expected_close = content.rfind("</div>")
codeflash_output = _find_closing_tag(content, start, "div") # 4.86μs -> 2.60μs (86.5% faster)
def test_large_nested_tags_scalability():
# Large-scale nested tags to test stack/depth handling but keep under 1000 elements.
# Create 200 nested tags: <t><t>...x...</t></t>...
depth = 200
open_tags = "<t>" * depth
close_tags = "</t>" * depth
content = open_tags + "X" + close_tags
# start position of the outermost opening tag
start = content.find("<t")
# The closing index for the outermost is the last </t>
expected_outer_close = content.rfind("</t>")
# The function should handle many nested levels and return the outermost closing index
codeflash_output = _find_closing_tag(content, start, "t") # 713μs -> 91.5μs (680% faster)
def test_interleaved_other_tags_do_not_affect_depth():
# Tags of other names between nested tags should not affect counting for the target tag_name.
content = "<x><a><b></b><a><b></b></a></a></x>"
# There are nested <a> tags with other tags interleaved; find the outermost <a>
start = content.find("<a")
# expected closing is the last </a> corresponding to the outermost
expected_close = content.rfind("</a>")
codeflash_output = _find_closing_tag(content, start, "a") # 5.06μs -> 3.96μs (27.8% faster)
def test_no_opening_tag_at_start_pos_returns_minus_one_or_misleading():
# If start_pos points past any opening tag (e.g., at end of content), the function should return -1
content = "<z></z>"
# choose a start_pos beyond content length to simulate incorrect caller input
start = len(content) + 5
# Since pos will be >= len(content), the while loop will not execute and -1 is returned
codeflash_output = _find_closing_tag(content, start, "z") # 1.12μs -> 1.28μs (12.5% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import pytest
from codeflash.languages.java.build_tools import _find_closing_tag
def test_simple_single_tag():
"""Test finding closing tag for a simple tag with no nesting."""
content = "<root>content</root>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.75μs -> 1.78μs (54.0% faster)
def test_simple_tag_with_content():
"""Test finding closing tag for a tag containing text content."""
content = "<div>Hello World</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.67μs -> 1.81μs (47.5% faster)
def test_tag_with_whitespace_content():
"""Test finding closing tag when content contains whitespace."""
content = "<span> </span>"
codeflash_output = _find_closing_tag(content, 0, "span"); result = codeflash_output # 2.67μs -> 1.73μs (53.8% faster)
def test_empty_tag():
"""Test finding closing tag for an empty tag."""
content = "<empty></empty>"
codeflash_output = _find_closing_tag(content, 0, "empty"); result = codeflash_output # 2.58μs -> 1.63μs (57.6% faster)
def test_tag_with_attributes():
"""Test finding closing tag for a tag with attributes."""
content = '<element class="test">content</element>'
codeflash_output = _find_closing_tag(content, 0, "element"); result = codeflash_output # 2.58μs -> 1.68μs (53.6% faster)
def test_tag_with_multiple_attributes():
"""Test finding closing tag for a tag with multiple attributes."""
content = '<div id="main" class="container">text</div>'
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.70μs -> 1.79μs (50.3% faster)
def test_no_closing_tag():
"""Test when closing tag is missing - should return -1."""
content = "<root>content"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 1.79μs -> 1.42μs (26.2% faster)
def test_nested_tags_one_level():
"""Test finding closing tag with one level of nesting."""
content = "<parent><child></child></parent>"
codeflash_output = _find_closing_tag(content, 0, "parent"); result = codeflash_output # 2.67μs -> 2.67μs (0.000% faster)
def test_nested_tags_multiple_levels():
"""Test finding closing tag with multiple levels of nesting."""
content = "<a><b><c></c></b></a>"
codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 2.75μs -> 3.41μs (19.4% slower)
def test_nested_tags_same_name():
"""Test finding closing tag when nested tags have the same name."""
content = "<div>outer<div>inner</div>text</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 5.21μs -> 2.62μs (98.5% faster)
def test_nested_tags_same_name_multiple():
"""Test multiple nested tags of the same name."""
content = "<tag>level1<tag>level2</tag>level1</tag>"
codeflash_output = _find_closing_tag(content, 0, "tag"); result = codeflash_output # 4.81μs -> 2.50μs (92.1% faster)
def test_closing_tag_at_end():
"""Test when closing tag is at the very end of content."""
content = "<root>text</root>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.62μs -> 1.68μs (55.9% faster)
def test_tag_name_is_single_character():
"""Test with single character tag name."""
content = "<a>content</a>"
codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 2.57μs -> 1.74μs (47.7% faster)
def test_tag_name_is_long():
"""Test with long tag name."""
content = "<verylongtagnamethatiscomplex>content</verylongtagnamethatiscomplex>"
codeflash_output = _find_closing_tag(content, 0, "verylongtagnamethatiscomplex"); result = codeflash_output # 2.73μs -> 1.78μs (52.8% faster)
def test_tag_with_numbers():
"""Test tag name containing numbers."""
content = "<div2>text</div2>"
codeflash_output = _find_closing_tag(content, 0, "div2"); result = codeflash_output # 2.53μs -> 1.64μs (54.2% faster)
def test_tag_with_hyphens():
"""Test tag name containing hyphens."""
content = "<my-tag>content</my-tag>"
codeflash_output = _find_closing_tag(content, 0, "my-tag"); result = codeflash_output # 2.56μs -> 1.71μs (49.6% faster)
def test_nested_different_tags():
"""Test nested tags with different names."""
content = "<outer><inner>text</inner></outer>"
codeflash_output = _find_closing_tag(content, 0, "outer"); result = codeflash_output # 2.62μs -> 2.79μs (6.08% slower)
def test_multiple_nested_with_attributes():
"""Test nested tags where some have attributes."""
content = '<root id="1"><child class="x">content</child></root>'
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.63μs -> 2.58μs (1.93% faster)
def test_tag_with_attribute_containing_tag_like_string():
"""Test tag with attribute value containing tag-like content."""
content = '<div data="<test>">content</div>'
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.65μs -> 2.28μs (16.2% faster)
def test_start_pos_not_zero():
"""Test when start_pos is not at the beginning."""
content = "text<root>content</root>more"
codeflash_output = _find_closing_tag(content, 4, "root"); result = codeflash_output # 2.50μs -> 1.70μs (46.4% faster)
def test_deeply_nested_same_tags():
"""Test deeply nested tags with the same name."""
content = "<x><x><x></x></x></x>"
codeflash_output = _find_closing_tag(content, 0, "x"); result = codeflash_output # 6.69μs -> 3.00μs (123% faster)
def test_tag_with_newlines():
"""Test tag with newline characters in content."""
content = "<div>\nline1\nline2\n</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.62μs -> 1.72μs (52.4% faster)
def test_tag_with_tabs():
"""Test tag with tab characters in content."""
content = "<div>\ttab\tcontent\t</div>"
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 2.52μs -> 1.71μs (47.4% faster)
def test_consecutive_opening_tags():
"""Test multiple consecutive opening tags of the same name."""
content = "<span><span>text</span></span>"
codeflash_output = _find_closing_tag(content, 0, "span"); result = codeflash_output # 4.99μs -> 2.56μs (94.5% faster)
def test_tag_after_first_but_before_close():
"""Test when there's another tag between opening and closing."""
content = "<root><other>text</other></root>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.67μs -> 2.69μs (1.11% slower)
def test_closing_tag_without_corresponding_opening():
"""Test when there's a closing tag but it doesn't match our opening."""
content = "<root>text</other>"
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 1.75μs -> 2.02μs (13.3% slower)
def test_tag_name_with_underscore():
"""Test tag name with underscore characters."""
content = "<my_tag>content</my_tag>"
codeflash_output = _find_closing_tag(content, 0, "my_tag"); result = codeflash_output # 2.63μs -> 1.68μs (56.6% faster)
def test_very_short_content():
"""Test with minimal content - just opening tag."""
content = "<x>"
codeflash_output = _find_closing_tag(content, 0, "x"); result = codeflash_output # 1.68μs -> 1.40μs (20.0% faster)
def test_tag_with_self_closing_like_syntax():
"""Test tag that might look self-closing but isn't."""
content = "<br />content</br>"
codeflash_output = _find_closing_tag(content, 5, "br"); result = codeflash_output # 2.64μs -> 1.72μs (53.5% faster)
def test_large_content_simple():
"""Test with large content size but simple structure."""
# Create content with many nested levels (up to 100 levels)
opening = "".join(f"<tag{i}>" for i in range(100))
closing = "".join(f"</tag{i}>" for i in range(99, -1, -1))
content = opening + "CONTENT" + closing
# Find the closing tag for the first tag
codeflash_output = _find_closing_tag(content, 0, "tag0"); result = codeflash_output # 6.07μs -> 62.7μs (90.3% slower)
def test_large_content_wide_structure():
"""Test with many tags at the same level."""
# Create content with many sibling tags
content = "<root>"
for i in range(100):
content += f"<item{i}>content</item{i}>"
content += "</root>"
# Find the closing tag for root
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 6.57μs -> 63.2μs (89.6% slower)
def test_large_nested_tags_finding_correct_close():
"""Test that with many nested tags, we find the correct closing tag."""
# Create deeply nested structure: <a><b><c>...<z></z>...</c></b></a>
alphabet = "abcdefghijklmnopqrstuvwxyz"
opening = "".join(f"<{char}>" for char in alphabet)
closing = "".join(f"</{char}>" for char in reversed(alphabet))
content = opening + "CORE" + closing
# Find the closing tag for 'a' (the outermost)
codeflash_output = _find_closing_tag(content, 0, "a"); result = codeflash_output # 3.12μs -> 16.8μs (81.4% slower)
def test_large_content_with_many_attributes():
"""Test with large content containing tags with many attributes."""
# Create a tag with many attributes
attributes = ' '.join(f'attr{i}="value{i}"' for i in range(50))
content = f'<root {attributes}>content</root>'
# Find the closing tag
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 4.56μs -> 1.88μs (142% faster)
def test_large_content_mixed_nesting():
"""Test with large content containing mixed nesting patterns."""
# Create content with alternating levels of nesting
content = "<root>"
for i in range(50):
content += f"<level1{i}><level2{i}>content</level2{i}></level1{i}>"
content += "</root>"
# Find the closing tag for root
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 6.81μs -> 62.9μs (89.2% slower)
def test_large_content_same_name_nesting():
"""Test with many nested tags of the same name."""
# Create content with 50 levels of the same tag nested
content = ""
for i in range(50):
content += "<div>"
content += "CONTENT"
for i in range(50):
content += "</div>"
# Find the closing tag for the first div
codeflash_output = _find_closing_tag(content, 0, "div"); result = codeflash_output # 102μs -> 24.2μs (325% faster)
def test_large_content_finding_middle_tag():
"""Test finding a closing tag for a tag in the middle of large content."""
# Create content with multiple root-level tags
content = "<root1>content</root1>"
content += "<root2><nested>content</nested></root2>"
for i in range(50):
content += f"<item{i}>content</item{i}>"
# Find the closing tag for root2 which has nesting
start_pos = content.find("<root2>")
codeflash_output = _find_closing_tag(content, start_pos, "root2"); result = codeflash_output # 3.87μs -> 2.58μs (49.6% faster)
def test_performance_with_large_string_no_match():
"""Test performance when there's no closing tag in large content."""
# Create large content without closing tag
content = "<root>" + "x" * 10000
# Should return -1 efficiently
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 13.7μs -> 1.62μs (745% faster)
def test_large_content_multiple_tag_searches():
"""Test finding closing tags for multiple tags in large content."""
# Create content with nested different tag types
content = "<wrapper>"
for i in range(100):
content += f"<container{i}><item>data</item></container{i}>"
content += "</wrapper>"
# Find the closing tag for wrapper
codeflash_output = _find_closing_tag(content, 0, "wrapper"); result = codeflash_output # 7.97μs -> 123μs (93.5% slower)
def test_large_content_with_special_characters():
"""Test large content with special characters in values."""
# Create content with special characters
special_chars = "!@#$%^&*()_+-=[]{}|;:',.<>?/~`"
content = f"<root data=\"{special_chars * 10}\">content</root>"
# Find the closing tag
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 3.24μs -> 5.34μs (39.4% slower)
def test_large_content_with_xml_entities():
"""Test large content with XML entities."""
# Create content with XML entities
content = "<root>Text with < > & entities</root>"
# Find the closing tag
codeflash_output = _find_closing_tag(content, 0, "root"); result = codeflash_output # 2.69μs -> 1.73μs (54.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.from codeflash.languages.java.build_tools import _find_closing_tag
def test__find_closing_tag():
_find_closing_tag('<></>', -1, '')
def test__find_closing_tag_2():
_find_closing_tag('', -2, '')
def test__find_closing_tag_3():
_find_closing_tag('</>', -1, '')🔎 Click to see Concolic Coverage Tests
| Test File::Test Function | Original ⏱️ | Optimized ⏱️ | Speedup |
|---|---|---|---|
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag |
4.23μs | 2.50μs | 69.5%✅ |
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag_2 |
1.79μs | 1.44μs | 24.3%✅ |
codeflash_concolic_34v0t72u/tmpmp8y47yq/test_concolic_coverage.py::test__find_closing_tag_3 |
2.48μs | 1.67μs | 47.9%✅ |
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-01T23.32.35
Click to see suggested changes
| while pos < len(content): | |
| next_open = content.find(open_tag, pos) | |
| next_open_short = content.find(open_tag_short, pos) | |
| next_close = content.find(close_tag, pos) | |
| if next_close == -1: | |
| return -1 | |
| # Find the earliest opening tag (if any) | |
| candidates = [x for x in [next_open, next_open_short] if x != -1 and x < next_close] | |
| next_open_any = min(candidates) if candidates else len(content) + 1 | |
| if next_open_any < next_close: | |
| # Found opening tag first - nested tag | |
| depth += 1 | |
| pos = next_open_any + 1 | |
| else: | |
| # Found closing tag first | |
| depth -= 1 | |
| if depth == 0: | |
| return next_close | |
| pos = next_close + len(close_tag) | |
| len_close = len(close_tag) | |
| # Scan for the next '<' and then determine whether it's an open/close of interest. | |
| while True: | |
| next_lt = content.find("<", pos) | |
| if next_lt == -1: | |
| return -1 | |
| # Check for the relevant closing tag first | |
| if content.startswith(close_tag, next_lt): | |
| # Found closing tag first | |
| depth -= 1 | |
| if depth == 0: | |
| return next_lt | |
| pos = next_lt + len_close | |
| continue | |
| # Check for nested opening tags of the exact forms we consider | |
| if content.startswith(open_tag, next_lt) or content.startswith(open_tag_short, next_lt): | |
| depth += 1 | |
| pos = next_lt + 1 | |
| continue | |
| # Not an open/close we're tracking; move on | |
| pos = next_lt + 1 | |
…benchmarking - Add inner loop in Java test instrumentation for JIT warmup within single JVM - Implement compile-once-run-many: compile tests once with Maven, then run directly via JUnit Console Launcher (~500ms vs ~5-10s per invocation) - Add fallback to Maven-based execution when direct execution fails - Update parsing to handle JUnit Console Launcher output format - Add inner_iterations parameter (default: 100) to control loop count - Add comprehensive E2E tests for inner loop benchmarking Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Configure JUnit Console Launcher to capture stdout/stderr in XML reports: - Add --config=junit.platform.output.capture.stdout=true - Add --config=junit.platform.output.capture.stderr=true - Change --details=verbose to --details=none to avoid duplicate output This ensures timing markers are properly captured in the JUnit XML's <system-out> element, eliminating the need to rely on subprocess stdout fallback for parsing timing markers. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8") | ||
| parts.append(part_text) | ||
|
|
||
| return " ".join(parts).strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡️Codeflash found 33% (0.33x) speedup for _extract_type_declaration in codeflash/languages/java/context.py
⏱️ Runtime : 133 microseconds → 100 microseconds (best of 15 runs)
📝 Explanation and details
The optimized code achieves a 33% runtime improvement (from 133μs to 100μs) by deferring UTF-8 decoding until after joining all byte slices together, rather than decoding each part individually.
Key Optimization:
The original code decoded each child node's byte slice immediately:
part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8")
parts.append(part_text)
return " ".join(parts).strip()The optimized code collects raw byte slices first, then performs a single decode operation:
parts.append(source_bytes[child.start_byte : child.end_byte])
return b" ".join(parts).decode("utf8").strip()Why This is Faster:
- Reduced decode operations: Instead of calling
decode("utf8")once per child node (~527 times in profiled runs), the optimization calls it just once on the final joined bytes - Byte-level joining:
b" ".join()on bytes is faster than" ".join()on strings, as it operates on raw bytes without character encoding overhead - Better memory efficiency: Avoids creating intermediate string objects for each part
Performance Impact by Test Case:
The optimization shows particularly strong gains on tests with many tokens:
- 37.6% faster on large-scale test with 500 tokens
- 15-16% faster on typical multi-token declarations (interface, enum, unknown types)
- Neutral/slight regression on trivial cases (empty children) where the overhead is negligible
Line Profiler Evidence:
The bottleneck shifted from line 27 in the original (34.3% of time spent on decode + slice) to line 26 in the optimized version (44.2% on append only, but with 23% less total time overall). The single decode at return now takes 3.1% vs the original's 23.2% spent on multiple appends of decoded strings.
This optimization is particularly valuable for parsing Java files with complex type declarations containing many modifiers, annotations, and generic type parameters.
✅ Correctness verification report:
| Test | Status |
|---|---|
| ⚙️ Existing Unit Tests | 🔘 None Found |
| 🌀 Generated Regression Tests | ✅ 8 Passed |
| ⏪ Replay Tests | 🔘 None Found |
| 🔎 Concolic Coverage Tests | 🔘 None Found |
| 📊 Tests Coverage | 100.0% |
🌀 Click to see Generated Regression Tests
from __future__ import annotations
from types import \
SimpleNamespace # used to create lightweight node-like objects
# imports
import pytest # used for our unit tests
from codeflash.languages.java.context import _extract_type_declaration
from tree_sitter import Node
# Helper utilities for tests ---------------------------------------------------
def _make_children_from_tokens_and_body(source: bytes, token_texts: list[str], body_index: int | None, body_type_name: str):
"""
Construct a list of SimpleNamespace children where each token corresponds to a
slice in `source`. Tokens are expected to appear in `source` separated by a single
space. `body_index` indicates the index in token_texts at which a body node should
be inserted; if None, no body node is inserted.
Each produced child has attributes: type, start_byte, end_byte.
"""
children = []
# locate tokens sequentially in source to compute byte offsets
offset = 0
# Copy token_texts to avoid mutating caller's list
for idx, token in enumerate(token_texts):
# find token starting at or after offset
token_bytes = token.encode("utf8")
pos = source.find(token_bytes, offset)
if pos == -1:
raise ValueError(f"Token {token!r} not found in source (from offset {offset}).")
start = pos
end = pos + len(token_bytes)
children.append(SimpleNamespace(type="token", start_byte=start, end_byte=end))
offset = end + 1 # assume tokens separated by at least one byte (space)
# Insert body node if requested. Body will cover from the start of the token at body_index to end of source
if body_index is not None:
# Determine where the body token starts; it should be the token at body_index
if not (0 <= body_index < len(children)):
# if body_index points past tokens, place body at the end
body_start = len(source)
else:
body_start = children[body_index].start_byte
body_child = SimpleNamespace(type=body_type_name, start_byte=body_start, end_byte=len(source))
# place body child at the end of the children list (function only checks type and breaks)
children.append(body_child)
return children
def test_interface_declaration_stops_before_interface_body():
# Interface should use 'interface_body' as the body node name and stop before it.
source_str = "public interface MyInterface extends BaseInterface { void foo(); }"
source = source_str.encode("utf8")
tokens = ["public", "interface", "MyInterface", "extends", "BaseInterface"]
# body_index points to the token position where we consider the body starts (token count)
children = _make_children_from_tokens_and_body(source, tokens, body_index=5, body_type_name="interface_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "interface"); decl = codeflash_output # 3.67μs -> 3.18μs (15.4% faster)
def test_enum_without_body_returns_all_parts():
# If no enum_body node exists among children, function should not break early and should include all parts.
source_str = "public enum Color RED GREEN BLUE"
source = source_str.encode("utf8")
tokens = ["public", "enum", "Color"]
# Do not insert a body node. The function should return everything from the supplied children.
children = _make_children_from_tokens_and_body(source, tokens, body_index=None, body_type_name="enum_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "enum"); decl = codeflash_output # 2.81μs -> 2.54μs (10.2% faster)
def test_empty_children_returns_empty_string():
# Edge case: type_node has no children -> return empty string (after join & strip)
node = SimpleNamespace(children=[])
source = b""
codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 1.32μs -> 1.34μs (1.49% slower)
def test_unknown_type_kind_defaults_to_class_body():
# If type_kind is unknown, body_type defaults to 'class_body'
source_str = "myModifier customType Foo extends Bar { body }"
source = source_str.encode("utf8")
tokens = ["myModifier", "customType", "Foo", "extends", "Bar"]
# Insert a 'class_body' child so unknown maps to class_body and the function stops before it
children = _make_children_from_tokens_and_body(source, tokens, body_index=5, body_type_name="class_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "unknown_kind"); decl = codeflash_output # 3.76μs -> 3.23μs (16.5% faster)
def test_child_with_empty_slice_produces_empty_segment():
# If a child has start_byte == end_byte, that yields an empty decoded string.
# The function will include it as an element; the final join will contain extra space for it.
# Construct source and children manually where one child corresponds to an empty slice.
source_str = "public class MyClass"
source = source_str.encode("utf8")
# Create two real children for 'public' and 'class' and a third child that's empty (start=end)
# The third child will contribute an empty string and show up as an additional space once joined.
# We then append the name child and a body to stop before.
public_pos = source.find(b"public")
class_pos = source.find(b"class")
name_pos = source.find(b"MyClass")
# children as SimpleNamespace objects
children = [
SimpleNamespace(type="token", start_byte=public_pos, end_byte=public_pos + len(b"public")),
SimpleNamespace(type="token", start_byte=class_pos, end_byte=class_pos + len(b"class")),
SimpleNamespace(type="token", start_byte=10, end_byte=10), # empty slice in the middle
SimpleNamespace(type="token", start_byte=name_pos, end_byte=name_pos + len(b"MyClass")),
SimpleNamespace(type="class_body", start_byte=name_pos + len(b"MyClass") + 1, end_byte=len(source)),
]
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 3.32μs -> 2.87μs (15.7% faster)
def test_large_number_of_tokens_stops_at_body_and_scales_correctly():
# Large scale test with many tokens (but under 1000).
# Ensure the function correctly concatenates many parts and stops at the body node.
n = 500 # number of tokens to include before body
tokens = [f"T{i}" for i in range(n)]
# Build source: tokens separated by spaces, then a body starting with '{'
source_str = " ".join(tokens) + " {" + " body" + " }"
source = source_str.encode("utf8")
# Construct children corresponding to tokens and then the body node
children = _make_children_from_tokens_and_body(source, tokens, body_index=n, body_type_name="class_body")
node = SimpleNamespace(children=children)
codeflash_output = _extract_type_declaration(node, source, "class"); decl = codeflash_output # 113μs -> 82.4μs (37.6% faster)
# The declaration should be exactly the tokens joined by single spaces
expected = " ".join(tokens)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.import pytest
from codeflash.languages.java.context import _extract_type_declaration
from tree_sitter import Language, Node, Parser
# Helper function to create a tree-sitter node for testing
def _get_parser():
"""Create and return a tree-sitter parser for Java."""
JAVA_LANGUAGE = Language("build/my-languages.so", "java")
parser = Parser()
parser.set_language(JAVA_LANGUAGE)
return parser
def _parse_java_code(code: str) -> Node:
"""Parse Java code and return the root node."""
parser = _get_parser()
tree = parser.parse(code.encode("utf8"))
return tree.root_node
def _find_type_node(root: Node, type_kind: str) -> Node:
"""Find the first type declaration node of the given kind."""
def traverse(node: Node) -> Node | None:
if node.type == type_kind:
return node
for child in node.children:
result = traverse(child)
if result:
return result
return None
return traverse(root)
def test_empty_class_name():
"""Test that function handles class nodes properly (tree-sitter should parse valid Java)."""
code = "public class {} "
To test or edit this optimization locally git merge codeflash/optimize-pr1199-2026-02-02T00.37.05
| part_text = source_bytes[child.start_byte : child.end_byte].decode("utf8") | |
| parts.append(part_text) | |
| return " ".join(parts).strip() | |
| parts.append(source_bytes[child.start_byte : child.end_byte]) | |
| return b" ".join(parts).decode("utf8").strip() |
feat: add inner loop and compile-once-run-many optimization for Java benchmarking
- Fix multi-module Maven project detection for projects where tests are in a submodule within the same project root (e.g., test/src/...) - Add fallback to Maven-based execution when JUnit Console Launcher is not available (JUnit 4 projects don't have it) - Prefer benchmarking_file_path over behavior path in module detection Tested on aerospike-client-java with JUnit 4: - Multi-module detection now correctly identifies 'test' module - Fallback to Maven execution works for JUnit 4 projects - JIT warmup effect captured: 13,363x speedup from using min runtime Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add support for Java optimizations that include new class-level members: - Static fields (e.g., lookup tables like BYTE_TO_HEX) - Helper methods (e.g., createByteToHex()) - Precomputed arrays Changes: - Add _add_java_class_members() in code_replacer.py to detect and insert new class members from optimized code into the original source - Update _add_global_declarations_for_language() to handle Java - Add ParsedOptimization dataclass and supporting functions in replacement.py - Exclude target functions from being added as helpers (they're replaced) Tests: - Add TestOptimizationWithStaticFields (3 tests) - Add TestOptimizationWithHelperMethods (2 tests) - Add TestOptimizationWithFieldsAndHelpers (2 tests including real-world bytesToHexString optimization pattern) All 28 Java replacement tests and 32 instrumentation tests pass. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…rs exist Previously, the benchmark loop stopped immediately when Maven returned non-zero (any test failure). This was too aggressive because: - Generated tests may have some failures - Passing tests still produce valid timing markers - We need multiple loops for accurate measurements Now the loop continues if timing markers are present, only stopping when: - No timing markers are found (all tests failed) - Target duration is reached - Max loops is reached This allows proper multi-loop benchmarking even when some generated tests fail, improving measurement accuracy. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add index-based tracking for overloaded methods to ensure correct method is replaced when multiple methods share the same name - Match target method by line number (with 5-line tolerance) when multiple overloads exist - Track overload index to re-find correct method after class member insertion which shifts line numbers - Improve error logging in test compilation to show both stdout/stderr - Use -e flag instead of -q for Maven compilation to show errors - Add comprehensive test for overloaded method replacement Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
No description provided.