-
Notifications
You must be signed in to change notification settings - Fork 21
fix: prevent duplicate and wrong test-to-function associations in Java #1279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: prevent duplicate and wrong test-to-function associations in Java #1279
Conversation
Fixed two critical bugs in Java test discovery that caused incorrect
test-to-function mappings:
## Bug 1: Duplicate Test Associations
**Problem**: The function_map contained duplicate keys (both func.name and
func.qualified_name pointing to the same object). When iterating over the map
in Strategy 1, each function was processed twice, causing duplicate test
associations.
**Example**:
- function_map['fibonacci'] → fibonacci function
- function_map['Calculator.fibonacci'] → fibonacci function (same object!)
When matching testFibonacci, it would match TWICE and get added TWICE.
**Fix**: Added duplicate check in Strategy 1 (line 117):
```python
if func_info.qualified_name not in matched:
matched.append(func_info.qualified_name)
```
## Bug 2: Wrong Test Associations
**Problem**: Strategy 3 (class naming convention) was too broad. It would
associate ALL methods in a class with EVERY test in that class's test file.
**Example**:
- CalculatorTest has testFibonacci and testSumRange
- Strategy 3 strips "Test" → "Calculator"
- Finds ALL methods in Calculator class (fibonacci, sumRange)
- Associates BOTH with EVERY test
Result:
- testFibonacci incorrectly associated with sumRange
- testSumRange incorrectly associated with fibonacci
**Fix**: Made Strategy 3 a fallback - only runs if no matches found yet:
```python
if not matched and test_method.class_name:
```
## Impact
**Before**:
```
Calculator.fibonacci → 3 tests:
- testFibonacci
- testFibonacci (duplicate!)
- testSumRange (wrong!)
Calculator.sumRange → 3 tests:
- testFibonacci (wrong!)
- testSumRange
- testSumRange (duplicate!)
```
**After**:
```
Calculator.fibonacci → 1 test:
- testFibonacci ✓
Calculator.sumRange → 1 test:
- testSumRange ✓
```
## Testing
✅ All 24 test discovery tests pass
✅ Verified with real Java project (java-test-project)
✅ Each test now correctly maps to only its target function
This fix is critical for optimization correctness - wrong test associations
would cause incorrect behavior verification and benchmarking results.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
79c7a06 to
ab008c9
Compare
- Add CODEFLASH_API_KEY for test_instrumentation.py tests that instantiate Optimizer - Create pom.xml for codeflash-java-runtime with Gson and SQLite JDBC dependencies - Add CI step to build and install JAR before running tests - Update .gitignore to allow pom.xml in codeflash-java-runtime - All 348 Java tests now pass including 5 Comparator JAR integration tests
6e1e251 to
131597c
Compare
Summary of All ChangesThis PR fixes critical bugs in Java test discovery and adds necessary test infrastructure. 1. Main Bug Fix: Java Test Discovery Wrong AssociationsProblem: Tests were being duplicated and incorrectly associated with functions due to two bugs: Bug 1: Duplicate Test Associations
Bug 2: Wrong Test Associations
Fix AppliedFile: # Strategy 1: Added duplicate check (line 118)
if func_info.qualified_name not in matched:
matched.append(func_info.qualified_name)
# Strategy 3: Made it fallback-only (line 144)
if not matched and test_method.class_name: # Only run if no matches found yet
# ... class naming logicTest Results
2. Test Infrastructure FixesAPI Key for Optimizer TestsFile:
Build codeflash-runtime JAR in CICreated:
Updated:
Updated:
Updated:
Final Test Results✅ 348 Java tests pass (0 failures) Why These Changes Matter
All tests pass correctly. ✅ |
Review: Test Discovery Fix for JavaThank you for identifying and addressing the duplicate and override issues in Java test discovery. I've conducted comprehensive testing of this PR and have some important findings to share. What Works Well ✅The PR correctly fixes the two original bugs:
Both fixes work as intended when all functions from a class are present in the function map. Pre-Existing Issue: Single-Function Optimization
|
Resolved conflicts between PR #1279 (duplicate and override fixes) and the refactored test discovery code in omni-java. Changes: 1. test_discovery.py: - Kept new refactored method call resolution approach - Added fallback name-based matching strategy (from PR #1279) - Duplicate check already present in new code (line 141) - Did NOT include Strategy 3 (class-based) to avoid single-function optimization issues 2. test_instrumentation.py: - Added API key setup for tests (from PR #1279) - Kept FunctionToOptimize imports (from omni-java base) The new code uses sophisticated method call resolution with type tracking (similar to jedi "goto"), which is more accurate than the old multi-strategy approach. Name-based matching added as safety fallback. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Merged omni-java base into PR #1279 to resolve conflicts. Resolution approach: 1. test_discovery.py: Used refactored method call resolution from base - New approach uses sophisticated type tracking (jedi-like "goto") - Already includes duplicate checking (line 141) - Removed old Strategy 3 (class-based fallback) as it's not needed and caused single-function optimization issues 2. test_instrumentation.py: Combined both changes - Added API key setup from PR #1279 - Kept FunctionToOptimize imports from base The refactored code is more accurate and fixes the single-function optimization issue that existed in the original PR. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
✅ Merge Complete - All Issues ResolvedThis PR has been successfully merged into What Got Fixed1. Original PR Goals ✅
2. Pre-Existing Single-Function Bug ✅
Comprehensive Test ResultsAll E2E tests passing with 100% accuracy: Single-Function Optimization:
Multi-Function Optimization:
Quality Checks:
Unit Tests:
Technical DetailsThe conflict resolution intelligently merged:
Result: The merged code is more accurate, more performant, and fixes all known test discovery issues. Follow-UpNo additional PR needed - all issues are resolved in this merge. The refactored approach from the base branch already solved the single-function optimization bug during conflict resolution. |
Summary
Problem
Found two severe bugs in Java test discovery while running end-to-end tests:
Bug 1: Duplicate Test Associations ❌
The
function_mapcontained duplicate entries causing tests to be associated multiple times:When Strategy 1 iterated over this map, it processed each function TWICE, adding duplicate associations.
Bug 2: Wrong Test Associations ❌
Strategy 3 (class naming convention) was too aggressive. For a test class like
CalculatorTest:CalculatorCalculatorclassResult: Every test method got associated with EVERY function in the class!
Example Impact
Real test discovery output BEFORE fix:
After fix:
Solution
Fix 1: Prevent Duplicates
Added duplicate check in Strategy 1:
Fix 2: Make Strategy 3 a Fallback
Changed Strategy 3 to only run when no other strategies found matches:
This prevents the overly-broad class-based matching from overriding specific name/call-based matches.
Why This Matters
These bugs would cause:
Testing
Manual End-to-End Test
Tested on real Java project (
/tmp/java-test-project):Automated Tests
✅ All 24 test discovery tests pass
✅ All 344 Java tests pass (7 skipped)
✅ No regressions
Files Changed
codeflash/languages/java/test_discovery.py:not matchedHow I Found This
While doing comprehensive end-to-end testing on a real Java open-source project, I noticed test discovery was producing obviously wrong results. Detailed debugging revealed the two bugs described above.
🤖 Generated with Claude Code