chore: sync main into omni-java (batch 3/4) by KRRT7 · Pull Request #1558 · codeflash-ai/codeflash

KRRT7 · 2026-02-20T01:36:13Z

Summary

Merges main up to 6020c4fa (PR #1533, cf-release-0.20.1) into omni-java.

Included

Context extraction refactor to languages/python/context/ (refactor: move context extraction to languages/python/context/ #1498)
pnpm resolution fix (fix: resolve jest-runner from project's node_modules for Jest 30 compatibility #1497)
Bedrock CI (chore: switch Claude workflows from Foundry to AWS Bedrock #1504, fix: use correct Bedrock inference profile ID #1507)
JS capture perf fix for external runner ([FIX][JS] capturePerf shouldn't break when we have an external runner (batch size = 1) #1509)
Path mismatch fix ([FIX] path mismatch for behavior_file_path for original baseline, resulting into empty validated test source #1513)
Async decorator refactor (refactor: inline async decorators to remove codeflash import dependency #1518)
Worktree fix (Fix worktree mode filtering out all diff-discovered functions #1522)
v0.20.1 release (chore: release v0.20.1 #1533)
License format fix (fix: update license format to use license-files #1503)
codeflash-benchmark build fix (fix: update license format to use license-files #1503)

Conflict resolutions

codeflash/code_utils/config_consts.py: Main bumped OPTIMIZATION_CONTEXT_TOKEN_LIMIT and TESTGEN_CONTEXT_TOKEN_LIMIT from 16000 → 48000. Kept omni-java's JAVA_TESTCASE_TIMEOUT = 120 and Python timeout comment. Combined both.
pyproject.toml: Main changed license = {text = "BSL-1.1"} → license-files = ["LICENSE"]. Kept omni-java's tree-sitter-java>=0.23.0 dependency. Combined both.

Remaining batches

Batch 4: merge up to 6346c740 (HEAD of main)

…atibility The loop-runner was loading jest-runner from codeflash's node_modules (v29) instead of the project's (v30), causing "runtime.enterTestCode is not a function" errors. This fix: - Adds recursive search to find jest-runner in any node_modules structure - Works with npm, yarn, and pnpm (including non-hoisted deps) - Prefers higher versions when multiple are found - Removes internal looping in capturePerf when using external loop-runner - Creates fresh TestRunner per batch to avoid Jest 30 state corruption Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

…codeflash into fix/jest30-pnpm-resolution

…unner

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

Consolidate three enricher functions (get_imported_class_definitions, get_external_base_class_inits, get_external_class_inits) into a single enrich_testgen_context that parses code context once. Extract shared helpers, unify prune_cst variants, deduplicate loop bodies, and remove dead UsedNameCollector class.

Move code_context_extractor.py and unused_definition_remover.py from codeflash/context/ to codeflash/languages/python/context/ and update all import sites.

Replace duplicate implementations in extract_code_context() and find_helper_functions() with calls to get_code_optimization_context() and get_function_sources_from_jedi() from the canonical context module.

Update stale context/ paths in mypy_allowlist.txt to match the languages/python/context/ move. Add assert to narrow BaseSuite to IndentedBlock in prune_cst for mypy.

Re-add graceful degradation when context exceeds token limits instead of raising ValueError immediately. Read-only context falls back to removing docstrings then removing entirely. Testgen context falls back to removing docstrings then removing enrichment before raising.

The optimization achieves a **68% runtime improvement** (23.5ms → 14.0ms) by replacing the expensive `ast.walk()` traversal with a targeted recursive collection strategy. **Key Performance Improvement:** The original code uses `ast.walk(tree)` which visits **every single node** in the AST tree (12,947 hits shown in line profiler), consuming 71.7% of total runtime. This includes unnecessary nodes like expressions, literals, and operators that can never contain `ImportFrom` statements. The optimized version implements a custom `collect_imports()` function that: 1. **Only traverses module body and control flow structures** where imports can legally appear (function/class definitions, if/while/for blocks, try/except) 2. **Skips irrelevant AST nodes** like expressions, literals, and operators entirely 3. **Recursively processes nested bodies** (body, orelse, finalbody, handlers) in a depth-first manner **Why This Works:** In Python, `from X import Y` statements can only appear: - At module level - Inside function/class definitions - Within control flow blocks (if/while/for/try) By checking `isinstance()` for only these container node types and recursively descending into their body attributes, we avoid traversing the entire AST subtree for each construct. This dramatically reduces the number of nodes visited while maintaining correctness. **Test Case Performance:** The optimization excels across all scales: - **Small imports** (single statements): 60-77% faster - **Large import lists** (100-500 items): 74-104% faster - **Many code blocks** (500-1000 lines): 70-77% faster - **Mixed code/imports** at scale: 70% faster The performance gain is particularly pronounced when the AST contains large amounts of non-import code (functions, classes, expressions), as shown by the `test_mixed_imports_and_code_large_scale` case improving from 9.31ms to 5.45ms (70.8% faster). **Impact on Workloads:** Given the function_references show this is used in code context extraction benchmarks, this optimization will significantly speed up any workflow that analyzes Python imports from large codebases or performs repeated import analysis during development workflows.

The optimized code achieves a **350% speedup** (2.36ms → 523μs) by replacing the generic `ast.walk()` traversal with a targeted stack-based iteration that only visits nodes where class definitions can appear. **Key Performance Improvement:** The original implementation uses `ast.walk(tree)`, which performs an exhaustive depth-first traversal of **every single node** in the AST—including expressions, literals, operators, and other leaf nodes that can never contain class definitions. For a typical Python module, this means checking thousands of irrelevant nodes. The optimized version uses a stack-based approach that only descends into structural nodes (ClassDef, FunctionDef, If, For, While, With, Try blocks) where classes can actually be defined. This dramatically reduces the number of nodes visited and `isinstance()` checks performed. **Why This Matters:** From the test results, we see consistent 200-700% speedups across all scenarios: - Empty modules: 579% faster (5.37μs → 791ns) - minimal traversal overhead - Simple cases: 200-400% faster - fewer nodes to check - Complex nested structures: 405% faster (37.2μs → 7.37μs) - targeted descent pays off - Large modules (500 classes): 280% faster (869μs → 228μs) - scales better - Mixed workloads: 558% faster (799μs → 121μs) - avoids non-class nodes **Impact on Workloads:** Based on the function references showing this is called from `build_testgen_context`, this optimization benefits test generation workflows that analyze Python code structure. Since class extraction is likely performed repeatedly during code analysis, the 4x speedup directly improves overall test generation throughput. The optimization is particularly effective for large codebases with many classes and complex nesting patterns, as demonstrated by the benchmark results.

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

…2026-02-16T20.53.40 ⚡️ Speed up function `collect_existing_class_names` by 351% in PR #1498 (`cf-simplify-context-extraction`)

The optimized collect_imports missed match/case statements where imports can legally appear. Add hasattr-guarded handling for ast.Match nodes. Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>

…2026-02-16T20.49.33 ⚡️ Speed up function `_parse_and_collect_imports` by 69% in PR #1498 (`cf-simplify-context-extraction`)

Extract shared helpers and remove dead code across the language support area: - Extract `is_assignment_used()` and move `recurse_sections` to unused_definition_remover.py, replacing duplicated logic in both context files - Extract `function_sources_to_helpers()` in support.py to unify identical HelperFunction construction - Remove dead `get_comment_prefix()` method from protocol and all implementations (comment_prefix property serves all callers)

…tion refactor: move context extraction to languages/python/context/

Replace deprecated license table format with modern license-files array in both main package and codeflash-benchmark subpackage. This resolves the setuptools deprecation warning about TOML table license format. Changes: - Use license-files = ["LICENSE"] instead of license = {text = "BSL-1.1"} - Add LICENSE file to root directory - Add LICENSE and README.md to codeflash-benchmark/

fix: update license format to use license-files

The cross-region inference profile for Claude Opus 4.6 on Bedrock is `us.anthropic.claude-opus-4-6-v1`, not `us.anthropic.claude-opus-4-6-v1:0`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: use correct Bedrock inference profile ID

…ternal-runner [FIX][JS] capturePerf shouldn't break when we have an external runner (batch size = 1)

…s_path_mismatch [FIX] path mismatch for behavior_file_path for original baseline, resulting into empty validated test source

Instead of injecting `from codeflash.code_utils.codeflash_wrap_decorator import ...` into instrumented source files, inject the decorator function definitions directly. This removes the hard dependency on the codeflash package being importable at runtime in the target environment, matching the pattern already used for sync instrumentation.

Replace inline code injection with a helper file approach that writes decorator implementations to a separate codeflash_async_wrapper.py file. This removes the codeflash package import dependency from instrumented source files while keeping line numbers stable (only 1 import + 1 decorator line added, same as before). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…e file Write all three async decorator implementations into one helper file to avoid overwrite issues when switching modes. Clean up the helper file in revert_code_and_helpers and early-exit paths so it doesn't persist in the user's project root after optimization.

…rget The e2e test expects codeflash to detect and fix the intentional use of blocking time.sleep() in an async function. Using asyncio.sleep() removes the optimization opportunity and causes the CI job to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The optimized code removes `import time`, shifting all function lines up by 1. Update expected_lines from [10-20] to [9-19] to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

refactor: inline async decorators to remove codeflash import dependency

In --worktree mode, get_git_diff resolves file paths from cwd (the original repo), but module_root/project_root are mirrored to the worktree. This caused filter_functions to reject all diff-discovered functions as "outside module-root". Use the pre-mirror roots for filtering, then remap file paths to the worktree for downstream use.

git_root_dir() searches from CWD (original repo), but in worktree mode file paths have been remapped to the worktree. This caused relative_to() to raise ValueError when creating PRs. Search from module_root instead so root_dir is always in the same path space as the file paths.

…t-path Fix worktree mode filtering out all diff-discovered functions

chore: release v0.20.1

claude · 2026-02-20T02:01:15Z

PR Review Summary

Prek Checks

Status: Fixed and passing ✅

Fixed 8 linting issues and 6 formatting issues in commit ea48939:

codeflash/languages/java/context.py: PLR1714 — merged multiple comparisons into in check
codeflash/languages/java/instrumentation.py: N806 — renamed _PRIMITIVE_DEFAULTS to lowercase; SIM113 — replaced manual counter with enumerate()
codeflash/verification/parse_line_profile_test_output.py: UP031 — replaced % format with f-strings (×4); PTH119 — replaced os.path.basename() with Path.name; removed unused TYPE_CHECKING import
codeflash/cli_cmds/console.py, codeflash/cli_cmds/logging_config.py, codeflash/languages/java/replacement.py: ruff format auto-fixes

Mypy: 567 errors across checked files (325 in new Java module files, 242 in existing code). These are pre-existing type issues not introduced by this PR — the new Java module would benefit from stricter typing in a follow-up.

Code Review

No critical issues found. ✅

This is a sync PR merging main into omni-java. Key changes reviewed:

Module relocation (codeflash/context/ → codeflash/languages/python/context/) — all imports properly updated
prune_cst unification (3 separate functions → 1) — correctly handles all code context types via parameter flags
Async decorator refactored to use runtime helper file generation — cleanup handled in all code paths
_current_language default changed from None to Language.PYTHON — safe change
Token limits tripled (16K → 48K) — intentional config change, test updated accordingly
MAX_TRANSITIVE_DEPTH increased from 2 to 5 — may increase processing time for deep type hierarchies (non-critical)

Test Coverage

PR branch: 3113 passed, 29 failed, 58 skipped | Main branch: 2356 passed, 8 failed, 56 skipped
Test count increase is expected — new Java test files added.

Modified Files (PR vs Main)

File	PR	Main	Delta
`code_utils/config_consts.py`	88%	88%	—
`code_utils/instrument_existing_tests.py`	91%	92%	-1%
`languages/base.py`	99%	98%	+1%
`languages/current.py`	100%	95%	+5%
`languages/python/context/code_context_extractor.py`	90%	91%	-1%
`languages/python/context/unused_definition_remover.py`	94%	94%	—
`models/models.py`	79%	78%	+1%
`optimization/function_optimizer.py`	20%	19%	+1%
`optimization/optimizer.py`	20%	19%	+1%
`verification/parse_line_profile_test_output.py`	90%	92%	-2%
`verification/equivalence.py`	87%	87%	—
`verification/verifier.py`	35%	38%	-3%
`setup/config_writer.py`	65%	85%	-20% ⚠️
`setup/detector.py`	72%	86%	-14% ⚠️
`result/critic.py`	70%	70%	—

New Java Files

File	Coverage	Status
`languages/java/parser.py`	98%	✅
`languages/java/test_discovery.py`	90%	✅
`languages/java/line_profiler.py`	90%	✅
`languages/java/config.py`	89%	✅
`languages/java/context.py`	89%	✅
`languages/java/concurrency_analyzer.py`	88%	✅
`languages/java/discovery.py`	88%	✅
`languages/java/import_resolver.py`	88%	✅
`languages/java/remove_asserts.py`	88%	✅
`languages/java/instrumentation.py`	82%	✅
`languages/java/support.py`	72%	⚠️ Below 75%
`languages/java/formatter.py`	64%	⚠️ Below 75%
`languages/java/replacement.py`	60%	⚠️ Below 75%
`languages/java/build_tools.py`	58%	⚠️ Below 75%
`languages/java/comparator.py`	46%	⚠️ Below 75%
`languages/java/test_runner.py`	44%	⚠️ Below 75%

Coverage notes:

Most modified files have stable or slightly improved coverage
setup/config_writer.py (-20%) and setup/detector.py (-14%) have regressions due to new Java config paths not yet fully covered
10 of 16 new Java files meet the ≥75% coverage threshold
6 Java files below threshold (test_runner.py, comparator.py, build_tools.py, replacement.py, formatter.py, support.py) would benefit from additional tests

Last updated: 2026-02-20T01:50Z

codeflash-ai · 2026-02-20T02:02:29Z

⚡️ Codeflash found optimizations for this PR

📄 2,141% (21.41x) speedup for `_extract_public_method_signatures` in `codeflash/languages/java/context.py`

⏱️ Runtime : 27.5 milliseconds → 1.23 milliseconds (best of 5 runs)

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function _extract_public_method_signatures by 2,141% in PR #1558 (sync-main-batch-3) #1559

If you approve, it will be merged into this PR (branch sync-main-batch-3).

mohammedahmed18 and others added 30 commits February 16, 2026 14:31

Merge branch 'main' into fix/jest30-pnpm-resolution

31e7116

Update packages/codeflash/runtime/loop-runner.js

5694135

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

cleaning up

2fb4b2d

Merge branch 'fix/jest30-pnpm-resolution' of github.com:codeflash-ai/…

066980b

…codeflash into fix/jest30-pnpm-resolution

typo

2d73cf8

debugging for failed workflow

5e25b7f

just for testing

bfe4224

fallback to directly require the jest-runner module inside the loop r…

d13cdb5

…unner

Update packages/codeflash/runtime/loop-runner.js

b4ea8b6

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

refactor: move context extraction modules to languages/python/context/

547c02e

Move code_context_extractor.py and unused_definition_remover.py from codeflash/context/ to codeflash/languages/python/context/ and update all import sites.

refactor: delegate PythonSupport context methods to canonical pipeline

b1ec824

Replace duplicate implementations in extract_code_context() and find_helper_functions() with calls to get_code_optimization_context() and get_function_sources_from_jedi() from the canonical context module.

fix: update mypy allowlist paths and fix BaseSuite type narrowing

8566cf0

Update stale context/ paths in mypy_allowlist.txt to match the languages/python/context/ move. Add assert to narrow BaseSuite to IndentedBlock in prune_cst for mypy.

style: auto-fix linting issues

73e71d0

fix: resolve mypy type errors in collect_imports

29c0a66

style: auto-fix linting issues

69d3268

Update codeflash/languages/python/context/code_context_extractor.py

ea14b2f

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>

Merge pull request #1500 from codeflash-ai/codeflash/optimize-pr1498-…

cc77394

…2026-02-16T20.53.40 ⚡️ Speed up function `collect_existing_class_names` by 351% in PR #1498 (`cf-simplify-context-extraction`)

fix: handle ast.Match (Python 3.10+) in collect_imports traversal

bfa55cb

The optimized collect_imports missed match/case statements where imports can legally appear. Add hasattr-guarded handling for ast.Match nodes. Co-authored-by: Kevin Turcios <KRRT7@users.noreply.github.com>

Merge pull request #1499 from codeflash-ai/codeflash/optimize-pr1498-…

82b4002

…2026-02-16T20.49.33 ⚡️ Speed up function `_parse_and_collect_imports` by 69% in PR #1498 (`cf-simplify-context-extraction`)

style: auto-fix linting issues

633acce

Merge pull request #1498 from codeflash-ai/cf-simplify-context-extrac…

805d946

…tion refactor: move context extraction to languages/python/context/

Merge pull request #1503 from codeflash-ai/fix-codeflash-benchmark-build

fc9cdf8

fix: update license format to use license-files

Merge branch 'main' into fix/jest30-pnpm-resolution

c262f3c

aseembits93 and others added 22 commits February 17, 2026 20:34

fix: use correct Bedrock inference profile ID (no :0 suffix)

09c026a

The cross-region inference profile for Claude Opus 4.6 on Bedrock is `us.anthropic.claude-opus-4-6-v1`, not `us.anthropic.claude-opus-4-6-v1:0`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge pull request #1507 from codeflash-ai/fix/bedrock-model-id

110e9b8

fix: use correct Bedrock inference profile ID

fix always execute capture perf for external runner

d307409

Merge branch 'main' of github.com:codeflash-ai/codeflash

7308afe

Merge pull request #1509 from codeflash-ai/fix/js-capture-perf-for-ex…

cc678ba

…ternal-runner [FIX][JS] capturePerf shouldn't break when we have an external runner (batch size = 1)

fix path mismatch bug

7b2692f

Merge pull request #1513 from codeflash-ai/fix/gen_test_to_no_of_test…

8cb7209

…s_path_mismatch [FIX] path mismatch for behavior_file_path for original baseline, resulting into empty validated test source

Merge branch 'main' into proper-async

9f80ea6

fix: update expected coverage lines for optimized async e2e code

6c092b5

The optimized code removes `import time`, shifting all function lines up by 1. Update expected_lines from [10-20] to [9-19] to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge pull request #1518 from codeflash-ai/proper-async

9d23a0e

refactor: inline async decorators to remove codeflash import dependency

style: auto-fix ruff formatting for long line

d43d9ae

fix: add type assertions for mypy narrowing in worktree path remapping

6a19b9d

Merge pull request #1522 from codeflash-ai/cf-fix-worktree-module-roo…

2bcd91c

…t-path Fix worktree mode filtering out all diff-discovered functions

chore: bump version to 0.20.1

c76acae

Merge pull request #1533 from codeflash-ai/cf-release-0.20.1

6020c4f

chore: release v0.20.1

Merge commit '6020c4fa' into sync-main-batch-3

85d1d4f

github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Feb 20, 2026

KRRT7 and others added 2 commits February 19, 2026 20:39

fix: update test import for moved code_context_extractor module

7c7eeb5

style: auto-fix linting and formatting issues

ea48939

codeflash-ai bot mentioned this pull request Feb 20, 2026

⚡️ Speed up function _extract_public_method_signatures by 2,141% in PR #1558 (sync-main-batch-3) #1559

Closed

KRRT7 merged commit 4a45ac5 into omni-java Feb 20, 2026
25 of 31 checks passed

KRRT7 deleted the sync-main-batch-3 branch February 20, 2026 02:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: sync main into omni-java (batch 3/4)#1558

chore: sync main into omni-java (batch 3/4)#1558
KRRT7 merged 58 commits intoomni-javafrom
sync-main-batch-3

KRRT7 commented Feb 20, 2026

Uh oh!

claude bot commented Feb 20, 2026 •

edited

Loading

Uh oh!

codeflash-ai bot commented Feb 20, 2026

⚡️ Speed up function `_extract_public_method_signatures` by 2,141% in PR #1558 (`sync-main-batch-3`) #1559

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

KRRT7 commented Feb 20, 2026

Summary

Included

Conflict resolutions

Remaining batches

Uh oh!

claude bot commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Code Review

Test Coverage

Modified Files (PR vs Main)

New Java Files

Uh oh!

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 2,141% (21.41x) speedup for _extract_public_method_signatures in codeflash/languages/java/context.py

A dependent PR with the suggested changes has been created. Please review:

⚡️ Speed up function _extract_public_method_signatures by 2,141% in PR #1558 (sync-main-batch-3) #1559

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

claude bot commented Feb 20, 2026 •

edited

Loading

📄 2,141% (21.41x) speedup for `_extract_public_method_signatures` in `codeflash/languages/java/context.py`

⚡️ Speed up function `_extract_public_method_signatures` by 2,141% in PR #1558 (`sync-main-batch-3`) #1559