chore: sync main into omni-java (batch 2/4) by KRRT7 · Pull Request #1557 · codeflash-ai/codeflash

KRRT7 · 2026-02-20T01:29:16Z

Summary

Merges main up to d578d996 (PR #1494, add-private-tessl-tiles) into omni-java. This is batch 2 of 4 for the incremental main sync.

Depends on: PR #1556 (batch 1) being merged first.

Included PRs

Jest 30 loop-runner (fix: add Jest 30 support and fix time limit in loop-runner #1318)
crosshair/pythonpath (Fix CrossHair subprocess missing PYTHONPATH #1476, Unify PYTHONPATH setup for subprocesses #1477)
config parser fix (Fix package.json config overriding closer pyproject.toml in monorepos #1478)
testgen context (feat: include external class __init__ signatures with transitive type deps in testgen context #1481)
Claude rules restructure (refactor: restructure CLAUDE.md for effective context usage #1486)
tessl setup (Setup repository to use Tessl #1492, feat: add private tessl tiles for rules, docs, and skills #1494)
coverage qualified name (Fix coverage function identification using qualified names #1457)
test filtering (fix: filter test_*.py files and pytest fixtures from optimization #1471)
async restore (Restore concurrency in testgen and candidate generation #1461)

Conflict resolutions

PrComment.py: Took main's counts variable name; kept HEAD's correct json_result dict name and str() wrapping
function_optimizer.py: Took HEAD's perf_path.open() — avoids duplicate behavior test write and respects Java path overrides
tracer.py: Kept import logging (used), dropped import os (unused)
parse_test_output.py: Combined both — kept HEAD's is_java, is_python imports + main's _parse_jest_test_xml import; kept HEAD's JacocoCoverageUtils import (used)
verification_utils.py: Took HEAD — has full Java support (package directory structure, class naming) and active JS package test dir logic; main's version had undefined path variable

Remaining batches

Batch 3: merge up to 6020c4fa (context extraction refactor, v0.20.1)
Batch 4: merge up to 6346c740 (HEAD of main)

- Add Jest 30 compatibility by detecting version and using TestRunner class - Resolve jest-runner from project's node_modules instead of codeflash's bundle - Fix time limit enforcement by using local time tracking instead of shared state (Jest runs tests in worker processes, so state isn't shared with runner) - Integrate stability-based early stopping into capturePerf - Use plain object instead of Set for stableInvocations to survive Jest module resets - Fix async function benchmarking: properly loop through iterations using async helper (Previously, async functions only got one timing marker due to early return) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…jest30-loop-runner

After merging main, constants like PERF_STABILITY_CHECK, PERF_MIN_LOOPS, PERF_LOOP_COUNT were changed to getter functions. Updated all references in capturePerf and _capturePerfAsync to use the getter function calls. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…apture Improvements to loop-runner.js: - Extract isValidJestRunnerPath() helper to reduce code duplication - Add comprehensive JSDoc comments for Jest version detection - Improve error messages with more context about detected versions - Add better documentation for runTests() method - Add validation for TestRunner class availability in Jest 30 Improvements to capture.js: - Extract _recordAsyncTiming() helper to reduce duplication - Add comprehensive JSDoc for _capturePerfAsync() with all parameters - Improve error handling in async looping (record timing before throwing) - Enhance shouldStopStability() documentation with algorithm details - Improve code organization with clearer comments These changes improve maintainability and debugging without changing behavior.

…king The _parse_timing_from_jest_output() function was defined but never called, causing benchmarking tests to report runtime=0. This integrates console timing marker parsing into parse_test_results() to extract accurate performance data from capturePerf() calls. Fixes the "summed benchmark runtime of the original function is 0" error when timing data exists in console output but JUnit XML reports 0.

Changes f-string to % formatting in logger.debug() call to avoid evaluating the string when debug logging is disabled.

The verify_requirements() method only checked for test frameworks (jest/vitest) in the local package's node_modules. In monorepos with workspace hoisting (yarn/pnpm), dependencies are often installed at the workspace root instead. Changes: - Check both local node_modules and workspace root node_modules - Use _find_monorepo_root() to locate workspace root - Add debug logging for framework resolution - Update docstring to document monorepo support Fixes false positive "jest is not installed" warnings in monorepo projects where jest is hoisted to the workspace root. Tested with Budibase monorepo where jest is at workspace root.

Adds detailed logging to track: - Test files being passed to Jest - File existence checks - Full Jest command - Working directory - Jest stdout/stderr even on success This helps diagnose why Jest may not be discovering or running tests.

…ctories Problem: - Generated tests are written to /tmp/codeflash_*/ - Import paths were calculated relative to tests_root (e.g., project/tests/) - This created invalid imports like 'packages/shared-core/src/helpers/lists' - Jest couldn't resolve these paths, causing all tests to fail Solution: - For JavaScript, calculate import path from actual test file location - Use os.path.relpath(source_file, test_dir) for correct relative imports - Now generates proper paths like '../../../budibase/packages/shared-core/src/helpers/lists' This fixes the root cause preventing test execution in monorepos like Budibase.

Problem 1 - Import path normalization: - Path("./foo/bar") normalizes to "foo/bar", stripping the ./ prefix - JavaScript/TypeScript require explicit relative paths with ./ or ../ - Jest couldn't resolve imports like "packages/shared-core/src/helpers" Solution 1: - Keep module_path as string instead of Path object for JavaScript - Preserve the ./ or ../ prefix needed for relative imports Problem 2 - Missing TestType enum value: - Code referenced TestType.GENERATED_PERFORMANCE which doesn't exist - Caused AttributeError during Jest test result parsing Solution 2: - Use TestType.GENERATED_REGRESSION for performance tests - Performance tests are still generated regression tests These fixes enable CodeFlash to successfully run tests on Budibase monorepo. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Added warning-level logging to trace performance test execution flow: - Log test files passed to run_jest_benchmarking_tests() - Log Jest command being executed - Log Jest stdout/stderr output - Save perf test source to /tmp for inspection Findings: - Perf test files ARE being created correctly with capturePerf() calls - Import paths are now correct (./prefix working) - Jest command executes but fails with: runtime.enterTestCode is not a function - Root cause: codeflash/loop-runner doesn't exist in npm package yet - The loop-runner is the core Jest 30 infrastructure that needs to be implemented This debugging reveals that performance benchmarking requires the custom loop-runner implementation, which is the original scope of this PR. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Temporarily disabled --runner=codeflash/loop-runner since the runner hasn't been implemented yet. This allows Jest to run performance tests with the default runner. Result: MAJOR BREAKTHROUGH! - CodeFlash now runs end-to-end on Budibase - Generated 11 optimization candidates - All candidates tested behaviorally - Tests execute successfully (40-48 passing) - Import paths working correctly with ./ prefix Current blocker: All optimization candidates introduce test failures (original: 47 passed/1 failed, candidates: 46 passed/2 failed). This suggests either: 1. Optimizations are too aggressive and change behavior 2. Generated tests may have quality issues 3. Need to investigate the 2 consistently failing tests But the infrastructure fixes are complete and working! This PR delivers: ✅ Monorepo support ✅ Import path resolution ✅ Test execution on JS/TS projects ✅ End-to-end optimization pipeline Next: Investigate test quality or optimization aggressiveness Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Resolved conflicts by: 1. Accepting origin/main's refactored verify_requirements() in support.py - Uses centralized find_node_modules_with_package() from init_javascript.py - Cleaner monorepo dependency detection 2. Accepting origin/main's refactored Jest parsing in parse_test_output.py - Jest-specific parsing moved to new codeflash/languages/javascript/parse.py - parse_test_xml() now routes to _parse_jest_test_xml() for JavaScript 3. Fixed TestType.GENERATED_PERFORMANCE bug in new parse.py - Changed to TestType.GENERATED_REGRESSION (performance tests are regression tests) - This was part of the original fixes in this branch The merge preserves all the infrastructure fixes from this branch while adopting the cleaner code organization from main. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Fixed ruff issues: - PLW0108: Removed unnecessary lambda wrappers, inline method references - Changed lambda: self.future_all_code_repair.clear() to self.future_all_code_repair.clear - Changed lambda: self.future_adaptive_optimizations.clear() to self.future_adaptive_optimizations.clear - PTH123: Replaced open() with Path.open() for debug file - S108: Use get_run_tmp_file() instead of hardcoded /tmp path for security - RUF059: Prefix unused concolic_tests variable with underscore Fixed mypy issues in PrComment.py: - Renamed loop variable from 'result' to 'test_result' to avoid redefinition - Removed str() conversion for async throughput values (already int type) - Type annotations now match actual value types All files formatted with ruff format. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

This optimization achieves a **329% speedup** (1.61ms → 374μs) by eliminating expensive third-party library calls and simplifying dictionary lookups: ## Primary Optimization: `humanize_runtime()` - Eliminated External Library Overhead The original code used `humanize.precisedelta()` and `re.split()` to format time values, which consumed **79.6% and 11.4%** of the function's execution time respectively (totaling ~91% overhead). The optimized version replaces this with: 1. **Direct unit determination via threshold comparisons**: Instead of calling `humanize.precisedelta()` and then parsing its output with regex, the code now uses a simple cascading if-elif chain (`time_micro < 1000`, `< 1000000`, etc.) to directly determine the appropriate time unit. 2. **Inline formatting**: Time values are formatted with f-strings (`f"{time_micro:.3g}"`) at the same point where units are determined, eliminating the need to parse formatted strings. 3. **Removed regex dependency**: The `re.split(r",|\s", runtime_human)[1]` call is completely eliminated since units are now determined algorithmically rather than extracted from formatted output. **Line profiler evidence**: The original `humanize.precisedelta()` call took 3.73ms out of 4.69ms total (79.6%), while the optimized direct formatting approach reduced the entire function to 425μs - an **11x improvement** in `humanize_runtime()` alone. ## Secondary Optimization: `TestType.to_name()` - Simplified Dictionary Access Changed from: ```python if self is TestType.INIT_STATE_TEST: return "" return _TO_NAME_MAP[self] ``` To: ```python return _TO_NAME_MAP.get(self, "") ``` This eliminates a conditional branch and replaces a KeyError-raising dictionary access with a safe `.get()` call. **Line profiler shows this reduced execution time from 210μs to 172μs** (18% faster). ## Performance Impact by Test Case All test cases show **300-500% speedups**, with the most significant gains occurring when: - Multiple runtime conversions happen (seen in `to_json()` which calls `humanize_runtime()` twice) - Test cases with larger time values (e.g., 1 hour in nanoseconds) that previously required more complex humanize processing The optimization particularly benefits the `PrComment.to_json()` method, which calls `humanize_runtime()` twice per invocation. This is reflected in test results showing consistent 350-370% speedups across typical usage patterns. ## Trade-offs None - this is a pure performance improvement with identical output behavior and no regressions in any other metrics.

…2026-02-04T14.10.57 ⚡️ Speed up method `PrComment.to_json` by 329% in PR #1318 (`fix/js-jest30-loop-runner`)

- Enable loop-runner for Jest benchmarking tests - Add LOG_LEVEL and DEBUG env vars to prevent console.log mocking - Add is_exported detection for functions in treesitter_utils - Skip non-exported functions that can't be imported in tests - Fix coverage file matching to use full path (avoid db/utils.ts vs utils/utils.ts) - Remove debug logging statements from verifier Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…odeflash into fix/js-jest30-loop-runner

…cript When optimizing TypeScript class methods that call other methods from the same class, the helper methods were being appended OUTSIDE the class definition. This caused syntax errors because class-specific keywords like `private` are only valid inside a class body. Changes: - Add _find_same_class_helpers() method to identify helper methods belonging to the same class as the target method - Modify extract_code_context() to include same-class helpers inside the class wrapper and filter them from the helpers list - Fix all JavaScript/TypeScript tests by adding export keywords to test code so functions can be discovered by discover_functions() - Add comprehensive tests for same-class helper extraction Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Add export keywords to test code in: - test_javascript_integration.py - test_javascript_optimization_flow.py - test_typescript_e2e.py This fixes the remaining test failures caused by discover_functions filtering out non-exported functions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Enhanced _is_node_exported in treesitter_utils.py to detect CommonJS export patterns in addition to ES module exports: - module.exports = { foo, bar } - module.exports = { key: value } - module.exports.foo = ... - exports.foo = ... This allows discover_functions to find functions exported via CommonJS without requiring tests to use ES module syntax. Updated tests to use module.exports instead of export keyword. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

os.path.relpath returns backslashes on Windows. The backslash-to-slash conversion happened after the ./ / ../ prefix check, so the check failed and prepended ./ producing ./../src/... paths. Use Path.as_posix() instead of manual string replacement.

…ts-in-testgen feat: include external class __init__ signatures with transitive type deps in testgen context

- Remove commands block from CLAUDE.md (standard tool usage Claude knows) - Remove dead @AGENTS.md reference - Add optimization pipeline overview with module pointers - Add domain glossary (optimization candidate, addressable time, candidate forest, replay test, tracer, worktree mode) - Extract mypy workflow to .claude/skills/fix-mypy.md (on-demand) - Create .claude/skills/fix-prek.md for prek workflow (on-demand) - Add key entry points table to architecture.md - Create path-scoped rules: optimization-patterns.md, language-patterns.md - Remove redundancy from source-code.md and across rules files - Move "never use pip" convention to code-style.md

refactor: restructure CLAUDE.md for effective context usage

Adds automated duplicate code detection using GitHub Agentic Workflows with Serena semantic analysis, configured for Python.

chore: add gh-aw duplicate code detector workflow

Pass ANTHROPIC_FOUNDRY_API_KEY and ANTHROPIC_FOUNDRY_BASE_URL env vars so Claude Code CLI authenticates via Azure Foundry instead of direct API.

…ndry fix: configure duplicate code detector for Azure Foundry

gh-aw doesn't support Azure Foundry auth. Use claude-code-action directly with use_foundry and Serena MCP server for semantic code analysis.

…detector chore: replace gh-aw duplicate detector with claude-code-action + Serena

Setup repository to use Tessl

Three private tiles in the codeflash workspace: - codeflash-rules: 6 steering rules (code-style, architecture, optimization-patterns, git-conventions, testing-rules, language-rules) - codeflash-docs: 7 doc pages (domain-types, optimization-pipeline, context-extraction, verification, ai-service, configuration) - codeflash-skills: 2 skills (debug-optimization-failure, add-codeflash-feature)

- Add trigger hints and code snippets to both skills - Add checkpoints after each step - Extract module reference and troubleshooting into linked files - Bump codeflash-skills tile to 0.2.0

5 scenarios testing: sequential debugging, Result type + effort config, test patterns, domain type conventions, and deduplication/repair mechanics. Also adds tessl-labs/tessl-skill-eval-scenarios dev dependency.

5 scenarios testing: code serialization format, candidate lifecycle/DAG, deterministic patches, effort levels/selection criteria, and function representation/concurrency model.

feat: add private tessl tiles for rules, docs, and skills

# Conflicts: # codeflash/github/PrComment.py # codeflash/optimization/function_optimizer.py # codeflash/tracer.py # codeflash/verification/parse_test_output.py # codeflash/verification/verification_utils.py

codeflash-ai · 2026-02-20T02:11:08Z

⚡️ Codeflash found optimizations for this PR

📄 17% (0.17x) speedup for `humanize_runtime` in `codeflash/code_utils/time_utils.py`

⏱️ Runtime : 3.49 milliseconds → 2.99 milliseconds (best of 110 runs)

A new Optimization Review has been created.

🔗 Review here

codeflash-ai · 2026-02-20T02:41:20Z

codeflash/languages/javascript/instrument.py

+    while i < pos:
+        char = code[i]
+
+        if in_string:
+            # Check for escape sequence
+            if char == "\\" and i + 1 < len(code):
+                i += 2  # Skip escaped character
+                continue
+            # Check for end of string
+            if char == string_char:
+                in_string = False
+                string_char = None
+        # Check for start of string
+        elif char in "\"'`":
+            in_string = True
+            string_char = char
+
+        i += 1
+


⚡️Codeflash found 206% (2.06x) speedup for is_inside_string in codeflash/languages/javascript/instrument.py

⏱️ Runtime : 4.21 milliseconds → 1.37 milliseconds (best of 58 runs)

📝 Explanation and details

The optimized code achieves a 206% speedup (4.21ms → 1.37ms) by replacing character-by-character iteration with str.find() calls that leverage Python's C-optimized substring search.

Key Optimizations

1. Batch Processing with str.find()
Instead of checking each character individually, the code now:

Uses code.find("\\", i, pos) and code.find(string_char, i, pos) to jump directly to the next relevant character (backslash or closing quote)

Uses code.find('"', i, pos), code.find("'", i, pos), and code.find('', i, pos)` to find the next string opening

This eliminates ~69,000 individual character checks in the original (as seen in line profiler: "while i < pos" executed 69,214 times vs 3,850 times in optimized).

2. Performance Characteristics
The optimization particularly excels when:

Long strings: Test cases show 1340-6830% speedup for strings with 1000-5000 characters (33.3μs → 2.31μs, 165μs → 2.38μs)

Large gaps between strings: 957% speedup (31.2μs → 2.96μs) when checking positions between distant strings

Many strings: 37% speedup (94.6μs → 69.0μs) for code with 200 strings

Batch position checks: 1832% speedup (3.01ms → 155μs) when checking multiple positions

3. Trade-offs
Short strings (1-10 characters) show 20-60% slowdown due to function call overhead of multiple str.find() operations. However, based on function_references, this function is used to detect whether positions in JavaScript test code are inside string literals - a context where longer strings are common (e.g., "test('fibonacci(5)', () => {})"), making the optimization highly beneficial.

4. Why It's Faster
Python's str.find() is implemented in C and can scan large sections of memory efficiently, while the original character-by-character loop incurs Python bytecode interpretation overhead for each iteration. By reducing loop iterations from ~69K to ~3.8K, the optimized version minimizes this overhead while maintaining identical behavior (including the IndexError for out-of-bounds positions).

✅ Correctness verification report:

Test Status

⚙️ Existing Unit Tests 🔘 None Found

🌀 Generated Regression Tests ✅ 183 Passed

⏪ Replay Tests 🔘 None Found

🔎 Concolic Coverage Tests 🔘 None Found

📊 Tests Coverage 100.0%

🌀 Click to see Generated Regression Tests

from __future__ import annotations # imports import pytest # used for our unit tests from codeflash.languages.javascript.instrument import is_inside_string def test_empty_code_at_zero_is_not_inside(): # empty code, position 0 -> nothing before position, cannot be inside a string codeflash_output = is_inside_string("", 0) # 491ns -> 611ns (19.6% slower) def test_position_zero_in_nonempty_code_is_not_inside(): # starting position (index 0) is never considered inside because loop iterates while i < pos codeflash_output = is_inside_string("var x = 'hello';", 0) # 481ns -> 621ns (22.5% slower) @pytest.mark.parametrize("code,pos,expected", [ # simple single-quoted string: positions of characters inside the quotes should be True ("'hello'", 1, True), # at 'h' ("'hello'", 5, True), # at last 'o' ("'hello'", 6, True), # at the closing quote (state before processing closing quote is inside) # double quotes behave similarly ('"hi"', 1, True), ('"hi"', 2, True), # backticks (template literal) ("`template`", 3, True), # position right before the opening quote should be False ("var s = 'x';", 8, False), # index of space before opening quote # position after the closing quote: when pos is after the closing quote loop will have processed the quote ("'a'", 3, False), # pos == len, processed closing quote -> not inside ]) def test_basic_string_positions(code, pos, expected): # parametric test for multiple basic scenarios codeflash_output = is_inside_string(code, pos) # 9.59μs -> 16.4μs (41.4% slower) def test_escaped_quote_does_not_end_string(): # code contains an escaped single quote inside a single-quoted string code = "'a\\'b'" # characters: 0:' 1:a 2:\ 3:' (escaped) 4:b 5:' # check positions: inside the characters after the escaped quote should still be considered inside codeflash_output = is_inside_string(code, 4) # 1.31μs -> 2.34μs (44.0% slower) # position exactly at the escaped quote character index (3) should also be considered inside codeflash_output = is_inside_string(code, 3) # 621ns -> 1.48μs (58.1% slower) def test_escaped_backslash_skips_next_character_properly(): # ensure that an escape sequence of backslash followed by quote is skipped and doesn't terminate the string # Build code: opening ', then backslash, then quote (escaped), then x, then closing ' code = "'\\'x'" # represents characters: 0:' 1:\ 2:' 3:x 4:' # pos at 3 (the 'x') should be inside because the escaped quote at index 2 did not end the string codeflash_output = is_inside_string(code, 3) # 1.04μs -> 2.37μs (55.9% slower) # pos at 4 (closing quote index) should still be considered inside (state before processing closing quote) codeflash_output = is_inside_string(code, 4) # 752ns -> 1.62μs (53.7% slower) def test_multiple_string_types_and_positions(): # a code snippet containing multiple string types back-to-back code = "'s1' + \"s2\" + `s3`" # find indices manually: # "'s1'" spans indices 0..3 (quotes at 0 and 3), so pos=1 and pos=2 are inside s1 codeflash_output = is_inside_string(code, 1) # 812ns -> 1.66μs (51.2% slower) codeflash_output = is_inside_string(code, 2) # 581ns -> 1.35μs (57.0% slower) # "\"s2\"" starts after " + " (indices can be computed); find the starting index of "s2" start_s2 = code.index('"') # index of opening double quote codeflash_output = is_inside_string(code, start_s2 + 1) # 1.12μs -> 2.10μs (46.7% slower) # backtick string start_s3 = code.index('`') codeflash_output = is_inside_string(code, start_s3 + 1) # 1.49μs -> 2.71μs (45.0% slower) def test_negative_position_returns_false(): # negative positions should result in loop not running and thus not inside a string codeflash_output = is_inside_string("'abc'", -1) # 541ns -> 671ns (19.4% slower) codeflash_output = is_inside_string("", -5) # 250ns -> 321ns (22.1% slower) def test_position_greater_than_length_raises_index_error(): # if pos exceeds len(code), the implementation will attempt to index out of range -> IndexError code = "'short'" with pytest.raises(IndexError): codeflash_output = is_inside_string(code, len(code) + 1); _ = codeflash_output # 3.48μs -> 2.36μs (47.0% faster) def test_position_equal_to_length_safe_and_reflects_after_processing(): # pos == len(code) is allowed: loop processes up to last character code = "'ok'" # pos == len -> after processing closing quote, should not be inside codeflash_output = is_inside_string(code, len(code)) # 1.39μs -> 2.33μs (40.3% slower) def test_large_scale_long_string_performance_and_correctness(): # create a long string of ~1000 characters to test scalability and correctness inner = "a" * 998 # many characters inside the quotes code = "'" + inner + "'" # total length 1000 # pick a position in the middle of the long string; should be considered inside mid_pos = 1 + 500 # offset by 1 due to opening quote at index 0 codeflash_output = is_inside_string(code, mid_pos) # 33.3μs -> 2.31μs (1340% faster) # position at the very end (len) should be False (closing quote processed) codeflash_output = is_inside_string(code, len(code)) # 65.6μs -> 1.66μs (3845% faster) def test_many_escaped_sequences_in_long_string(): # build a long string containing many escape sequences to ensure the escape-skip logic scales # we'll create a pattern like: 'a\<quote>b\<quote>...' where each escape should skip the next char parts = [] # construct roughly 500 escape sequences (total length ~1000) for i in range(500): parts.append("\\'") # backslash followed by single-quote as escaped char inside the JS string literal parts.append("x") # a visible character following the escape inner = "".join(parts) code = "'" + inner + "'" # wrap with quotes # choose a position beyond many escapes but still inside the literal pos = 1 + 600 # somewhere in the middle of the inner content codeflash_output = is_inside_string(code, pos) # 32.9μs -> 63.2μs (47.9% slower) # ensure that position at closing quote is considered inside (state before processing closing quote) codeflash_output = is_inside_string(code, len(code) - 1) # 78.6μs -> 154μs (49.3% slower) # codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import pytest from codeflash.languages.javascript.instrument import is_inside_string def test_position_zero_outside_string(): """Test that position 0 is outside any string.""" codeflash_output = is_inside_string("hello", 0); result = codeflash_output # 471ns -> 611ns (22.9% slower) def test_position_inside_double_quoted_string(): """Test that a position inside a double-quoted string is detected.""" code = 'x = "hello"' codeflash_output = is_inside_string(code, 5); result = codeflash_output # 1.36μs -> 1.94μs (29.9% slower) def test_position_inside_single_quoted_string(): """Test that a position inside a single-quoted string is detected.""" code = "x = 'hello'" codeflash_output = is_inside_string(code, 5); result = codeflash_output # 1.28μs -> 1.83μs (30.1% slower) def test_position_inside_backtick_template_literal(): """Test that a position inside a backtick template literal is detected.""" code = "x = `hello`" codeflash_output = is_inside_string(code, 5); result = codeflash_output # 1.30μs -> 1.84μs (29.4% slower) def test_position_outside_string_before_opening_quote(): """Test that a position before the opening quote is outside string.""" code = 'x = "hello"' codeflash_output = is_inside_string(code, 4); result = codeflash_output # 1.17μs -> 1.54μs (24.0% slower) def test_position_outside_string_after_closing_quote(): """Test that a position after the closing quote is outside string.""" code = 'x = "hello"' codeflash_output = is_inside_string(code, 11); result = codeflash_output # 2.00μs -> 2.35μs (14.9% slower) def test_position_at_closing_quote(): """Test that a position at the closing quote itself is outside string.""" code = 'x = "hello"' codeflash_output = is_inside_string(code, 10); result = codeflash_output # 1.74μs -> 2.11μs (17.5% slower) def test_multiple_strings_first_string(): """Test detection in the first of multiple strings.""" code = '"first" and "second"' codeflash_output = is_inside_string(code, 2); result = codeflash_output # 1.10μs -> 2.10μs (47.6% slower) def test_multiple_strings_second_string(): """Test detection in the second of multiple strings.""" code = '"first" and "second"' codeflash_output = is_inside_string(code, 14); result = codeflash_output # 2.16μs -> 3.28μs (33.9% slower) def test_position_between_strings(): """Test that a position between two strings is outside both.""" code = '"first" and "second"' codeflash_output = is_inside_string(code, 8); result = codeflash_output # 1.68μs -> 2.73μs (38.4% slower) def test_escaped_quote_in_double_quoted_string(): """Test that an escaped quote doesn't end the string.""" code = '"hello\\"world"' codeflash_output = is_inside_string(code, 10); result = codeflash_output # 1.83μs -> 2.69μs (31.7% slower) def test_escaped_quote_in_single_quoted_string(): """Test that an escaped single quote doesn't end the string.""" code = "'hello\\'world'" codeflash_output = is_inside_string(code, 10); result = codeflash_output # 1.82μs -> 2.65μs (31.3% slower) def test_position_at_escaped_backslash(): """Test that the position of an escaped backslash is inside string.""" code = '"hello\\\\world"' codeflash_output = is_inside_string(code, 8); result = codeflash_output # 1.56μs -> 2.35μs (33.6% slower) def test_empty_code_string(): """Test behavior with empty code string.""" codeflash_output = is_inside_string("", 0); result = codeflash_output # 501ns -> 631ns (20.6% slower) def test_position_zero_with_string_at_start(): """Test position 0 when string starts at position 0.""" codeflash_output = is_inside_string('"hello"', 0); result = codeflash_output # 481ns -> 632ns (23.9% slower) def test_position_one_with_string_at_start(): """Test position 1 when string starts at position 0.""" codeflash_output = is_inside_string('"hello"', 1); result = codeflash_output # 902ns -> 1.72μs (47.6% slower) def test_position_equal_to_code_length(): """Test position equal to the code length.""" code = '"hello"' codeflash_output = is_inside_string(code, len(code)); result = codeflash_output # 1.58μs -> 2.31μs (31.3% slower) def test_single_character_double_quoted_string(): """Test a single character inside a double-quoted string.""" code = '"x"' codeflash_output = is_inside_string(code, 1); result = codeflash_output # 952ns -> 1.92μs (50.5% slower) def test_single_character_single_quoted_string(): """Test a single character inside a single-quoted string.""" code = "'x'" codeflash_output = is_inside_string(code, 1); result = codeflash_output # 901ns -> 1.81μs (50.3% slower) def test_single_character_backtick_string(): """Test a single character inside a backtick string.""" code = "`x`" codeflash_output = is_inside_string(code, 1); result = codeflash_output # 872ns -> 1.77μs (50.8% slower) def test_nested_quotes_double_inside_single(): """Test double quote inside single-quoted string (not nested escaping).""" code = "'hello\"world'" codeflash_output = is_inside_string(code, 6); result = codeflash_output # 1.49μs -> 2.21μs (32.6% slower) def test_nested_quotes_single_inside_double(): """Test single quote inside double-quoted string.""" code = '"hello\'world"' codeflash_output = is_inside_string(code, 6); result = codeflash_output # 1.42μs -> 2.06μs (31.1% slower) def test_escaped_backslash_followed_by_quote(): """Test escaped backslash followed by an actual closing quote.""" code = '"test\\\\"' # String with escaped backslash, then closing quote codeflash_output = is_inside_string(code, 7); result = codeflash_output # 1.57μs -> 2.31μs (32.0% slower) def test_multiple_escaped_quotes(): """Test multiple escaped quotes in succession.""" code = '"a\\"b\\"c"' codeflash_output = is_inside_string(code, 4); result = codeflash_output # 1.29μs -> 2.33μs (44.6% slower) def test_code_with_only_opening_quote(): """Test code with only an opening quote and no closing quote.""" code = '"no closing quote' codeflash_output = is_inside_string(code, 5); result = codeflash_output # 1.32μs -> 2.09μs (36.8% slower) def test_code_with_only_opening_quote_at_end(): """Test position at the end when there's an unclosed quote.""" code = '"unclosed' codeflash_output = is_inside_string(code, len(code)); result = codeflash_output # 1.60μs -> 2.10μs (23.8% slower) def test_alternating_quotes(): """Test alternating between different quote types.""" code = '"a"\'b\'`c`' codeflash_output = is_inside_string(code, 1); result_double = codeflash_output # 851ns -> 1.70μs (50.0% slower) codeflash_output = is_inside_string(code, 4); result_single = codeflash_output # 971ns -> 2.29μs (57.7% slower) codeflash_output = is_inside_string(code, 7); result_backtick = codeflash_output # 921ns -> 2.81μs (67.2% slower) def test_quote_immediately_after_another_quote(): """Test position right after a closed quote when new quote opens immediately.""" code = '""' codeflash_output = is_inside_string(code, 1); result = codeflash_output # 822ns -> 1.65μs (50.3% slower) def test_whitespace_only_string(): """Test a string containing only whitespace.""" code = '" "' codeflash_output = is_inside_string(code, 2); result = codeflash_output # 1.06μs -> 2.02μs (47.5% slower) def test_string_with_newline_character(): """Test position inside a string containing newline character.""" code = '"line1\nline2"' codeflash_output = is_inside_string(code, 7); result = codeflash_output # 1.44μs -> 2.02μs (28.7% slower) def test_string_with_tab_character(): """Test position inside a string containing tab character.""" code = '"before\tafter"' codeflash_output = is_inside_string(code, 8); result = codeflash_output # 1.48μs -> 1.96μs (24.5% slower) def test_position_negative(): """Test behavior with negative position (should not enter the while loop).""" code = '"hello"' codeflash_output = is_inside_string(code, -1); result = codeflash_output # 481ns -> 651ns (26.1% slower) def test_escaped_character_other_than_quote(): """Test escaped character that is not a quote.""" code = '"hello\\nworld"' codeflash_output = is_inside_string(code, 8); result = codeflash_output # 1.80μs -> 2.58μs (30.3% slower) def test_backslash_at_end_of_code(): """Test when there's a backslash at the end of the code.""" code = '"hello\\' codeflash_output = is_inside_string(code, 7); result = codeflash_output # 1.74μs -> 2.42μs (28.1% slower) def test_single_quote_in_code_snippet(): """Test a minimal single quote scenario.""" code = "'" codeflash_output = is_inside_string(code, 0); result = codeflash_output # 511ns -> 631ns (19.0% slower) def test_double_quote_in_code_snippet(): """Test a minimal double quote scenario.""" code = '"' codeflash_output = is_inside_string(code, 0); result = codeflash_output # 491ns -> 631ns (22.2% slower) def test_large_code_with_many_strings(): """Test a code snippet with many string literals (200 strings).""" # Build a code string with 200 strings code_parts = ['"string_{}"'.format(i) for i in range(200)] code = ' + '.join(code_parts) # Check a position deep inside the 100th string # Find the position of the 100th string opening quote offset = sum(len(code_parts[i]) + 3 for i in range(100)) # +3 for ' + ' codeflash_output = is_inside_string(code, offset + 5); result = codeflash_output # 94.6μs -> 69.0μs (37.2% faster) def test_very_long_single_string(): """Test a very long string literal (5000 characters).""" long_content = 'x' * 5000 code = '"' + long_content + '"' codeflash_output = is_inside_string(code, 2500); result = codeflash_output # 165μs -> 2.38μs (6830% faster) def test_many_escaped_quotes_in_long_string(): """Test a string with 500 escaped quotes.""" escaped_parts = ['\\"'] * 500 code = '"' + ''.join(escaped_parts) + '"' codeflash_output = is_inside_string(code, len(code) // 2); result = codeflash_output # 23.7μs -> 75.5μs (68.5% slower) def test_alternating_escaped_and_normal_chars(): """Test string with alternating escaped and normal characters (1000 chars).""" pattern = ('\\x' * 500) # 1000 characters of \\x patterns code = '"' + pattern + '"' codeflash_output = is_inside_string(code, 500); result = codeflash_output # 23.7μs -> 69.6μs (65.9% slower) def test_position_in_large_gap_between_strings(): """Test position in a large gap between two strings.""" code = '"first"' + ' ' * 1000 + '"second"' codeflash_output = is_inside_string(code, 500); result = codeflash_output # 31.2μs -> 2.96μs (957% faster) def test_mixed_quote_types_500_iterations(): """Test 500 alternations between different quote types.""" parts = [] for i in range(500): if i % 3 == 0: parts.append('"str_double"') elif i % 3 == 1: parts.append("'str_single'") else: parts.append("`str_backtick`") code = ' '.join(parts) # Check inside the last string codeflash_output = is_inside_string(code, len(code) - 5); result = codeflash_output # 465μs -> 365μs (27.5% faster) def test_deeply_nested_escaped_sequences(): """Test a string with deeply nested escaped sequences (200 levels).""" content = '\\' * 200 + 'x' + '\\' * 200 code = '"' + content + '"' codeflash_output = is_inside_string(code, len(code) // 2); result = codeflash_output # 9.77μs -> 28.2μs (65.3% slower) def test_large_code_multiple_unclosed_quotes(): """Test large code with multiple unclosed quotes.""" code = '"unclosed1' + ' ' * 1000 + '"unclosed2' + ' ' * 1000 # Position inside first unclosed string codeflash_output = is_inside_string(code, 100); result = codeflash_output # Position inside second unclosed string codeflash_output = is_inside_string(code, 2100); result = codeflash_output def test_performance_large_code_outside_strings(): """Test performance checking many positions outside strings.""" # Build code with strings and large gaps code = '"str"' + ' ' * 500 + '"str"' + ' ' * 500 results = [] for pos in range(0, len(code), 10): results.append(is_inside_string(code, pos)) # 3.01ms -> 155μs (1832% faster) def test_longest_possible_escaped_sequence(): """Test string with very long escape sequence pattern.""" # Create a pattern of 300 escaped backslashes code = '"' + ('\\\\' * 300) + '"' codeflash_output = is_inside_string(code, len(code) // 2); result = codeflash_output # 14.5μs -> 42.1μs (65.4% slower) def test_position_across_all_string_types_large(): """Test positions across 300 total strings of all types.""" parts = [] for i in range(300): if i % 3 == 0: parts.append('"d"') elif i % 3 == 1: parts.append("'s'") else: parts.append("`b`") code = ' '.join(parts) # Check random positions are correct codeflash_output = is_inside_string(code, 50); result_1 = codeflash_output # 4.99μs -> 11.5μs (56.5% slower) codeflash_output = is_inside_string(code, len(code) - 1); result_2 = codeflash_output # 82.4μs -> 212μs (61.2% slower) def test_return_type_consistency(): """Test that function always returns a boolean.""" test_cases = [ ('""', 1), ("''", 1), ('``', 1), ('x = "test"', 5), ('', 0), ('"unclosed', 5), ] for code, pos in test_cases: codeflash_output = is_inside_string(code, pos); result = codeflash_output # 3.38μs -> 5.67μs (40.3% slower) # codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr1557-2026-02-20T02.41.19

Click to see suggested changes

Suggested change

while i < pos:

char = code[i]

if in_string:

# Check for escape sequence

if char == "\\" and i + 1 < len(code):

i += 2 # Skip escaped character

continue

# Check for end of string

if char == string_char:

in_string = False

string_char = None

# Check for start of string

elif char in "\"'`":

in_string = True

string_char = char

i += 1

# Preserve original behavior: accessing code[i] in the original implementation

# would raise IndexError if pos > len(code). Raise the same error early.

if pos > len(code):

raise IndexError

while i < pos:

if in_string:

# Check for escape sequence

bs_idx = code.find("\\", i, pos)

# Check for end of string

quote_idx = code.find(string_char, i, pos)

if bs_idx == -1 and quote_idx == -1:

# No terminator before pos; still inside string

return True

# If quote occurs before backslash (or backslash not found), end string

if bs_idx == -1 or (quote_idx != -1 and quote_idx < bs_idx):

in_string = False

string_char = None

i = quote_idx + 1

continue

# Otherwise a backslash occurs first; skip escaped character

if bs_idx + 1 < len(code):

i = bs_idx + 2

else:

# Backslash at end of code; original logic would increment i by 2

# (leading to an exit because i >= pos), so mimic that behavior

i = bs_idx + 2

continue

# Check for start of string

# Find the next quote of any of the three types before pos

next_double = code.find('"', i, pos)

next_single = code.find("'", i, pos)

next_backtick = code.find("`", i, pos)

# Determine the smallest non-negative index among the found ones

next_idx = -1

if next_double != -1:

next_idx = next_double

if next_single != -1 and (next_idx == -1 or next_single < next_idx):

next_idx = next_single

if next_backtick != -1 and (next_idx == -1 or next_backtick < next_idx):

next_idx = next_backtick

if next_idx == -1:

# No starting quote before pos

break

in_string = True

string_char = code[next_idx]

i = next_idx + 1

mohammedahmed18 and others added 30 commits February 3, 2026 19:04

Merge branch 'main' into fix/js-jest30-loop-runner

4c61d08

Merge branch 'main' of github.com:codeflash-ai/codeflash into fix/js-…

a3764f1

…jest30-loop-runner

chore: trigger CI workflows

7273f27

fix: use lazy % formatting for logger.debug to pass ruff G004

b83e516

Changes f-string to % formatting in logger.debug() call to avoid evaluating the string when debug logging is disabled.

Merge branch 'main' into fix/js-jest30-loop-runner

0592d92

style: auto-fix linting issues

bab3bd4

Merge pull request #1383 from codeflash-ai/codeflash/optimize-pr1318-…

c151b6c

…2026-02-04T14.10.57 ⚡️ Speed up method `PrComment.to_json` by 329% in PR #1318 (`fix/js-jest30-loop-runner`)

style: auto-fix linting issues

9bb05f6

cleanup

8fcb8cc

Merge branch 'fix/js-jest30-loop-runner' of github.com:codeflash-ai/c…

67ea0c9

…odeflash into fix/js-jest30-loop-runner

Merge branch 'main' into fix/js-jest30-loop-runner

f800ae3

fix: resolve merge conflict in function_optimizer.py

b65711d

version upgrade for cf package

6c23255

KRRT7 and others added 25 commits February 13, 2026 09:57

docs: distinguish local vs CI prek commands in CLAUDE.md

29a5324

chore: upgrade all dependencies in lockfile

4f44286

Merge pull request #1481 from codeflash-ai/include-external-class-ini…

42a1150

…ts-in-testgen feat: include external class __init__ signatures with transitive type deps in testgen context

Merge pull request #1486 from codeflash-ai/restructure-claude-md

53f8658

refactor: restructure CLAUDE.md for effective context usage

chore: add gh-aw duplicate code detector workflow

f819d60

Adds automated duplicate code detection using GitHub Agentic Workflows with Serena semantic analysis, configured for Python.

Merge pull request #1487 from codeflash-ai/add-duplicate-code-detector

b3c3a30

chore: add gh-aw duplicate code detector workflow

fix: configure duplicate code detector for Azure Foundry auth

ef66139

Pass ANTHROPIC_FOUNDRY_API_KEY and ANTHROPIC_FOUNDRY_BASE_URL env vars so Claude Code CLI authenticates via Azure Foundry instead of direct API.

docs: add new-branch-from-main rule to git guidelines

9961a02

docs: add new-branch-from-main rule to git guidelines

0bb62d6

Merge pull request #1490 from codeflash-ai/fix-duplicate-detector-fou…

de78ffe

…ndry fix: configure duplicate code detector for Azure Foundry

chore: replace gh-aw duplicate detector with claude-code-action + Serena

02b9a5e

gh-aw doesn't support Azure Foundry auth. Use claude-code-action directly with use_foundry and Serena MCP server for semantic code analysis.

Merge pull request #1491 from codeflash-ai/replace-ghaw-with-foundry-…

dbba5e0

…detector chore: replace gh-aw duplicate detector with claude-code-action + Serena

Initialize tessl.json with matched tiles

9af75a6

Add MCP config for .mcp.json

9282e25

Merge pull request #1492 from codeflash-ai/tessl/setup-1771114839280

90601c3

Setup repository to use Tessl

chore: improve skills to 100% review score and bump to v0.2.0

18ad00b

- Add trigger hints and code snippets to both skills - Add checkpoints after each step - Extract module reference and troubleshooting into linked files - Bump codeflash-skills tile to 0.2.0

chore: add tessl-managed gitignore for codex and gemini skill symlinks

289b75c

chore: add eval scenarios for codeflash-skills tile

ff2abd2

5 scenarios testing: sequential debugging, Result type + effort config, test patterns, domain type conventions, and deduplication/repair mechanics. Also adds tessl-labs/tessl-skill-eval-scenarios dev dependency.

chore: add eval scenarios for codeflash-docs tile

869fbe1

5 scenarios testing: code serialization format, candidate lifecycle/DAG, deterministic patches, effort levels/selection criteria, and function representation/concurrency model.

Merge pull request #1494 from codeflash-ai/add-private-tessl-tiles

d578d99

feat: add private tessl tiles for rules, docs, and skills

Merge commit 'd578d996' into sync-main-batch-2

c66953d

# Conflicts: # codeflash/github/PrComment.py # codeflash/optimization/function_optimizer.py # codeflash/tracer.py # codeflash/verification/parse_test_output.py # codeflash/verification/verification_utils.py

chore: fix ruff format issue in code_context_extractor

8632da0

github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Feb 20, 2026

KRRT7 merged commit c39313b into omni-java Feb 20, 2026
25 of 32 checks passed

KRRT7 deleted the sync-main-batch-2 branch February 20, 2026 01:35

codeflash-ai bot reviewed Feb 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: sync main into omni-java (batch 2/4)#1557

chore: sync main into omni-java (batch 2/4)#1557
KRRT7 merged 107 commits intoomni-javafrom
sync-main-batch-2

KRRT7 commented Feb 20, 2026

Uh oh!

Uh oh!

codeflash-ai bot commented Feb 20, 2026

Uh oh!

codeflash-ai bot Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 183 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

Conversation

KRRT7 commented Feb 20, 2026

Summary

Included PRs

Conflict resolutions

Remaining batches

Uh oh!

Uh oh!

codeflash-ai bot commented Feb 20, 2026

⚡️ Codeflash found optimizations for this PR

📄 17% (0.17x) speedup for humanize_runtime in codeflash/code_utils/time_utils.py

A new Optimization Review has been created.

Uh oh!

codeflash-ai bot Feb 20, 2026

Choose a reason for hiding this comment

⚡️Codeflash found 206% (2.06x) speedup for is_inside_string in codeflash/languages/javascript/instrument.py

Key Optimizations

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

📄 17% (0.17x) speedup for `humanize_runtime` in `codeflash/code_utils/time_utils.py`

⚡️Codeflash found 206% (2.06x) speedup for `is_inside_string` in `codeflash/languages/javascript/instrument.py`