Skip to content

feat: Achieve 100% pass rate on perf/benchmarks.t with interpreter backend#214

Merged
fglock merged 10 commits into
masterfrom
fix/test-regressions
Feb 20, 2026
Merged

feat: Achieve 100% pass rate on perf/benchmarks.t with interpreter backend#214
fglock merged 10 commits into
masterfrom
fix/test-regressions

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented Feb 19, 2026

Summary

This PR achieves 100% pass rate (1960/1960 tests) on perf/benchmarks.t when running with the interpreter backend (JPERL_EVAL_USE_INTERPRETER=1), improving from 96.2% (1886/1960) to 100%.

Test Results

Before:

[  1/1] perl5_t/t/perf/benchmarks.t  ... ✗ 1886/1960 ok
Pass rate: 96.2%
Tests fixed needed: 74

After:

[  1/1] perl5_t/t/perf/benchmarks.t  ... ✓ 1960/1960 ok (0.65s)
Pass rate: 100.0%

Implementations

1. s/// (replaceRegex) Operator (+6 tests)

  • Added GET_REPLACEMENT_REGEX opcode (236)
  • Implements regex substitution with proper =~ binding
  • Handles pattern, replacement, flags, and optional string operand
  • Full support for modifiers including /g (global replacement)

Example:

my $x = "abc";
$x =~ s/b/X/;  # $x now "aXc"

2. substr Operator (+3 tests)

  • Added SUBSTR_VAR opcode (237) for variable-argument substr
  • Supports 2-4 arguments: substr($string, $offset, $length, $replacement)
  • Returns lvalue for in-place modification
  • Proper context handling for scalar/list results

Example:

my $s = "hello";
my $sub = substr($s, 1, 2);  # "el"

3. grep Context Handling (+2 tests)

  • Fixed grep to return scalar count in SCALAR context
  • Prevents ClassCastException when grep used in boolean context
  • Updated to use current calling context instead of hardcoding LIST

Example:

my @a = (1, 2, 3);
my $count = grep $_ > 1, @a;  # Returns 2 (count)
if (!grep $_, @empty) { ... }  # Boolean context now works

4. Array Dereference Assignment (+1 test)

  • Added support for @$r = list assignments
  • Properly dereferences scalar containing array reference
  • Handles context propagation for scalar/list return values

Example:

my $r = [];
@$r = split /:/, "a:b", 2;  # Now works correctly

5. Build System Improvement

  • Updated Makefile dev target to include shadowJar
  • Ensures JAR with all dependencies is built to target/ directory
  • Fixes issue where jperl wrapper was using outdated JAR

Technical Details

New Opcodes

  • GET_REPLACEMENT_REGEX (236): Creates compiled replacement regex from pattern, replacement text, and flags
  • SUBSTR_VAR (237): Variable-argument substr operation with optional replacement

Modified Components

  • CompileOperator.java - Added replaceRegex and substr operator compilation
  • CompileBinaryOperator.java - Added =~ binding for regex operators (replaceRegex, matchRegex, tr, transliterate)
  • CompileBinaryOperatorHelper.java - Fixed grep to respect calling context
  • CompileAssignment.java - Added array dereference assignment support (@$ref = list)
  • BytecodeInterpreter.java - Added opcode handlers with proper context handling
  • InterpretedCode.java - Added disassembly support for new opcodes
  • Opcodes.java - Defined new opcodes and updated LASTOP
  • Makefile - Added shadowJar to dev target

Testing

To verify the fix:

JPERL_EVAL_USE_INTERPRETER=1 perl dev/tools/perl_test_runner.pl perl5_t/t/perf/benchmarks.t

All 1960 tests now pass with the interpreter backend, closing the gap with the JVM compiler backend.

Related Issues

This PR addresses interpreter backend compatibility, enabling 46x faster eval STRING compilation while maintaining 100% feature parity with the baseline on this critical performance benchmark suite.

🤖 Generated with Claude Code

fglock and others added 10 commits February 19, 2026 19:03
Fixed register reuse bug in foreach loops that caused ClassCastException
when trying to cast Integer to Iterator.

Root cause:
- Iterator register (iterReg) was allocated before enterScope()
- Inside loop body, recycleTemporaryRegisters() would reset nextRegister
- getHighestVariableRegister() didn't account for iterator register
- On second iteration, iterator register got reused for LOAD_INT
- FOREACH_NEXT_OR_EXIT tried to cast Integer to Iterator -> crash

Solution:
1. Allocate both iterReg and varReg BEFORE enterScope()
2. Modified recycleTemporaryRegisters() to use Math.max() to preserve
   baseRegisterForStatement set by enterScope()
3. This protects iterator and loop variable registers across iterations

Test improvements:
- perf/benchmarks.t: 1886/1960 -> 1907/1960 (+21 tests, 96.2% -> 97.3%)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed array and hash assignments to return scalar values when used
in scalar context (e.g., `!(@A = @b)`).

Root cause:
- Array/hash assignments always returned the array/hash object
- When used in scalar context (e.g., NOT operator), this caused
  ClassCastException: RuntimeArray cannot be cast to RuntimeScalar

Solution:
- Check savedContext (the calling context before assignment modified it)
- If SCALAR context, emit ARRAY_SIZE after ARRAY_SET_FROM_LIST/HASH_SET_FROM_LIST
- ARRAY_SIZE calls .scalar() which converts array to size, hash to bucket info

Test improvements:
- perf/benchmarks.t: 1907/1960 -> 1908/1960 (+1 test)
- expr::aassign::boolean now passes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed hash variable references to return scalar values when used
in scalar context (e.g., `!%h`).

Root cause:
- Hash references (%) always returned RuntimeHash object
- When used in scalar context (e.g., NOT operator), this caused
  ClassCastException: RuntimeHash cannot be cast to RuntimeScalar

Solution:
- Check currentCallContext == RuntimeContextType.SCALAR for % operator
- Emit ARRAY_SIZE after loading hash (calls .scalar() for conversion)
- This matches the pattern already used for @ (array) operator

Test improvements:
- expr::hash::bool_empty, expr::hash::bool_full now pass
- Hash boolean expressions work correctly in interpreter

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
After refactoring BytecodeCompiler into CompileOperator.java,
the opcode generator was looking for GENERATED_OPERATORS markers
in the wrong file.

Changed bytecode_compiler_file path from:
  BytecodeCompiler.java -> CompileOperator.java

Note: Generator still needs updates to work with static context
(generate bytecodeCompiler.method() instead of this.method())

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implemented sprintf for interpreter eval STRING execution.

Changes:
- Added SPRINTF opcode (234) to Opcodes.java
- Implemented sprintf handler in BytecodeInterpreter.java
- Added sprintf disassembly in InterpretedCode.java
- Added sprintf compilation in CompileBinaryOperator.java
- Handles sprintf as BinaryOperatorNode (format, args_list)
- Creates RuntimeList from arguments and calls SprintfOperator.sprintf()

Test improvements:
- perf/benchmarks.t: 1905/1960 -> 1936/1960 (+31 tests, 97.2% -> 98.8%)
- All sprintf tests now pass

Format: SPRINTF rd formatReg argsListReg

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed keys() and values() to work correctly in scalar context.

Root cause:
- keys/values compiled operand in current context
- When in scalar context, %hash was converted to scalar (bucket info)
- HASH_KEYS/HASH_VALUES tried to cast scalar to hash -> ClassCastException

Solution:
- Force LIST context when compiling keys/values operands
- Check savedContext after HASH_KEYS/HASH_VALUES
- If scalar context, emit ARRAY_SIZE to convert list to count
- This matches Perl semantics: keys in scalar = count, keys in list = keys

Test improvements:
- perf/benchmarks.t: 1936/1960 -> 1946/1960 (+10 tests, 98.8% -> 99.3%)
- All keys/values boolean and scalar context tests pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implemented chop operator for removing last character from string.

Changes:
- Added CHOP opcode (235) to Opcodes.java
- Implemented chop handler in BytecodeInterpreter.java
- Added chop disassembly in InterpretedCode.java
- Added chop compilation in CompileOperator.java
- Handles ListNode wrapping of operand (common parser pattern)
- Calls StringOperators.chopScalar() which modifies in place

Test improvements:
- perf/benchmarks.t: 1946/1960 -> 1948/1960 (+2 tests, 99.3% -> 99.4%)
- func::index::utf8_position_1, func::length::bool0_utf8 now pass

Format: CHOP rd scalarReg

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement remaining operators and fix context handling to reach 100% pass
rate (1960/1960 tests) on perf/benchmarks.t when running with
JPERL_EVAL_USE_INTERPRETER=1.

Operators implemented:
- replaceRegex (s///): Added GET_REPLACEMENT_REGEX opcode (236) for regex
  substitution with proper =~ binding and context propagation
- substr: Added SUBSTR_VAR opcode (237) for variable-argument substr with
  optional replacement parameter
- Array dereference assignment: Added support for @$r = list assignments

Context fixes:
- grep: Fixed to return scalar count in SCALAR context instead of always
  returning RuntimeList, preventing ClassCastException in boolean context
- Assignment: Properly handle scalar/list context for array assignments

Build system:
- Updated Makefile dev target to include shadowJar, ensuring JAR with all
  dependencies is built to target/ directory

Progress: 1886/1960 (96.2%) → 1960/1960 (100.0%) tests passing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation overhead

Extracted 84 cold-path opcodes from the main execute() switch statement to
reduce method bytecode size from 12,066 bytes to 8,228 bytes (32% reduction).
This keeps the method close to the 8KB JIT compilation threshold for optimal
performance.

Changes:
- Created OpcodeHandlerExtended.java with 30+ handler methods for:
  * String/regex operations (SPRINTF, CHOP, MATCH_REGEX, etc.)
  * Assignment operators (17 compound assignment ops)
  * Bitwise operations (8 binary bitwise ops)
  * String utilities (CHOMP, INDEX, RINDEX, POS, etc.)
  * I/O operations (PRINT, SAY, OPEN, READLINE)
  * Increment/decrement operations
  * Closure and iterator operations

- Created OpcodeHandlerFileTest.java with consolidated file test handler
  using secondary switch for all 27 FILETEST_* opcodes

- Updated BytecodeInterpreter.execute() to delegate to new handlers

Performance: Interpreter backend (0.68s) is 9% faster than JVM compiler (0.75s)
on perf/benchmarks.t with 1960 tests. All tests passing at 100%.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When $string =~ m/pattern/ is compiled, the =~ binding adds the string to
the matchRegex operand list. However, matchRegex was only creating the regex
object and returning it, instead of checking if a string was provided and
performing the match.

This caused eval'd regex matches with the interpreter to return the regex
object (?^:pattern) instead of the boolean match result.

Changes:
- Updated matchRegex compilation to check if element 3 (string) exists
- If string is provided, emit MATCH_REGEX opcode to perform the match
- If no string, return the regex object (for cases like: $r = qr/pattern/)

Test results:
- re/regexp.t with JPERL_EVAL_USE_INTERPRETER: 78.6% (was 17.9%)
- perf/benchmarks.t with JPERL_EVAL_USE_INTERPRETER: 100% (maintained)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fglock fglock merged commit 09ce7d1 into master Feb 20, 2026
2 checks passed
@fglock fglock deleted the fix/test-regressions branch February 20, 2026 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant