Skip to content

Add HASH_DEREF_FETCH and ARRAY_DEREF_FETCH superoperators#304

Merged
fglock merged 10 commits intomasterfrom
feature/superoperators
Mar 12, 2026
Merged

Add HASH_DEREF_FETCH and ARRAY_DEREF_FETCH superoperators#304
fglock merged 10 commits intomasterfrom
feature/superoperators

Conversation

@fglock
Copy link
Owner

@fglock fglock commented Mar 12, 2026

Summary

This PR adds two superoperators to the bytecode interpreter that combine multiple instruction sequences into single operations:

HASH_DEREF_FETCH (opcode 381)

  • Pattern replaced: DEREF_HASH + LOAD_STRING + HASH_GET
  • Format: HASH_DEREF_FETCH rd hashref_reg key_string_idx
  • Optimizes: $hashref->{key} with bareword or string literal keys
  • Impact: Eliminates ~2,498 instruction sequences in ExifTool tests (~7.3% of hash operations)

ARRAY_DEREF_FETCH (opcode 382)

  • Pattern replaced: DEREF_ARRAY + LOAD_INT + ARRAY_GET
  • Format: ARRAY_DEREF_FETCH rd arrayref_reg index_immediate
  • Optimizes: $arrayref->[n] with integer literal indices

Files Changed

  • Opcodes.java: Added opcode constants 381-384 (strict and non-strict variants)
  • CompileBinaryOperator.java: Pattern detection for -> operator
  • BytecodeCompiler.java: Helper methods and pattern detection
  • BytecodeInterpreter.java: Execution handlers for all superoperators
  • Disassemble.java: Disassembly support

Example

Before (3 instructions):

DEREF_HASH r9 = %{r8}
LOAD_STRING r10 = "key"
HASH_GET r3 = r9{r10}

After (1 instruction):

HASH_DEREF_FETCH r3 = r8->{"key"}

Bug Fix: RuntimeList Handling

Fixed a regression where (caller)[0] and similar expressions failed with:

Can't use string ("...") as an ARRAY ref while "strict refs" in use

Root cause: Superoperators were incorrectly used in handleGeneralArrayAccess()
where the input could be a RuntimeList (not a scalar reference).

Fix: Superoperators are now only used in the -> operator handler where the
left side is always compiled in SCALAR context.

Test plan

  • Build succeeds: ./gradlew build
  • All existing tests pass: ./gradlew test
  • Hash access works: $hashref->{key} returns correct value
  • Array access works: $arrayref->[n] returns correct value
  • Nested access works: $data->{users}->[0]->{name}
  • (caller)[0] works correctly
  • Getopt::Long module works
  • examples/life_bitpacked.pl works
  • Disassembly shows superoperators being emitted

Generated with Devin

fglock and others added 10 commits March 12, 2026 12:03
These superoperators combine multiple bytecode instructions into single
operations for better interpreter performance:

- HASH_DEREF_FETCH (opcode 381): Combines DEREF_HASH + LOAD_STRING + HASH_GET
  for patterns like $hashref->{key} with bareword or string literal keys

- ARRAY_DEREF_FETCH (opcode 382): Combines DEREF_ARRAY + LOAD_INT + ARRAY_GET
  for patterns like $arrayref->[n] with integer literal indices

Based on bytecode analysis of ExifTool tests, HASH_DEREF_FETCH alone
eliminates ~2,498 instruction sequences (~7.3% of hash operations).

Design document: dev/design/superoperators.md

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
- Add emitHashDerefGet() and emitArrayDerefGet() helpers in BytecodeCompiler
- Refactor handleGeneralHashAccess() and handleGeneralArrayAccess() to use helpers
- Refactor CompileBinaryOperator -> operator handling to use helpers
- Enables superoperators for both $h->{a}{b} (implicit arrows) and $h->{a}->{b} (explicit arrows)
- Reduces code duplication across 3 call sites

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
…ashAccess

The superoperators (ARRAY_DEREF_FETCH, HASH_DEREF_FETCH) expect a scalar
containing a reference, but handleGeneralArrayAccess and handleGeneralHashAccess
can receive a RuntimeList (e.g., from `(caller)[0]`).

This caused `(caller)[0]` and similar expressions to fail with:
  Can't use string ("...") as an ARRAY ref while "strict refs" in use

Fix: Keep superoperators only in the -> operator handler (CompileBinaryOperator)
where the left side is always a scalar reference. For handleGeneralArrayAccess
and handleGeneralHashAccess, use the original DEREF_ARRAY/HASH + ARRAY/HASH_GET
instruction sequence which correctly handles all input types.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
- Add HASH_DEREF_FETCH_NONSTRICT (383) and ARRAY_DEREF_FETCH_NONSTRICT (384)
- Update BytecodeInterpreter with handlers for new opcodes
- Update BytecodeCompiler emitHashDerefGet/emitArrayDerefGet to use
  NONSTRICT variants when strict refs is not enabled
- Add disassembler support for NONSTRICT superoperators
- Fix (expr)[index] handling in CompileBinaryOperator (ListNode transform)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
…lHashAccess

Now that ListNode cases like (caller)[0] are transformed to array literal
+ arrow deref before reaching these handlers, it is safe to use superoperators.

- handleGeneralArrayAccess: uses emitArrayDerefGet() for chained array access
- handleGeneralHashAccess: uses emitHashDerefGet() for chained hash access
- Removed redundant code that was duplicating the helper logic
- Changed handleGeneralArrayAccess to use SCALAR context (not LIST)

Example improvement for $v[1]{a}{b}{c}->[2]:
- Before: 50 shorts (DEREF_HASH + LOAD_STRING + HASH_GET sequences)
- After: 32 shorts (HASH_DEREF_FETCH_NONSTRICT superoperators)

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
Matches JVM backend behavior (EmitLiteral.java line 55-56):
'Perl semantics: array literal elements are always evaluated in LIST context'

This fixes regressions in op/bop.t tests 36-38 where (keys %h)[0] was
incorrectly returning the count instead of the first key, because
keys was evaluated in SCALAR context.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <noreply@cognition.ai>
@fglock fglock merged commit 131b29a into master Mar 12, 2026
2 checks passed
@fglock fglock deleted the feature/superoperators branch March 12, 2026 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant