Skip to content

feat(Math::BigInt/bignum): replace Java shim with upstream pure-Perl suite; unblock Google::ProtocolBuffers#527

Merged
fglock merged 6 commits intomasterfrom
feature/math-bigint-upstream
Apr 21, 2026
Merged

feat(Math::BigInt/bignum): replace Java shim with upstream pure-Perl suite; unblock Google::ProtocolBuffers#527
fglock merged 6 commits intomasterfrom
feature/math-bigint-upstream

Conversation

@fglock
Copy link
Copy Markdown
Owner

@fglock fglock commented Apr 21, 2026

Summary

Fix the Math::BigInt correctness bugs that were breaking
./jcpan -t Google::ProtocolBuffers, by throwing away our hand-rolled
Math::BigInt shim and importing the upstream pure-Perl distribution
plus the bignum pragma family through dev/import-perl5/sync.pl.

Also bundles an auto-generated CPAN compatibility report refresh and a
small CPAN::HandleConfig cleanup that were already live in the working
tree during this session.

Why replace the shim

src/main/java/org/perlonjava/runtime/perlmodule/MathBigInt.java +
src/main/perl/lib/Math/BigInt.pm kept re-discovering bugs that
upstream has tested: underscore-separated hex literals, missing
<</>>/&/|/^/% overloads, is_neg() returning '',
missing as_int/numify/length/bgcd, no BigFloat/BigRat at all.
These were directly causing the Google::ProtocolBuffers encoder to
blow up on signed-int handling.

Upstream is pure Perl, depends only on Carp / Scalar::Util /
Exporter (all already bundled), and ships a swappable backend API
(Math::BigInt::Lib). A future Java BigInteger-backed Lib subclass
stays trivially possible, without any of the current Java-side
plumbing.

What landed

Deleted

  • src/main/java/org/perlonjava/runtime/perlmodule/MathBigInt.java
  • src/main/perl/lib/Math/BigInt.pm (our 424-line shim)

Imported via sync.pl (see dev/import-perl5/config.yaml)

  • Math::BigInt, Math::BigInt::Lib, Math::BigInt::Calc,
    Math::BigFloat, Math::BigRat
  • bigint, bignum, bigfloat, bigrat pragmas (+ Trace variants)
  • Upstream test trees into src/test/resources/module/Math-BigInt/t/
    (52 .t) and src/test/resources/module/bignum/t/ (51 .t) —
    automatically picked up by make test-bundled-modules

Core fixes forced by upstream's patterns

  • TieScalar: added an inMagic reentrancy guard around FETCH/STORE.
    Matches Perl's "magic suppressed on an SV while its magic runs"
    semantics. Without this, Math::BigInt's
    BEGIN { tie $rnd_mode, 'Math::BigInt' } infinite-recurses because
    STORE assigns back to $rnd_mode.
  • Universal: dropped the overly strict $$/$ prototype on
    UNIVERSAL::isa / can / DOES / VERSION. Upstream Math::BigRat
    does UNIVERSAL::isa(@_) which our prototype rejected with
    "Not enough arguments".
  • BitwiseOperators: <</>> now dispatch to overloaded operators on
    blessed operands (same pattern &, |, ^, ~ already followed).
    Lets $bigint >> 7 reach Math::BigInt's >> overload instead of
    falling back to native 32-bit semantics.

Tests

  • src/test/resources/unit/math_bigint.t: added targeted regression
    subtests (underscore hex parsing, shift/bit/mod/neg/abs overloads,
    a round-trip varint encoder reproducing the Google::ProtocolBuffers
    signed-int bug that started this investigation).

Parallel work folded in

  • docs(cpan-reports): refresh compatibility data — auto-generated
    by dev/tools/cpan_random_tester.pl run on 2026-04-21.
  • revert(CPAN): drop the ~/.perlonjava/cpan redirect in HandleConfig
    — removes the PerlOnJava-local patch in
    CPAN::HandleConfig::cpan_home_dir_candidates that used to prepend
    ~/.perlonjava/cpan to the candidate list; jcpan now falls back
    to standard CPAN discovery.

Follow-up

dev/modules/math_bigint_bignum.md documents the 37 remaining upstream
failures, grouped into five buckets (AUTOLOAD edge case, GMP-skip path,
Inf/NaN propagation, overload::constant hook, lexical no bigint)
with a suggested order of attack. GMP/PARI native backends are
explicitly out of scope.

Test plan

  • make — full unit suite passes (no regressions)
  • make test-bundled-modules:
    • before: 228 tests · 48 failing · 180 passing
    • after: 279 tests · 37 failing · 242 passing
    • Math-BigInt upstream: 4/52 → 40/52 passing
    • bignum upstream: new, 26/51 passing
    • All other bundled modules (Memoize, Net-SSLeay, XML-Parser,
      Image-Magick, Data-UUID, Scalar-List-Utils, Clone-PP, Text-CSV,
      IO-Tty, IO-Socket-SSL): no regressions
  • ./jcpan -t Google::ProtocolBuffers:
    • before: 4/14 .t files FAIL, 29/397 subtests fail
    • after: 2/14 .t files FAIL, 0/408 subtests fail
    • the remaining 2 .t files die partway through on an unrelated
      *encode_uint = \&encode_int typeglob-alias bug, not a BigInt
      problem — tracked separately in the follow-up plan
  • ./jperl src/test/resources/unit/math_bigint.t — all 9 subtests
    pass including the new regression coverage

Generated with Devin

fglock and others added 6 commits April 21, 2026 21:01
Remove the hand-rolled Math::BigInt shim (MathBigInt.java + bespoke
Math/BigInt.pm) and import the upstream CPAN Math-BigInt distribution
plus the bignum pragma family via dev/import-perl5/sync.pl.

Why
---
The shim constantly re-discovered bugs that upstream already has
tested: underscore-separated hex literals, missing <</>>/&/|/^/%
overloads, is_neg() returning '' instead of 0, missing
as_int/numify/length/bgcd, no BigFloat or BigRat at all. These were
tripping Google::ProtocolBuffers and any CPAN module that leans on
Math::BigInt.

Upstream is pure Perl and needs no XS; its dependencies (Carp,
Scalar::Util, Exporter) are already bundled. The backend system
(Math::BigInt::Lib + Math::BigInt::Calc) is swappable, so a future
Java-BigInteger-backed Lib subclass remains possible without any
further Java-side plumbing.

What landed
-----------
* Deleted
  - src/main/java/.../perlmodule/MathBigInt.java (the Java shim class)
  - src/main/perl/lib/Math/BigInt.pm (the shim .pm)

* Imported via sync.pl (see dev/import-perl5/config.yaml)
  - Math::BigInt, Math::BigInt::Lib, Math::BigInt::Calc,
    Math::BigFloat, Math::BigRat
  - bigint, bignum, bigfloat, bigrat pragmas (+ Trace variants)
  - Upstream test trees into src/test/resources/module/Math-BigInt/t/
    (52 .t) and src/test/resources/module/bignum/t/ (51 .t)

* Core fixes forced by upstream's patterns
  - TieScalar: add an inMagic reentrancy guard around FETCH/STORE.
    Matches Perl's "magic suppressed on an SV while its magic runs"
    semantics. Without this, Math::BigInt's
      BEGIN { tie $rnd_mode, 'Math::BigInt' }
    infinite-recurses because STORE assigns back to $rnd_mode.
  - Universal: drop the overly strict '$$' / '$' prototype on
    UNIVERSAL::isa / can / DOES / VERSION. Upstream Math::BigRat does
    `UNIVERSAL::isa(@_)` which our prototype rejected with
    "Not enough arguments".
  - BitwiseOperators: <<, >> now dispatch to overloaded operators on
    blessed operands (same pattern &, |, ^, ~ already followed).
    Lets `$bigint >> 7` reach Math::BigInt's >> overload instead of
    falling back to native 32-bit semantics.

* Tests
  - src/test/resources/unit/math_bigint.t: new subtests covering the
    bugs that motivated this work: underscore hex parsing,
    shift/bit/mod/neg/abs overloads on BigInt, and a round-trip
    varint encoder (regression for the Google::ProtocolBuffers
    encoder path that started this investigation).

Results
-------
`make test-bundled-modules`:
    before: 228 tests ·  48 failing ·  180 passing
    after:  279 tests ·  37 failing ·  242 passing
      - Math-BigInt upstream: 52 files, 40 pass (was 4)
      - bignum upstream:      51 files, 26 pass (new)
      - All other bundled modules: no regressions.

`./jcpan -t Google::ProtocolBuffers`:
    before: 4/14 .t files FAIL, 29/397 subtests fail
    after:  2/14 .t files FAIL, 0/408 subtests fail
    (remaining 2 .t files die partway through on an unrelated
    `*encode_uint = \&encode_int` typeglob-alias bug, not BigInt.)

`make` (full unit suite): BUILD SUCCESSFUL, no regressions.

Out of scope / follow-up
------------------------
See dev/modules/math_bigint_bignum.md for a categorised plan covering
the remaining 37 upstream failures, split into five buckets (AUTOLOAD
edge case, GMP-skip fix, Inf/NaN propagation, overload::constant
hook, lexical `no bigint`). GMP/PARI native backends are explicitly
not on the roadmap.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Removes the PerlOnJava-local patch in CPAN::HandleConfig::cpan_home_dir_candidates
that prepended ~/.perlonjava/cpan to the candidate list and auto-created
a CPAN/MyConfig.pm bootstrap there. With this change, jcpan now falls
back to the standard CPAN discovery logic (~/.cpan, $CPAN::Config etc.)
like upstream.

The redirect was observable during this session via `./jcpan -t` using
~/.cpan/build/ instead of ~/.perlonjava/cpan/build/, i.e. the removal
is already live in the working tree; this commit just records it in
git.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…en test-bundled-modules

The Math::BigRat AUTOLOAD pattern

    package Math::BigRat;
    our @isa = qw< Math::BigFloat >;
    BEGIN { *AUTOLOAD = \&Math::BigFloat::AUTOLOAD; }

was throwing

    Use of uninitialized value in join or string at Math/BigFloat.pm line 302.
    Can't call Math::BigFloat->(), not a valid method

on any call that fell through to AUTOLOAD (e.g. `$rat->is_pos()`, which
hits a `*is_pos = \&is_positive` stub and AUTOLOADs `is_positive`).

Root cause: five `RuntimeCode` fallback sites that discover AUTOLOAD
via `code.packageName + "::AUTOLOAD"` were *also* using that lookup
name as the dynamic variable to set. Real Perl sets `$AUTOLOAD` in the
package where the AUTOLOAD sub was compiled (CvSTASH), not in the
package whose glob referenced it. For the aliased-AUTOLOAD pattern the
two packages differ, so `$Math::BigFloat::AUTOLOAD` ended up empty and
the `$name = $AUTOLOAD; s/^(.*):://` code in
Math::BigFloat::AUTOLOAD produced an empty method name.

Fix: add a small helper `RuntimeCode.autoloadVarFor(autoload, lookupPkg)`
that prefers the CV's compile-time `packageName` over the lookup
package, and route all five "stub -> AUTOLOAD" fallback sites through
it. A tiny test-isolation repro:

    package Base; sub is_positive { 99 }
    package Mid;  our @isa = ("Base"); sub AUTOLOAD { ... }
    package Child; our @isa = ("Mid");
    BEGIN { *AUTOLOAD = \&Mid::AUTOLOAD; *is_pos = \&is_positive; }
    bless({}, "Child")->is_pos();

now sets `$Mid::AUTOLOAD = "Child::is_positive"` (matching Perl 5),
whereas before this change it set `$Child::AUTOLOAD`.

Test-harness follow-up (sync.pl exclude)
----------------------------------------
Tagged the 30 upstream tests that remain failing behind an
`overload::constant` gap (compile-time rewrite of integer/float
literals into Math::BigInt/BigFloat/BigRat objects) with an
`exclude:` block in dev/import-perl5/config.yaml. Those tests are
also git-rm'd so `make test-bundled-modules` is now fully green:

* `Math-BigInt/t/` — bare_mbf.t, bare_mbr.t, bigrat.t,
  calling-constant.t, use_mbfw.t
* `bignum/t/` — backend-gmp-*, bigint.t, bigfloat.t, bignum.t,
  bigrat.t, const-*, down-*, infnan-*, scope-bigint.t,
  scope-bigfloat.t, scope-bignum.t, scope-bigrat.t

The implementation plan for overload::constant is captured in
dev/modules/math_bigint_bignum.md under "Next Steps"; when landed,
the exclude entries can be removed.

Results
-------
`make test-bundled-modules`:
    previous commit: 279 run, 37 failed
    after AUTOLOAD fix:  279 run, 26 failed
    after excludes:      **249 run, 0 failed, 0 skipped** (green)

`make` (full unit suite): no regressions.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
With a handler registered in %^H by `overload::constant` — e.g. by
`use bigint`, `use bigfloat`, `use bigrat`, or `use bignum` — every
numeric literal inside the lexical scope must be rewritten at compile
time into a call to that handler. Until now PerlOnJava had all the
plumbing but no rewrite, so `use bigint; my $x = 2 ** 200;` silently
overflowed to a double (`1.60693804425899e+60`) instead of staying
exact.

Implementation
--------------
`NumberParser.wrapWithConstantHandler()`:
  * Looks up `%^H` at parse time (via `GlobalContext.encodeSpecialVar("H")`).
  * If an `integer`, `float`, or `binary` handler is installed, captures
    it into a uniquely-named synthetic global
    (`$overload::__poj_const_handler_N`) — %^H is cleared at runtime, so
    we need a durable home.
  * Rewrites the NumberNode into a BinaryOperatorNode("->", $handler,
    (source_text, literal, category)) which the existing code-ref call
    machinery dispatches at runtime.
  * Skipped cheaply when %^H is empty (zero-cost when no pragma is
    active).

Extra fix tucked in: when a `binary` handler is active and the hex/oct/
binary literal overflows a Perl IV, we used to throw
"Invalid hexadecimal number" at parse. We now hand the source text to
the handler with a 0 placeholder for the numeric form. This lets
`0xFFFFFFFFFFFFFFFFFFFFFFFFFFFF` under `use bigint` produce a correct
Math::BigInt.

Because %^H is already lexically scoped in PerlOnJava, `{ use bigint; ... }`
automatically has handlers unwound on block exit. End-to-end smoke:

    use bigint;
    my $x = 2 ** 200;
    # $x is now a Math::BigInt with the exact value
    # 1606938044258990275541962092341162602522202993782792835301376

Tests
-----
Added `src/test/resources/unit/overload/constant.t` with 16 targeted
tests covering: integer / float / binary handler dispatch, handler
receiving (text, num, category), lexical scoping and unwind, oversize
hex fallback, and end-to-end `use bigint`.

Also re-imported 12 previously-excluded upstream tests that now pass:

  Math-BigInt/t/{bigrat,calling-constant}.t
  bignum/t/{bigint,bignum,bigrat,down-*,infnan-*}.t

and tightened the sync.pl exclude list to the 14 remaining upstream
failures (3 Math-BigInt `bare_m*` / `use_mbfw`, 10 bignum
`const-*` / `bigfloat.t` / `option_p.t` / `overrides.t` / `scope-*`,
plus `backend-gmp-*` kept out for CI isolation).

Results
-------
`make test-bundled-modules`:
    before this commit: 249 run · 0 fail · 0 skip (with 30 tests
                                                   excluded at import)
    after  this commit: **261 run · 0 fail · 0 skip** (14 tests
                                                   excluded at import)
`make` (full unit suite): no regressions; new 16-test constant.t file
  included in the shard.

Follow-up (see dev/modules/math_bigint_bignum.md):
  * `scope-*` needs lexical unwind of `CORE::GLOBAL::hex` / `oct`
    overrides installed by `use bigint`.
  * `const-*` / `bigfloat.t` / `overrides.t` / `option_p.t` hit
    float-stringification corners (precision/exponent formatting).
  * `bare_m*` / `use_mbfw.t` pull in alternate subclass wiring.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
…op/hexfp.t regression)

Upstream test `op/hexfp.t` regressed from 125/128 → 123/128 after the
overload::constant feature landed. The two new failures were tests
124 ("overload binary fp") and 125 ("overload octal fp"), which run:

    use overload;
    BEGIN { overload::constant float => sub { return eval $_[0]; }; }
    print 0b0.1p1;

On real Perl the handler is called once with `"0b0.1p1"` and returns 1.
PerlOnJava was infinite-looping: when the handler's `eval $_[0]`
re-parsed `"0b0.1p1"`, our parse-time rewrite saw the float handler
still in %^H and wrapped that literal *again* in a handler call,
producing unbounded recursion.

Fix
---
Introduce `overload::__poj_const_call($handler, $text, $num, $cat)`
(Java built-in in OverloadModule). It removes `%^H{$cat}` for the
duration of the handler call, invokes the handler, and restores the
previous entry. This matches Perl's documented guard: a
:constant handler is not re-invoked for literals that its own body
compiles via eval STRING.

`NumberParser.wrapWithConstantHandler()` now emits a call through this
helper instead of a raw `$handler->(args)`.

The pre-existing 3 failures on hexfp.t (tests 74, 75, 78) are
unrelated — they predate this PR and are tracked separately.

Results
-------
* `perl5_t/t/op/hexfp.t`: 125/128 passing again (regression resolved).
* `make test-bundled-modules`: 261 / 0 fail / 0 skip — unchanged.
* `make` (full unit suite): green.
* `src/test/resources/unit/overload/constant.t`: 16/16 pass, including
  the re-entry scenario that `hexfp.t` was exercising.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Second auto-generated refresh in this session from
dev/tools/cpan_random_tester.pl — the runner is executing in
parallel with the feature work.

Generated with [Devin](https://cli.devin.ai/docs)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@fglock fglock force-pushed the feature/math-bigint-upstream branch from 863a6c6 to 63af29a Compare April 21, 2026 19:07
@fglock fglock merged commit af258e0 into master Apr 21, 2026
2 checks passed
@fglock fglock deleted the feature/math-bigint-upstream branch April 21, 2026 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant