Skip to content

[jvm] auto-detect Java SAM interfaces; honor SAM in overload resolution #12901

Merged
Simn merged 10 commits into
developmentfrom
jvm/SAM
May 16, 2026
Merged

[jvm] auto-detect Java SAM interfaces; honor SAM in overload resolution #12901
Simn merged 10 commits into
developmentfrom
jvm/SAM

Conversation

@kLabz
Copy link
Copy Markdown
Contributor

@kLabz kLabz commented May 13, 2026

Two related changes that together make Haxe match javac's lambda-vs-SAM
(Single Abstract Method) conversion rules on the JVM target.

  1. Structural SAM detection in javaModern.ml

Previously, only interfaces carrying @FunctionalInterface were tagged
CFunctionalInterface and so eligible for implicit conversion from a
function value. The annotation is documentation in javac too — what
makes a type lambda-convertible is having exactly one abstract instance
method, per JLS §9.8. The Android SDK rarely annotates its listeners,
which forced every consumer of the JVM target to hand-roll adapter
classes for View.OnClickListener, DialogInterface.OnClickListener,
MediaPlayer.OnCompletionListener and so on.

This patch auto-tags any .jar-imported interface as @:functionalInterface
when it has exactly one abstract instance method, excluding:

  • static / default / private / synthetic methods,
  • constructors,
  • Object members re-declared as abstract (equals/hashCode/toString,
    matched by name+arity since their signatures on j.l.Object are
    unambiguous — JLS §9.8 excludes these from the SAM count).

The explicit @FunctionalInterface annotation path is preserved.

  1. SAM-aware overload candidate filtering

overloadResolution.ml's unify_cf used strict Type.unify to filter
candidates before the cast layer could fire. A function value passed to
an overload whose parameter is a SAM class would be discarded at the
filter step. Now, when strict unify fails, the filter retries via the
SAM conversion path — unifying the function's TFun against the SAM
method's signature, mirroring what AbstractCast does at cast time. Only
candidates whose SAM signature actually matches the function shape
survive, so this isn't a blind accept.

Two supporting fixes round this out:

  • overloads.ml rate_conv: added a TInst(SAM), TFun case so specificity
    ranking works when two SAM-parametered overloads both accept the
    same function. The rate descends through the SAM method's signature
    rather than raising Not_found, which previously discarded the
    candidate during reduce_compatible.

  • tOther.ml get_singular_interface_field and the parallel walker in
    genjvm.ml's check_functional_interface now exclude equals/hashCode/
    toString by name+arity, matching the new exclusion in javaModern so
    interfaces like Comparator (abstract equals re-declared) — and the
    Android SDK's various 'override equals' interfaces — are still
    detected as SAMs at both typer and codegen stages.

Tests:

  • tests/unit/.../IssueFunctionalInterfaceOverload.hx exercises the
    overload disambiguation path with two @:functionalInterface
    interfaces of disjoint shape (Int,Int -> Int vs String -> String).
  • tests/misc/jvm/projects/StructuralSam/ ships a .java file with six
    interface shapes — plain SAM, abstract-equals re-declaration,
    default/static method skipping, multi-abstract non-SAM, generic-arg
    SAM, and overloaded call sites — none annotated @FunctionalInterface.
    Runs the generated jar end-to-end to verify both type-checking and
    runtime closure-adapter dispatch.

Limitations matching javac:

  • Abstract-class SAMs (e.g. android.content.BroadcastReceiver) remain
    unsupported; javac doesn't allow lambda conversion to those either.
  • Two SAM-parametered overloads where both signatures would accept
    the same function value remain ambiguous and require explicit
    disambiguation.

@kLabz kLabz marked this pull request as draft May 14, 2026 17:40
@kLabz kLabz marked this pull request as ready for review May 14, 2026 21:32
@kLabz
Copy link
Copy Markdown
Contributor Author

kLabz commented May 15, 2026

@EliteMasterEric would this help you with your jvm project(s)?

Can you check if your project run into issues with this (either at compile time or runtime) with corresponding nightlies?
https://build.haxe.org/builds/haxe/linux64/haxe_2026-05-14_jvm/SAM_a18560e.tar.gz (or similar path for win/mac)

@kLabz kLabz force-pushed the jvm/SAM branch 2 times, most recently from 819c31b to 8e768fc Compare May 15, 2026 13:02
kLabz added 4 commits May 15, 2026 22:16
Two related changes that together make Haxe match javac's lambda-vs-SAM
conversion rules on the JVM target.

1. Structural SAM detection in javaModern.ml

Previously, only interfaces carrying @FunctionalInterface were tagged
CFunctionalInterface and so eligible for implicit conversion from a
function value. The annotation is documentation in javac too — what
makes a type lambda-convertible is having exactly one abstract instance
method, per JLS §9.8. The Android SDK rarely annotates its listeners,
which forced every consumer of the JVM target to hand-roll adapter
classes for View.OnClickListener, DialogInterface.OnClickListener,
MediaPlayer.OnCompletionListener and so on.

This patch auto-tags any .jar-imported interface as @:functionalInterface
when it has exactly one abstract instance method, excluding:
  - static / default / private / synthetic methods,
  - constructors,
  - Object members re-declared as abstract (equals/hashCode/toString,
    matched by name+arity since their signatures on j.l.Object are
    unambiguous — JLS §9.8 excludes these from the SAM count).

The explicit @FunctionalInterface annotation path is preserved.

2. SAM-aware overload candidate filtering

overloadResolution.ml's unify_cf used strict Type.unify to filter
candidates before the cast layer could fire. A function value passed to
an overload whose parameter is a SAM class would be discarded at the
filter step. Now, when strict unify fails, the filter retries via the
SAM conversion path — unifying the function's TFun against the SAM
method's signature, mirroring what AbstractCast does at cast time. Only
candidates whose SAM signature actually matches the function shape
survive, so this isn't a blind accept.

Two supporting fixes round this out:

  - overloads.ml rate_conv: added a TInst(SAM), TFun case so specificity
    ranking works when two SAM-parametered overloads both accept the
    same function. The rate descends through the SAM method's signature
    rather than raising Not_found, which previously discarded the
    candidate during reduce_compatible.

  - tOther.ml get_singular_interface_field and the parallel walker in
    genjvm.ml's check_functional_interface now exclude equals/hashCode/
    toString by name+arity, matching the new exclusion in javaModern so
    interfaces like Comparator (abstract equals re-declared) — and the
    Android SDK's various 'override equals' interfaces — are still
    detected as SAMs at both typer and codegen stages.

Tests:

  - tests/unit/.../IssueFunctionalInterfaceOverload.hx exercises the
    overload disambiguation path with two @:functionalInterface
    interfaces of disjoint shape (Int,Int -> Int vs String -> String).
  - tests/misc/jvm/projects/StructuralSam/ ships a .java file with six
    interface shapes — plain SAM, abstract-equals re-declaration,
    default/static method skipping, multi-abstract non-SAM, generic-arg
    SAM, and overloaded call sites — none annotated @FunctionalInterface.
    Runs the generated jar end-to-end to verify both type-checking and
    runtime closure-adapter dispatch.

Limitations matching javac:

  - Abstract-class SAMs (e.g. android.content.BroadcastReceiver) remain
    unsupported; javac doesn't allow lambda conversion to those either.
  - Two SAM-parametered overloads where both signatures would accept
    the same function value remain ambiguous and require explicit
    disambiguation.
Auto-tagging every structurally-SAM interface as @:functionalInterface
swept in sun.reflect.ConstructorAccessor, sun.nio.ch.Interruptible,
sun.reflect.generics.tree.TypeTree and similar JDK-internal classes.
The existing JFI matcher then made every signature-matching closure
declare `implements sun.reflect.ConstructorAccessor`, which compiles
fine against the extern jar but fails to defineClass at runtime on
JDK 9+ — sun.* was moved to jdk.internal.* and is inaccessible to
application code.

This broke unit tests at startup: Issue5556 was the first class to
trigger loading the affected closure adapter, raising
NoClassDefFoundError before any test ran.

User code can't legitimately target these internal packages anyway,
so the fix is to exclude sun.*, com.sun.*, and jdk.internal.* from
structural detection. Classes that actually carry @FunctionalInterface
are still honored (none of the affected internal classes do, so this
is safe). Issue5556 serves as the regression test — its closure
adapter previously implemented sun.reflect.ConstructorAccessor and
now no longer does.
Structural SAM detection tags every single-abstract-method interface on
the --java-lib classpath as @:functionalInterface, and a closure then
implements every one of them whose signature structurally matches. That
includes interfaces transitively loaded from the SDK jar but never named
in user code — e.g. compiling against android-34 drags in
android.os.OutcomeReceiver (API 33) and SplashScreen.OnExitAnimationListener
(API 31). A closure implementing those hard-fails class linking
(NoClassDefFoundError) the moment it loads on an older runtime, crashing
the app at activity instantiation.

Restrict the promiscuous binding to interfaces the program genuinely
converts a function to:

  - new com.functional_interfaces_used set (shared across cloned contexts,
    exposed on Gctx.t for the generator);
  - AbstractCast records each interface as its SAM-conversion branch fires,
    keyed by the @:native path since Native.apply_native_paths rewrites
    cl_path between typing and generation;
  - genjvm's Preprocessor only registers an interface in
    gctx.functional_interfaces when it is in that set, resolved by physical
    class identity in the modules loop (before check_path can rewrite the
    path again).

StructuralSam gains an Unused interface — structurally identical to
OnClick but never used as a conversion target — and asserts a closure
implements OnClick but not Unused.
The functional_interfaces_used set populated by AbstractCast only covers
interfaces reached through an implicit SAM conversion typed from source.
That misses two cases: an explicit `cast` of a closure to a functional
interface bypasses AbstractCast entirely, and an `--hxb-lib` build never
re-runs typing so the set is empty — closures then fail to implement the
interface and hard-cast at runtime (Issue9576, Issue11236,
IssueFunctionalInterfaceOverload under the hxb-lib CI configs).

Additionally scan the AST of non-extern types in genjvm's Preprocessor:
field/static/constructor signatures and their cf_overloads, plus the
etype of every sub-expression (which catches the TFun type of a called
extern method whose parameter is a SAM interface). This is independent
of how the module was loaded. Restricting the scan to non-extern types
keeps incidental SAM interfaces from a --java-lib classpath out of the
set unless user code actually references them.
@kLabz kLabz changed the base branch from jvm/dex-compatible to development May 15, 2026 20:17
@kLabz kLabz requested a review from Simn May 15, 2026 20:17
@Simn
Copy link
Copy Markdown
Member

Simn commented May 16, 2026

I understand what we're trying to do here, but it seems a little messy because there's no clear authority on what is and isn't considered a SAM interface. The java loader initially determines it, then the typer basically verifies it via check_functional_interface (which just silently does nothing in the fail-case) and finally the generator checks it again in its own check_functional_interface function. In all three places we dance around this object member special case, which suggests to me that something should be centralized here.

I'm aware that this PR doesn't really introduce this mess and merely amplifies it, but I would still like to look into a cleaner approach here. Having functional_interface_lut on the common context already felt like somewhat of a hack, though it might be a necessary one because we have to store this information somewhere. But adding functional_interfaces_used and relying on the abstract cast filter to populate that? Ugh...

kLabz added 2 commits May 16, 2026 09:11
…_field

The JLS §9.8 SAM rule (exclude default methods and Object members) was
duplicated between tOther and genjvm; the latter now just calls the
shared helper.
AbstractCast was writing into a Hashtbl on the common context to track
which SAM interfaces a program converts a function to, with @:native
re-keying to bridge the typing↔generation path rewrite. The genjvm
preprocessor already walks the AST and collects every SAM-typed TInst
it sees, which strictly subsumes the AbstractCast set (it also catches
explicit casts and --hxb-lib builds). Remove the redundant bookkeeping
and the field on common.context / Gctx.t.
@kLabz
Copy link
Copy Markdown
Contributor Author

kLabz commented May 16, 2026

Claude's answer before implementing

Looking at the diff, the mess concentrates in two axes — what counts as a SAM and which SAMs the program uses. Both can be tightened significantly.

What's actually duplicated

SAM-ness rule appears three times, each with its own copy of the JLS §9.8 Object-member exclusion:

  • javaModern.ml — structural detection at load (is_object_redeclaration + abstract/static/etc. filter)
  • tOther.ml::get_singular_interface_field — used by overload rating
  • genjvm.ml::Preprocessor.check_functional_interface — re-verifies and populates the LUT

Usage tracking appears twice:

  • abstractCast.ml writes functional_interfaces_used keyed by @:native path
  • genjvm.ml re-walks the whole AST to compute essentially the same set (by class identity)

Proposed cleanup

1. One module owns "is this a SAM, and what's its method?"

Add src/core/functionalInterface.ml (or fold into tOther) exposing exactly:

val is_object_member : tclass_field -> bool        (* JLS §9.8 *)
val sam_method : tclass -> tclass_field option     (* None unless CFunctionalInterface + exactly one non-default, non-Object-member MethNormal *)

Then:

  • get_singular_interface_field becomes a thin caller (or is deleted in favor of sam_method).
  • check_functional_interface in genjvm becomes match sam_method c with Some cf -> register ... | None -> (). No private is_object_member copy, no re-implementing the count.
  • javaModern calls the same helper at the OCaml level too — except it's working on jf records, not tclass_field. The fix there is smaller: keep the structural detector local, but lift is_object_redeclaration + the JDK-internal-package filter behind named helpers in the same module and document them as "the load-time mirror of FunctionalInterface.sam_method."

2. Drop functional_interfaces_used entirely.

The AST scan added in 605467b is a strict superset of what AbstractCast records — an implicit SAM conversion produces a node whose etype is the SAM TInst, so note_fi_in_type already sees it. That makes the abstractCast bookkeeping, the field on common.context, the field on Gctx.t, and the special clone-sharing comment all dead weight. Remove:

  • Hashtbl.replace ctx.com.functional_interfaces_used ... in abstractCast.ml (and the @:native re-keying it had to do)
  • functional_interfaces_used on common.context and Gctx.t
  • the shared-with-parent clone case

Then genjvm's preprocessor has a single authority: walk the AST, collect TInst whose class has CFunctionalInterface, call sam_method to register them. One pass, one rule, one place.

3. Bonus: the overload-resolution helper already lives in the right shape.

try_functional_interface_match in overloadResolution.ml and the new TFun → TInst rate case in overloads.ml are both doing "given a SAM class and a function type, unify against the SAM method." That's sam_method + apply_params + unify — extract it as FunctionalInterface.unify_as_sam : tclass -> tparams -> t -> bool and call it from both sites.

Net effect

After this, the answer to "what is a SAM?" is one function. The answer to "is this SAM used?" is one AST walk in one place. Java loader tags the flag, typer/overload/generator all query through the same helper, and common.context loses a field instead of gaining one.


What changed:

  • genjvm.ml::check_functional_interface now calls TClass.get_singular_interface_field instead of carrying its own copy of the loop + Object-member rule. JLS §9.8 lives in exactly one place (tOther.ml).
  • functional_interfaces_used is gone from common.context, Gctx.t, the to_gctx mapping, the create/clone constructors, and the abstractCast write site (including the @:native re-keying that only existed because of the typing↔generation path mismatch).
  • genjvm's preprocessor still has the AST scan (the actual source of truth) but no longer has the note_if_used_fi / Hashtbl.mem branch.

The reviewer's three concerns:

  1. No clear authority on what is/isn't a SAM — now one helper, called from all three sites (typeloadFields, overloadResolution, overloads, genjvm).
  2. functional_interface_lut felt like a hack — untouched; it's the cache for the SAM-method lookup. Could be argued either way but it's not amplified.
  3. functional_interfaces_used populated by abstractCast — deleted entirely.

The remaining structural-detection code in javaModern.ml is appropriately local: it works on jf records before tclass_fields exist, so it can't share the helper.

@Simn
Copy link
Copy Markdown
Member

Simn commented May 16, 2026

Much better!

One thing I'm slightly worried about is List.memqing into fi_used_classes in genjvm.ml. This is a linear lookup and we do it for every class, so it's technically O(n^2). This might not matter in practice because the element count in fi_used_classes is going to be much lower, but I think it's still better to look this up via a Hashtbl with the cl_path key.

Avoid an O(n*m) lookup over the SAM-used list during the interface
preprocessing loop. The list is finalized into a Hashtbl after loop 1
completes (and therefore after check_path's cl_path rewrites), so
keying by cl_path is stable.
@kLabz
Copy link
Copy Markdown
Contributor Author

kLabz commented May 16, 2026

Claude seems to think the whole functional_interface_lut thing is (and was already) pretty much useless.. I tried to make it look at the whole thing instead of focusing only on this PR, but it insists. What do you think?

CI seems ok with removing it, at least (commit / CI run).

Should I cherry-pick that commit here?

@Simn
Copy link
Copy Markdown
Member

Simn commented May 16, 2026

but typeloadModule's @:functionalInterface path never populated it

I don't understand that part because check_functional_interface had ctx.com.functional_interface_lut#add c.cl_path (c,cf).

@kLabz
Copy link
Copy Markdown
Contributor Author

kLabz commented May 16, 2026

It got confused about #11549 somehow, but still insists the lut wasn't doing much:

The cached computation is a walk over a SAM interface's field list —
typically under 10 entries, and short-circuits on the second non-default
method, so the perf delta from recomputing is well below noise. Doing
that directly in AbstractCast collapses the two load paths into one and
removes the field from common.context.

The LUT memoized get_singular_interface_field per class, populated by
typeloadFields.check_functional_interface and re-checked lazily by
AbstractCast (#11549, for display-mode flows where init_class hadn't
run yet by the time a SAM conversion was queried). It was a global
mutable hashtbl on common.context whose only reader was AbstractCast.

The cached computation is a walk over a SAM interface's field list —
typically under 10 entries, and short-circuits on the second non-default
method, so the perf delta from recomputing is well below noise. Doing
that directly in AbstractCast collapses the two load paths into one and
removes the field from common.context.

Also drops the now-no-op typeloadFields.check_functional_interface
(its only effect was populating the LUT) and the bridge call that
re-invoked it from init_class when the flag was set via @:functionalInterface
meta in typeloadModule.
@kLabz
Copy link
Copy Markdown
Contributor Author

kLabz commented May 16, 2026

I'm getting it to benchmark the change with pessimistic scenarios, and it quickly hit the JVM method-size limit instead 😅

  ┌─────────────────────────────────┬────────────────────────────┬─────────────────────────┬──────────────┐
  │            benchmark            │ before (cbd81b1, with LUT) │ after (ee67399, no LUT) │    ratio     │
  ├─────────────────────────────────┼────────────────────────────┼─────────────────────────┼──────────────┤
  │ single SAM (10k → Runnable)     │ 1.596 ± 0.059 s            │ 1.635 ± 0.146 s         │ 1.02× ± 0.10 │
  ├─────────────────────────────────┼────────────────────────────┼─────────────────────────┼──────────────┤
  │ mixed SAMs (6 distinct classes) │ 1.957 ± 0.024 s            │ 1.987 ± 0.068 s         │ 1.02× ± 0.04 │
  └─────────────────────────────────┴────────────────────────────┴─────────────────────────┴──────────────┘

Both ~2% slower on after, but σ overlap on the single case is huge (±9% on the after run) and the mixed case 
is right at the noise floor. No meaningful regression — exactly what we'd predict from "swap a hash lookup for
a 5-element list walk." Cache wasn't load-bearing.

@Simn
Copy link
Copy Markdown
Member

Simn commented May 16, 2026

Yes I think this is fine, the argument just didn't seem accurate.

One last cosmetic change I'd like to make is to have that note_fi stuff in genjvm become its own function (in the Preprocessor module) returning the list instead of being awkwardly added to the preprocess function.

Move the SAM-usage AST scan out of preprocess and into its own
top-level function in the Preprocessor module that returns the
path-keyed hashtbl. The scan now runs after the check_path loop so
cl_paths are stable when we record them.
@kLabz
Copy link
Copy Markdown
Contributor Author

kLabz commented May 16, 2026

Something like that?
kLabz@00d9484

@Simn
Copy link
Copy Markdown
Member

Simn commented May 16, 2026

The standalone diff looks a bit strange but I think that's right.

@kLabz
Copy link
Copy Markdown
Contributor Author

kLabz commented May 16, 2026

Looks better in the PR diff indeed :)

@Simn Simn merged commit dce7adf into development May 16, 2026
58 checks passed
@kLabz kLabz deleted the jvm/SAM branch May 16, 2026 14:33
@kLabz kLabz mentioned this pull request May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants