Skip to content

Commit

Permalink
Merge pull request #1428 from SethTisue/signature-polymorphic-methods
Browse files Browse the repository at this point in the history
  • Loading branch information
SethTisue committed Jul 17, 2023
2 parents ea17f85 + 260445d commit 3f0b660
Showing 1 changed file with 235 additions and 0 deletions.
235 changes: 235 additions & 0 deletions _posts/2023-07-17-signature-polymorphic-methods.md
@@ -0,0 +1,235 @@
---
category: blog-detail
post-type: blog
by: Seth Tisue, Lightbend
title: "Signature polymorphic methods in Scala"
---

Java 7 introduced a curious and little-known feature to the Java
Virtual Machine: "signature polymorphic" methods. These methods have
strangely malleable types.

This blog post explains the feature and why it exists. We also delve
into how it is specified and implemented in both Scala 2 and Scala 3.

The Scala 3 implementation is new, and that's the occasion for this
blog post. Thanks to this recent work, **Scala 3 users can now access
the entire Java reflection API**, as of Scala 3.3.0.

## Should I keep reading?

Signature polymorphism is admittedly an obscure feature. When you need
it you need it, but the need doesn't arise in ordinary Scala
code. Thus, the remaining material may be of interest primarily to JVM
aficionados, Scala and Java language mavens, and compiler hackers.

## When is signature polymorphism needed?

Compiler support is needed when you use some portions of the Java
reflection API, namely `MethodHandle` (since Java 7) and `VarHandle`
(since Java 11).

`MethodHandle` provides reflective access to methods on JVM classes,
regardless of whether the methods were defined in Java, Scala, or some
other JVM language. `VarHandle` does the same, but for fields.

The polymorphism of these methods makes them more efficient, by
avoiding boxing overhead when primitive values are passed, returned,
or stored.

## Is signature polymorphism supported in Scala?

Yes: since Scala 2.11.5, and more fully since Scala 2.12.16. Scala 3
now has the support too, as of Scala 3.3.0.

The initial Scala 2 implementation was done by [Jason Zaugg] in 2014
and refined later by [Lukas Rytz]. The latest version, with all fixes,
landed in Scala 2.12.16 (released June 2022).

Recently, [Dale Wijnand] ported the feature to Scala 3, with the
assistance of [Guillaume Martres] and myself, [Seth Tisue].

Jason, Lukas, Dale, and myself are members of the Scala compiler team
at [Lightbend]. We maintain Scala 2 and also contribute to Scala 3.
Guillaume has worked on the Scala 3 compiler for some years, previously
at [LAMP] and now at the [Scala Center].

[Jason Zaugg]: https://github.com/retronym
[Lukas Rytz]: https://github.com/lrytz
[Dale Wijnand]: https://github.com/dwijnand
[Seth Tisue]: https://github.com/SethTisue
[Guillaume Martres]: https://github.com/smarter
[Lightbend]: https://lightbend.com
[LAMP]: https://www.epfl.ch/labs/lamp/
[Scala Center]: https://scala.epfl.ch

## What signature polymorphic methods exist?

You may already have run into this feature if you have used the
`MethodHandle` and `VarHandle` classes from the Java reflection API in
the Java standard library.

In fact, `MethodHandle` and `VarHandle` are the _only_ places you
could possibly have run into this feature!

That's because users are not allowed to define their own signature
polymorphic methods. Only the Java standard library can do that, and
so far, the creators of Java have only used the feature in these two
classes.

## What does "signature polymorphism" mean, exactly?

There is a formal description in [JLS 15.12.3], but a more readable
version is in the [Javadoc for
`MethodHandle`](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/invoke/MethodHandle.html).
It says:

> A signature polymorphic method is one which can operate with any of
> a wide range of call signatures and return types.
and:

> In source code, a call to a signature polymorphic method will
> compile, regardless of the requested symbolic type descriptor. As
> usual, the Java compiler emits an invokevirtual instruction with the
> given symbolic type descriptor against the named method. The unusual
> part is that the symbolic type descriptor is derived from the actual
> argument and return types, not from the method declaration.
Note that generics are not sufficient to express this level of
flexibility, for two reasons:

First, Java generics only work on reference types, not primitive
types. Scala does not have this limitation, but pays for it by
incurring boxing at run-time when primitive types are used in generic
contexts.

Second, methods (in both languages) may only have a fixed number of
type parameters, but we need one varying type for each parameter
of the method we want to call reflectively.

The following example should help make all of this clearer.

[JLS 15.12.3]: https://docs.oracle.com/javase/specs/jls/se17/html/jls-15.html#jls-15.12.3

## How do I call a signature polymorphic method from Scala?

Take `MethodHandle` for example. It provides an `invokeExact`
method. Its signature as seen from Scala is:

def invokeExact(args: AnyRef*): AnyRef

Signature polymorphism means that the `AnyRef`s here are just
placeholders for types to be supplied later.

To see this work in practice, let's adapt an example from
the Javadoc. From Scala, we'll make a reflective call to the `replace`
method on a `String`:

import java.lang.invoke._
val mt = MethodType.methodType(
classOf[String], classOf[Char], classOf[Char])
val mh = MethodHandles.lookup.findVirtual(
classOf[String], "replace", mt)
val s = mh.invokeExact("daddy", 'd', 'n'): String

If we paste this into the Scala REPL (2 or 3), we see:

val s: String = nanny

Signature polymorphism helped us here in two ways:

* The arguments `d` and `n` will not be passed as `Object` or boxed to
`java.lang.Character` at runtime, but will be passed directly as
primitive `Char`s.
* The result comes back as a `String` without needing to be checked
or cast at runtime.

## Are these methods good for anything else?

Great question!

Doesn't it seem puzzling that the designers of Java would go to so
much trouble to make Java reflection faster? If I care so much about
performance, shouldn't I avoid using reflection entirely?

The real reason these methods need to be fast is to aid efficient
implementation of dynamic languages on the JVM. `MethodHandle` was
added to the JVM at the same time as `invokeDynamic`, as part of
[JSR-292], which aimed to support efficient implementation of JRuby
and other alternative JVM languages. (`invokeDynamic` is additionally
used for implementing lambdas, in both Java and Scala; see [this
writeup on Stack Overflow].)

[JSR-292]: https://www.infoq.com/articles/invokedynamic/
[this writeup on Stack Overflow]: https://stackoverflow.com/questions/30002380/why-are-java-8-lambdas-invoked-using-invokedynamic

## How is this implemented in Scala 2?

Jason Zaugg describes his initial JDK 7 implementation in [PR 4139]
and shows how the resulting bytecode looks.

See also these well-documented followups: [PR 5594] for JDK 9,
[PR 9530] for JDK 11, and [PR 9930] for JDK 17.

[PR 4139]: https://github.com/scala/scala/pull/4139
[PR 5594]: https://github.com/scala/scala/pull/5594
[PR 9530]: https://github.com/scala/scala/pull/9530
[PR 9930]: https://github.com/scala/scala/pull/9930

## What's different in the Scala 3 version?

We had to work harder in Scala 3 because it wasn't enough to have an
an in-memory representation for signature polymorphic call sites. The
call sites must also have a representation in TASTy, so we had to add
a new TASTy node type. (Scala 2 pickles only represent method
signatures; in contrast, TASTy represents method bodies too.)

To represent a signature polymorphic call site internally, we
synthesize a method type based on the types at the call site. One can
imagine the original signature-polymorphic method as being infinitely
overloaded, with each individual overload only being brought into
existence as needed.

For details, see [the pull
request](https://github.com/lampepfl/dotty/pull/16225).

### The path not taken

Along the way we explored an alternative approach, suggested by Jason,
which involved rewriting each call site to include a cast to a
structural type containing an appropriately typed method.

In that version, the `replace` call-site in the example above was
rewritten from:

mh.invokeExact("daddy", 'd', 'n'): String

to:

mh.asInstanceOf[
MethodHandle {
def invokeExact(a0: String, a1: Char, a2: Char): String
}
].invokeExact("daddy", 'd', 'n')

(The actual rewrite was applied to in-memory ASTs, rather than to
source code.)

The transformed code could be written and read as TASTy without
trouble. Later in compilation, we detected which call sites are the
product of this transform, drop the cast, and emit the correct
bytecode.

In the end, we didn't go with this approach. As Sébastien Doeraene
pointed out, although this approach avoided adding a new TASTy tag, it
also gave new semantics to existing tags that older compilers wouldn't
understand. Therefore the work still couldn't ship until the next
minor version of the compiler. Besides, avoiding the new tag
complicated the implementation.

## Questions? Discussion?

These are welcome on the Scala Contributors forum thread at:

* (TODO Discourse link, with link back to this post)

0 comments on commit 3f0b660

Please sign in to comment.