Add support for more types #51

mcimadamore · 2024-04-15T09:24:14Z

This PR adds support for type-variables and wildcard type arguments in the code model JavaType's hierarchy.

This allows the code model to reflect the source types much more accurately, as we no longer need to erase the source type at the first sign of a non-denotable type. Instead, we can use the a modified (see below) version of the Types::upwards function (type projection) to compute the closest denotable upper bound to the type found in the source code. This means that the type associated with every op in the model is a (denotable) supertype of the type in the javac AST. The fact that such type is denotable has three important consequences:

the type can be expressed in the source code (in case the code model needs to be lifted back into Java source)
the type must be expressible in the syntax of bytecode signature attributes (this is important e.g. for the local variable type attribute)
the type can be resolved to its runtime counterpart in j.l.r.Type (not implemented in this PR), as explained below

Some parser changes were required to support this, so that we can serialize and deserialize the new types accordingly.

A new method has been added to JavaType, namely JavaType::erasure, which computes the erasure of a JavaType. This might come in handy when lowering the model into bytecode. Since supporting erasure is crucial, modelling of types has been carefully chosen as to facilitate that operation as much as possible: that is why, for example, TypeVariableRef contains the "principal" type-variable bound (so that we can define erasure for type-variables in a straightforward fashion, as the erasure of the primary bound).

Denotable projections

The code model type associated with an op result is computed by applying a modified version of Types::upwards - that is, the function that implements type projections as specified in JLS 4.10.5. The original projection algorithm is designed to leave intersection types in place - while this is handy, as it maximizes the applicability of the type inferred for local variables declared with var, for the code model use this is not suitable, as we'd like to get to a denotable type in the end (jshell has a similar problem, which was addressed in a more ad-hoc way).

It is generally possible to project an intersection type using only one of its bounds, e.g.

List<A & B>

Is projected to:

List<? extends A>

There are, however, problems when projecting intersection types that are on the right of some lower-bounded wildcard - e.g.

List<? super A & B>

In this case, projecting to List<? super A> is not valid, as List<? super A> is not a supertype of List<? super A & B>. For this reason, in these cases we have to fallback to an unbounded wildcard List<?>.

Runtime resolution

Support for runtime resolution of elements in the JavaType hierarchy is possible, as there is a subtype of j.l.r.Type for each of the subtypes in JavaType. The main problem is being able to resolve type-variables: in the current modelling, type-variable types only have a name, and names can be ambiguous. That is, it could be possible for a type-variable with same name to be defined at different levels in the source code:

class Foo<X> { //1
    <X> void test() { ... } // 2
}

To allow for better disambiguation we need to add ownership information to the TypeVariableRef class. This could point to either another JavaType (if the type-variable is a class type-variable), or to a MethodRef in case the type-variable is defined in a method. In this PR I didn't want to tackle to problem of modelling this additional information (that will come in a follow-up PR). Once the proper ownership info is in place, we might add code to enable runtime resolution of JavaTypes.

Update

After some consideration, I have also added support for ownership info in type-variables. A type variable reference can now have a parent method or class (the source element which declared the type-variable). In the former case, a MethodRef is used, in the latter case a JavaType is used. The string representation for type-variables is a tad convoluted. For class type-variables:

#Foo::X

While for method type-variables:

#Foo::bar()Baz::Z

The parser has been adjusted accordingly.

Progress

Change must not contain extraneous whitespace

Reviewers

Paul Sandoz (@PaulSandoz - Reviewer) ⚠️ Review applies to 8ad6110b

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/babylon.git pull/51/head:pull/51
$ git checkout pull/51

Update a local copy of the PR:
$ git checkout pull/51
$ git pull https://git.openjdk.org/babylon.git pull/51/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 51

View PR using the GUI difftool:
$ git pr show -t 51

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/babylon/pull/51.diff

Webrev

Link to Webrev Comment

(some tests need to be adjusted)

bridgekeeper · 2024-04-15T09:24:49Z

👋 Welcome back mcimadamore! A progress list of the required criteria for merging this PR into code-reflection will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2024-04-15T09:24:55Z

@mcimadamore This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

Add support for more types

Reviewed-by: psandoz

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 11 new commits pushed to the code-reflection branch:

9f35792: 8324789: Add line number information to code models
628a931: Preserve order of captured value
c1a8964: Determine if lambda operation originates from a method reference
4556061: Temporarily disable TestSmallCorpus tests
ebf319d: Refine support for captured values.
55416a0: BytecodeGenerator cleanup and and types handling fixes
8408a32: Missing conversion for some unary operators
d02e7c6: InnerClassLambdaMetafactory fix of hidden classes handling
0f1e4e1: Issues with captures in quotable lambdas
a0e0953: Implement shift ops
... and 1 more: https://git.openjdk.org/babylon/compare/ff2074f84ab339eee340eeab435a5ae1afc1a7c4...code-reflection

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the code-reflection branch, type /integrate in a new comment.

openjdk · 2024-04-15T09:25:19Z

@mcimadamore this pull request can not be integrated into code-reflection due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout projections
git fetch https://git.openjdk.org/babylon.git code-reflection
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge code-reflection"
git push

mcimadamore · 2024-04-15T09:28:15Z

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ReflectMethods.java

-            Type quotedReturnType = new ClassType(null,
-                    com.sun.tools.javac.util.List.of(quotedOpType), syms.quotedType.tsym);
-            MethodType mtype = new MethodType(nil, quotedReturnType, nil, syms.methodClass);
+            MethodType mtype = new MethodType(nil, syms.quotedType, nil, syms.methodClass);


This code seemed to try to parameterized the Quoted type, which is no (longer) a generic type. This was causing a crash in the logic for computing the set of captured variables of a given type (types::captures).

This change is what caused the fixes in the two reflect/code tests, as the tests were also expecting a parameterized Quoted type.

mcimadamore · 2024-04-15T09:29:04Z

src/java.base/share/classes/java/lang/reflect/code/type/TypeVarRef.java

+ */
+public final class TypeVarRef implements JavaType {
+
+    // @@@: how do we encode tvar owner?


As the comment indicates, ideally a type-variable reference should also points to its owner (a type or a method). I'm not 100% sure how to encode that in the TypeElement structure (see also the toplevel PR summary).

This is now handled as part of 52fc6e9

Tweak type projection to eliminate intersections/unions

Tweak tests Add erasure method

mlbridge · 2024-04-23T13:12:50Z

Webrevs

mcimadamore · 2024-04-23T13:13:19Z

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/ReflectMethods.java

@@ -2236,22 +2240,15 @@ FieldRef symbolToFieldRef(Symbol s, Type site) {
            // @@@ Made Gen::binaryQualifier public, duplicate logic?
            // Ensure correct qualifying class is used in the reference, see JLS 13.1
            // https://docs.oracle.com/javase/specs/jls/se20/html/jls-13.html#jls-13.1
-            return symbolToFieldRef(gen.binaryQualifier(s, types.erasure(site)));
+            return symbolToErasedFieldRef(gen.binaryQualifier(s, types.erasure(site)));


I realized there was an issue here, as the field reference was not using erased types, and so it was incompatible with the binary qualifier used in codegen

mcimadamore · 2024-04-25T10:23:12Z

src/java.base/share/classes/java/lang/reflect/code/parser/impl/DescParser.java

        while (l.acceptIf(Tokens.TokenKind.DOT)) {
            identifier.append(Tokens.TokenKind.DOT.name);
            t = l.accept(Tokens.TokenKind.IDENTIFIER);
            identifier.append(t.name());
        }

+        if (l.token().kind == TokenKind.COLCOL && isTypeVar) {


If we see #Foo::Bar we might be seeing two things:

a type-variable Bar in class Foo

a method type-variable in some method Foo.Bar

So, we need to disambiguate based on what follows. E.g. if after Bar we see (, then we know we're in the method case.

mcimadamore · 2024-04-25T10:23:51Z

src/java.base/share/classes/java/lang/reflect/code/parser/impl/DescParser.java

+            t = l.accept(TokenKind.IDENTIFIER); // type-var or method name
+            identifier.append(t.name());
+            if (l.token().kind == TokenKind.LPAREN) {
+                FunctionType functionType = parseMethodType(l);


Note that here we parse, then we throw away, as the type definition only wants a string-based identifier, so we'll need to reparse the identifier string again in the type factory.

mcimadamore · 2024-04-25T10:24:43Z

src/java.base/share/classes/java/lang/reflect/code/type/CoreTypeFactory.java

+                if (typeArguments.size() != 1) {
+                    throw new IllegalArgumentException("Bad type-variable bounds: " + tree);
+                }
+                String[] parts = identifier.split("::");


And this is the duplicate parsing logic (although here we already know if it's a method or a class type-variable based on the number of ::)

I am wondering if instead we can check #, and the parsers job is dumbly accumulate all valid characters (selected tokens and identifiers) up to but not including the < token. We could even check if there is quoted string for the type identifier.

Note the special code for arrays in the parser was added only to avoid updating many tests.

I can take a look

Uploaded a new iteration with this simplification (which looks much nicer than what I had):

8ad6110

Note that if we wanted a truly general "quoting" mechanism we'd need both a prefix and a suffix token. Otherwise one can only use quotes if there's some nested type-definition with <>. Your idea of using just strings (e.g. surrounded with ") seems a powerful one (and more robust in the long run), because it would make the desc parsing logic a lot less opinionated (e.g. we wouldn't even need to special case qualified identifiers).

That's much simpler. We can iterate further afterwards if need be. I believe you can now replace identifier.contains("::") with identifier.startsWith("#")?

mcimadamore · 2024-04-25T10:57:45Z

From a modelling perspective, it would be cleaner to have a TypeArgument interface that is not a sub-interface of JavaType. Then ClassType, ArrayType, WildcardType and TypeVariableRef can implement that interface. This would allow us to state clearly in the API that the type arguments of a ClassType must be of type TypeArgument, and that WildcardType is not really a type.

But doing this in the current world is painful: all types have a uniform structure (identifier + list of type elements), which pushes us towards modelling wildcards using proper types (otherwise parsing becomes very convoluted).

To be honest, with the recent changes to DescParser to parse additional types (esp. type-variables) it seems to me that the distinction between "generic parsing" and "Java-specific parsing" has been lost somewhat (e.g. DescParser has special code which needs to be ready for the specific needs of java types).

PaulSandoz · 2024-04-25T19:48:06Z

I really like core principle of projecting upwards to a (or the nearest?) denotable supertype. It really simplifies things and is generally easy to grasp, even if the actual details can be hard to understand e.g., the set of Java types expressible in the code model is almost the same as the set of the types one can express in source code.

I agree with you having a clearer distinction for modeling type arguments, it may be useful to have a top-level Java type'ish interface covering Java type and java type argument. This seems possible, the Java type factory can create whatever instances it wants based off the type identifier information e.g.,

            if (identifier.equals("+") || identifier.equals("-")) {
                // wildcard type
                BoundKind kind = identifier.equals("+") ?
                        BoundKind.EXTENDS : BoundKind.SUPER;
                return JavaTypeArgument.wildcard(kind, typeArguments.get(0));

?

mcimadamore · 2024-04-25T21:05:15Z

This seems possible, the Java type factory can create whatever instances it wants based off the type identifier information e.g.,

Yes, TypeDefinition is identifier plus List<TypeDefinition>. So we have some flexibility in there. I was assuming we wanted 1-1 relationship between TypeDefinition and JavaTypes, but that doesn't need to be the case.

PaulSandoz · 2024-04-25T21:25:07Z

Yes, TypeDefinition is identifier plus List<TypeDefinition>. So we have some flexibility in there. I was assuming we wanted 1-1 relationship between TypeDefinition and JavaTypes, but that doesn't need to be the case.

Right, this enables us to serialize and parse non-Java-based code models with some non-Java-like type descriptions.

openjdk · 2024-04-26T09:50:04Z

@mcimadamore Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

mcimadamore · 2024-04-26T09:57:54Z

I gave this a try, but I don't think we should pursue this, at least not as part of this patch. Here's some code I put together:

https://github.com/mcimadamore/babylon/compare/projections...mcimadamore:babylon:java_type_argument?expand=1

I think I got the parser working, but then we're greeted with a death-by-thousands cuts situation where most classes use JavaType to mean "type argument" (and they use that to call the JavaType.type factory for parameterized types). If we tweak that factory to take JavaType.Argument instead (as I did in that branch), then several calls to the factory start failing, and we need to add casts instead. The situation is not helped by the fact that the JavaType factories are not always sharp (e.g. the factory type returns just JavaType.

This is aggravated by the fact that there's no type to say "a JavaType that is also a type argument". As a result, JavaType casts too wide a net (because of primitive types), but TypeArgument is too sharp, as it contains stuff (wildcards) that are not JavaType.

Overall it wasn't clear to me that doing this refactoring would be beneficial, especially as part of a PR that is already relatively big - given that the refactoring doesn't seem the "slam dunk" we were looking for.

Yes, TypeDefinition is identifier plus List<TypeDefinition>. So we have some flexibility in there. I was assuming we wanted 1-1 relationship between TypeDefinition and JavaTypes, but that doesn't need to be the case.

Right, this enables us to serialize and parse non-Java-based code models with some non-Java-like type descriptions.

PaulSandoz · 2024-04-26T16:09:38Z

src/java.base/share/classes/java/lang/reflect/code/type/CoreTypeFactory.java

+                if (typeArguments.size() != 1) {
+                    throw new IllegalArgumentException("Bad type-variable bounds: " + tree);
+                }
+                String[] parts = identifier.split("::");


That's much simpler. We can iterate further afterwards if need be. I believe you can now replace identifier.contains("::") with identifier.startsWith("#")?

mcimadamore · 2024-04-26T17:08:59Z

/integrate

openjdk · 2024-04-26T17:11:45Z

Going to push as commit 6713aca.
Since your change was applied there have been 11 commits pushed to the code-reflection branch:

9f35792: 8324789: Add line number information to code models
628a931: Preserve order of captured value
c1a8964: Determine if lambda operation originates from a method reference
4556061: Temporarily disable TestSmallCorpus tests
ebf319d: Refine support for captured values.
55416a0: BytecodeGenerator cleanup and and types handling fixes
8408a32: Missing conversion for some unary operators
d02e7c6: InnerClassLambdaMetafactory fix of hidden classes handling
0f1e4e1: Issues with captures in quotable lambdas
a0e0953: Implement shift ops
... and 1 more: https://git.openjdk.org/babylon/compare/ff2074f84ab339eee340eeab435a5ae1afc1a7c4...code-reflection

Your commit was automatically rebased without conflicts.

openjdk · 2024-04-26T17:12:01Z

@mcimadamore Pushed as commit 6713aca.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

mcimadamore added 4 commits April 12, 2024 21:51

Inital push

bcf31bc

(some tests need to be adjusted)

Fix tests

477e435

Drop method type support from ReflectMethods::normalize

3eb997a

Add test for union type

7967c37

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Apr 15, 2024

mcimadamore commented Apr 15, 2024

View reviewed changes

mcimadamore added 2 commits April 15, 2024 10:35

Merge branch 'code-reflection' into projections

59db7cd

Fix tests after merge

dbcb63e

openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Apr 15, 2024

mcimadamore added 4 commits April 22, 2024 14:29

Drop intersections and unions

17b14da

Tweak type projection to eliminate intersections/unions

Drop spurious changes

a7cd29a

Add bound to type-variables

3929e04

Tweak tests Add erasure method

Add erasure test

05571c5

mcimadamore changed the title ~~Add support for non-denotable types~~ Add support for more types Apr 23, 2024

mcimadamore marked this pull request as ready for review April 23, 2024 13:08

openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 23, 2024

mcimadamore commented Apr 23, 2024

View reviewed changes

Add support for type-variable owner

52fc6e9

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 25, 2024

mcimadamore commented Apr 25, 2024

View reviewed changes

Remove whitespaces

89a551a

openjdk bot added ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 25, 2024

openjdk bot added merge-conflict Pull request has merge conflict with target branch and removed ready Pull request is ready to be integrated labels Apr 26, 2024

mcimadamore force-pushed the projections branch from 1309ab7 to 89a551a Compare April 26, 2024 09:48

openjdk bot added ready Pull request is ready to be integrated and removed merge-conflict Pull request has merge conflict with target branch labels Apr 26, 2024

Simplify descriptor parsing

8ad6110

PaulSandoz approved these changes Apr 26, 2024

View reviewed changes

Simplify type-var test in java type factory

507074e

openjdk bot added the integrated Pull request has been integrated label Apr 26, 2024

openjdk bot closed this Apr 26, 2024

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for more types #51

Add support for more types #51

mcimadamore commented Apr 15, 2024 •

edited by openjdk bot

Loading

bridgekeeper bot commented Apr 15, 2024

openjdk bot commented Apr 15, 2024 •

edited

Loading

openjdk bot commented Apr 15, 2024

mcimadamore Apr 15, 2024

mcimadamore Apr 15, 2024 •

edited

Loading

mcimadamore Apr 25, 2024

mlbridge bot commented Apr 23, 2024 •

edited

Loading

mcimadamore Apr 23, 2024

mcimadamore Apr 25, 2024

mcimadamore Apr 25, 2024

mcimadamore Apr 25, 2024

PaulSandoz Apr 25, 2024

mcimadamore Apr 25, 2024

mcimadamore Apr 26, 2024

PaulSandoz Apr 26, 2024

mcimadamore commented Apr 25, 2024

PaulSandoz commented Apr 25, 2024 •

edited

Loading

mcimadamore commented Apr 25, 2024

PaulSandoz commented Apr 25, 2024

openjdk bot commented Apr 26, 2024

mcimadamore commented Apr 26, 2024

PaulSandoz Apr 26, 2024

mcimadamore commented Apr 26, 2024

openjdk bot commented Apr 26, 2024

openjdk bot commented Apr 26, 2024

Add support for more types #51

Add support for more types #51

Conversation

mcimadamore commented Apr 15, 2024 • edited by openjdk bot Loading

Denotable projections

Runtime resolution

Update

Progress

Reviewers

Reviewing

Webrev

bridgekeeper bot commented Apr 15, 2024

openjdk bot commented Apr 15, 2024 • edited Loading

openjdk bot commented Apr 15, 2024

Choose a reason for hiding this comment

mcimadamore Apr 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mlbridge bot commented Apr 23, 2024 • edited Loading

Webrevs

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcimadamore commented Apr 25, 2024

PaulSandoz commented Apr 25, 2024 • edited Loading

mcimadamore commented Apr 25, 2024

PaulSandoz commented Apr 25, 2024

openjdk bot commented Apr 26, 2024

mcimadamore commented Apr 26, 2024

Choose a reason for hiding this comment

mcimadamore commented Apr 26, 2024

openjdk bot commented Apr 26, 2024

openjdk bot commented Apr 26, 2024

mcimadamore commented Apr 15, 2024 •

edited by openjdk bot

Loading

openjdk bot commented Apr 15, 2024 •

edited

Loading

mcimadamore Apr 15, 2024 •

edited

Loading

mlbridge bot commented Apr 23, 2024 •

edited

Loading

PaulSandoz commented Apr 25, 2024 •

edited

Loading