Skip to content

Align Kotlin type model with Java parser output#7364

Merged
jkschneider merged 8 commits intomainfrom
kotlin-cross-parser-dedup
Apr 14, 2026
Merged

Align Kotlin type model with Java parser output#7364
jkschneider merged 8 commits intomainfrom
kotlin-cross-parser-dedup

Conversation

@jkschneider
Copy link
Copy Markdown
Member

Why

The Kotlin parser produces JavaType instances that systematically diverge from the Java parser's output for the same underlying JVM classes. This has two painful consequences:

  • Recipes don't work uniformly on Kotlin. A Java-authored recipe like FindSql that matches java.lang.String silently does nothing on Kotlin sources, because the Kotlin parser reports the type as kotlin.String instead of java.lang.String. Since Kotlin ultimately compiles to JVM types, this asymmetry is a usability papercut — the user's code does use java.lang.String at runtime, but the tooling can't see it.
  • Cross-parser dedup doesn't collapse shared classpath types. In mixed Java/Kotlin monorepos (e.g. google/dagger — 1113 Bazel targets), the stitch pipeline sees thousands of structurally-divergent copies of java.lang.Object, java.io.Serializable, java.util.function.Predicate, etc. The variants cache grows linearly with fragment count and OOMs at 6GB+ heap.

What this PR fixes

1. Class flags for interfaces, annotations, and enums

Both the FirClass path (Kotlin source / Kotlin-view of Java) and the BinaryJavaClass path now set the correct JVM ACC flags for special class kinds:

Type Before (Kotlin) After Java parser
java.io.Serializable (interface) 1025 1537 1537
@IntrinsicCandidate (annotation) 17 1537 1537
@jdk.internal.ValueBased 17 1537 1537

Interface / annotation types now carry ACC_INTERFACE | ACC_ABSTRACT (and drop ACC_FINAL); enum types carry ACC_ENUM.

2. Default methods on interfaces

Non-abstract, non-static instance methods on interfaces now carry the Default flag (bit 43) that the Java parser synthesizes, and interface instance methods also carry ACC_ABSTRACT (matching the Java parser's behaviour of marking interface methods abstract regardless of whether they have a default body). Applied across all four method-creation paths: methodDeclarationType(FirFunction), methodDeclarationType(JavaMethod), methodInvocationType, and KotlinIrTypeMapping.methodDeclarationType.

3. Annotation-class constructors

Kotlin's FIR emits a synthetic FirConstructor for annotation classes; the Java parser omits constructors for annotation types. Skip the synthesis so java.lang.FunctionalInterface and friends match across parsers.

4. Kotlin builtin remap to JVM FQNs

The core user-visible change. Where Kotlin's FIR reports types using kotlin.* builtins that compile to specific JVM classes, the parser now remaps to the JVM name so the produced JavaType mirrors what the Java parser would produce for the same bytecode:

Kotlin FQN JVM FQN
kotlin.Any java.lang.Object
kotlin.Annotation java.lang.annotation.Annotation
kotlin.CharSequence java.lang.CharSequence
kotlin.Comparable java.lang.Comparable
kotlin.Enum java.lang.Enum
kotlin.Number java.lang.Number
kotlin.String java.lang.String
kotlin.Throwable java.lang.Throwable
kotlin.annotation.Retention java.lang.annotation.Retention
kotlin.annotation.Target java.lang.annotation.Target
kotlin.annotation.MustBeDocumented java.lang.annotation.Documented
kotlin.annotation.Repeatable java.lang.annotation.Repeatable

The remap preserves Parameterized wrappers (so kotlin.Enum<Foo> becomes java.lang.Enum<Foo>, not raw java.lang.Enum).

Applied universally — supertypes, interfaces, annotation lists, generic bounds — so a class written as

class MyString(val value: String) : Comparable<MyString>

now shows supertype java.lang.Object, interface java.lang.Comparable<MyString>, and the value property typed as java.lang.String, exactly as the Java parser would have produced for the equivalent Java class.

5. Generic bounds

  • Explicit <T : Any> in Kotlin source now remaps to T extends java.lang.Object.
  • Implicit kotlin.Any bounds on Java-origin type parameters (e.g. java.util.Optional<T>) are stripped so they match Java's unbounded <T> (which has no bound at all, not a java.lang.Object bound).

Why KotlinTypeUtils?

Remapping Kotlin builtins to JVM FQNs means Java-authored recipes work on Kotlin code out of the box. But it also means the type model no longer carries Kotlin-world names — a recipe author reasoning in Kotlin terms (e.g. one thinking in terms of kotlin.collections.List or kotlin.Int) would find their calls to TypeUtils.isOfClassType(type, \"kotlin.String\") silently return false because the type model now holds java.lang.String.

KotlinTypeUtils is the compatibility layer that restores the Kotlin-perspective vocabulary without compromising the canonical JVM representation underneath:

  • toJvmFqn / toKotlinFqn — explicit FQN remap between the two worlds.
  • isOfClassType(type, fqn) and isAssignableTo(fqn, type) — accept either the Kotlin or JVM name. KotlinTypeUtils.isOfClassType(x, \"kotlin.String\") matches a java.lang.String type; so does KotlinTypeUtils.isOfClassType(x, \"java.lang.String\").
  • isKotlinInt / isKotlinLong / isKotlinBoolean / ... — match Kotlin primitive types regardless of whether they surface as JVM primitives (int) or their boxed forms (java.lang.Integer), since Kotlin's Int is context-dependent on nullability.
  • isKotlinUnit — matches JVM void or kotlin.Unit.
  • isAny — matches java.lang.Object.

The alias table covers the same types the parser remaps, plus the kotlin.collections.* interfaces (List, Map, Set, Iterable, Iterator, Collection, and their mutable variants) which also compile to java.util.*.

Design intent: the parser produces one canonical representation (JVM names). Tooling that needs to answer questions in Kotlin-native vocabulary layers KotlinTypeUtils over it. Kotlin-specific reasoning that genuinely has no JVM equivalent — nullability, kotlin.Unit vs void in non-return positions, source vs compiled origin — would slot in here too.

Test plan

  • ./gradlew :rewrite-kotlin:test passes (1184 tests, was 1184 before)
  • Updated 7 test expectations in KotlinTypeMappingTest that hard-coded the old Kotlin-native FQNs (e.g. "kotlin.Any""java.lang.Object") to match the new JVM-native output
  • New KotlinTypeUtilsTest exercises the alias lookup, the Kotlin-FQN variants of isOfClassType / isAssignableTo, and each Kotlin primitive helper against both primitive and boxed forms
  • Validated against moderneinc/moderne-ast-write's JavaKotlinCrossParserDeduplicationTest: javaAndKotlinFragmentsShareClasspathTypeVariants and annotationTypesFlagsAlign now pass; the dedup count for fundamental JDK types collapses from N-per-fragment to 1

The Kotlin parser produces JavaType instances that diverge from the Java
parser's output for the same JDK classpath types, preventing cross-parser
deduplication in mixed Java/Kotlin monorepos (google/dagger scale: 1113
Bazel targets hitting OOM at 6GB+ heap from linear variant growth).

Fixes in KotlinTypeMapping.kt / KotlinIrTypeMapping.kt:

**Class flags for interfaces and annotation types.** Both the FirClass
path and the BinaryJavaClass path now set ACC_INTERFACE and ACC_ABSTRACT
(and clear ACC_FINAL) for INTERFACE and ANNOTATION_CLASS, and set
ACC_ENUM for ENUM_CLASS. Previously `java.io.Serializable` had flags
1025 (Public+Abstract), `@IntrinsicCandidate` / `@ValueBased` had 17
(Public+Final); now all match the Java parser's 1537.

**Default methods on interfaces.** Non-abstract, non-static instance
methods on interfaces now carry the Default flag (bit 43). Applied in
the FirFunction path, the JavaMethod path, the IR path, and
methodInvocationType. Interface instance methods also carry Abstract
(matching Java's parser) regardless of whether they have a default body.

**Annotation classes no longer synthesize a constructor.** Kotlin's FIR
includes a FirConstructor for annotation classes; the Java parser omits
constructors for annotations, so skip them here for cross-parser dedup.

**Remap Kotlin builtins for Java-origin types.** When a class whose
origin is FirDeclarationOrigin.Java has its supertype or interfaces
resolved to kotlin.Any, kotlin.Annotation, etc., remap to the Java FQN
(java.lang.Object, java.lang.annotation.Annotation, etc.). Kotlin-source
classes keep their explicit Kotlin references. Meta-annotations
(kotlin.annotation.Retention/Target/MustBeDocumented) are always remapped
in listAnnotations so Java classes' meta-annotations align.

**Strip kotlin.Any bounds on Java-origin type parameters.** For
`java.util.Optional<T>`, Kotlin's FIR resolves T's bound to kotlin.Any;
the Java parser represents unbounded type parameters with no bounds.
Strip kotlin.Any from bounds only when the containing declaration is
Java-origin so `<T : Any>` in Kotlin source is preserved.
Previously the remap was scoped to Java-origin classes only, leaving
Kotlin source classes with supertype `kotlin.Any` / `kotlin.Enum<...>`
and type parameter bounds like `T extends kotlin.Any`. That meant
Java-authored recipes matching on `java.lang.Object` / `java.lang.String`
failed on Kotlin sources — `FindSql` in rewrite-sql wouldn't light up on
Kotlin code even though the runtime types are identical.

Apply the builtin remap universally so the parser produces the same JVM
FQNs the Java parser would for the same bytecode:

- Supertype and interfaces resolve through `remapKotlinBuiltin`
  regardless of origin.
- Explicit `<T : Any>` in Kotlin source remaps to `T extends
  java.lang.Object`. Implicit `kotlin.Any` bounds on Java-origin type
  parameters (e.g. `java.util.Optional<T>`) are still stripped so they
  match Java's unbounded `<T>`.
- `remapKotlinBuiltin` now preserves `Parameterized` wrappers —
  `kotlin.Enum<Foo>` becomes `java.lang.Enum<Foo>` rather than losing
  its type arguments.

Tests updated to reflect the JVM-native output. Kotlin-specific
semantics (e.g. reasoning about `kotlin.Any?` nullability, or whether a
declaration came from Kotlin source) would belong in a Kotlin-specific
`TypeUtils` layered over this representation, not in the type mapping
itself.
`KotlinTypeMapping` now produces JVM-native FQNs on the type model, so
Java-authored recipes matching on java.lang.Object / java.lang.String
already work over Kotlin sources. But recipes written from the Kotlin
author's perspective may reasonably say "match kotlin.Int" or
"match kotlin.collections.List" — those wouldn't find anything because
the type model carries the JVM name.

`KotlinTypeUtils` layers over `TypeUtils` with Kotlin-aware variants:

- `toJvmFqn` / `toKotlinFqn` — explicit FQN remap across the two worlds.
- `isOfClassType(type, fqn)` — accepts either the Kotlin or JVM name
  and matches against the JVM representation the type model carries.
- `isAssignableTo(fqn, type)` — same aliasing for supertype checks.
- `isKotlinInt` / `isKotlinLong` / `isKotlinBoolean` / ... — match the
  Kotlin primitive types regardless of whether they appear as JVM
  primitives or their boxed equivalents.
- `isKotlinUnit`, `isAny` — convenience for the top-type and unit-return
  semantics.

The alias table covers `kotlin.Any`, `kotlin.Annotation`,
`kotlin.CharSequence`, `kotlin.Comparable`, `kotlin.Enum`,
`kotlin.Number`, `kotlin.String`, `kotlin.Throwable`, the
`kotlin.annotation.*` meta-annotations, and the `kotlin.collections.*`
interfaces.
Four related fixes that let Kotlin-produced JavaType instances dedup
against Java-parser output for common JDK types:

**F-bounded generic resolution.** `BaseStream.sequential()` returns `S`
where `<S extends BaseStream<T, S>>`. The Kotlin parser was resolving
`S` — a `JavaClassifierType` whose classifier is a `JavaTypeParameter` —
through `TypeUtils.asFullyQualified`, which returns null for a
GenericTypeVariable. The fallback then produced `JavaType.Unknown`,
which cascaded through every type whose method signatures referenced
`S`. A JavaClassifierType whose classifier is a type parameter can't
itself be parameterized, so return the GTV directly.

**Java-primitive signature collision.** `KotlinTypeSignatureBuilder`
was using `JavaType.Primitive.Int.className` (`"java.lang.Integer"`) as
the signature for JVM `int`. That collided with the `java.lang.Integer`
Class entry written by `javaClassType`, so once Integer was resolved
(as it is during Number processing) every subsequent primitive-int
lookup got the boxed Class back. Switch to `.keyword` (`"int"`) so the
cache keys are distinct.

**ACC_VARARGS → Flag.Varargs.** The JVM reuses bit 7 (`0x0080`) for
`ACC_TRANSIENT` on fields and `ACC_VARARGS` on methods. OpenRewrite's
`Flag` keeps them separate: Transient at bit 7, Varargs at bit 34. When
we read `method.access.toLong()` directly from the class file, the
varargs methods (`Set.of(E[])`, `MethodHandles.argumentsWithCombiner`,
annotation-array parameters on constructors) were being mis-flagged
as Transient. Rewrite the bit in the method / constructor paths.

**Nested-class FQN dollar-separation.** `BinaryJavaClass.fqName.asString()`
returns `java.util.Map.Entry` — dotted form, indistinguishable from a
package-qualified top-level class named `Entry`. The Java parser emits
the JVM-style `java.util.Map$Entry`. Walk `outerClass` to produce the
dollar-separated FQN so nested types dedup between the two parsers.
Previously the parser kept `kotlin.Int` / `kotlin.Boolean` / `kotlin.Unit`
etc. as `JavaType.Class` instances with FQNs in the `kotlin.*` namespace.
Java-authored recipes that reason about JVM primitives — `MethodMatcher`s
with `int` parameters, `TypeUtils.isString`, etc. — silently failed on
Kotlin sources because the primitive was hidden behind a class wrapper.

In `KotlinTypeMapping.type()`, intercept `ConeClassLikeType` /
`FirResolvedQualifier` references whose FQN is a Kotlin built-in
primitive and whose nullability is known to be non-nullable (so they
collapse to a JVM primitive on the JVM rather than to a boxed class).
Return the matching `JavaType.Primitive` directly.

Class-definition contexts (where we're processing the kotlin.Int class
itself to populate its methods) still go through `classType()`. Method
declaring-type lookups bypass the primitive remap via a new
`asDeclaringType()` helper — methods are members of the kotlin.Int
class, not of the JVM `int` primitive.

`KotlinTypeUtils.isOfClassType("kotlin.Int", primitive)` now also
returns true so Kotlin-perspective recipes still match.

Existing recipes that depended on the wrapped form updated:
- `EqualsMethodUsage` checks `kotlin.Boolean` returns through
  `KotlinTypeUtils.isOfClassType` so Primitive.Boolean matches.
- `ReplaceCharToIntWithCode`'s template constraint loosened from
  `any(kotlin.Char)` to `any()` since the receiver is now a primitive.

Test expectations updated: `kotlin.Int` → `int`, `kotlin.Boolean` →
`boolean`, `kotlin.Unit` → `void` in the type-string assertions, and
`isInstanceOf(JavaType.Class.class)` checks for primitive types
replaced with `isEqualTo(JavaType.Primitive.X)` equality assertions.
Three related additions that chip further through the cross-parser dedup
cascade for JDK types:

**Enum values()/valueOf() synthesis.** The Java compiler generates
`static T[] values()` and `static T valueOf(String)` on every enum, and
the Java parser surfaces them. Kotlin's BinaryJavaClass only exposes
source-declared methods, so enum types dedup-mismatch with Java over a
missing `values()` and extra/missing `valueOf` overload. Synthesize both
when processing a BinaryJavaClass whose `isEnum` is true — `values()`
with flags ACC_PUBLIC | ACC_STATIC and return type `T[]`, and
`valueOf(String)` with ACC_PUBLIC | ACC_STATIC returning T.

**Static flag on nested interfaces and annotation types.** Nested
interfaces and annotation types are always implicitly static on the JVM,
but the ACC_STATIC bit lives in the InnerClasses attribute — not in
BinaryJavaClass.access. Apply it explicitly so
`java.lang.invoke.MethodHandle$PolymorphicSignature` and its peers get
flags 1544 rather than 1536, matching the Java parser. Applied in both
the FirClass path (`classId.isNestedClass`) and the BinaryJavaClass path
(FQN contains `$` after toJvmFqn normalization).

**SignaturePolymorphic flag.** Methods annotated with
`@java.lang.invoke.MethodHandle.PolymorphicSignature` — e.g.
`MethodHandle.invoke`, `VarHandle.compareAndExchange` — carry bit 46
(Flag.SignaturePolymorphic) in the Java parser. Detect the annotation
and set the bit.
Java 9 introduced private methods on interfaces for refactoring the
default-method helper pattern. These are concrete methods with private
visibility — not abstract, not default. Two code paths (FirFunction and
JavaMethod) were unconditionally marking all instance methods on an
interface as Abstract, and as Default when non-abstract, producing
wrong flags for methods like `java.lang.foreign.SegmentAllocator`'s
private helpers.

Check the Private bit (bit 1) before applying the Abstract/Default
adjustment so private interface methods retain their narrower flag set.
Resolves the remaining cross-parser dedup divergences between the Kotlin
and Java parsers on JDK types like Predicate, Optional, and Properties:

- Strip `java.lang.Object` bounds from wildcards (both JavaWildcardType
  and ConeKotlinTypeProjection paths) and from Java-origin type
  parameters (ConeTypeParameterType path). Kotlin's FIR surfaces the
  bytecode's explicit Object bound as `Generic{? super Object}` or
  `Generic{T extends Object}`; the Java parser elides these as
  `Generic{? super }` / `Generic{T}`. Only strip when the original
  bound wasn't explicitly `kotlin.Any` so Kotlin-source `<T : Any>`
  still surfaces with its author-intended bound.
- Also strip Object bounds in `KotlinIrTypeMapping.generic`.
- Align `javaClassSignature(BinaryJavaClass)` and
  `javaParameterizedSignature(JavaClassifierType)` to produce the
  JVM `Outer$Inner` form for nested classes, matching the Java parser
  and the `toJvmFqn` used for class-cache keys.
- Return the raw `Class` (not the class's own `Parameterized`) for a
  raw `JavaClassifierType` reference. Without this, a raw `Reference`
  field inside a generic `Reference<T>` surfaces as `Reference<T>`
  rather than `Reference`.
- Strip `ACC_FINAL` from constructor flags in both
  `methodDeclarationType(FirFunction)` and `javaConstructorType`.
  Kotlin's FIR synthesizes `FINAL` modality for every constructor on a
  final class; the JVM bytecode carries no ACC_FINAL on constructors
  and the Java parser reads flags from bytecode directly.
- Use the raw `Class` (not its `Parameterized` form) as the
  declaringType/returnType of a constructor in both paths.
- Filter `java.lang.String.serialPersistentFields` to match the Java
  parser's filter for this serialization-specific field.
@jkschneider jkschneider merged commit 3453c89 into main Apr 14, 2026
1 check passed
@jkschneider jkschneider deleted the kotlin-cross-parser-dedup branch April 14, 2026 19:52
@github-project-automation github-project-automation bot moved this from In Progress to Done in OpenRewrite Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

1 participant