Skip to content

Introduce JavaTypeFactory and extend JavaSourceSet with classpath fast paths#7528

Merged
knutwannheden merged 1 commit intomainfrom
java-type-factory-refactor
Apr 30, 2026
Merged

Introduce JavaTypeFactory and extend JavaSourceSet with classpath fast paths#7528
knutwannheden merged 1 commit intomainfrom
java-type-factory-refactor

Conversation

@knutwannheden
Copy link
Copy Markdown
Contributor

@knutwannheden knutwannheden commented Apr 30, 2026

Motivation

Parser-time type construction is independently re-implemented in every TypeMapping (Java 8/11/17/21/25, Groovy, Kotlin, KotlinIr, Scala, JavaReflection). Each one caches differently, uses a different idiom for breaking recursive resolution, and accesses JavaTypeCache directly. Recipes that want to influence parser type identity — most importantly, sharing types across a JavaTemplate splice and the surrounding LST — have no abstraction to hook into.

JavaSourceSet exposes a flat List<JavaType.FullyQualified> for the classpath. Recipes that look types up by FQN or by package have been doing linear scans, which doesn't scale on classpaths with thousands of types. There is no SPI for marker producers to plug in a sorted/indexed backing.

TypeTable writes per-artifact classes-directories. When those land on JavaTemplate's parser classpath, javac uses DirectoryContainer, which calls openat per list(). On parsing-heavy recipe runs the syscall cost can dominate wall time.

Summary

JavaTypeFactory

  • New org.openrewrite.java.internal.JavaTypeFactory interface with a compute* family covering the seven parser-constructed JavaType kinds: computeClass, computeMethod, computeVariable, computeArray, computeIntersection, computeGenericTypeVariable, computeParameterized.
  • Each compute* returns the cached instance keyed by signature, or allocates a stub, registers it in the cache before invoking the supplied initializer, then runs the initializer to populate the stub. Registering up front is what makes recursive resolution safe — a recursive lookup for the same signature finds the stub instead of looping.
  • DefaultJavaTypeFactory implements compute* directly. Initializers populate the stub by calling unsafeSet on it.
  • All TypeMappings migrated: ReloadableJava{8,11,17,21,25}TypeMapping, GroovyTypeMapping, KotlinTypeMapping, KotlinIrTypeMapping, ScalaTypeMapping, JavaReflectionTypeMapping.

JavaTypeFactory.Provider

  • Parser builders (JavaParser, Java*Parser, GroovyParser, KotlinParser, ScalaParser, GradleParser) accept a Provider. The legacy typeCache(JavaTypeCache) builder is @Deprecated in favor of typeFactory(JavaTypeFactory).
  • JavaTemplateParser consumes the same Provider from the recipe-scoped root cursor (via TYPE_FACTORY_PROVIDER_KEY). RecipeScheduler exposes rootCursorProvider(Supplier<Cursor>) so a recipe-scoped JavaTypeFactory can flow into JavaTemplate parses, keeping splice-time type identity consistent with the surrounding LST.
  • RecipeClassLoader parent-loads JavaTypeFactory so recipe-side and parser-side factory instances unify.

JavaSourceSet classpath fast paths

  • New JavaSourceSet.ClasspathIndex SPI: marker producers can plug in a sorted/indexed backing for the classpath list.
  • findClasspathType(fqn) — sorted-bsearch via ClasspathIndex.findFullyQualified, falls back to linear scan for legacy markers.
  • classpathTypesInPackage(pkg) — sorted-prefix range scan via ClasspathIndex.typesInPackage, same fallback.
  • removeTypesMatching / removeTypesForGav route through ClasspathIndex.withGavsRemoved when the marker backing supports it, so an indexed classpath view is preserved across recipe edits instead of being collapsed into a flat list.
  • ImportLayoutStyle uses the per-package fast path.
  • JavaSourceSetCompat retains the legacy materialization path under @ToBeRemoved(after = "2026-06-30") for recipes still calling JavaSourceSet.build(...) directly.

TypeTable writes per-artifact JARs

  • Output is now one JAR per (groupId, artifactId, version), written via temp file + atomic move.
  • Javac uses ArchiveContainer for these (central directory cached once) instead of DirectoryContainer. On a parsing-heavy recipe run (AssertJ migration on spring-data-commons), worker __open samples drop 921 → 157 (-83%) on async-profiler wall mode; recipe wall time drops ~9%.
  • AnnotationDeserializer, TypeSignature, and TypeTableSink are extracted as public types so writers can produce structured annotation data without string round-tripping.

Test plan

  • gw check clean across all modules (53m, 162 tasks, 0 failures)
  • Existing TypeMapping tests pass under each Java/Groovy/Kotlin/Scala module
  • JavaTemplateParserProviderTest covers Provider plumbing through JavaTemplate
  • TypeTableTest updated to assert JAR entries instead of directory contents
  • Legacy JavaSourceSet.build(...) path covered by JavaSourceSetCompat and remaining recipe tests

Surface area for parser-time type construction is now centralized on
JavaTypeFactory and threaded through every TypeMapping (Java 8/11/17/21/25,
Groovy, Kotlin/KotlinIr, Scala, JavaReflection). Each compute* method
returns the cached instance keyed by signature, or allocates a stub,
registers it, and runs the supplied initializer. Registering the stub
before the initializer runs is what makes recursive resolution safe —
a recursive lookup for the same signature finds the stub instead of
looping.

The previous two-phase create*/populate* surface is gone.
DefaultJavaTypeFactory implements compute* directly. Initializers call
unsafeSet on the stub to populate fields, replacing the earlier
typeFactory.populateXxx indirection.

JavaTypeFactory.Provider is the parser-builder hand-off: parsers
(JavaParser, Java*Parser, GroovyParser, KotlinParser, ScalaParser,
GradleParser) and JavaTemplateParser all consume a Provider rather than a
JavaTypeCache directly. JavaTypeCache itself is unchanged but the cache()
configuration on parser builders is deprecated in favor of typeFactory().
RecipeScheduler exposes a rootCursorProvider so a recipe-scoped
JavaTypeFactory can flow into JavaTemplate parses, keeping splice-time
type identity consistent with the surrounding LST.

JavaSourceSet gains two fast-path accessors for recipes that index by
classpath:

- findClasspathType(fqn) — sorted-bsearch on a per-source-set
  ClasspathIndex SPI, falls back to linear scan for legacy markers.
- classpathTypesInPackage(pkg) — sorted-prefix range scan.

Both go through ClasspathIndex.Subset, an SPI marker producers
(serializer V4 lazy backings) implement to short-circuit. ImportLayoutStyle
uses the per-package fast path. removeTypesMatching / removeTypesForGav
also dispatch through ClasspathIndex#withGavsRemoved when the marker
backing supports it, preserving lazy classpath views across recipe edits.

JavaSourceSetCompat retains the legacy materialization path under
@toBeRemoved(after = "2026-06-30") for recipes still calling
JavaSourceSet.build(...) directly. RecipeClassLoader parent-loads
JavaTypeFactory so recipe-side and parser-side factory instances unify.

TypeTable now writes per-artifact JARs (instead of classes-directories) so
javac uses ArchiveContainer with a cached central directory rather than
DirectoryContainer, which calls openat() per list() during template
parsing. Wall-mode profiling on AssertJ recipe / spring-data-commons:
__open worker samples 921 → 157 (-83%); recipe wall time 1m23s → 1m16s.
TypeTable annotations are now structured: AnnotationDeserializer is a
public class, TypeSignature parses ASM signatures, and TypeTableSink is
the abstract sink interface so writers don't go through string
round-trip.
@github-project-automation github-project-automation Bot moved this to In Progress in OpenRewrite Apr 30, 2026
@knutwannheden knutwannheden changed the title Introduce JavaTypeFactory and extend JavaSourceSet with classpath fast paths Introduce JavaTypeFactory and extend JavaSourceSet with classpath fast paths Apr 30, 2026
@knutwannheden knutwannheden merged commit 23be80c into main Apr 30, 2026
1 check passed
@knutwannheden knutwannheden deleted the java-type-factory-refactor branch April 30, 2026 10:45
@github-project-automation github-project-automation Bot moved this from In Progress to Done in OpenRewrite Apr 30, 2026
@sambsnyd
Copy link
Copy Markdown
Member

Nice 👍

steve-aom-elliott added a commit that referenced this pull request May 1, 2026
After #7528 TypeTable materializes artifacts as jar files inside the
version directory rather than as a classes directory. The .tt branch
in gavFromPath used fixed offsets that worked for the legacy directory
layout but mis-sliced GAV components for the new jar layout.

Compute versionIndex/artifactIndex off the tail and shift left by one
when the trailing path component ends with .jar, so the slice points
at the right components in either layout.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants