Move Builtin Types and Methods to stdlib #3363

hubertp · 2022-03-25T11:23:16Z

Pull Request Description

This PR replaces hard-coded @Builtin_Method and @Builtin_Type nodes in Builtins with an automated solution
that a) collects metadata from such annotations b) generates BuiltinTypes c) registers builtin methods with corresponding
constructors.
The main differences are:

The owner of the builtin method does not necessarily have to be a builtin type
You can now mix regular methods and builtin ones in stdlib
No need to keep track of builtin methods and types in various places and register them by hand (a source of many typos or omissions as it found during the process of this PR)

Related to #181497846
Benchmarks also execute within the margin of error.

Important Notes

The PR got a bit large over time as I was moving various builtin types and finding various corner cases.
Most of the changes however are rather simple c&p from Builtins.enso to the corresponding stdlib module.
Here is the list of the most crucial updates:

engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java - the core of the changes. We no longer register individual builtin constructors and their methods by hand. Instead, the information about those is read from 2 metadata files generated by annotation processors. When the builtin method is encountered in stdlib, we do not ignore the method. Instead we lookup it up in the list of registered functions (see getBuiltinFunction and IrToTruffle)
engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java has now information whether it corresponds to the builtin type or not.
engine/runtime/src/main/scala/org/enso/compiler/codegen/RuntimeStubsGenerator.scala - when runtime stubs generator encounters a builtin type, based on the @Builtin_Type annotation, it looks up an existing constructor for it and registers it in the provided scope, rather than creating a new one. The scope of the constructor is also changed to the one coming from stdlib, while ensuring that synthetic methods (for fields) also get assigned correctly
engine/runtime/src/main/scala/org/enso/compiler/codegen/IrToTruffle.scala - when a builtin method is encountered in stdlib we don't generate a new function node for it, instead we look it up in the list of registered builtin methods. Note that Integer and Number present a bit of a challenge because they list a whole bunch of methods that don't have a corresponding method (instead delegating to small/big integer implementations).
During the translation new atom constructors get initialized but we don't want to do it for builtins which have gone through the process earlier, hence the exception
lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java - @Builtin_Method processor not only generates the actual code fpr nodes but also collects and writes the info about them (name, class, params) to a metadata file that is read during builtins initialization
lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java - @Builtin_Method processor no longer generates only (root) nodes but also collects and writes the info about them (name, class, params) to a metadata file that is read during builtins initialization
lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/TypeProcessor.java - Similar to MethodProcessor but handles @Builtin_Type annotations. It doesn't, yet, generate any builtin objects. It also collects the names, as present in stdlib, if any, so that we can generate the names automatically (see generated types/ConstantsGen.java)
engine/runtime/src/main/java/org/enso/interpreter/node/expression/builtin - various classes annotated with @BuiltinType to ensure that the atom constructor is always properly registered for the builitn. Note that in order to support types fields in those, annotation takes optional params parameter (comma separated).
engine/runtime/src/bench/scala/org/enso/interpreter/bench/fixtures/semantic/AtomFixtures.scala - drop manual creation of test list which seemed to be a relict of the old design

Checklist

Please include the following checklist in your PR:

The documentation has been updated if necessary.
All code conforms to the Scala, Java, and Rust style guides.
All code has been tested:
- Unit tests have been written where possible.
- If GUI codebase was changed: Enso GUI was tested when built using BOTH ./run dist and ./run watch.

This is WIP. This change does not attempt to handle @Builtin_Type. Yet. During the initialization of Builtins we collect info about root nodes representing generated AST truffle nodes. Such nodes will dynamically replace implementations of @Builtin_Method stubs during IR to Truffle translation. There are still rough edges to work out but checking this in to see if this is the direction we want to take.

Sadly because of the move of the functions to a separate module and @Builtin_Type Polyglot still being in Builtins it's tricky to get auto imports that don't require manual adjustments. The main issue is with `PolyglotBranchNode` in `IrToTruffle` that requires a reference to the polyglot atom constructor. Decided to make this incremental step to do some benchmarking of @Builtin_Method solution and then move on @Builtin_Type. The latter will require a separate pass, hence the additional step. Additionally removed unused warnings for methods annotated with @Builtin_Method.

The change moves another type, Array, out of Builtins.enso. Also introduced a couple of changes to make it possible: - Complex Types cannot erase annotations of its atoms - method lookup algorithm has to take into account situation where Complex Type, it's Atom *and* module use the same name, which makes name resolution slightly ambiguous. The new version ensures that we don't have to use qualified names *every single time* This change introduces some lookup maps for checking if the given atom constructor is a builtin type. Using maps (and strings as keys) is not always necessary or unambiguous, and will work on a more permanent solution.

JaroslavTulach

Moving to declarative - e.g. not imperative - registration is always good. In addition if, ideally, we could:

make all the necessary information available at compile time
read the information step-by-step when needed

It could make the system more "native image" ready. It could speed up initialization even in HotSpot mode.

build.sbt

engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java

A very simple move of Boolean values to std library. Initially wanted to also fix the type definition of Boolean so that it looks like ``` @Builtin_Type type Boolean type True type False ... methods ``` rather than current type Boolean @Builtin_Type type Builtin_Type ... methods @Builtin_Type type True @Builtin_Type type False ``` but turns out this is a bigger task on its own. For example ComplexTypes pass would have to keep the information Boolean data type, rather than erasing it.

Per PR review: using `reflections` library adds an unnecessary cost at startup. Instead, while generating code from the BuiltinMethod annotations, we write meta files that can be read at compile time, therefore avoiding the performance penalty.

JaroslavTulach

Nicer. In addition it would be good to eliminate also FileSystem creation and walking through it.

engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java

lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java

Demonstrates support for @BuiltinMethod when we are not dealing with special types known to the compiler.

This reverts commit 6938b2d.

In order to be able to locally extend a builtin type, we have to be able to then find it during symbol resolution. Since the symbol is being added to a module in stdlib but resolution would be using Any special builtin type, the latter has to be aware of the former via shadowdefinitions reference.

Mostly c&p, and occassional fix for imports/renaming.

Rather than writing multiple tiny metadata files per each builtin we now generate a single file with all information. This simplifies the logic of reading that data in builtins a lot.

Reading resource as a stream rather than as a file allows us to remove the need to initialize FileSystem.

…497846

Move builtin types Cons & Nil to stdlib List module. Mostly c&p, also preparing for getting rid of temp `builtinConstructors` list for initialization of builtin types.

Use @BuiltinType to mark specific types as BuiltinTypes, and gather necessary metadata for initializing them during Builtins setup. This is WIP, lacks documentation and such, but illustrates steps towards system of builtin objects without hardcoding everything.

kustosz

I like this a lot, and it's going to be a great boon for usability! Reported some problems inline that make this broken though (should be easy enough to fix though).

engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java

...runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/BuiltinAtomConstructor.java

engine/runtime/src/main/java/org/enso/interpreter/runtime/scope/ModuleScope.java

lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/BuiltinsMetadataProcessor.java

engine/runtime/src/test/scala/org/enso/std/test/BooleanTest.scala

In separate compilation annotation processor will only process files that were touched. However, since we now create metadata files with bulk information and it is not possible to append/modify exisiting resource files, it will simply overwrite the original resource with a fraction of the original information. This would in turn lead to runtime issues. The only solution was to clean and compile from scratch which is not really acceptable in regular dev. Alternatively, in the past we were creating metadata file per source file. The latter however meant that during Builtins initialization we had to do a bit of File traversing. Instead we will read the previous resource file before overriding it, update entries that changed and append the past contents. It doesn't deal with all cases (renaming/delete) but neither would the alternatives.

The initial work introduced a workaround for a "feature" that existed before builtins work. Consider a module Foo defined `src/Data/Foo.enso`: ``` type Foo type Foo a plus_one = 1 + a new x = Foo x ``` And now you define extension methods on Foo ``` from Standard.Base.Data.Foo import all Foo.two = 2 Foo.Foo.three = 3 main = x = Foo.new 0 x.two // not OK x.there // fine ``` This is how it works even in the current master. The logic in ModuleScope allowed to workaround the (hidden) name resolution ambiguity and just define `Foo.two`. This became apparent when adding methods to types such as Nothing and Any and some tests were failing. This change reverts the workaround but adds a pending test to potentially address this in the future.

engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Error.java

engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java

JaroslavTulach · 2022-04-29T14:32:32Z

engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java

+      // Ensure that synthetic methods, such as getters for fields are in the scope
+      // Some scopes won't have any methods at this point, e.g., Nil or Nothing, hence the null
+      // check.
+      Map<String, Function> methods = this.definitionScope.getMethods().get(this);


Possibly guard byCompilerAsserts.neverPartOfCompilation().

JaroslavTulach

228 files! I am tired just scrolling thru them. Grandiose work. Please integrate it.

distribution/lib/Standard/Base/0.0.0-dev/src/Error/Common.enso

BuiltinType annotation now provides an additional parameter that stores a link to the corresponding definition in stdlib. TypesFromProxy provides a single static method `fromTypeSystem` which converts from type-system type names to atoms. It is a proxy because it could easily be placed inside {@link org.enso.interpreter.runtime.builtin.Builtins}. Except that by the time Builtins is compiled it requires ConstantsGen to be present, which is being generated via the TypeProcessor at a late stage. Similarly {@link org.enso.interpreter.runtime.type.Types} requires ConstantsGen to be in the same package which creates a catch-22 situation.

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>

Avoiding the `CompilationFinal` logic all over the place by abstracting it to a separate class. Minor cleanups as well.

…497846

Loading stdlib for `org.enso.interpreter.test.instrument.*` tests appears to be the main slowdown with builtins move. As in 25+ seconds vs 1 second, slowdown. Turning IR caching does not appear to affect the results and we are more or less back to what we used to have.

…497846

kustosz

I'm happy with this. Regarding test slowdowns – think about the micro-stdlib approach. Maybe in the next PR?

…497846

hubertp requested review from 4e6, jdunkerley, radeusgd, MichaelMauderer and PabloBuchu as code owners March 25, 2022 11:23

hubertp marked this pull request as draft March 25, 2022 11:23

hubertp added 3 commits March 29, 2022 15:36

Move Ref type + basic test

02ab5b1

JaroslavTulach approved these changes Apr 5, 2022

View reviewed changes

hubertp added 3 commits April 5, 2022 17:00

Use constant values to avoid typos

0f89f44

JaroslavTulach approved these changes Apr 7, 2022

View reviewed changes

engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java Outdated Show resolved Hide resolved

lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java Outdated Show resolved Hide resolved

hubertp added 2 commits April 7, 2022 12:48

Move Meta out of builtins

91c6432

Demonstrates support for @BuiltinMethod when we are not dealing with special types known to the compiler.

Revert "Move Boolean out of Builtins to std library"

43840f2

This reverts commit 6938b2d.

JaroslavTulach mentioned this pull request Apr 9, 2022

Enso Date shall be converted to java.time.LocalDate when passed to Java #3374

Merged

hubertp added 10 commits April 9, 2022 22:40

wip

9822d76

Move Boolean to stdlib

9dbfa06

Mostly c&p, and occassional fix for imports/renaming.

Missing Booleans.enso from last commit

c42b092

Simplify builtins metadata generation

fb10830

Rather than writing multiple tiny metadata files per each builtin we now generate a single file with all information. This simplifies the logic of reading that data in builtins a lot.

Cleanup

fb666c1

Simplify resource reading

1dc9594

Reading resource as a stream rather than as a file allows us to remove the need to initialize FileSystem.

Merge branch 'develop' into wip/hubert/shadow-definition-builtins-181…

8ca36e8

…497846

Move List to stdlib

a588c80

Move builtin types Cons & Nil to stdlib List module. Mostly c&p, also preparing for getting rid of temp `builtinConstructors` list for initialization of builtin types.

kustosz suggested changes Apr 13, 2022

View reviewed changes

hubertp added 2 commits April 28, 2022 22:53

Formatting

7dc9a34

JaroslavTulach reviewed Apr 29, 2022

View reviewed changes

engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Error.java Outdated Show resolved Hide resolved

JaroslavTulach reviewed Apr 29, 2022

View reviewed changes

engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java Outdated Show resolved Hide resolved

JaroslavTulach reviewed Apr 29, 2022

View reviewed changes

JaroslavTulach approved these changes Apr 29, 2022

View reviewed changes

radeusgd reviewed Apr 29, 2022

View reviewed changes

distribution/lib/Standard/Base/0.0.0-dev/src/Error/Common.enso Outdated Show resolved Hide resolved

hubertp and others added 5 commits April 29, 2022 18:33

Test adjustments to the previous commit

9414fce

Update distribution/lib/Standard/Base/0.0.0-dev/src/Error/Common.enso

fa4ca5b

Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>

Make formatter happy again

cba36f3

DRY

7bf6261

Avoiding the `CompilationFinal` logic all over the place by abstracting it to a separate class. Minor cleanups as well.

hubertp requested review from kustosz and radeusgd May 4, 2022 11:12

hubertp added 6 commits May 4, 2022 13:13

Add a requested compiler assert

fe89bc1

Merge branch 'develop' into wip/hubert/shadow-definition-builtins-181…

1c40d25

…497846

Fix tests

6ba1bc4

Increase timeout to avoid FPs

4fde6ba

Merge branch 'develop' into wip/hubert/shadow-definition-builtins-181…

6f27d5e

…497846

kustosz approved these changes May 5, 2022

View reviewed changes

hubertp added 4 commits May 5, 2022 15:40

Fix tests

aeeef17

Merge branch 'develop' into wip/hubert/shadow-definition-builtins-181…

670cad6

…497846

Make disabling IR cache in tests configurable

6c06032

More formatting

c90836d

jdunkerley approved these changes May 5, 2022

View reviewed changes

Merge branch 'develop' into wip/hubert/shadow-definition-builtins-181…

884a7d1

…497846

mwu-tow merged commit 4bbabc0 into develop May 5, 2022

mwu-tow deleted the wip/hubert/shadow-definition-builtins-181497846 branch May 5, 2022 18:18

wdanilo mentioned this pull request Feb 6, 2023

Support for Enso Date across the Java polyglot boundary #4296

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move Builtin Types and Methods to stdlib #3363

Move Builtin Types and Methods to stdlib #3363

hubertp commented Mar 25, 2022 •

edited

Loading

JaroslavTulach left a comment

JaroslavTulach left a comment

kustosz left a comment

JaroslavTulach Apr 29, 2022

JaroslavTulach left a comment

kustosz left a comment

Move Builtin Types and Methods to stdlib #3363

Move Builtin Types and Methods to stdlib #3363

Conversation

hubertp commented Mar 25, 2022 • edited Loading

Pull Request Description

Important Notes

Checklist

JaroslavTulach left a comment

Choose a reason for hiding this comment

JaroslavTulach left a comment

Choose a reason for hiding this comment

kustosz left a comment

Choose a reason for hiding this comment

JaroslavTulach Apr 29, 2022

Choose a reason for hiding this comment

JaroslavTulach left a comment

Choose a reason for hiding this comment

kustosz left a comment

Choose a reason for hiding this comment

hubertp commented Mar 25, 2022 •

edited

Loading