-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move Builtin Types and Methods to stdlib #3363
Move Builtin Types and Methods to stdlib #3363
Conversation
This is WIP. This change does not attempt to handle @Builtin_Type. Yet. During the initialization of Builtins we collect info about root nodes representing generated AST truffle nodes. Such nodes will dynamically replace implementations of @Builtin_Method stubs during IR to Truffle translation. There are still rough edges to work out but checking this in to see if this is the direction we want to take.
Sadly because of the move of the functions to a separate module and @Builtin_Type Polyglot still being in Builtins it's tricky to get auto imports that don't require manual adjustments. The main issue is with `PolyglotBranchNode` in `IrToTruffle` that requires a reference to the polyglot atom constructor. Decided to make this incremental step to do some benchmarking of @Builtin_Method solution and then move on @Builtin_Type. The latter will require a separate pass, hence the additional step. Additionally removed unused warnings for methods annotated with @Builtin_Method.
The change moves another type, Array, out of Builtins.enso. Also introduced a couple of changes to make it possible: - Complex Types cannot erase annotations of its atoms - method lookup algorithm has to take into account situation where Complex Type, it's Atom *and* module use the same name, which makes name resolution slightly ambiguous. The new version ensures that we don't have to use qualified names *every single time* This change introduces some lookup maps for checking if the given atom constructor is a builtin type. Using maps (and strings as keys) is not always necessary or unambiguous, and will work on a more permanent solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving to declarative - e.g. not imperative - registration is always good. In addition if, ideally, we could:
- make all the necessary information available at compile time
- read the information step-by-step when needed
It could make the system more "native image" ready. It could speed up initialization even in HotSpot mode.
engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java
Outdated
Show resolved
Hide resolved
engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java
Outdated
Show resolved
Hide resolved
engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java
Show resolved
Hide resolved
A very simple move of Boolean values to std library. Initially wanted to also fix the type definition of Boolean so that it looks like ``` @Builtin_Type type Boolean type True type False ... methods ``` rather than current type Boolean @Builtin_Type type Builtin_Type ... methods @Builtin_Type type True @Builtin_Type type False ``` but turns out this is a bigger task on its own. For example ComplexTypes pass would have to keep the information Boolean data type, rather than erasing it.
Per PR review: using `reflections` library adds an unnecessary cost at startup. Instead, while generating code from the BuiltinMethod annotations, we write meta files that can be read at compile time, therefore avoiding the performance penalty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicer. In addition it would be good to eliminate also FileSystem
creation and walk
ing through it.
engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java
Outdated
Show resolved
Hide resolved
lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java
Outdated
Show resolved
Hide resolved
Demonstrates support for @BuiltinMethod when we are not dealing with special types known to the compiler.
This reverts commit 6938b2d.
In order to be able to locally extend a builtin type, we have to be able to then find it during symbol resolution. Since the symbol is being added to a module in stdlib but resolution would be using Any special builtin type, the latter has to be aware of the former via shadowdefinitions reference.
Mostly c&p, and occassional fix for imports/renaming.
Rather than writing multiple tiny metadata files per each builtin we now generate a single file with all information. This simplifies the logic of reading that data in builtins a lot.
Reading resource as a stream rather than as a file allows us to remove the need to initialize FileSystem.
Move builtin types Cons & Nil to stdlib List module. Mostly c&p, also preparing for getting rid of temp `builtinConstructors` list for initialization of builtin types.
Use @BuiltinType to mark specific types as BuiltinTypes, and gather necessary metadata for initializing them during Builtins setup. This is WIP, lacks documentation and such, but illustrates steps towards system of builtin objects without hardcoding everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot, and it's going to be a great boon for usability! Reported some problems inline that make this broken though (should be easy enough to fix though).
engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java
Outdated
Show resolved
Hide resolved
...runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/BuiltinAtomConstructor.java
Outdated
Show resolved
Hide resolved
engine/runtime/src/main/java/org/enso/interpreter/runtime/scope/ModuleScope.java
Outdated
Show resolved
Hide resolved
lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/BuiltinsMetadataProcessor.java
Show resolved
Hide resolved
engine/runtime/src/test/scala/org/enso/std/test/BooleanTest.scala
Outdated
Show resolved
Hide resolved
In separate compilation annotation processor will only process files that were touched. However, since we now create metadata files with bulk information and it is not possible to append/modify exisiting resource files, it will simply overwrite the original resource with a fraction of the original information. This would in turn lead to runtime issues. The only solution was to clean and compile from scratch which is not really acceptable in regular dev. Alternatively, in the past we were creating metadata file per source file. The latter however meant that during Builtins initialization we had to do a bit of File traversing. Instead we will read the previous resource file before overriding it, update entries that changed and append the past contents. It doesn't deal with all cases (renaming/delete) but neither would the alternatives.
The initial work introduced a workaround for a "feature" that existed before builtins work. Consider a module Foo defined `src/Data/Foo.enso`: ``` type Foo type Foo a plus_one = 1 + a new x = Foo x ``` And now you define extension methods on Foo ``` from Standard.Base.Data.Foo import all Foo.two = 2 Foo.Foo.three = 3 main = x = Foo.new 0 x.two // not OK x.there // fine ``` This is how it works even in the current master. The logic in ModuleScope allowed to workaround the (hidden) name resolution ambiguity and just define `Foo.two`. This became apparent when adding methods to types such as Nothing and Any and some tests were failing. This change reverts the workaround but adds a pending test to potentially address this in the future.
engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Error.java
Outdated
Show resolved
Hide resolved
engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java
Outdated
Show resolved
Hide resolved
// Ensure that synthetic methods, such as getters for fields are in the scope | ||
// Some scopes won't have any methods at this point, e.g., Nil or Nothing, hence the null | ||
// check. | ||
Map<String, Function> methods = this.definitionScope.getMethods().get(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly guard byCompilerAsserts.neverPartOfCompilation()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
228 files! I am tired just scrolling thru them. Grandiose work. Please integrate it.
BuiltinType annotation now provides an additional parameter that stores a link to the corresponding definition in stdlib. TypesFromProxy provides a single static method `fromTypeSystem` which converts from type-system type names to atoms. It is a proxy because it could easily be placed inside {@link org.enso.interpreter.runtime.builtin.Builtins}. Except that by the time Builtins is compiled it requires ConstantsGen to be present, which is being generated via the TypeProcessor at a late stage. Similarly {@link org.enso.interpreter.runtime.type.Types} requires ConstantsGen to be in the same package which creates a catch-22 situation.
Co-authored-by: Radosław Waśko <radoslaw.wasko@enso.org>
Loading stdlib for `org.enso.interpreter.test.instrument.*` tests appears to be the main slowdown with builtins move. As in 25+ seconds vs 1 second, slowdown. Turning IR caching does not appear to affect the results and we are more or less back to what we used to have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with this. Regarding test slowdowns – think about the micro-stdlib approach. Maybe in the next PR?
Pull Request Description
This PR replaces hard-coded
@Builtin_Method
and@Builtin_Type
nodes in Builtins with an automated solutionthat a) collects metadata from such annotations b) generates
BuiltinTypes
c) registers builtin methods with correspondingconstructors.
The main differences are:
Related to #181497846
Benchmarks also execute within the margin of error.
Important Notes
The PR got a bit large over time as I was moving various builtin types and finding various corner cases.
Most of the changes however are rather simple c&p from Builtins.enso to the corresponding stdlib module.
Here is the list of the most crucial updates:
engine/runtime/src/main/java/org/enso/interpreter/runtime/builtin/Builtins.java
- the core of the changes. We no longer register individual builtin constructors and their methods by hand. Instead, the information about those is read from 2 metadata files generated by annotation processors. When the builtin method is encountered in stdlib, we do not ignore the method. Instead we lookup it up in the list of registered functions (seegetBuiltinFunction
andIrToTruffle
)engine/runtime/src/main/java/org/enso/interpreter/runtime/callable/atom/AtomConstructor.java
has now information whether it corresponds to the builtin type or not.engine/runtime/src/main/scala/org/enso/compiler/codegen/RuntimeStubsGenerator.scala
- when runtime stubs generator encounters a builtin type, based on the @Builtin_Type annotation, it looks up an existing constructor for it and registers it in the provided scope, rather than creating a new one. The scope of the constructor is also changed to the one coming from stdlib, while ensuring that synthetic methods (for fields) also get assigned correctlyengine/runtime/src/main/scala/org/enso/compiler/codegen/IrToTruffle.scala
- when a builtin method is encountered in stdlib we don't generate a new function node for it, instead we look it up in the list of registered builtin methods. Note that Integer and Number present a bit of a challenge because they list a whole bunch of methods that don't have a corresponding method (instead delegating to small/big integer implementations).During the translation new atom constructors get initialized but we don't want to do it for builtins which have gone through the process earlier, hence the exception
lib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java
- @Builtin_Method processor not only generates the actual code fpr nodes but also collects and writes the info about them (name, class, params) to a metadata file that is read during builtins initializationlib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/MethodProcessor.java
- @Builtin_Method processor no longer generates only (root) nodes but also collects and writes the info about them (name, class, params) to a metadata file that is read during builtins initializationlib/scala/interpreter-dsl/src/main/java/org/enso/interpreter/dsl/TypeProcessor.java
- Similar to MethodProcessor but handles @Builtin_Type annotations. It doesn't, yet, generate any builtin objects. It also collects the names, as present in stdlib, if any, so that we can generate the names automatically (see generatedtypes/ConstantsGen.java
)engine/runtime/src/main/java/org/enso/interpreter/node/expression/builtin
- various classes annotated with @BuiltinType to ensure that the atom constructor is always properly registered for the builitn. Note that in order to support types fields in those, annotation takes optionalparams
parameter (comma separated).engine/runtime/src/bench/scala/org/enso/interpreter/bench/fixtures/semantic/AtomFixtures.scala
- drop manual creation of test list which seemed to be a relict of the old designChecklist
Please include the following checklist in your PR:
./run dist
and./run watch
.