Specialise generic functions and types over kinds/shapes #525

yorickpeterse · 2023-05-15T15:39:32Z

Description

Generic functions are currently compiled in a subpar manner. #508 keeps this approach in an attempt to keep the scope of changes from expanding even more, but it's not something I want to keep in the long term.

The usual approach is to specialise generic types/functions across types, essentially copy-pasting these types and substituting the generic type parameters with the real types. This can result in great runtime performance, but also slows down compile times drastically; especially paired with the already slow LLVM. In case of Inko this would also be a waste, as a lot of data will be addressed through pointers and thus have the same memory layout (e.g. Array[User] and Array[String] would have the same layout).

Instead, I propose we specialise across kinds/shapes. These shapes specify both the memory layout, and how to handle aliasing. We'd start out with the following shapes:

T: pointer sized, owned
Int: 64 bits unboxed, passed around on the stack
Float: 64 bits floating point, unboxed, passed around on the stack using the appropriate floating point registers
ref T: just a pointer memory wise, but creating/dropping aliases involves changing the reference count
T where T is an atomic value (e.g. String): basically the same as ref T, but using atomic reference counting, and decrements follow an extra check to drop the value when the ref count is zero

Splitting generic code into these different shapes means we can get rid of pointer tagging, allow unboxed Int and Float values, and don't need to perform runtime checks to figure out how to create/destroy aliases. In addition, the compiler is free to turn types into shapes if deemed beneficial, allowing for more optimisations but without compile times going down the drain as a default.

A challenge here is casting Int and Float to traits objects. Such values are typed as T and thus treated as a heap object. This requires that we box Int and Float upon a cast.

In addition, generic types containing Int and Float can't be compatible with generic types containing traits implemented by Int and Float (e.g Array[Int] isn't compatible with Array[ToString]), because we'd be passing unboxed values where boxed values are expected. This is fine when passing e.g. Int directly to a ToString, as in such a case the conversion is trivial.

Similarly, if we have a value typed as a trait it might need atomic reference counting, such as when a function takes a ToString and is given a String. What we probably need to do is insert a runtime check that checks the object header, and chooses the right strategy based on whether the value needs atomic reference counting or not. This can be avoided if we statically know the trait isn't implemented by any atomic types.

Finally, for this to work we'll need to implement Array in Inko, such that the compiler/generated code has control over the memory layout, instead of this being dictated by the runtime library. This is outlined in #349.

The end result is that we can get rid of all runtime bookkeeping for generics, improve performance, but without dramatically increasing compile times.

Related work

https://github.com/golang/proposal/blob/master/design/generics-implementation-gcshape.md

Depends on

Consider implementing ByteArray and Array in Inko with a set of primitive instructions #349
- Type-safe C FFI #290

The text was updated successfully, but these errors were encountered:

This adds support for linking against C libraries and using their types and functions. The FFI is fairly strict and a bit limited, such as not performing automatic type conversions, but this is a deliberate choice: it keeps the compiler's complexity at a reasonable level, and it should (hopefully) further drive home the idea that one should avoid interfacing with C as much as they can, as all of Inko's safety guarantees are thrown out of the window when doing so. In preparation for #525, some runtime functions return an i64 instead of an Int, while some still return an Int. This is a little inconsistent, but the goal is to try and reduce the amount of explicit type casts that we may need to change when said specialisation is implemented. Once implemented, Int64 will just be an alias for Int (or maybe I'll remove it entirely, not sure yet). As part of this work, the precedence of type casts (`x as Type`) is changed to be the same as binary operators. This means expressions such as `x as Foo + y` are now valid, instead of resulting in a syntax error. This makes working with type casts (something you'll need to use more often when working with C code) less painful. This commit also introduces support for conditional compilation at the import level. Based on the build target, a set of build tags is produced. `import` statements support a condition, and when given the `import` is only included if all the tags match. For example: import foo if mac and amd64 This would only import `foo` if the tags "mac" and "amd64" are present. OR and NOT expressions aren't supported, as one can simply use multiple imports. This fixes #290. Changelog: added

The Array type is no longer defined in the runtime and exposed to the standard library, instead the standard library defines the type entirely, using the C FFI for raw memory allocations. This is a step towards being able to implement a better way of compiling generics (see #525), as doing so requires creating specialised copies of the Array type. As part of this, array literals are desugared into regular method calls, meaning this: [10, 20] Is now turned into this: { let temp = Array.with_capacity(3) temp.push(10) temp.push(20) temp } This removes a bunch of duplication/redundant compiler logic. Array constants in turn are changed to use statically allocated memory for their buffers (the actual Inko object is still heap allocated). The permanent flag is no longer set, instead we return a pointer with the reference bit set. While the reference count may overflow/underflow, that's OK because we never actually check it and drop the constant. The idea is to try and get rid of this permanent flag entirely, provided I can find a way to not need it for string constants, as we need to prevent dropping them when their reference count reaches zero. Integer and float literals no longer use constants. For floats that means heap allocations for float literals, but this is only temporary until we implement specialisation, at which point both Int and Float become stack/immediate values. This change makes some parts of LLVM code generation a bit easier to work with. The ByteArray type is still implemented in the runtime, as various Rust functions we use specifically expect a `Vec<u8>`. Until we replace all of that (or find another workaround), ByteArray remains defined in the runtime library. This fixes #349. Changelog: changed

yorickpeterse · 2023-07-26T22:14:05Z

Similarly, if we have a value typed as a trait it might need atomic reference counting, such as when a function takes a ToString and is given a String. What we probably need to do is insert a runtime check that checks the object header, and chooses the right strategy based on whether the value needs atomic reference counting or not. This can be avoided if we statically know the trait isn't implemented by any atomic types.

Instead I think we need to take a different approach:

If an argument is typed as SomeTrait, and this trait is implemented by a type that uses atomic reference counting, we also specialise over the "atomic" shape. Thus, arg: SomeTrait or arg: ref SomeTrait is effectively the same as arg: T / arg: ref T where T: SomeTrait.

Another option is that we simply disallow traits in arguments, and instead force users to go through generics. This would lead to slightly more verbose code, but also keep things consistent. It probably also solves some covariance problems down the line.

yorickpeterse added feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance compiler Changes related to the compiler labels May 15, 2023

yorickpeterse added this to the 0.12.0 milestone May 19, 2023

yorickpeterse self-assigned this May 24, 2023

yorickpeterse modified the milestones: 0.12.0, 0.13.0 May 30, 2023

yorickpeterse mentioned this issue Jun 13, 2023

Type-safe C FFI #290

Closed

10 tasks

yorickpeterse mentioned this issue Oct 1, 2023

Specialize generic types and methods #604

Merged

yorickpeterse closed this as completed in #604 Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Specialise generic functions and types over kinds/shapes #525

Specialise generic functions and types over kinds/shapes #525

yorickpeterse commented May 15, 2023 •

edited

Loading

yorickpeterse commented Jul 26, 2023 •

edited

Loading

Specialise generic functions and types over kinds/shapes #525

Specialise generic functions and types over kinds/shapes #525

Comments

yorickpeterse commented May 15, 2023 • edited Loading

Description

Related work

Depends on

yorickpeterse commented Jul 26, 2023 • edited Loading

yorickpeterse commented May 15, 2023 •

edited

Loading

yorickpeterse commented Jul 26, 2023 •

edited

Loading