Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialise generic functions and types over kinds/shapes #525

Closed
yorickpeterse opened this issue May 15, 2023 · 1 comment · Fixed by #604
Closed

Specialise generic functions and types over kinds/shapes #525

yorickpeterse opened this issue May 15, 2023 · 1 comment · Fixed by #604
Assignees
Labels
compiler Changes related to the compiler feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance
Milestone

Comments

@yorickpeterse
Copy link
Collaborator

yorickpeterse commented May 15, 2023

Description

Generic functions are currently compiled in a subpar manner. #508 keeps this approach in an attempt to keep the scope of changes from expanding even more, but it's not something I want to keep in the long term.

The usual approach is to specialise generic types/functions across types, essentially copy-pasting these types and substituting the generic type parameters with the real types. This can result in great runtime performance, but also slows down compile times drastically; especially paired with the already slow LLVM. In case of Inko this would also be a waste, as a lot of data will be addressed through pointers and thus have the same memory layout (e.g. Array[User] and Array[String] would have the same layout).

Instead, I propose we specialise across kinds/shapes. These shapes specify both the memory layout, and how to handle aliasing. We'd start out with the following shapes:

  • T: pointer sized, owned
  • Int: 64 bits unboxed, passed around on the stack
  • Float: 64 bits floating point, unboxed, passed around on the stack using the appropriate floating point registers
  • ref T: just a pointer memory wise, but creating/dropping aliases involves changing the reference count
  • T where T is an atomic value (e.g. String): basically the same as ref T, but using atomic reference counting, and decrements follow an extra check to drop the value when the ref count is zero

Splitting generic code into these different shapes means we can get rid of pointer tagging, allow unboxed Int and Float values, and don't need to perform runtime checks to figure out how to create/destroy aliases. In addition, the compiler is free to turn types into shapes if deemed beneficial, allowing for more optimisations but without compile times going down the drain as a default.

A challenge here is casting Int and Float to traits objects. Such values are typed as T and thus treated as a heap object. This requires that we box Int and Float upon a cast.

In addition, generic types containing Int and Float can't be compatible with generic types containing traits implemented by Int and Float (e.g Array[Int] isn't compatible with Array[ToString]), because we'd be passing unboxed values where boxed values are expected. This is fine when passing e.g. Int directly to a ToString, as in such a case the conversion is trivial.

Similarly, if we have a value typed as a trait it might need atomic reference counting, such as when a function takes a ToString and is given a String. What we probably need to do is insert a runtime check that checks the object header, and chooses the right strategy based on whether the value needs atomic reference counting or not. This can be avoided if we statically know the trait isn't implemented by any atomic types.

Finally, for this to work we'll need to implement Array in Inko, such that the compiler/generated code has control over the memory layout, instead of this being dictated by the runtime library. This is outlined in #349.

The end result is that we can get rid of all runtime bookkeeping for generics, improve performance, but without dramatically increasing compile times.

Related work

https://github.com/golang/proposal/blob/master/design/generics-implementation-gcshape.md

Depends on

@yorickpeterse yorickpeterse added feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance compiler Changes related to the compiler labels May 15, 2023
@yorickpeterse yorickpeterse added this to the 0.12.0 milestone May 19, 2023
@yorickpeterse yorickpeterse self-assigned this May 24, 2023
@yorickpeterse yorickpeterse modified the milestones: 0.12.0, 0.13.0 May 30, 2023
@yorickpeterse yorickpeterse mentioned this issue Jun 13, 2023
10 tasks
yorickpeterse added a commit that referenced this issue Jul 5, 2023
This adds support for linking against C libraries and using their types
and functions. The FFI is fairly strict and a bit limited, such as not
performing automatic type conversions, but this is a deliberate choice:
it keeps the compiler's complexity at a reasonable level, and it should
(hopefully) further drive home the idea that one should avoid
interfacing with C as much as they can, as all of Inko's safety
guarantees are thrown out of the window when doing so.

In preparation for #525, some
runtime functions return an i64 instead of an Int, while some still
return an Int. This is a little inconsistent, but the goal is to try and
reduce the amount of explicit type casts that we may need to change when
said specialisation is implemented. Once implemented, Int64 will just be
an alias for Int (or maybe I'll remove it entirely, not sure yet).

As part of this work, the precedence of type casts (`x as Type`) is
changed to be the same as binary operators. This means expressions such
as `x as Foo + y` are now valid, instead of resulting in a syntax error.
This makes working with type casts (something you'll need to use more
often when working with C code) less painful.

This commit also introduces support for conditional compilation at the
import level. Based on the build target, a set of build tags is
produced. `import` statements support a condition, and when given the
`import` is only included if all the tags match. For example:

    import foo if mac and amd64

This would only import `foo` if the tags "mac" and "amd64" are present.
OR and NOT expressions aren't supported, as one can simply use multiple
imports.

This fixes #290.

Changelog: added
yorickpeterse added a commit that referenced this issue Jul 7, 2023
This adds support for linking against C libraries and using their types
and functions. The FFI is fairly strict and a bit limited, such as not
performing automatic type conversions, but this is a deliberate choice:
it keeps the compiler's complexity at a reasonable level, and it should
(hopefully) further drive home the idea that one should avoid
interfacing with C as much as they can, as all of Inko's safety
guarantees are thrown out of the window when doing so.

In preparation for #525, some
runtime functions return an i64 instead of an Int, while some still
return an Int. This is a little inconsistent, but the goal is to try and
reduce the amount of explicit type casts that we may need to change when
said specialisation is implemented. Once implemented, Int64 will just be
an alias for Int (or maybe I'll remove it entirely, not sure yet).

As part of this work, the precedence of type casts (`x as Type`) is
changed to be the same as binary operators. This means expressions such
as `x as Foo + y` are now valid, instead of resulting in a syntax error.
This makes working with type casts (something you'll need to use more
often when working with C code) less painful.

This commit also introduces support for conditional compilation at the
import level. Based on the build target, a set of build tags is
produced. `import` statements support a condition, and when given the
`import` is only included if all the tags match. For example:

    import foo if mac and amd64

This would only import `foo` if the tags "mac" and "amd64" are present.
OR and NOT expressions aren't supported, as one can simply use multiple
imports.

This fixes #290.

Changelog: added
yorickpeterse added a commit that referenced this issue Jul 8, 2023
This adds support for linking against C libraries and using their types
and functions. The FFI is fairly strict and a bit limited, such as not
performing automatic type conversions, but this is a deliberate choice:
it keeps the compiler's complexity at a reasonable level, and it should
(hopefully) further drive home the idea that one should avoid
interfacing with C as much as they can, as all of Inko's safety
guarantees are thrown out of the window when doing so.

In preparation for #525, some
runtime functions return an i64 instead of an Int, while some still
return an Int. This is a little inconsistent, but the goal is to try and
reduce the amount of explicit type casts that we may need to change when
said specialisation is implemented. Once implemented, Int64 will just be
an alias for Int (or maybe I'll remove it entirely, not sure yet).

As part of this work, the precedence of type casts (`x as Type`) is
changed to be the same as binary operators. This means expressions such
as `x as Foo + y` are now valid, instead of resulting in a syntax error.
This makes working with type casts (something you'll need to use more
often when working with C code) less painful.

This commit also introduces support for conditional compilation at the
import level. Based on the build target, a set of build tags is
produced. `import` statements support a condition, and when given the
`import` is only included if all the tags match. For example:

    import foo if mac and amd64

This would only import `foo` if the tags "mac" and "amd64" are present.
OR and NOT expressions aren't supported, as one can simply use multiple
imports.

This fixes #290.

Changelog: added
yorickpeterse added a commit that referenced this issue Jul 9, 2023
This adds support for linking against C libraries and using their types
and functions. The FFI is fairly strict and a bit limited, such as not
performing automatic type conversions, but this is a deliberate choice:
it keeps the compiler's complexity at a reasonable level, and it should
(hopefully) further drive home the idea that one should avoid
interfacing with C as much as they can, as all of Inko's safety
guarantees are thrown out of the window when doing so.

In preparation for #525, some
runtime functions return an i64 instead of an Int, while some still
return an Int. This is a little inconsistent, but the goal is to try and
reduce the amount of explicit type casts that we may need to change when
said specialisation is implemented. Once implemented, Int64 will just be
an alias for Int (or maybe I'll remove it entirely, not sure yet).

As part of this work, the precedence of type casts (`x as Type`) is
changed to be the same as binary operators. This means expressions such
as `x as Foo + y` are now valid, instead of resulting in a syntax error.
This makes working with type casts (something you'll need to use more
often when working with C code) less painful.

This commit also introduces support for conditional compilation at the
import level. Based on the build target, a set of build tags is
produced. `import` statements support a condition, and when given the
`import` is only included if all the tags match. For example:

    import foo if mac and amd64

This would only import `foo` if the tags "mac" and "amd64" are present.
OR and NOT expressions aren't supported, as one can simply use multiple
imports.

This fixes #290.

Changelog: added
yorickpeterse added a commit that referenced this issue Jul 9, 2023
This adds support for linking against C libraries and using their types
and functions. The FFI is fairly strict and a bit limited, such as not
performing automatic type conversions, but this is a deliberate choice:
it keeps the compiler's complexity at a reasonable level, and it should
(hopefully) further drive home the idea that one should avoid
interfacing with C as much as they can, as all of Inko's safety
guarantees are thrown out of the window when doing so.

In preparation for #525, some
runtime functions return an i64 instead of an Int, while some still
return an Int. This is a little inconsistent, but the goal is to try and
reduce the amount of explicit type casts that we may need to change when
said specialisation is implemented. Once implemented, Int64 will just be
an alias for Int (or maybe I'll remove it entirely, not sure yet).

As part of this work, the precedence of type casts (`x as Type`) is
changed to be the same as binary operators. This means expressions such
as `x as Foo + y` are now valid, instead of resulting in a syntax error.
This makes working with type casts (something you'll need to use more
often when working with C code) less painful.

This commit also introduces support for conditional compilation at the
import level. Based on the build target, a set of build tags is
produced. `import` statements support a condition, and when given the
`import` is only included if all the tags match. For example:

    import foo if mac and amd64

This would only import `foo` if the tags "mac" and "amd64" are present.
OR and NOT expressions aren't supported, as one can simply use multiple
imports.

This fixes #290.

Changelog: added
yorickpeterse added a commit that referenced this issue Jul 14, 2023
The Array type is no longer defined in the runtime and exposed to the
standard library, instead the standard library defines the type
entirely, using the C FFI for raw memory allocations. This is a step
towards being able to implement a better way of compiling generics (see
#525), as doing so requires
creating specialised copies of the Array type.

As part of this, array literals are desugared into regular method calls,
meaning this:

    [10, 20]

Is now turned into this:

    {
      let temp = Array.with_capacity(3)

      temp.push(10)
      temp.push(20)
      temp
    }

This removes a bunch of duplication/redundant compiler logic.

Array constants in turn are changed to use statically allocated memory
for their buffers (the actual Inko object is still heap allocated). The
permanent flag is no longer set, instead we return a pointer with the
reference bit set. While the reference count may overflow/underflow,
that's OK because we never actually check it and drop the constant. The
idea is to try and get rid of this permanent flag entirely, provided I
can find a way to not need it for string constants, as we need to
prevent dropping them when their reference count reaches zero.

Integer and float literals no longer use constants. For floats that
means heap allocations for float literals, but this is only temporary
until we implement specialisation, at which point both Int and Float
become stack/immediate values. This change makes some parts of LLVM code
generation a bit easier to work with.

The ByteArray type is still implemented in the runtime, as various Rust
functions we use specifically expect a `Vec<u8>`. Until we replace all
of that (or find another workaround), ByteArray remains defined in the
runtime library.

This fixes #349.

Changelog: changed
@yorickpeterse
Copy link
Collaborator Author

yorickpeterse commented Jul 26, 2023

Similarly, if we have a value typed as a trait it might need atomic reference counting, such as when a function takes a ToString and is given a String. What we probably need to do is insert a runtime check that checks the object header, and chooses the right strategy based on whether the value needs atomic reference counting or not. This can be avoided if we statically know the trait isn't implemented by any atomic types.

Instead I think we need to take a different approach:

If an argument is typed as SomeTrait, and this trait is implemented by a type that uses atomic reference counting, we also specialise over the "atomic" shape. Thus, arg: SomeTrait or arg: ref SomeTrait is effectively the same as arg: T / arg: ref T where T: SomeTrait.

Another option is that we simply disallow traits in arguments, and instead force users to go through generics. This would lead to slightly more verbose code, but also keep things consistent. It probably also solves some covariance problems down the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler Changes related to the compiler feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant