Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC - Implement unsigned integers in the compiler space #3336

Conversation

WojciechMazur
Copy link
Contributor

@WojciechMazur WojciechMazur commented Jun 14, 2023

This is a proof of concept of how could we improve the implementation of unsigned types.

Currently, the Scala compiler does not allow us to implement a first-class, efficient unsigned type in the user space:

  • Implementation based on value classes (AnyVal) would not allow defining other value classes using unsigned types
  • Implementation based on opaque types would not allow us to implement a typesafe code in the erased context, and though would not allow for safe usage of unsigned types in pattern matching, besides matching on literal value
opaque type UInt = Int 
val x: UInt = ???
val y: Any = x
1.getClass // int
x.getClass() // class java.lang.Integer
y.getClass() // class java.lang.Integer

The ultimate solution for this problem would be defining first-class unsigned types in the stdlib, but SIP-26 rejected this idea. However, at that time, unsigned types were implemented in the user space, mostly by the usage of the value classes, and were not incorporated in the main compiler and its type system.

As a workaround, I present the POC for compiler-plugin-space implementation of unsigned types.
UInt is now defined as final abstract class NewUInt private() and defines only the abstract method declarations, similarly to the scala.Int in the stdlib. All the actual implementation of these methods is handled by the compiler based on the registered set of primitive symbols. The NewUInt at the typer is treated as AnyRef class, however, in Scala Native backend it is treated as primitive type (AnyVal). By moving the logic to the compiler-plugin we do risk a much higher complexity and maintenance cost, but it could be a coal stone of checking this experimental feature, and maybe eventually moving it to the compiler.

We do not introduce a NIR primitive type for unsigned integers. At NIR level they're still represented as Int. Similary to LLVM IR representation signless of integers is only context-based - there is no dedicated uint type in LLVM.

The following code:

    val a = 1.u
    val b = a.toInt.U * 2.u
    val c = a + b
    val d: NewUInt = a + b + c
    val e: Any = d

would translate to following NIR

  %4 = imul[int] int 1, int 2     ; b = a * 2
  %5 = iadd[int] int 1, %4 : int  ; c = a + b
  %6 = iadd[int] int 1, %4 : int  ; a + b 
  %7 = iadd[int] %6 : int, %5 : int  
  %8 = module @"T32scala.scalanative.runtime.Boxes$"
  %9 = call[(@"T32scala.scalanative.runtime.Boxes$", int) => @"T38scala.scalanative.unsigned.UnsignedInt"] @"M32scala.scalanative.runtime.Boxes$D16boxToUnsignedIntiL38scala.scalanative.unsigned.UnsignedIntEO" : ptr(%8 : !?@"T32scala.scalanative.runtime.Boxes$", %7 : int)

What's important the literal types 1.u and a.toInt.U are still literal constants at NIR level. It was not possible in the previous implementation, in which by default all unsigned types would be boxed, and might get rid of boxing in the optimizer

Problems of new implementation:

  • Even though at runtime NewUInt is a primtive type, we can define it as AnyVal it's impossible and limited by the compiler
  • All usages of new unsiged integers need to be rewritten in the compiler plugin. Becouse they're not recognized as primtive types by the compiler, we need to manually handle they're boxing/unboxing after the Erasure phase. Currently all operations used in the sanbox project (also commented out) are working correctly, but there is always a risk of bugs for each unhandled use case of new unsigned types

@WojciechMazur
Copy link
Contributor Author

For comparsion NIR for similar code generated using current unsigned ints:

    val a = 1.toUInt
    val b = a.toInt.toUInt * 2.toUInt
    val c = a + b
    val d: UInt = a + b + c
    val e: Any = d
%4 = module @"T51scala.scalanative.unsigned.package$UnsignedRichInt$"
  %5 = module @"T35scala.scalanative.unsigned.package$"
  %6 = call[(@"T35scala.scalanative.unsigned.package$", int) => int] @"M35scala.scalanative.unsigned.package$D15UnsignedRichIntiiEO" : ptr(%5 : !?@"T35scala.scalanative.unsigned.package$", int 1)
  %7 = call[(@"T51scala.scalanative.unsigned.package$UnsignedRichInt$", int) => @"T31scala.scalanative.unsigned.UInt"] @"M51scala.scalanative.unsigned.package$UnsignedRichInt$D16toUInt$extensioniL31scala.scalanative.unsigned.UIntEO" : ptr(%4 : !?@"T51scala.scalanative.unsigned.package$UnsignedRichInt$", %6 : int)
  %8 = module @"T51scala.scalanative.unsigned.package$UnsignedRichInt$"
  %9 = module @"T35scala.scalanative.unsigned.package$"
  %10 = method %7 : @"T31scala.scalanative.unsigned.UInt", "D5toIntiEO"
  %11 = call[(@"T31scala.scalanative.unsigned.UInt") => int] %10 : ptr(%7 : @"T31scala.scalanative.unsigned.UInt")
  %12 = call[(@"T35scala.scalanative.unsigned.package$", int) => int] @"M35scala.scalanative.unsigned.package$D15UnsignedRichIntiiEO" : ptr(%9 : !?@"T35scala.scalanative.unsigned.package$", %11 : int)
  %13 = call[(@"T51scala.scalanative.unsigned.package$UnsignedRichInt$", int) => @"T31scala.scalanative.unsigned.UInt"] @"M51scala.scalanative.unsigned.package$UnsignedRichInt$D16toUInt$extensioniL31scala.scalanative.unsigned.UIntEO" : ptr(%8 : !?@"T51scala.scalanative.unsigned.package$UnsignedRichInt$", %12 : int)
  %14 = module @"T51scala.scalanative.unsigned.package$UnsignedRichInt$"
  %15 = module @"T35scala.scalanative.unsigned.package$"
  %16 = call[(@"T35scala.scalanative.unsigned.package$", int) => int] @"M35scala.scalanative.unsigned.package$D15UnsignedRichIntiiEO" : ptr(%15 : !?@"T35scala.scalanative.unsigned.package$", int 2)
  %17 = call[(@"T51scala.scalanative.unsigned.package$UnsignedRichInt$", int) => @"T31scala.scalanative.unsigned.UInt"] @"M51scala.scalanative.unsigned.package$UnsignedRichInt$D16toUInt$extensioniL31scala.scalanative.unsigned.UIntEO" : ptr(%14 : !?@"T51scala.scalanative.unsigned.package$UnsignedRichInt$", %16 : int)
  %18 = method %13 : @"T31scala.scalanative.unsigned.UInt", "D6$timesL31scala.scalanative.unsigned.UIntL31scala.scalanative.unsigned.UIntEO"
  %19 = call[(@"T31scala.scalanative.unsigned.UInt", @"T31scala.scalanative.unsigned.UInt") => @"T31scala.scalanative.unsigned.UInt"] %18 : ptr(%13 : @"T31scala.scalanative.unsigned.UInt", %17 : @"T31scala.scalanative.unsigned.UInt")
  %20 = method %7 : @"T31scala.scalanative.unsigned.UInt", "D5$plusL31scala.scalanative.unsigned.UIntL31scala.scalanative.unsigned.UIntEO"
  %21 = call[(@"T31scala.scalanative.unsigned.UInt", @"T31scala.scalanative.unsigned.UInt") => @"T31scala.scalanative.unsigned.UInt"] %20 : ptr(%7 : @"T31scala.scalanative.unsigned.UInt", %19 : @"T31scala.scalanative.unsigned.UInt")
  %22 = method %7 : @"T31scala.scalanative.unsigned.UInt", "D5$plusL31scala.scalanative.unsigned.UIntL31scala.scalanative.unsigned.UIntEO"
  %23 = call[(@"T31scala.scalanative.unsigned.UInt", @"T31scala.scalanative.unsigned.UInt") => @"T31scala.scalanative.unsigned.UInt"] %22 : ptr(%7 : @"T31scala.scalanative.unsigned.UInt", %19 : @"T31scala.scalanative.unsigned.UInt")
  %24 = method %23 : @"T31scala.scalanative.unsigned.UInt", "D5$plusL31scala.scalanative.unsigned.UIntL31scala.scalanative.unsigned.UIntEO"
  %25 = call[(@"T31scala.scalanative.unsigned.UInt", @"T31scala.scalanative.unsigned.UInt") => @"T31scala.scalanative.unsigned.UInt"] %24 : ptr(%23 : @"T31scala.scalanative.unsigned.UInt", %21 : @"T31scala.scalanative.unsigned.UInt")

@WojciechMazur
Copy link
Contributor Author

We've discussed that going with this implementation would be difficult to maintain in the long run. It might be easier with special handling introduced by the compiler, but that's not possible, until stabilizing the project (release 1.0) and integrating scala-native into the Scala compiler at some point in the future.
As an alternative we currently opt for:

  • implementing unsigned types using inline classes as discussed in pre-SNIP
  • optionally add special handling for transforming integers literals (1.toUInt) directly to NIR primitive value to skip unnecessary boxing without
  • reduce the number of places where unsigned types are required - especially stackalloc / alloc instructions. Unsigned types should be mostly used for places where the extra sign-bit is actually required and for the C interop. stackalloc / alloc should re-introduce the signed variants, optionally with compile time checks via refinements.

@WojciechMazur
Copy link
Contributor Author

Bunch of posible improvement were merged in #3375. Since full implemention of this PoC would not be maintainable we I'm closing this issue. It might be reopened some day if Dotty would allow to implement custom primitive types, or when compiler plugin would be integrated with Scala compiler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant