Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsigned integer types #135

Open
ilya-g opened this issue Jul 13, 2018 · 46 comments
Open

Unsigned integer types #135

ilya-g opened this issue Jul 13, 2018 · 46 comments

Comments

@ilya-g
Copy link
Member

ilya-g commented Jul 13, 2018

This issue is for discussion of the proposal to introduce unsigned integer types in Kotlin.

@elect86
Copy link

elect86 commented Jul 16, 2018

I have many case scenarios where I'd need to use Number as a generic bound, and some of them include also unsigned.

And since this proposal doesn't contemplate extending the Number abstract class, supposing that inline classes may extend interfaces, I'd like to ask if it'd be possible to convert Number into an interface so that unsigned number may implement also that

@zarechenskiy
Copy link
Contributor

zarechenskiy commented Jul 16, 2018

@elect86 Could you please elaborate on your use-cases? Maybe it'll be more convenient if we introduce new common supertype for unsigned types, for example, UnsignedNumber?

Also, just to note that if we'll decide to add Number as a supertype for unsigned types, we'll have to add several conversion methods for unsigned types (similar to toByte) into it as member or extension functions

@elect86
Copy link

elect86 commented Jul 16, 2018

Sure, for example in imgui I have a generic inputScalar accepting a reference to a Number (it's actually * to reduce casts around the lib), which can be Float, Double or Int.

I'd like to have the possibility to include also unsigned int as well.

UnsignedNumber would help only on those cases where of course I need a generic bounds on unsigned numbers. I had also a couple of those use cases, but the most common one for me is unequivocally the Number case.

Those conversion methods make, I'd say, actually sense.

Unsigned Int is actually a Number and it should have its corresponding Byte representation. If then one reads as signed than this is the user fault.

@voddan
Copy link
Contributor

voddan commented Jul 21, 2018

Arithmetic and comparison operations that mix signed and unsigned operands are not provided

What would the analog of x = x + step be if x is unsigned and step can be negative? I hope it is not something like x = if(step > 0) x + step.toUInt() else x - step.abs().toUInt()

@ilya-g
Copy link
Member Author

ilya-g commented Jul 23, 2018

@voddan The analog is just x = x + step.toUInt(): the addition of a negative number results in the same as the addition of its unsigned two-complement. Take a look for example, how the iterator of UIntProgression is implemented https://github.com/JetBrains/kotlin/blob/1.3-M1/libraries/stdlib/unsigned/src/kotlin/UIntRange.kt#L107.

@voddan
Copy link
Contributor

voddan commented Jul 23, 2018

@ilya-g Oh, of course! Silly of me to forget how toUInt works. Thanks!

@elect86
Copy link

elect86 commented Jul 26, 2018

Playing with them I saw that data is actually private. Please, don't do that.

If you leave it accessible you let the people (me) the possibility to access it and write comfortable operators overloads for arithmetic operation involving signed and unsigned operations

i.e: operator fun Byte.plus(b: UByte) = (this + b.data).toByte()

@ilya-g
Copy link
Member Author

ilya-g commented Jul 26, 2018

@elect86 You can use UByte.toByte() function to reinterpret unsigned byte as signed.

@elect86
Copy link

elect86 commented Jul 26, 2018

Yeah, you are right. Anyway, which are the reasons behind making it private?

@gildor
Copy link
Contributor

gildor commented Jul 27, 2018

@elect86 consistency with other numerical types? Byte is also just an int under the hood on JVM

@rossdanderson
Copy link

rossdanderson commented Jul 28, 2018

I've come across two potentially related issues when using the unsigned values in the latest 1.3-M1 release along with Delegation (it may affect other scenarios, such as other inline classes, but I've not seen any yet)

I've raised https://youtrack.jetbrains.net/issue/KT-25784 and https://youtrack.jetbrains.net/issue/KT-25785 with examples

@zarechenskiy
Copy link
Contributor

@rossdanderson Thank you! We'll try to address these issues soon

@mquigley
Copy link

mquigley commented Jul 29, 2018

There are no toString(radix: Int) on any of the ULong, UInt, etc., even though internally there are functions to do so such as internal fun ulongToString(v: Long, base: Int): String. Can I file an issue? or is this so simple it'll just get handled?

@ilya-g
Copy link
Member Author

ilya-g commented Jul 29, 2018

@mquigley The string-to-number and number-to-string conversions are planned for the next milestones.

@stangls
Copy link

stangls commented Aug 2, 2018

For int arithmetic with floats and double we have the logic that
val x = 1 + 1f results in x being a float with a value of 2f
val x = 1 + 1.0 results in x being a double with a value of 2.0

Any reason why this widening operation should be different for unsigned ints? I.e. when 1u represets an unsigned int of value 1:
val x = 1u + 1 ⇒ x is 2
val x = 1u + 1f ⇒ x is 2f
val x = 1u + 1.0 ⇒ x is 2.0

It just seems logical, because
val x = 1u + (-2) ⇒ x is simply -1 instead of 2^31-1
If you really want to do narrow arithmetic with negative numbers and unsigned ints, you can still cast:
val x = ( 1u + (-2) ).toUInt() ⇒ x is then a UInt of 2^31-1

To me it seems more consistent.

@cretz
Copy link

cretz commented Aug 3, 2018

I saw extensions like toByteArray mentioned in the proposal but I cannot find the implementation. Also, surely it would return the underlying array and not copy it. I mentioned this on the Kotlin forum just now, but for people to do fast copies between already-existing arrays of unsigned types, they need access to the raw array for System.arraycopy which makes me think the private val storage: IntArray should become public val signedArray: IntArray or even better since collection is an interface, make UIntArray and friends inline classes that implement collection. But however it's implemented, raw array access is needed (or provide a multiplatform way to do fast, potentially-overlapping array copies between two already-existing arrays).

@ilya-g
Copy link
Member Author

ilya-g commented Aug 3, 2018

@cretz toByteArray and similar ones will be provided in some of the next milestone releases of 1.3.

toByteArray is copying the contents to a new array, to get a view of the underlying storage array it's proposed to use asByteArray.

We should also provide some copying functions like copyOf, so that array reinterpretation wouldn't be required in common scenarios.

@ilya-g
Copy link
Member Author

ilya-g commented Aug 3, 2018

@stangls Kotlin is a statically typed language. The return type of an arithmetic operation depends on the types of operands but not on their values. So to introduce an arithmetic operation between signed and unsigned operands, you need to specify what is the return type for a given combination of operand types.
Should UInt + Int result in Int? Then it will result in overflow when the first one is 4_000_000_000u and the second is 1.
Should UInt + Int result in UInt? Then it will overflow for 1u + (-2) .

As you can see, both ways to declare the operation may produce an overflow and that overflow may be not that the author intended to have, thus it can result in subtle bugs. We require author to declare explicitly which type of operation is used, either signed or unsigned one, by converting operands to compatible types.

@stangls
Copy link

stangls commented Aug 4, 2018

@ilya-g : You are right, Kotlin is a statically typed language. Any yes, that is exactly why you read my example right, because you read values that have an unambiguous type.
Thank you for explaining that unsigned ints are not a subset of signed ints (and vice versa). So the (future) documentation about unsigned ints should explain that we can only treat widening within unsigned types or within signed types, but never mix them because they are not a subset of each other.

@0legg
Copy link

0legg commented Aug 20, 2018

Are there any plans to support (or wrap) different descendants of java.nio.Buffer?

@ilya-g
Copy link
Member Author

ilya-g commented Aug 30, 2018

@0legg We have no such plans for 1.3, though nothing seems to prevent providing such support in a separate library.

@zayass
Copy link

zayass commented Sep 4, 2018

Is it planned to make some kind of safe interop with java?

For example, if you export UInt to java you can simply divide or compare them as int without any warning.
Or when you export UInt? java see boxed type but still not have any chance to invoke proper functions to divide.

Maybe it is possible to do it with some external tool. Like android studio does with @ColorInt, @ColorRes. But it requires some additional metadata to be emitted by compiler.

@mquigley
Copy link

mquigley commented Sep 4, 2018

@zayass The unsigned classes are implemented as Kotlin inline classes.

public inline class UShort internal constructor(private val data: Short)

So, a UShort is seen to Java and the JVM as a short in most cases. When you use UShort in your Kotlin classes, the final Java bytecode will either be Kotlin Short class but usually the primitive short. And the methods of the classes such as UShort.compareTo() are actually implemented with static methods.

I recommend reading all about Kotlin inline classes to get a better picture of what's going on under the hood. But you can always write a few test classes yourself.

@cretz
Copy link

cretz commented Sep 4, 2018

There's no actual class in the bytecode called UShort

That's not true. Commenter even referenced boxed forms. I too recommend reading all about Kotlin inline classes. Granted I'd say the Kotlin forums are better for conversations like these.

@zayass
Copy link

zayass commented Sep 4, 2018

@mquigley @cretz I understand how it works and how it look in bytecode, but have issues with current implementation.

Where is a proper place for discuss such issues? For example:

"Expose $Erased methods to be able to invoke it with boxed value in java"
Currently, we can invoke UInt.compareTo from java because it is an interface method, but can not to invoke UInt.div

or

"Mark inlined values with annotation for external tooling"
For example when UInt translates to int in java mark them with @InlinedFrom(UInt.class) to be able to restore info from original kotlin code.

@zarechenskiy
Copy link
Contributor

zarechenskiy commented Sep 20, 2018

@zayass Declarations inside unsigned classes and declarations that have unsigned types in their signatures on top-level will be mangled. In other words, it will not be possible to use functionality of unsigned types from Java. Please check out these sections from inline classes KEEP for more details: overloads, non-public constructors, mangling

@zayass
Copy link

zayass commented Sep 20, 2018

@zarechenskiy I understand overloading issue when type is inlined. But mangling leads to another issue in kotlin java interop.

For example we have mixed project with java/kotlin/c and want to use unsigned types for better interop with c. And it works cool with kotlin<->c, but not with java<->kotlin<->c.

And what is more strange, interface methods like compareTo(UInt) is accessible but div, rem not.

Why not to export methods with boxed arguments to java without mangling?

Like this

class UInt {
    public UInt div(UByte other);
    public UInt div(UShort other);
    public UInt div(UInt other);
    public ULong div(ULong other);
}

@zarechenskiy
Copy link
Contributor

@zayass Because this way we'll have to duplicate method count for inline classes and declarations that use inline class types in their signatures. Also, we'd love to make inline classes future-proof with regard to project Valhalla: we'd love to make these classes value classes and maybe at some point, we'll generate similar declarations if we'll be sure about gradual migration.
One thing that we can do is to introduce some opt-in (compiler argument, for example) for exporting such methods with boxed inline classes.

@zayass
Copy link

zayass commented Sep 20, 2018

Sounds reasonable thanks

@Danielku15
Copy link

I'm not sure if I'm at the correct location for reporting this finding, but opening a issue in YouTrack sounded a bit to premature to me.

I gave the feature a small try today and noticed that the unsigned type conversion functions are still missing for float and double. The types provide conversion functions to signed integral types (e.g. toByte() ) but the extensions provided by e.g. UByte are covering them:

https://github.com/JetBrains/kotlin/blob/0a2d1b409b5f52acc115e06062e99b194199a7ac/libraries/stdlib/unsigned/src/kotlin/UByte.kt#L183

I'm using kotlinc-jvm 1.3-M2. Is it planned for 1.3 to add those conversions?

@LouisCAD
Copy link
Contributor

LouisCAD commented Oct 28, 2018

@Danielku15 Please, use latest, 1.3 RC-4 (1.3.0-rc-190), then tell us if it's still missing.

@Danielku15
Copy link

@LouisCAD Thanks for the fast response. I pulled the latest update available on the "Early Access Preview 1.3" channel via IntelliJ just today. Even though it was announced here that 1.3-RC4 is available via the EAP channel I cannot install it from there. Am I missing something?

My Environment:

  • IntelliJ IDEA 2018.2 (Community Edition)
  • Windows 10 Pro N 1803 64bit
    image

@LouisCAD
Copy link
Contributor

LouisCAD commented Oct 28, 2018 via email

@ilya-g
Copy link
Member Author

ilya-g commented Oct 28, 2018

@Danielku15 @LouisCAD Let's keep the discussion not far from unsigned types.
@Danielku15 if you have a problem updating to the latest RC version of Kotlin plugin it's better to ask about it on the forum where it was announced.

Regarding toFloat and toDouble conversions — they are still missing in 1.3.0, but we may provide them in one of 1.3.x updates. You could follow https://youtrack.jetbrains.com/issue/KT-27108 in order not to miss it.

@Danielku15
Copy link

@LouisCAD @ilya-g Thanks a lot for the info. I updated everything to the EAP versions and could see that the conversions are still missing. I added a comment to the issue.

@octylFractal
Copy link

Is the difference between Java's toUnsignedXXX functions and Kotlin's toUXXX intended?

println((-1).toULong().toString(16))
println(Integer.toUnsignedLong(-1).toString(16))
println((Integer.MIN_VALUE).toULong().toString(16))
println(Integer.toUnsignedLong(Integer.MIN_VALUE).toString(16))

Output:

ffffffffffffffff
ffffffff
ffffffff80000000
80000000

It looks like Kotlin assumes you want to convert your signed int to an unsigned long, whereas Java assumes that the integer you provide is an unsigned int, and you're converting it to an unsigned long.

This was surprising to me, because I typically want to take some Java APIs output and convert it to UXXX for Kotlin type-safety, but it seems like it would be more complicated to do that, something like toUXXX() and UXXX.MAX_VALUE. If there's no plans to change it, could this difference between the two similar APIs be made more visible?

@ilya-g
Copy link
Member Author

ilya-g commented Dec 26, 2018

@kenzierocks Yes, the difference is intentional.

The initial implementation of unsigned types had Int.toULong operation implemented same as Integer.toUnsignedLong in Java, i.e. zero extending. But in the process of design refinement we decided to stick with the sign extending behavior, that is common in other languages with unsigned types, such as C, Rust and Go.

The reason why Integer.toUnsignedLong is implemented as zero extending is because there are no distinct unsigned types in Java, so to operate with unsigned values JDK provides methods that have to assume that a signed integer parameter actually contains an unsigned integer value (see the Algorithms involving unsigned integers use case and the linked article there).

This approach is not required when the language clearly distinguishes unsigned types from signed ones. In Kotlin you do not need to assume that a singed Int contains an unsigned value and reinterpret it as Long to get something meaningful from it. You just need to convert it to unsigned int: Int.toUInt() and then you can work with that value as unsigned. Only then if you need to operate in a wider domain, you can extend the unsigned int to ULong or Long.

So to summarize:

  • if you need to reinterpret signed value as unsigned, use Int.toUInt()
  • if you need to reinterpret value as unsigned and then convert to a bigger unsigned type use Int.toUInt().toULong()
  • if you need a complete replacement of toUnsignedLong, use Int.toUInt().toLong()

@voddan
Copy link
Contributor

voddan commented Dec 27, 2018

So to summarize:

  • if you need to reinterpret signed value as unsigned, use Int.toUInt()
  • if you need to reinterpret value as unsigned and then convert to a bigger unsigned type use Int.toUInt().toULong()
  • if you need a complete replacement of toUnsignedLong, use Int.toUInt().toLong()

Nicely put! Please make sure to include this summary into the final documentation for unsigned arithmetic ;)

@davidbarkhuizen
Copy link

Hi All.

Thanks for all the work on this, and I would just like to highlight a use-case.

I'm currently busy with EMV development, and having an unsigned byte primitive (and associated array types) is essential to directly handling byte and bit-level manipulations.

I would be very sad to see support for unsigned types dropped in future, and I imagine so would everyone else who is looking to do bit-level work in Kotlin.

Again, thanks for this important contribution to an excellent language.

@matt-quigley-ck
Copy link

matt-quigley-ck commented Apr 12, 2019

I'd like to leave a few comments about quality of life issues using unsigned values in Kotlin.

  • Numeric constant literals are not automatically converted by the compiler to unsigned equivalents, e.g. val num: UInt = 25 is a compiler error. This is frustrating in declarations, where it is known at compiler-time what the value is. Another example is ubyteArrayOf(0xEFu, 0xBBu, 0xBFu) where the values inside the array can only be unsigned bytes. The most frustrating of all is that a 0 does not simply work in val num: UInt = 0. This is boilerplate that Kotlin is usually good at avoiding. I would argue that forcing explicit declarations may be good for Java developers who aren't sure of the semantics of unsigned vs signed types, but bad for developers who are familiar with unsigned types and want Kotlin to work well with them.

  • String formatting doesn't work with unsigned types. String.format("%08x:%08x = %d, a, b, c) results in a runtime error. I understand why this is the case, I am just bringing it up as this is a quality of life issue that could be improved with the API.

  • Conversions are pedantic, in a language which does a great job at terseness. You must explicitly call two conversions in some cases, such as Int.toUInt().toLong() when needing to sign-extend or zero-extend. This might be solvable by a few helper methods.

  • UTypes do not extend Number which rears its ugly head in places you wouldn't expect. I've added a helper method to help in these cases, but there might be a better way at the API/compiler level. If not Number, it would be nice if the classes all implemented some common interface or abstract class, otherwise when the type is not known you have to check for all 4 types of unsigned classes to convert. See (1) below.

  • Inexplicably, the UByte and UShort APIs do not include shl and shr.

  • Given the code

val a1: UShort = 1u; 
val a2: UShort = 2u; 
val a3 = a1 + a2

Reasonably one would expect the type of a3 to be UShort, but nope! It's a UInt. That's a hidden gotcha if I've ever seen one.

  • Related to the above: I want unsigned types for low-level fast work. I thought inline classes would act seamlessly as primitives at the bytecode level, but they're actually quite expensive. The above code turns into the following Java code.
      int var4 = UInt.constructor-impl(a1 & '\uffff');
      int var5 = UInt.constructor-impl(a2 & '\uffff');
      int a3 = UInt.constructor-impl(var4 + var5);

I'm just passing this feedback along to see if it can help in anyway. The current state of affairs is much better than not having unsigned types at all!

(1)

private fun Any.toInt(): Int {
    if (this is UByte) return this.toInt() // I can't group all the unsigned types
    if (this is UShort) return this.toInt()
    if (this is UInt) return this.toInt()
    if (this is ULong) return this.toInt()
    if (this is Number) return this.toInt() // I can group all signed types as a Number
    throw IllegalArgumentException("Not number $this")
}

@elect86
Copy link

elect86 commented May 2, 2019

@matt-quigley-ck I'd like to point you to one of our library, kotlin-unsigned in case it may suit better your needings:

  • val num = Uint(25) or val num = Uint() // 0. No arrays yet though
  • a.format("%08x")
  • no sign-extend or zero-extend yet (you may want to contribute?)
  • Utypes do extend Number (it might also be possible to implement some additional layer common to all Utypes?)
  • All the Utypes implement all the function, including shl and shr for Ubyte and Ushort
  • val a1: Ushort= 1u; val a2: Ushort= 2u; val a3 = a1 + a2, a3 is actually an Ushort type

Kotlin stdlib unsigned is lighter because they are Int under the hood.
kotlin-unsigned use classes instead. However to address this in critical scenarios, the primitive value inside the Utypes is a var

@hfhbd
Copy link

hfhbd commented Apr 26, 2021

Will there be any compile time checks to prevent negative unsigned numbers?

fun bar() {
    foo(4u - 5u) // no error, but could be optimized to foo(-1u) and should fail at compile time
    foo(-1u) // compiler error, unaryMinus not found
}

fun foo(i: UInt) { }

@matt-quigley-ck
Copy link

 foo(4u - 5u) // no error, but could be optimized to foo(-1u) and should fail at compile time

This should not fail, it should wrap around. Wrap arounds are completely legal, valid, and common in unsigned work. 4u - 5u == UInt.MAX_VALUE.

Note that some languages do handle integer overflow handling in a first class manner. (Rust and Swift come to mind) But it's never been a Java thing. If this is wanted, I'd say helper methods are the way to go.

@ilya-g
Copy link
Member Author

ilya-g commented May 4, 2021

Will there be any compile time checks to prevent negative unsigned numbers?

There's no such thing as negative unsigned numbers, because unsigned numbers do not have a sign.

A situation like 4u - 5u or UInt.MAX_VALUE + 1u is called an (unsigned) overflow. The similar overflow situation is currently detected for operations with signed numbers and compiler issues the corresponding warning about it.

An overflow could be detected for unsigned numbers too. I've opened KT-46470 to track this feature.

@ilya-g
Copy link
Member Author

ilya-g commented Aug 28, 2023

An observation was reported about the fact that comparison operators mixing signed and unsigned integer types have clear meaning and potentially could be provided: https://youtrack.jetbrains.com/issue/KT-59634/Provide-comparison-operators-that-mix-signed-and-unsigned-operands

@gabrieljones
Copy link

gabrieljones commented Jul 25, 2024

Java standard library has Math.<operation>Exact methods
https://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/lang/Math.html#subtractExact(int,int)

Kotlin's unsigned types could benefit from a similar set of functions.

For example:

inline fun ULong.subtractExact(subtrahend: ULong): ULong = when {
  subtrahend > this -> throw ArithmeticException("underflow")
  else -> this - subtrahend
}

Bonus functions:

inline fun ULong.subtractOrZero(subtrahend: ULong): ULong = when {
  subtrahend > this -> 0u
  else -> this - subtrahend
}

val maxLongAsULong = Long.MAX_VALUE.toULong()

inline fun ULong.toSignedExact(): Long = when {
  this > maxLongAsULong -> throw ArithmeticException("overflow")
  else -> this.toLong()
}

inline fun ULong.toSignedOrMax(): Long = this.coerceAtMost(maxLongAsULong).toLong()

inline fun ULong.toSignedUnsafe(): Long = this.toLong()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests