[BUG] Unsigned integer casting overflowing as if signed when using `int()` or `UInt32()` #3065

fnands · 2024-06-17T08:52:58Z

Bug description

Migrating this here after a bit of discussion in Discord.

It seems like casting to unsigned integers actually just casts to signed integers, but has different behaviour in different cases.

I.e. I would expect that if I took a UInt32 value, set it to 129, cast it to UInt8 and then call int(), I would still get 129.

However, this seem to end up as 4294967169, i.e. 2**32 - 127

If I instead call UInt32() I get 1, so it overflows as 2**32 + 1? Not sure about the last one.

So despite the fact that no overflow is expected (129 < 255), it seems to overflow, i.e. act like an signed integer.

The same goes if I cast to UInt16, i.e. casting 32768 (2**16/2 + 1) to UInt16 and then calling int() I get 4294934528 i.e. 2**32 - 2**16/2

So it looks like the cast to UInt8 and UInt16 actually perform casts to Int8 and Int16, and then overflow.

Even more confusingly, this behaviour is different based on where the original UInt32 is from, but behaviour differs when initialising using var and alias (see below).

Steps to reproduce

Most minimal example

def main():
 
    var b: UInt32 = 129
    print(b.cast[DType.uint8]())
    print(int(b.cast[DType.uint8]()))
    print(UInt32(b.cast[DType.uint8]()))

Produces:

129
-127
1

Doesn't have the same effect in the REPL, but the REPL does seem to have a clue:

  1> var a: UInt32 = 129 
  2.  
(SIMD[uint32, 1]) a = {
  (ui32) [0] = 129
}
  2> var b = a.cast[DType.uint8]() 
  3.  
(SIMD[uint8, 1]) b = {
  (ui8) [0] = -127
}

A little bit less of a minimal example:

fn fill_super_minimal() -> List[UInt32]:
    var table = List[UInt32](capacity=4)
    table.append(129)
 
    # This overflows with alias, but not var
    table.append(int(table[0].cast[DType.uint8]()))
 
    # This overflows in both cases, as if uint8 overflowed.
    var a: UInt32 = 129
    table.append(int(a.cast[DType.uint8]()))
 
    # This overflows as if uint32 overflows
    var b: UInt32 = 129
    table.append(UInt32(b.cast[DType.uint8]()))
 
    return table
 
 
def main():
 
    var var_table = fill_super_minimal()
    alias alias_table = fill_super_minimal()
 
    for i in range(4):
        print(var_table[i], alias_table[i])

The expected output would be:

But I get:

129 129
129 4294967169
4294967169 4294967169
1 1

System information

- What OS did you do install Mojo on ?
Ubuntu 22.04.4 LTS
- Provide version information for Mojo by pasting the output of `mojo -v`
mojo 24.4.0 (2cb57382)
- Provide Modular CLI version by pasting the output of `modular -v`
modular 0.8.0 (39a426b5)

The text was updated successfully, but these errors were encountered:

fnands · 2024-06-17T09:12:16Z

Just tested on nightly and behaviour is identical.
Nightly version: mojo 2024.6.1705 (79838f00)

soraros · 2024-06-17T12:54:35Z

Related to #2860 and #3045.

The 1 is caused by implicit conversion. UInt32(...) doesn't do what you think it does.

fnands · 2024-06-17T14:25:07Z

So just to be clear:

The var vs alias differences seem to only appear when reading a value from a table.

So there might be two issues: the table one and the int() one.

soraros · 2024-06-17T14:37:59Z

Smaller repro:

fn main():
    print(int(UInt8(128)))  # prints -128

It's very curious that the following runs fine:

fn main():
  var n = UInt8(128)
  print(n)       # prints 128
  print(int(n))  # prints 128

fnands · 2024-06-17T14:43:01Z

Smaller repro:

fn main():
    print(int(UInt8(128)))  # prints -128

Similarly, running:

  var a: UInt32 = 129

  print(int(a.cast[DType.uint8]()))

in the REPL returns 129.

Running:

def main():
    var a: UInt32 = 129

    print(int(a.cast[DType.uint8]()))

main()

In the REPL returns -127

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>

martinvuyk · 2024-07-03T21:11:10Z

I have a bit of a quick and dirty solution proposal for this in #3172 . I would love some help since MLIR won't let me do rebind to another type and I'm out of my depth when it comes to IR and compiler stuff. As far as I understand we only have access to pop and index dialect and we'd need a way to bitcast the value itself and I've only found __mlir_op.pop.pointer.bitcast which I imagine only works in the heap
@soraros any idea who in the Mojo team could jump in here?

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>

soraros · 2024-07-12T02:59:26Z

@soraros any idea who in the Mojo team could jump in here?

IDK. Let's just summon the all knowing @JoeLoser.

laszlokindrat · 2024-07-12T14:28:59Z

I'm taking a look at this now:

def main():
 
    var b: UInt32 = 129
    print(b.cast[DType.uint8]())
    print(int(b.cast[DType.uint8]()))
    print(UInt32(b.cast[DType.uint8]()))

There is something flat out wrong with the conversion in print(int(b.cast[DType.uint8]())), but we can probably fix it easily. The other one, however, is actually due to an implicit bool conversion: print(UInt32(b.cast[DType.uint8]())). I'm looking at options right now for this.

martinvuyk · 2024-07-12T14:44:18Z

print(UInt32(b.castDType.uint8)). I'm looking at options right now for this.

In PR #3172 I thought about adding a constructor that casts WDYT?

    fn __init__[A: DType, //](inout self, value: SIMD[A, size]):
        """Cast the other SIMD vector into self.

        Parameters:
            A: The DType of the other SIMD.

        Args:
            value: The value to cast into self.
        """
        self = value.cast[type]()

laszlokindrat · 2024-07-12T14:46:18Z

In PR #3172 I thought about adding a constructor that casts WDYT?

    fn __init__[A: DType, //](inout self, value: SIMD[A, size]):
        """Cast the other SIMD vector into self.

        Parameters:
            A: The DType of the other SIMD.

        Args:
            value: The value to cast into self.
        """
        self = value.cast[type]()

I think we should make casting like this explicit. One way to do that is to make an always-failing constructor. I'm working on that now.

soraros · 2024-07-12T19:14:04Z

@laszlokindrat I had the idea of adding a all-failing constructor to catch this case. However, with implicit conversion to Bool, it's still less ideal than what we had before. The LSP can't warn us about mismatched types:

fn f(x: SIMD) -> SIMD:
  return x  # We can only find out about the dtype/shape mismatch after the first compiling attempt

soraros · 2024-07-14T15:37:34Z

The implicit casting part is traced in #3045.

laszlokindrat · 2024-07-15T15:09:43Z

@laszlokindrat I had the idea of adding a all-failing constructor to catch this case. However, with implicit conversion to Bool, it's still less ideal than what we had before. The LSP can't warn us about mismatched types:
fn f(x: SIMD) -> SIMD:
  return x  # We can only find out about the dtype/shape mismatch after the first compiling attempt

I agree this is not optimal, but the LSP currently has a general limitation that it doesn't work with constrained. IMO That's what really should be fixed, in conjunction with a tighter story around constraints.

The `pop.cast` op would sign extend when casting a smaller unsigned type to the `index` type, which would cause incorrect behavior for `int(UInt8(128))` and similar code. This patch fixes that by first upcasting to a larger unsigned scalar value and then converting to `Int`. Fixes #3065 MODULAR_ORIG_COMMIT_REV_ID: 111bb5545c02663ac0e7bf7cdb3678ea23c261fd

fnands added bug Something isn't working mojo-repo Tag all issues with this label labels Jun 17, 2024

DWSimmons mentioned this issue Jul 3, 2024

[BUG] [stdlib] [SIMD] Inconsistent int casting #3167

Closed

martinvuyk added a commit to martinvuyk/mojo that referenced this issue Jul 3, 2024

idea for issue modularml#3065

c22950f

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>

martinvuyk mentioned this issue Jul 3, 2024

[stdlib] Fix SIMD uint cast #3172

Closed

martinvuyk added a commit to martinvuyk/mojo that referenced this issue Jul 10, 2024

idea for issue modularml#3065

0eab8d7

Signed-off-by: martinvuyk <martin.vuyklop@gmail.com>

martinvuyk mentioned this issue Jul 12, 2024

[BUG] SIMD[type,size].MAX_FINITE #3220

Closed

laszlokindrat self-assigned this Jul 12, 2024

This was referenced Jul 14, 2024

[BUG] UInt32 or UInt64 conversion to Float32 or Float64 does not work correctly #3244

Open

[mojo-stdlib] Implicit conversions between SIMD values can lose data #3149

Closed

laszlokindrat closed this as completed Jul 16, 2024

fnands mentioned this issue Jul 27, 2024

Pre-compute CRC32 table fnands/mimage#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Unsigned integer casting overflowing as if signed when using `int()` or `UInt32()` #3065

[BUG] Unsigned integer casting overflowing as if signed when using `int()` or `UInt32()` #3065

fnands commented Jun 17, 2024 •

edited by ematejska

Loading

fnands commented Jun 17, 2024

soraros commented Jun 17, 2024 •

edited

Loading

fnands commented Jun 17, 2024 •

edited

Loading

soraros commented Jun 17, 2024 •

edited

Loading

fnands commented Jun 17, 2024

martinvuyk commented Jul 3, 2024

soraros commented Jul 12, 2024

laszlokindrat commented Jul 12, 2024

martinvuyk commented Jul 12, 2024

laszlokindrat commented Jul 12, 2024

soraros commented Jul 12, 2024 •

edited

Loading

soraros commented Jul 14, 2024 •

edited

Loading

laszlokindrat commented Jul 15, 2024

[BUG] Unsigned integer casting overflowing as if signed when using int() or UInt32() #3065

[BUG] Unsigned integer casting overflowing as if signed when using int() or UInt32() #3065

Comments

fnands commented Jun 17, 2024 • edited by ematejska Loading

Bug description

Steps to reproduce

System information

fnands commented Jun 17, 2024

soraros commented Jun 17, 2024 • edited Loading

fnands commented Jun 17, 2024 • edited Loading

soraros commented Jun 17, 2024 • edited Loading

fnands commented Jun 17, 2024

martinvuyk commented Jul 3, 2024

soraros commented Jul 12, 2024

laszlokindrat commented Jul 12, 2024

martinvuyk commented Jul 12, 2024

laszlokindrat commented Jul 12, 2024

soraros commented Jul 12, 2024 • edited Loading

soraros commented Jul 14, 2024 • edited Loading

laszlokindrat commented Jul 15, 2024

[BUG] Unsigned integer casting overflowing as if signed when using `int()` or `UInt32()` #3065

[BUG] Unsigned integer casting overflowing as if signed when using `int()` or `UInt32()` #3065

fnands commented Jun 17, 2024 •

edited by ematejska

Loading

soraros commented Jun 17, 2024 •

edited

Loading

fnands commented Jun 17, 2024 •

edited

Loading

soraros commented Jun 17, 2024 •

edited

Loading

soraros commented Jul 12, 2024 •

edited

Loading

soraros commented Jul 14, 2024 •

edited

Loading