Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect f64 -> u64 conversions #20944

Open
Varpie opened this issue Mar 3, 2024 · 5 comments
Open

Incorrect f64 -> u64 conversions #20944

Varpie opened this issue Mar 3, 2024 · 5 comments
Labels
Bug This tag is applied to issues which reports bugs. Unit: Compiler Bugs/feature requests, that are related to the V compiler in general. Unit: Type System Bugs/feature requests, that are related to the V types system.

Comments

@Varpie
Copy link
Contributor

Varpie commented Mar 3, 2024

Describe the bug

For explicit conversions of an f64 to a u64, it seems like the u64 operator behaves incorrectly for values between 2^63 and 2^64-1 (the max value for a u64).
Here are some examples of f64 values that should return valid u64, but fail:

Reproduction Steps

fn main() {
    assert u64(-1) == 18446744073709551615 // this one passes
    assert f64(u64(-1)) == u64(-1) // this one passes
    assert u64(f64(u64(-1))) == f64(u64(-1)) // this one fails. u64(f64(u64(-1))) gives 0
    
    assert u64(math.pow(2, 63)) == math.pow(2, 63) // this one works
    assert u64(math.pow(2, 63) + 1) != math.pow(2, 63) // this one fails, which can make sense due to low precision on the f64
    assert u64(math.pow(2, 64)) != math.pow(2, 63) // this one fails too, which doesn't make sense since 2^64 should overflow and give 0 instead of 2^63
}

Expected Behavior

The assertions above should all pass.
Alternatively, when u64 is used with a type that is too large, it should give an error, so we can handle it properly.

Current Behavior

The assertions commented as failing give a FAIL, with *unknown value* for the u64 casts. For instance:

fn main.main: assert u64(math.pow(2, 63) + 1) != math.pow(2, 63)
   left value: u64(math.pow(2, 63) + 1) = *unknown value*
  right value: math.pow(2, 63) = 9.223372036854776e+18

But when evaluating just the side with u64, it gives a concrete value: u64(math.pow(2, 63) + 1) is 9223372036854775808 (and so is u64(math.pow(2, 63)))

Possible Solution

Potentially related to #14783

Maybe also related: assertions of large f64 with small differences fail (although this is likely just because the IEEE754 binary64 format is only precise up to 2^53 for integers):

assert math.pow(2, 64) != math.pow(2, 64) - 1000 // fails, gives both sides as 1.8446744073709552e+19

Additional Information/Context

No response

V version

V 0.4.4 065e5e2

Environment details (OS name and version, etc.)

V full version: V 0.4.4 a374d25.065e5e2
OS: linux, "Arch Linux"
Processor: 8 cpus, 64bit, little endian, Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz

vroot: OK, value: ~/[...]/v
VMODULES: OK, value: ~/.vmodules
VTMP: OK, value: /tmp/v_1000

Git version: git version 2.44.0
Git vroot status: weekly.2024.08-35-g065e5e22-dirty
.git/config present: true

CC version: cc (GCC) 13.2.1 20230801
thirdparty/tcc status: thirdparty-linux-amd64 99683af0

Note

You can use the 👍 reaction to increase the issue's priority for developers.

Please note that only the 👍 reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.

@Varpie Varpie added the Bug This tag is applied to issues which reports bugs. label Mar 3, 2024
@kbkpbot
Copy link
Contributor

kbkpbot commented Mar 3, 2024

The generated C code as follow:

	// assert1: assert u64(-1) == 18446744073709551615 // this one passes
	if (!(((u64)(-1)) == 18446744073709551615U)) {}
    
	// assert2: assert f64(u64(-1)) == u64(-1) // this one passes
	if (!(((f64)(((u64)(-1)))) == ((u64)(-1)))) {}
    
	// assert3: assert u64(f64(u64(-1))) == f64(u64(-1)) // this one fails. u64(f64(u64(-1))) gives 0
    if (!(((u64)(((f64)(((u64)(-1)))))) == ((f64)(((u64)(-1)))))) {}
	
    // assert4: assert u64(math.pow(2, 63)) == math.pow(2, 63) // this one works
	f64 _t4 = math__pow(2, 63);
	if (!(((u64)(math__pow(2, 63))) == _t4)) {}
    
	// assert5: assert u64(math.pow(2, 63) + 1) != math.pow(2, 63) // this one fails, which can make sense due to low precision on the f64
    f64 _t6 = math__pow(2, 63);
	if (!(((u64)((f64)(math__pow(2, 63) + 1))) != _t6)) {}
	
	// assert6: assert u64(math.pow(2, 64)) != math.pow(2, 63)
	f64 _t8 = math__pow(2, 63);
	if (!(((u64)(math__pow(2, 64))) != _t8)) {}

compile with different compilers:

v . 
v . -cc gcc
v . -cc clang

As for different C compilers, some assertions does not pass,

  1. tcc fail at assert3, assert5, assert6
  2. gcc fail at assert5,
  3. clang fail at assert3, assert5, assert6

According to this result, I think there are some undefined actions of C lang, so different compiler handle these different.

@hholst80
Copy link
Contributor

hholst80 commented Mar 5, 2024

assert f64(u64(-1)) == u64(-1) // this one passes

That is a problem. Implicit cast between int and float should not be allowed because u64 max overflows the 15 decimals that float 64-bit can represent accurately.

Edit: My point is this: This statement should not compile.

I am fine with:

assert u64(f64(u64(-1))) == u64(-1) // expected to fail

@felipensp felipensp added Unit: Compiler Bugs/feature requests, that are related to the V compiler in general. Unit: Type System Bugs/feature requests, that are related to the V types system. labels Mar 5, 2024
@Varpie
Copy link
Contributor Author

Varpie commented Mar 5, 2024

@hholst80 the cast overflows the 15 decimals that f64 can represent accurately, but the max value of f64 is much higher than the max value of u64, so it doesn't overflow the range of acceptable values. So I don't think it should fail by default, but maybe a compiler warning would be a better way to handle it, in addition to a safe cast method.
The problem is that casting u32 to f64 is safe, and u32 to u64 is also safe, so it can be difficult to determine whether the cast is safe at compile time.

@hholst80
Copy link
Contributor

hholst80 commented Mar 5, 2024

I did a poll a few years back "what is the worst feature of C++" and the implicit cast came up as the absolute worst.

The only case I can come to terms with is a numeric literal. As long as it can be promoted without loss of precision as-is, I can accept it as useful shorthand notation. Eg. assert f64(1) == 1.0 should be ok. Rationale: 1 can be cast to a 1 as f64 and 1.0 literal can be interpreted as a f64 without loss of precision. The only difference between cast and the "interpretation" should be the flexibility of scope and target container for the source number.

Relying on "whatever C does" is closer to the Absolute worst than it's close to Optimal. C does not do much if anything worth copying for numerically safe computations. Given that C is the intermittent state there has to be compile time guards against insecure code that can break the user code in unexpected ways.

@hholst80
Copy link
Contributor

hholst80 commented Mar 5, 2024

The second part of this discussion would be on the expected behavior of an explicit cast from f64 to u64. What should the value be if say the value of the f64 cannot be represented as an u64? C probably pulls it's usual "oh it's undefined and vendor specific". That is not a good platform for safe computations. Safe should be the default with an optional escape hatch like unsafe { ... } or similar.

Note that no floats can generally be represented as integers no matter the size due to signed zero, nan's, and inf's. A stdlib conversion function that returns a Result or unsafe { .. } feels like a step up from C.

Side note: Fortran gave some more careful though and offers four ways to cast a float to an int. They talk about just the regular normal floats.This is hairy stuff when left to intuition- users are going to be surprised in a bad way. https://stackoverflow.com/questions/40310299/convert-a-real-output-into-integer#40310677

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug This tag is applied to issues which reports bugs. Unit: Compiler Bugs/feature requests, that are related to the V compiler in general. Unit: Type System Bugs/feature requests, that are related to the V types system.
Projects
None yet
Development

No branches or pull requests

4 participants