Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some UInt calculations aren't unsigned #2736

Closed
shohei909 opened this issue Mar 8, 2014 · 40 comments · Fixed by #3899
Closed

Some UInt calculations aren't unsigned #2736

shohei909 opened this issue Mar 8, 2014 · 40 comments · Fixed by #3899
Assignees
Milestone

Comments

@shohei909
Copy link
Contributor

class MainTest {
    static public function main() {
        var a:UInt = 50000 * 50000;
        var b:UInt = 50000;

        trace(a); //2500000000
        trace(a / b); //50000 in flash and js, but -35899.34592 in neko
        trace(a > b); //true in flash and js, but false in neko
        trace(a == -1794967296); //compile error in flash, false in js, and true in neko
        trace(a % b); //0 in flash and js, but 4294950000 in neko, 
        trace(a >> 1); //-897483648 in flash, but 3397483648 in neko and js
    }
}
@Simn
Copy link
Member

Simn commented Mar 8, 2014

@ncannasse: Is this specified?

@ncannasse
Copy link
Member

Not-static platforms such as JS or Neko will turn Int/UInt overflowed operations into Float. The value is still an integer since it has % 1 == 0, but it might lose precision.

Neko is particular in the sense that all operations including * on two Int are overflowing on 32 bits, whereas for JS it's only bitwise operators such as ^ | & >> >>> <<

However that should not change the result of UInt operations since the abstract is meant to deal with these platform specific things.

@ncannasse
Copy link
Member

I think the problem here is that the first UInt is assigned by an Int that has already overflowed. If this can't be reproduced by (50000:UInt) * (50000:UInt) then we're all green, since Int overflow is unspecified

@Herschel
Copy link
Contributor

Herschel commented Mar 8, 2014

50000 * 50000. This is already dangerous because some platforms don't use 32-bit semantics on mutliply. So, for example, on JS, you get a big Float. I'm not sure if we want to ignore this, or implement UInt using Int32, or (god help us) define UInt32...

a / b, a % b: UInt doesn't adjust for the signs here.

a == -1: Flash gives "Comparison of Int and UInt might lead to unexpected results". We should either remove this error, or remove the Int==UInt ops in the abstract so that it errors on other platforms.

a >> 1: Flash actually defines uint >> int as returning int?! (Try it in AS3).

  1. How much do we care about cross-platform consistency? (Should UInt wrap Int32 instead of Int?)
  2. What do we define for mixed-type operations? e.g. Is Int==UInt allowed, or Int+UInt?

@shohei909
Copy link
Contributor Author

50000 * 50000 is not problem here. (50000:UInt) * (50000:UInt) reproduce this.

@ncannasse
Copy link
Member

Then it should be fixed for UInt at least

@ncannasse
Copy link
Member

BTW we should maybe translate >> into >>> for UInt, but that will introduce a regression

@Simn
Copy link
Member

Simn commented Mar 9, 2014

I'm quite sure that fixing this would require basing UInt on Int32.

@Herschel
Copy link
Contributor

Herschel commented Mar 9, 2014

I've got a branch that has UInt implemented with Int32 and fixes the other issues with div and mod. I have two remaining questions:

  1. Do we support comparison of Int vs UInt? genswf9 says no, but the abstract currently says yes. If not, we can just remove Int==UInt from the abstract.

  2. How to handle >>? I don't think it should be translated to >>>, however the result should be remain a UInt. So (a>>1) above would be 3397483648. I guess this would need genswf9 to cast it to the right type?

@shohei909
Copy link
Contributor Author

Additionally, abstract UInt lacks UInt == Float and UInt != Float overloading.

class MainTest {
    static public function main() {
        var a:UInt = (50000:UInt) * (50000:UInt);
        trace(a > 2500000000.0); //false
        trace(a < 2500000000.0); //false
        trace(a == 2500000000.0); //false
    }
}

So, you should add below functions to UInt.

    @:commutative @:op(A == B) private static function equalsFloat(a:UInt, b:Float):Bool {
        return a.toFloat() == b;
    }

    @:commutative @:op(A != B) private static function notEqualsFloat(a:UInt, b:Float):Bool {
        return a.toFloat() != b;
    }

@Simn
Copy link
Member

Simn commented Mar 9, 2014

@ncannasse: so is our UInt type actually more of a UInt32 type?

@ncannasse
Copy link
Member

@Simn, that's an interesting question. I didn't get much thoughts to UInt since the start, since it was to me more like a Flash specific thing. Now that you mention it, I don't think that UInt overflow rules should actually differ from Int overflow ones, but it's quite complex since their domain definition is not the same, and per-platform auto promotion from Int to Number (JS,Flash) in case of overflow is also an issue

@Herschel what would you suggest if we were not to use Int32 ?

@Simn Simn added this to the 3.1.1 milestone Mar 11, 2014
@Simn
Copy link
Member

Simn commented Mar 11, 2014

We should address this for 3.1.1 one way or another.

@Herschel
Copy link
Contributor

If we don't use Int32 and ignore overflow behavior, I can still fix the issues with div and mod, and add the Float comparisons that @shohei909 suggested.

There's still the question of allowing Int == UInt comparisons, as well as shifts in flash9. (It affects left shifts too, btw, because Flash VM defines shifts to always return int.)

@ncannasse
Copy link
Member

@Herschel I think we should disalow Int=UInt comparisons, since they might cause some issues with negative numbers which seems somehow unspecified.

I would not change flash shift behavior for 3.1.1 since that would be a potential regression, will have to look at it for 3.2

@ncannasse
Copy link
Member

Seems to be a lot better now, but we still have this issue with shift behavior.
I think that we should have UInt >> 1 use unsigned bit shift (and we need to document that since it's a change wrt 3.2).

@Simn could you fix it and write unit tests for this so we make sure we have same behavior everywhere ?

@ncannasse ncannasse assigned Simn and unassigned ncannasse Feb 19, 2015
@Simn
Copy link
Member

Simn commented Feb 21, 2015

But >> leads to a negative result on Flash in the original example. If we use >>> and stick to UInt we will not have the same behavior on other targets. So I don't quite understand what you want me to fix here.

@ncannasse
Copy link
Member

@Simn I was proposing to use >>> on all targets of course, so it's always unsigned.
In C there is only >>, the operation depends if the type is signed or not.

@Herschel
Copy link
Contributor

Since Haxe has both >> and >>> operators, I think that the signedness of the shift should depend on the operator used, not on the operand type. This puts it in line with other languages with both >> and >>>.

@ncannasse
Copy link
Member

@Herschel yes I actually started thinking about it just after posting my comment. Maybe we should stick to >> returning an Int and >>> keeping the type intact. @Simn would that work for you ?

@Herschel
Copy link
Contributor

Looks like cs target is different here as well, UInt >> n results in an unsigned shift (above example returns 1250000000).

@ncannasse
Copy link
Member

@Herschel ah now that's annoying, because that means we have on some platforms a different behavior on what people are used to...

@ncannasse
Copy link
Member

Maybe taking the C side is better then. @waneck what about Java, and what's your take on this ?

@Simn
Copy link
Member

Simn commented Feb 22, 2015

The question is always if that's a problem with C# or with gencs.

@waneck
Copy link
Member

waneck commented Feb 22, 2015 via email

@ncannasse
Copy link
Member

@waneck given how it's hard to decice on a given behavior, forbidding is maybe the best solution indeed.

@ousado
Copy link
Contributor

ousado commented Feb 22, 2015

Following the principle of least surprise I'd (also?) vote for making >> an unsigned shift. Forbidding it would only force people to change >> into >>> in some places to make things compile, for no good reason. If there was a good use-case for >> being signed, that'd be different, but I highly doubt that anyone using UInt in the first place would want to perform signed shifts on it, and that an operators behavior depends on the types it's used with makes sense, IMO.

@ncannasse ncannasse reopened this Feb 22, 2015
@ncannasse
Copy link
Member

It was not supposed to be closed, and each new message on this thread makes me change my opinion. I'll sleep on it and look at it again this week with fresh mind

@Simn
Copy link
Member

Simn commented Feb 22, 2015

I'm leaning towards making >> "return" Int. That way it's completely user's choice.

@Simn
Copy link
Member

Simn commented Feb 22, 2015

... but I change my opinion about this on a regular basis as well.

@ncannasse
Copy link
Member

As long as we don't agree then everything is normal :-)

@Herschel
Copy link
Contributor

We had a discussion on IRC, here's the consensus:

The semantics of the shift operators are based on the operand type. >> maintains the sign of the value, >>> zero-shifts. Because UInt values have no sign, >> and >>> are the same for UInt, both zero-shift.

>>:
Int -> Int -> Int
UInt -> Int -> UInt

>>>:
Int -> Int -> Int
UInt -> Int -> UInt

We briefly discussed disallowing >>> for UInt, but decided to have >> and >>> mean the same to avoid people having to change >>> to >> for not much of a reason.

What do you think of this, @ncannasse ?

@ncannasse
Copy link
Member

I seriously thought about it as well, but came to a different conclusion :)
First, regarding the types: I agree that the type should be conserved with bitshifting operations, so as you said both >> and >>> should return UInt when applied to an UInt.

But I think that we should also get the same result. For instance

(0xFFFFFFFF >>> 1 = 0x7FFFFFFF) = 2147483647 for both Int and UInt
and
(0xFFFFFFFF >> 1 = 0xBFFFFFFF) = 3221225471 (UInt) or -1073741825 (Int). The value is the same, only the way it's displayed changes.

That allows to switch between Int and UInt without changing the actual calculus being done, which is normal for bit operators. This way we specify that >> is an signed bitshift operator and that >>> is an unsigned one.

I think it actually makes more sense this way if you represent the value as bits, which is what the operator does (operates on bits).

@Simn
Copy link
Member

Simn commented Feb 23, 2015

We have updated our consensus in the sense that nobody cares enough to disagree with your opinion. So let's go ahead with that!

@waneck
Copy link
Member

waneck commented Feb 23, 2015

Nicolas, I think that this is the worst behaviour we can have. On many languages - that don't have the >>> operator -, what determines if it's a signed or unsigned shift is their type (e.g. C, C++, C#...). Porting code from these languages into Haxe would be very cumbersome if we make the behaviour you're suggesting.

My own personal view - in light of porting code and the confusion it brings - is that whatever we decide, we should @:deprecate the >> operator on UInt. I know it's inconvenient, but it can save a lot of hours of debugging.

@ousado
Copy link
Contributor

ousado commented Feb 23, 2015

We have to be careful:

  • On an unsigned integer, there simply is no sign bit we can carry.
  • On an unsigned integer, to perform an arithmetic right-shift, you do a logical one.
  • What are "signed" / arithmetic right-shifts useful for? For quickly dividing by powers of 2, for instance:

signed arithmetic right-shift:
10000000000000000000000000000000 >> 1 = 11000000000000000000000000000000
(-2147483648) / 2 = (-1073741824) // USEFUL!!!

unsigned both logical and arithmetic right-shift:
10000000000000000000000000000000 >> 1 = 01000000000000000000000000000000
2147483648 / 2 = 1073741824 // ALSO USEFUL!!!

proposed haxe so called "signed"/arithmetic right-shift on an unsigned integer:
10000000000000000000000000000000 >> 1 = 11000000000000000000000000000000
2147483648 / 2 = 3221225472 // ???????????? !!!

Seriously, frankly, please, let's not try to be "clever but different" here.
C, C#, C++, D, Rust, SML, Go, Perl, Scala - not a single one of these languages which

  1. do have unsigned types and
  2. have both arithmetic and logical right-shift
    use "signed"/arithmetic shift on unsigned integers. Not one. And that's because it would be wrong, plain and simple.

See especially D which, syntax-wise, also uses >>> for a 0-filling right-shift on signed integers, does allow >> on unsigned types and it does exactly the same - a plain logical, 0-filling right-shift, not a 1-filling pseudo-arithmetic one.

Haxe is an awesome language for systems programming. An outstanding one, even. And it will become much better at it. Let's get integers, of all things, right. Please.

@waneck
Copy link
Member

waneck commented Feb 23, 2015

Well, I have to say, after what @ousado said, I'm convinced.

@Simn
Copy link
Member

Simn commented Feb 24, 2015

Indeed, can't disagree with that.

@ncannasse
Copy link
Member

Another day, another opinion... Ok, let's go with UInt >> being unsigned shift, and burn people that complains about it afterwards :)

@waneck
Copy link
Member

waneck commented Feb 24, 2015 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants