Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantics of JavaScript compiled >> as intended? #1434

Closed
eernstg opened this issue Feb 5, 2021 · 5 comments
Closed

Semantics of JavaScript compiled >> as intended? #1434

eernstg opened this issue Feb 5, 2021 · 5 comments
Labels
question Further information is requested specification

Comments

@eernstg
Copy link
Member

eernstg commented Feb 5, 2021

We are probably not going to change the semantics of >> when compiling to JavaScript, but the implemented behavior was actually surprising to me. Consider the following program:

// Assuming compilation to JS.

void main() async {
  // Signed shift right on positive number that fits in 32 bits: No surprise.
  print(4294967295 >> 10); // 4194303

  // 64 bit literals: Truncated in JS, zeros at the end to avoid error.
  // This shows that we're masking off all bits >31.
  print(0x0000000080000000 >> 0); // bit 31 is one: 2147483648.
  print(0x0000000100000000 >> 0); // bit 32 is one: 0.
  print(0xffffffff00000000 >> 0); // all bits >31 are one: 0.

  // Signed shift right _does_ mask off all bits >31, and it's _not_ bit 31
  // which plays the role as a sign bit, it must be the sign of the IEEE 754
  // representation, that is, a *33rd* bit.
  print(0xffffffff00000000 >> 31); // 0.
  print(0xffffffff80000000 >> 31); // 1.

  // Signed shift right _is_ signed, in the sense that we preserve the sign
  // bit of a representation whose IEEE 754 interpretation is negative. But
  // then we mask off the signed result at 32 bits, and consider it unsigned.
  print(-1 >> 1000); // 4294967295.
}

This shows that the operator >> on int in JavaScript-compiled programs behaves as a signed shift on a bit string of length 33. The 32 bits are taken from the mantissa of the IEEE 754 representation, and the 33rd bit is the sign bit.

If this is working as intended then we should specify it.

@rakudrama, @lrhn, @munificent, @leafpetersen, @stereotype441, @jakemac53, @natebosch, WDYT?

@lrhn
Copy link
Member

lrhn commented Feb 5, 2021

The >> surprises me too.

I thought the specified behavior of bitwise operations on the web are to use the underlying JS operations, possibly after some argument validation - we don't do (mod 32) on the shift counts. The JS operation coerces to Int32 first, then does 32-bit operations. Afterwards we convert the result to Uint32 (by doing >>> 0).

Seems >> actually does unsigned shift for positive values and signed shift for negative values, switching mode before converting to Int32. Clever. That makes it possible to achieve both kinds of shifts, you can just do (x | 0) >> n to get unsigned shift. (And since unsigned shift can't be implemented easily using ~ and signed shift, unlike the other direction, it's been good to have it available. That might be less important when we get >>> as an the unconditional unsigned shift).

About specifying it, we already have something in the spec (an appendix) stating:

item[$\bullet$]
  Bitwise operations on integers (and, or, xor, negation, and shifts)
  all truncate the operands to 32-bit two's complement integers,
  perform 32-bit operations on those,
  and the resulting 32 bits are interpreted as a 32-bit unsigned integer
  (for example, \code{-1 <{}< 1} evaluates to 4294967294).

That's apparently not precise when it comes to >>. It doesn't say which shift it performs on the 32-bit values.

@rakudrama
Copy link
Member

The non-negative input behaviour of >> was required to get the 32-bit-unsigned rotate patterns in the crypto libraries to work.

I describe the behaviour as follows:
If the inputs are unsigned 32-bit ints and the result is an unsigned 32-bit int, the VM and web produce the same value.
This applies to all the bitwise and shift operators.
~ fails to be compatible since one of the input or output are negative on the 64-bit int VM.
It would be easier to reason about a more constructive statement. We might be able to say result.toUnsigned(32) is the same on both platforms.

@lrhn
Copy link
Member

lrhn commented Feb 9, 2021

It's not always the same result.toUnsigned(32) on both platforms. Because JS operations truncate the inputs to 32 bits too, a right-shift can lose bits that would be shifted into the first 32 bits of result on the VM.
Example: 0x100000000 >> 3; is 0 on web, 0x20000000 natively.
It's only right shift, all other operations are indifferent to higher positioned bits.

So, to describe the operations, we do need to say that we truncate to 32 bits. The interesting thing about >> is that it treats the bits as signed or unsigned depending on the sign of the original number, a property which is not part of those 32 bits.

@eernstg
Copy link
Member Author

eernstg commented Feb 10, 2021

Here's a proposal for adding a few words to specify it: #1448.

@eernstg
Copy link
Member Author

eernstg commented Mar 10, 2021

#1448 landed as e909d79.

@eernstg eernstg closed this as completed Mar 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested specification
Projects
None yet
Development

No branches or pull requests

3 participants