Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Conversation

@tlively
Copy link
Member

@tlively tlively commented Sep 13, 2019

Clarifies that low lanes are lanes [0, n/2) and high lanes are
[n/2, n) and that the order of arguments to the narrowing ops
are (low, high). This matches the current implementation in V8.

Resolves #103.

Clarifies that high lanes are lanes `[0, n/2)` and low lanes are
`[n/2, n)` and that the order of arguments to the narrowing ops
are (low, high). This matches the current implementation in V8.
@tlively tlively requested review from dtig and ngzhian September 13, 2019 01:28
@tlively
Copy link
Member Author

tlively commented Sep 13, 2019

cc @penzn as well.

Copy link
Member

@ngzhian ngzhian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some clarifications, please let me know if i'm completely wrong and confused.


def S.widen_low_T_s(a):
return S.widen_low_T(Sext, a)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we missing S.widen_low_T_u and S.widen_high_T_s?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah good point. Will update once we resolve the high/low terminology above.

result[i] = S.SignedSaturate(b[i])
for i in range(T.Lanes):
result[T.Lanes + i] = S.SignedSaturate(a[i])
return result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #103 I mentioned that narrow [a1,a0] [b1,b0] = [b1,b0,a1,a0] this is "register view".
In pseudocode, I think we deal with arrays/lists, hence "memory view" (little endian). So I think it looks a bit different.

a = [a0, a1, a2, a3]
b = [b0, b1, b2, b3]
narrow(a,b) will try to put a on the low lanes, i.e. result[0], result[1], result[2], result[3]. Note, I think low lanes should be the smaller lane numbers, so lane 0 is the lowest lane.

Thus in pseudo code, the arguments to SignedSaturate are swapped:

    for i in range(T.Lanes):
        result[i] = S.SignedSaturate(a[i]) # changed from b to a
    for i in range(T.Lanes):
        result[T.Lanes + i] = S.SignedSaturate(b[i]) # changed from a to b

Giving result = [a0, a1, a2, a3, b0, b1, b2, b3] (little endian view).

Does this make sense? Or am I still confused?

Copy link
Member Author

@tlively tlively Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense. I didn't realize your comment in #103 was the register view, so I will have to update the pseudocode.

Apart from that, it looks like we disagree about which lanes are "low" and "high." This is reasonable because we did not use those terms to refer to lanes before adding these instructions. Your interpretation is that the "high" lanes have the greater indices, mine is that the "high" lanes should be on the left (by analogy to high-order bits).

Do people think "high" lanes should be on the right with greater indices (:tada:) or on the left with lesser indices (:heart:)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think high lanes should be on the left, analogy to high-order bits, when we look at it in terms of a 128-bit register. But then in memory, those end up being on the right part of an array.
So given a 128-bit register representing a i32x4:
[ 0x1 | 0x2 | 0x3 | 0x4 ]
The highest lane contain 0x1.

When represented as an array (little-endian):
arr = [0x4, 0x3, 0x2, 0x1], to keep the meaning of "high" consistent, it would be arr[3].

Another way I am thinking of this is: if we have lanes 0, 1, 2, and 3, which is the high lane? I would say that it is lane 3. And lane 3 happens to be 0x1 in the above example, which in pseudocode is arr[3].

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But consider 16-bit lanes 0, 1, 2, 3, 4, 5, 6, 7. Inside the x86 register lane 2 will be higher than lane 3 but lane 4 will be higher than both of them. That's not easy to remember or reason about. WebAssembly itself only ever uses the in-memory ordering of the vector bytes, so I don't think we should be considering the implementation architecture when deciding whether to describe lanes as high or low.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngzhian reminded me that not only are the lanes "reversed" when going from little-endian memory to big-endian registers, so are their contents. Considering that everything is little-endian in memory, it's actually more consistent to say the the higher order lanes are on the right, just like the higher order bytes. I will update the PR accordingly.

Copy link
Contributor

@penzn penzn Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For v128.load or v128.store lane indices are "big endian", zero lane is read from the location pointed to by the index argument, and highest lane is in the highest memory position (if index is x lane zero would be at x, lane one at x + lane_size, three - x + 2*lane_size, and so on). For multi-byte lanes, their contents are handled in little-endian way, storing 0x01020304 to a 32-bit lane results in byte sequence of 0x04, 0x03, 0x02, 0x1.

Shuffle also assigns lower lane indices (0-15) to the first argument and higher (16-32) - to the second. I think this semantics are correct.

Maybe we should add Python pseudocode to loads and stores :)

@tlively tlively merged commit fb7b68b into master Sep 13, 2019
@dtig dtig deleted the clarify-low-high branch September 13, 2019 21:58
Honry pushed a commit to Honry/simd that referenced this pull request Oct 19, 2019
Clarifies that low lanes are lanes [0, n/2) and high lanes are
[n/2, n) and that the order of arguments to the narrowing ops
are (low, high). This matches the current implementation in V8.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Narrowing operations are underspecified

4 participants