-
-
Notifications
You must be signed in to change notification settings - Fork 673
[Feature] Implement TypedArray#set with tests #1002
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tests are probably naive. Could be made better, but it appears to be functional. Any help/guidance would be appreciated. |
|
||
let setSource1: i32[] = [1, 2, 3]; | ||
let setSource2: f32[] = [4, 5, 6]; | ||
let setSource3: f64[] = [Infinity, NaN, -Infinity]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess need add i64[]
and typed array sources as well (with and without byte offsets)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I'll add a few more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Came here looking for an i16->i8 test. Might make sense to have a signed and unsigned small int here, a signed and unsigned normal i32 and a signed and unsigned i64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please forgive me for finally getting around to this. I added a few more tests and actually found an edge case involving Uint8ClampedArray
.
I would be happy to add some more tests if you'd like! I just need to get around to them tomorrow.
Can I get some feedback on the ones I put in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now with the additional Uint8Array and Int16Array cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now ready for review @MaxGraey
Thank you for all your help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MaxGraey Changes made. Thanks for your help.
Anything else I can do to complete this pull request? |
Looks good to me |
isFinite<U>(value) ? <T>max<U>(0, min<U>(255, value)) : 0 | ||
); | ||
} else { | ||
let value = load<U>(sourceDataStart + (<usize>i << alignof<U>())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we'll get more sign-extension/masking here than necessary, and if so whether this can be avoided by typing the intermediate local i/u32
. Like, if value
is typed u16
and we cast that to i8
, the explicit conversion will wrap, even though the store8
would already wrap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you mean store<u8>(<u8>~( .... ))
is unnecessary and just need load<u8>(~(... ))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly relevant if T
and U
have different sizes, which Wasm has distinct instructions for when loading and storing. The store<T>
builtin is more of a store<T>(<U>...)
. It evaluates the argument type in order to pick the right instruction, which implicitly wraps the value. So, if we explicitly convert <T>someU
that might already be implicit in the store, and any explicit wrapping being unnecessary. Didn't look at the fixtures yet.
Like, one can think about the stores below, as they are, as store<T>(<T>someU...)
, where store<T>(someU)
would already be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to make changes for that. Just want to make sure I understand it before I pull changes from master and make a qualitative change to the code.
Can you help me understand this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe what happens in the T=i8, U=i16
case is that on value = load<i16>(...)
, the compiler remembers that value
is i16
and wrapped (because the load returns a properly wrapped value). Then, min<i16>(some, thing)
requires properly wrapped values to determine the minimum, and fortunately it's already wrapped so nothing to do there. After that, the value is cast to <T>
(i8
here), which wraps the i16 to an i8 discarding garbage bits (explicit cast does this), but that's unnecessary since what follows is a store<i8>(...)
translating into an i32.store8
that already discards garbage bits.
One first thing to test can be to simply remove the <T>
at the start of the store argument (in the int->int case), and see what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. @dcodeIO that was an incredible find. A typecast causing an optimization problem like that seems so obscure, I can't believe it.
Anyway, this should be fixed and brought current with master!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the offending <T>
, and this should at least be resolved from an implementation standpoint, if not from the testing standpoint. See above!
Can we mark this conversation as resolved?
store<T>(targetDataStart + (<usize>i << alignof<T>()), isFinite<U>(value) ? <T>value : 0); | ||
} else { | ||
// @ts-ignore: cast to T is valid for numeric types here | ||
store<T>(targetDataStart + (<usize>i << alignof<T>()), <T>load<U>(sourceDataStart + (<usize>i << alignof<U>()))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is the int->int case, in which case the <T>
is unnecessary for the same reason. I believe the tests currently do not catch this due to missing small and long int cases. So it might make sense to add the tests first, and then compare what changes with <T>
removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace("hit!");
store<T>(targetDataStart + (<usize>i << alignof<T>()), <T>load<U>(sourceDataStart + (<usize>i << alignof<U>())));
This is the line in question yes?
- instantiate untouched OK (33.971 ms)
716 allocs, 716 frees, 2077 increments, 2077 decrements
[fun] __start = function 674() { [native code] }
[obj] memory = [object WebAssembly.Memory]
[start]
trace: Hit!
trace: Hit!
trace: Hit!
...
This line is hit quite regularly. I will simply remove the offending <T>
and see what happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it being hit with small int types for both T and U?
if (sizeof<U>() < 4) trace("hit U!");
if (sizeof<Z>() < 4) trace("hit Z!");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I have evidence that this optimization isn't too necessary, but it does seem to help.
This is only in the untouched
version, and the optimized version remains unchanged.
I had to change it so that the cast happens if the conversion is from integer to float, however, by adding another else if
branch to take advantage of this optimization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The i64
to native i32
(or smaller) and vice-versa cases are interesting for the optimizer because it can potentially remain on i32 or i64 exclusively, without a .wrap_i64
instruction for example.
Other interesting cases are i8
-> i32
, u8
-> i16
, u16
-> i8
because these involve sign extension/wrapping. That's what I'm trying to get to, as these aren't currently present in the tests.
Ideally the tests would try all possible combinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I'll talk with @MaxGraey to get the test cases all covered.
No description provided.