Skip to content

Conversation

jtenner
Copy link
Contributor

@jtenner jtenner commented Dec 11, 2019

No description provided.

@jtenner
Copy link
Contributor Author

jtenner commented Dec 11, 2019

Tests are probably naive. Could be made better, but it appears to be functional. Any help/guidance would be appreciated.


let setSource1: i32[] = [1, 2, 3];
let setSource2: f32[] = [4, 5, 6];
let setSource3: f64[] = [Infinity, NaN, -Infinity];
Copy link
Member

@MaxGraey MaxGraey Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess need add i64[] and typed array sources as well (with and without byte offsets)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'll add a few more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Came here looking for an i16->i8 test. Might make sense to have a signed and unsigned small int here, a signed and unsigned normal i32 and a signed and unsigned i64.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please forgive me for finally getting around to this. I added a few more tests and actually found an edge case involving Uint8ClampedArray.

I would be happy to add some more tests if you'd like! I just need to get around to them tomorrow.

Can I get some feedback on the ones I put in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now with the additional Uint8Array and Int16Array cases.

Copy link
Contributor Author

@jtenner jtenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now ready for review @MaxGraey

Thank you for all your help.

Copy link
Contributor Author

@jtenner jtenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MaxGraey Changes made. Thanks for your help.

@jtenner
Copy link
Contributor Author

jtenner commented Dec 16, 2019

Anything else I can do to complete this pull request?

@MaxGraey
Copy link
Member

Looks good to me

isFinite<U>(value) ? <T>max<U>(0, min<U>(255, value)) : 0
);
} else {
let value = load<U>(sourceDataStart + (<usize>i << alignof<U>()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if we'll get more sign-extension/masking here than necessary, and if so whether this can be avoided by typing the intermediate local i/u32. Like, if value is typed u16 and we cast that to i8, the explicit conversion will wrap, even though the store8 would already wrap.

Copy link
Member

@MaxGraey MaxGraey Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you mean store<u8>(<u8>~( .... )) is unnecessary and just need load<u8>(~(... ))?

Copy link
Member

@dcodeIO dcodeIO Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly relevant if T and U have different sizes, which Wasm has distinct instructions for when loading and storing. The store<T> builtin is more of a store<T>(<U>...). It evaluates the argument type in order to pick the right instruction, which implicitly wraps the value. So, if we explicitly convert <T>someU that might already be implicit in the store, and any explicit wrapping being unnecessary. Didn't look at the fixtures yet.

Like, one can think about the stores below, as they are, as store<T>(<T>someU...), where store<T>(someU) would already be sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to make changes for that. Just want to make sure I understand it before I pull changes from master and make a qualitative change to the code.

Can you help me understand this?

Copy link
Member

@dcodeIO dcodeIO Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe what happens in the T=i8, U=i16 case is that on value = load<i16>(...), the compiler remembers that value is i16 and wrapped (because the load returns a properly wrapped value). Then, min<i16>(some, thing) requires properly wrapped values to determine the minimum, and fortunately it's already wrapped so nothing to do there. After that, the value is cast to <T> (i8 here), which wraps the i16 to an i8 discarding garbage bits (explicit cast does this), but that's unnecessary since what follows is a store<i8>(...) translating into an i32.store8 that already discards garbage bits.

One first thing to test can be to simply remove the <T> at the start of the store argument (in the int->int case), and see what happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Wow. @dcodeIO that was an incredible find. A typecast causing an optimization problem like that seems so obscure, I can't believe it.

Anyway, this should be fixed and brought current with master!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the offending <T>, and this should at least be resolved from an implementation standpoint, if not from the testing standpoint. See above!

Can we mark this conversation as resolved?

store<T>(targetDataStart + (<usize>i << alignof<T>()), isFinite<U>(value) ? <T>value : 0);
} else {
// @ts-ignore: cast to T is valid for numeric types here
store<T>(targetDataStart + (<usize>i << alignof<T>()), <T>load<U>(sourceDataStart + (<usize>i << alignof<U>())));
Copy link
Member

@dcodeIO dcodeIO Dec 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is the int->int case, in which case the <T> is unnecessary for the same reason. I believe the tests currently do not catch this due to missing small and long int cases. So it might make sense to add the tests first, and then compare what changes with <T> removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace("hit!");
store<T>(targetDataStart + (<usize>i << alignof<T>()), <T>load<U>(sourceDataStart + (<usize>i << alignof<U>())));

This is the line in question yes?

- instantiate untouched OK (33.971 ms)
  716 allocs, 716 frees, 2077 increments, 2077 decrements

  [fun] __start = function 674() { [native code] }
  [obj] memory = [object WebAssembly.Memory]

  [start]
  trace: Hit!
  trace: Hit!
  trace: Hit!
...

This line is hit quite regularly. I will simply remove the offending <T> and see what happens.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it being hit with small int types for both T and U?

if (sizeof<U>() < 4) trace("hit U!");
if (sizeof<Z>() < 4) trace("hit Z!");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I have evidence that this optimization isn't too necessary, but it does seem to help.

image

This is only in the untouched version, and the optimized version remains unchanged.

I had to change it so that the cast happens if the conversion is from integer to float, however, by adding another else if branch to take advantage of this optimization.

Copy link
Member

@dcodeIO dcodeIO Dec 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The i64 to native i32 (or smaller) and vice-versa cases are interesting for the optimizer because it can potentially remain on i32 or i64 exclusively, without a .wrap_i64 instruction for example.

Other interesting cases are i8 -> i32, u8 -> i16, u16 -> i8 because these involve sign extension/wrapping. That's what I'm trying to get to, as these aren't currently present in the tests.

Ideally the tests would try all possible combinations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I'll talk with @MaxGraey to get the test cases all covered.

@dcodeIO dcodeIO merged commit 01fe0b4 into AssemblyScript:master Dec 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants