Skip to content

Conversation

dcodeIO
Copy link
Member

@dcodeIO dcodeIO commented Feb 22, 2019

Edit: This escalated and is complete now.

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2019

Hmm, it might actually be easier to defer i64x2.splat -> v128.splat<i64> in order to share most of the codegen here, quite similar to what we do with i32.clz -> clz<i32> etc. Suggesting v128.* over simd.* because there might be other types at some point potentially.

@MaxGraey
Copy link
Member

I'm not sure this convention apply to rest operations

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 22, 2019

From the proposal it seems that the name of the operation, i.e. add, plus a single type parameter, i.e. <i64> (... i32, i16, i8, f32, f64) is sufficient to represent which instruction to pick when bound to the underlying vector type (here v128). One v128 split into i64s gives us 2 lanes etc. Correct me if I'm wrong :)

@MaxGraey
Copy link
Member

yeah, it seems for fixed length SIMD operations this correct)

@MaxGraey
Copy link
Member

@dcodeIO shuffling lanes operation could be also not trivial for "first good issue". wdyt?

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 23, 2019

Looks like that's essentially just a v128,v128->v128 binary plus an additional lane index like in extract_lane/replace_lane. Should be straight forward :)

@MaxGraey
Copy link
Member

Nope) It's v128,v128,imm[16]->v128. This instruction is encoded with 16 bytes providing the indices of the elements to return. The indices i in range [0, 15] select the i-th element of a. The indices in range [16, 31] select the i - 16-th element of b. imm is array of constants and compiler should check this

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 23, 2019

I understand. Well, actually I don't, so I guess you're right :)

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 23, 2019

Regarding shuffle it might also be convenient to provide typed versions like i64x2.shuffle as well (even though these are not in the spec), so v8x16.shuffle would essentially become an alias of i8x16.shuffle which itself would become a deferred call to v128.shuffle<T>. Wdyt?

@dcodeIO dcodeIO changed the title Implement v128 initializers / exemplary add Implement initial v128 instructions Feb 23, 2019
@MaxGraey
Copy link
Member

MaxGraey commented Feb 25, 2019

What do you think about add optional align attribute for all load and store builtins? I guess it may some meaning for SIMD's load/store like translate to movaps sse instruction when align==16 instead movups which could lead to significant performance difference. Probably just add boolean param will be enough load<T>(ptr: usize, offset: i32 = 0, aligned: bool = false).

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2019

For context: Currently all load/store ops use natural alignment, in the case of v128 that's 16 bytes. Natural alignment isn't displayed in text format iiuc. I must admit though that I don't fully understand the full implications of the alignment hint, and it appears to me to become relevant only when actually diverging from natural alignment?

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2019

One strange thing about the alignment hint is that the spec speaks of "an alignment hint (in base 2 logarithmic representation)", while Binaryen's CreateLoad takes the size in bytes (which is fine) but the text format also outputs the size in bytes. Is this expected? Asking because I'm not sure anymore whether the new constantAlign parameter on i32.load etc., which is modeled after the text format, should take log2 or size.

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2019

Oh, well .. Downloading and installing node v12.0.0-v8-canary201902266c298fffb9...

@MaxGraey
Copy link
Member

Yeah, it's alive)

@MaxGraey
Copy link
Member

It seems you need rebuild tests

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 26, 2019

10/10 would do if node/v8-canary would work on Windows :)

@kripken
Copy link

kripken commented Feb 26, 2019

(yeah, I think the binary has exponent of 2, while the text format is in bytes)

@dcodeIO
Copy link
Member Author

dcodeIO commented Feb 27, 2019

Alright, the set of instructions should be complete now, except no real docs on definitions and incomplete tests. Also fixed a few definition hiccups I came across.

Whoever feels like giving this a review, please do. I know, I know, it's boring to the max.

@dcodeIO dcodeIO changed the title Implement initial v128 instructions Implement v128 instructions Feb 27, 2019
@dcodeIO dcodeIO merged commit e1f1a3b into master Feb 28, 2019
@dcodeIO dcodeIO deleted the simd-init branch March 7, 2019 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants