Instructions that can be emulated by other instructions #68

bnjbvr · 2014-09-23T09:28:34Z

I don't follow the rationale behind these 4 functions (SIMD.int32x4.withFlag{X,Y,Z,W}):

they don't map to a single instruction, in x86/x64/arm.
their behavior can be emulated with:

 SIMD.int32x4.withFlagX = function(vec, val) {
    return SIMD.int32x4.withX(vec, 0 - (!!val | 0));
 }
// or (less optimized)
SIMD.int32x4.withFlagX = function(vec, val) {
   return SIMD.int32x4.withX(vec, val ? 0xFFFFFFFF : 0x0))
}

Unless there is an obvious use case I am missing, could we delete this set of functions from the spec?

The text was updated successfully, but these errors were encountered:

bnjbvr · 2014-09-25T14:09:23Z

Actually, the issue is broader:

SIMD.type.zero() is equivalent to SIMD.type(0,..0). For float32x4,int32x4 it spares 2 chars, but for float64x2 it wastes 5 chars. If the reason to have it in the specs was that it maps directly to x86 instructions (xorps, xorpd, pxor), it seems that any good JIT compiler should be able to find SIMD literals and use optimized paths for generating the equivalent code?
SIMD.float32x4.clamp(x, low, high) is equivalent to SIMD.float32x4.min(SIMD.float32x4.max(x, low), high). SIMD implementations tend to use minps and maxps for this one, so what's the rationale behind this operator?

ghost · 2014-09-25T14:13:58Z

If I understand correctly, we have three different cases:

withFlagX - does not have a single target instruction to be optimized to
zero - should be just as optimizable as typex4(0,0,0,0)
clamp - guaranteed optimization, whereas the min(max) pattern could potentially be missed (maybe the user writes it obtusely, maybe GVN breaks it apart accidentally)

My gut feeling is that 3 deserves a builtin function but 1 and 2 don't.

bnjbvr · 2014-09-25T16:11:41Z

To these 3 instructions, we can add a fourth one: float32x4.scale. It doesn't have any SSE2 counterpart and it can be implemented as SIMD.float32x4.mul(SIMD.float32x4.splat(scalar), vec).

bnjbvr · 2014-10-17T13:29:40Z

The same way, do we need:

int32x4.bool? can be emulated with int32x4(0xFFFFFFFF, 0x0, etc), afaik no matching SSE2 instruction
int32x4.flag{X,Y,Z,W}? can be emulated with (int32x4.X != 0x0), afaict no matching SSE2 instruction

bnjbvr · 2014-10-17T13:30:28Z

+if we keep clamp, how should it behave wrt NaN? with -0 / +0?

bnjbvr changed the title ~~SIMD.int32x4.withFlagX and such~~ Instructions that can be emulated by other instructions Sep 25, 2014

johnmccutchan closed this as completed in b85f7aa Oct 31, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instructions that can be emulated by other instructions #68

Instructions that can be emulated by other instructions #68

bnjbvr commented Sep 23, 2014

bnjbvr commented Sep 25, 2014

ghost commented Sep 25, 2014

bnjbvr commented Sep 25, 2014

bnjbvr commented Oct 17, 2014

bnjbvr commented Oct 17, 2014

Instructions that can be emulated by other instructions #68

Instructions that can be emulated by other instructions #68

Comments

bnjbvr commented Sep 23, 2014

bnjbvr commented Sep 25, 2014

ghost commented Sep 25, 2014

bnjbvr commented Sep 25, 2014

bnjbvr commented Oct 17, 2014

bnjbvr commented Oct 17, 2014