Should binary operations be defined on float types? #177

littledan · 2015-05-27T05:36:04Z

It seems that operations like and, or, not, etc are defined in the polyfill on float32x4 but not float64x2. Should they be defined on float types? Seems to me like they should be on both of those or neither. For float32x4 they are just defined as casting bits to int32x4, doing the binary operation, and casting back; seems to me like this would extend pretty cleanly to float64x2.

Naively, I'd suggest restricting these binary operations to integer types, but I don't have much background about use cases.

sunfishcode · 2015-05-27T05:49:04Z

I'd be ok restricting them to integer types. We have neg and abs already, so I've lately been pondering proposing adding copySign as well, which would give us the complete IEEE-754 set of quiet computational operations. That would satisfy most of use cases for floating point and/or/xor/not that don't need to be done on integers for other reasons anyway.

PeterJensen · 2015-05-27T05:57:52Z

I was the one requesting the logical operations on float32x4 (see #40 ). I found them useful when coding up the sinx4 benchmark. Looking at the code now though it seems that I ended up not using them afterall.

littledan · 2015-05-27T06:50:00Z

Will any architecture permit an implementation faster than the polyfill? If not, maybe we should just leave it out, and let users compose the functions themselves.

johnmccutchan · 2015-05-27T14:14:08Z

Requiring the developer to explicitly do a bitCast from a float to an int before doing any bit operations seems reasonable.

Existing users would just have to do:

SIMD.int32x4.fromFloat32x4Bits(input)
result =
SIMD.float32x4.fromInt32x4Bits(result)

The JIT will strip away the bit casting operations (no performance hit) and the code will be more readable.

littledan · 2015-05-27T15:29:12Z

Well... I hope the JIT can strip it out, though with current spec language, it'll have to do the NaN->qNaN conversion.

littledan · 2015-05-27T15:29:49Z

Maybe this can be fixed just for casts but leaving it in for stores.

johnmccutchan · 2015-05-29T15:06:03Z

The sNaN->qNan conversion is dead so the compiler can easily strip away the casts.

littledan · 2015-05-29T16:09:36Z

Good, so I'll just remove the float32 binary operations from the polyfill then?

johnmccutchan · 2015-05-29T18:27:07Z

@littledan We'll discuss it in the next call, but, likely, yes.

juj · 2015-06-20T22:55:02Z

Note that performing an and on integer data is not the same as performing an and on floating point data. There is generally a one cycle stall involved when moving data between execution ports. See https://software.intel.com/en-us/forums/topic/279382 and section 3.5.2.3 in http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html .

littledan · 2015-06-22T17:00:08Z

@juj Thanks for pointing this out; that sounds like a real issue. Do you know anything about the NEON behavior? Is the same instruction used on float32x4 and float64x2, removing the same cycle stall in both cases? I see a few ways to go from here:

Add this back in for both float32x4 and float64x2.
Don't add it in, and in some implementations, teach the compiler to transform casting to an int type, performing an and and casting back to float into this SSE instruction. Depending on how your compiler is laid out, this could be either in value numbering or register allocation.
Don't add it back in and don't worry about it in implementations until more data is collected from real programs, using that to prioritize what to optimize rather than trying to optimize everything where we see a possibility with the instruction set.

Out of minimalism, I'd personally lean more towards the second or third option, but with more data I could be persuaded that it's best to add them back in. We are already planning on continuing to develop the SIMD spec (with the Universal API and the Long SIMD API), so it would not be so bad to add them into a later spec (and nothing stops implementations from having the functions before they are standardized).

johnmccutchan · 2015-06-22T21:21:11Z

I think option 3 seems like the best path forward. I doubt this is going to be a performance issue in real world programs, and if I'm wrong, we can always add it back later.

juj · 2015-06-23T04:44:28Z

Don't want to be harsh or anything, but I don't want to play the "please prove this is a problem by finding a real world program which is hurt by this" game anymore. I find that the test cases I provide are rejected as synthetic and theoretical. I don't currently have a codebase at hand that would block on this performance limitation, and I doubt that I should spend much cycles in harvesting one. When developing native SSE code, I can routinely observe the latency difference of routing float ops via the int pipe.

If the bitops on floats were added, then it the spec would be simpler and would allow removing the special case notes "This operation exists only on integer and boolean SIMD types." here: http://littledan.github.io/simd.html#simd-and .

The lack of bitops on floats is already worked around in Emscripten, at https://github.com/juj/emscripten/blob/sse2_emmintrin_h/system/include/emscripten/vector.h#L58 . In general, this seems to boil down to the same as #59.

johnmccutchan · 2015-06-23T13:59:04Z

@juj Optimizing for synthetic micro-benchmarks isn't a good approach. I'm impressed you can observe a fraction of a nanosecond (aka one cycle). I would love to see a real world program that demonstrates that.

Anyways, I'm not sure how having an explicit cast at the JavaScript level would stop a motivated JIT from avoiding the cast at the machine level. It's an incredibly simple pattern to match.

littledan · 2015-08-07T18:48:41Z

Sounds like we're resolved that we aren't including this.

sunfishcode mentioned this issue Jun 1, 2015

Implement the new bool vector types #189

Merged

littledan closed this as completed Aug 7, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should binary operations be defined on float types? #177

Should binary operations be defined on float types? #177

littledan commented May 27, 2015

sunfishcode commented May 27, 2015

PeterJensen commented May 27, 2015

littledan commented May 27, 2015

johnmccutchan commented May 27, 2015

littledan commented May 27, 2015

littledan commented May 27, 2015

johnmccutchan commented May 29, 2015

littledan commented May 29, 2015

johnmccutchan commented May 29, 2015

juj commented Jun 20, 2015

littledan commented Jun 22, 2015

johnmccutchan commented Jun 22, 2015

juj commented Jun 23, 2015

johnmccutchan commented Jun 23, 2015

littledan commented Aug 7, 2015

Should binary operations be defined on float types? #177

Should binary operations be defined on float types? #177

Comments

littledan commented May 27, 2015

sunfishcode commented May 27, 2015

PeterJensen commented May 27, 2015

littledan commented May 27, 2015

johnmccutchan commented May 27, 2015

littledan commented May 27, 2015

littledan commented May 27, 2015

johnmccutchan commented May 29, 2015

littledan commented May 29, 2015

johnmccutchan commented May 29, 2015

juj commented Jun 20, 2015

littledan commented Jun 22, 2015

johnmccutchan commented Jun 22, 2015

juj commented Jun 23, 2015

johnmccutchan commented Jun 23, 2015

littledan commented Aug 7, 2015