Perform runtime check for SSE 4.2 instructions#795
Conversation
4560ef5 to
d566782
Compare
|
Detecting earlier is better. Running the check on each parse is expensive. I'd rather the test was done when the Makefile is generated. There is a check in the current extconf.rb. Is that not working? If not I could use some help getting that to work on some of the older machines you might have access to. Is that possible? |
If SSE 4.2 is not available on the system, don't attempt to use SIMD instructions. Relates to ohler55#789
Doesn't this change only do the detection when the module is initialized?
We package a build using SSE 4.2 platform, but some users run the code on older hardware. We either need a way to disable the SSE instructions outright, or make the check dynamic. Tensorflow behaves the same way: they ship a Python wheel that can use SSE instructions when available. |
|
I see, looks like what you have should be good then. I'll merge once tests are done and cut a release. |
|
Released |
Compiling with `-msse4.2` is problematic because it may cause a number of other SSE instructions to be used. For example: ``` $ ./elfx86exts oj.so MODE64 (call) CMOV (cmovne) SSE2 (movq) SSE41 (pinsrq) SSE1 (movups) SSSE3 (pshufb) SSE3 (movddup) SSE42 (pcmpestri) CPU Generation: Unknown ``` The runtime check in ohler55#795 is therefore not sufficient to prevent SSE instructions from leaking elsewhere in the shared library. As a result, loading Oj on older machines with these libraries will result in a "illegal instruction" crash. Introduce a `--with-sse42` flag that can be used to configure this gem.
Compiling with `-msse4.2` is problematic because it may cause a number of other SSE instructions to be used. For example: ``` $ ./elfx86exts oj.so MODE64 (call) CMOV (cmovne) SSE2 (movq) SSE41 (pinsrq) SSE1 (movups) SSSE3 (pshufb) SSE3 (movddup) SSE42 (pcmpestri) CPU Generation: Unknown ``` The runtime check in ohler55#795 is therefore not sufficient to prevent SSE instructions from leaking elsewhere in the shared library. As a result, loading Oj on older machines with these libraries will result in a "illegal instruction" crash. Introduce a `--with-sse42` flag that can be used to configure this gem.
Compiling with `-msse4.2` is problematic because it may cause a number of other SSE instructions to be used. For example: ``` $ ./elfx86exts oj.so MODE64 (call) CMOV (cmovne) SSE2 (movq) SSE41 (pinsrq) SSE1 (movups) SSSE3 (pshufb) SSE3 (movddup) SSE42 (pcmpestri) CPU Generation: Unknown ``` The runtime check in ohler55#795 is therefore not sufficient to prevent SSE instructions from leaking elsewhere in the shared library. As a result, loading Oj on older machines with these libraries will result in a "illegal instruction" crash. Introduce a `--with-sse42` flag that can be used to configure this gem.
Compiling with `-msse4.2` is problematic because it may cause a number of other SSE instructions to be used. For example: ``` $ ./elfx86exts oj.so MODE64 (call) CMOV (cmovne) SSE2 (movq) SSE41 (pinsrq) SSE1 (movups) SSSE3 (pshufb) SSE3 (movddup) SSE42 (pcmpestri) CPU Generation: Unknown ``` The runtime check in #795 is therefore not sufficient to prevent SSE instructions from leaking elsewhere in the shared library. As a result, loading Oj on older machines with these libraries will result in a "illegal instruction" crash. Introduce a `--with-sse42` flag that can be used to configure this gem.
If SSE 4.2 is not available on the system, don't attempt to use SIMD instructions.
Relates to #789