New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dft/simd/sse2 is compiled even when --disable-sse2 was given to configure #35
Comments
The following patch seems to help solve the problem: https://bugs.freebsd.org/bugzilla/attachment.cgi?id=153812&action=diff |
The distinction between SSE and SSE2 by have_sse* is done at the configure level only. Everywhere else in the code, HAVE_SSE2 is enabled and the data size (single vs. double) is used to distinguish the two. Your patch makes --enable-sse a no-op. fully disabling SSE in single precision, which is probably not what you want. I can reproduce the error. This is a bug in clang 3.4, perhaps https://llvm.org/bugs/show_bug.cgi?id=16748 or similar. Everything works fine with gcc. |
My patch may well have been incorrect, but the problem is real -- fftw's code considers SSE and SSE2 to be the same, which is incorrect. Now, it may very well be, that there are no SSE-only optimizations in the code and hardware devoid of SSE2-instructions will have to use generic code. But, if there is code, that can be used on an SSE-only CPU (without SSE2), it should be possible to turn it on without also turning on SSE2.
I would not blame clang for barfing -- it is asked to generate SSE2-instructions while a command-line option ( Moreover, the fact, that gcc "works" despite being presented with such a conflict, may be considered a bug... |
There is SSE code, and it doesn't enable SSE2. It just compiles file named
It doesn't 'barf', it throws an ICE (internal compiler error) and If you're still convinced there is an issue, compile the code with gcc (or Cordially, |
I am very well aware, SSE and SSE2 are different things. Your code seems to consider them to be same. Below is the line from your own
I see. So the distinction is made at preprocessing time... Yes, I see |
It appears that the confusion is caused by our attempts to reduce confusion :-( Before we added AVX support, it used to be the case that --enable-sse meant "use sse instructions, single-precision only", and --enable-sse2 meant "use sse2 instructions, double-precision only". When we added --enable-avx, this flag necessarily meant "use avx instructions in either precision". For uniformity, we decided to unify treatment of sse and sse2, with the following rules:
We never thought about the contradictory specification --enable-sse --disable-sse2; it looks like the --enable side wins, causing the OP's issue. The OP's issue can be solved simply by not passing --enable-sse to the double-precision build. If the OP's disagrees with this treatment, feel free to suggest an alternative, but please keep in mind that the current behavior is backward-compatible with the historical behavior (i.e., a double-precision build would have failed with --enable-sse in the past), and that avx is precision-agnostic, so a unified treatment is desirable. |
In all cases the suffix for the codelets is "_sse2", adding to the Cordially, Romain Dolbeau |
The build -- both in 3.3.3 and 3.3.4 -- attempts to compile the content of the sse2-directory even when
configure
was explicitly asked to disable sse2:When the
-march
argument is set to a CPU, that has no SSE2 instructions (such as "athlon-xp"), some compilers -- such asclang
-- fail:Looking into
configure
script I see the following line:Huh?
The text was updated successfully, but these errors were encountered: