Build failures on OS X #28

zmwangx · 2016-07-18T00:13:55Z

This is a continuation of #11. I'm opening a new issue because

The old failures due to SSE 4.1 appear to have been fixed in 1.1, while a new one surfaced;
The old thread was slightly polluted by pointless arguments.

Again, the failures occur only on Homebrew's CI server, not locally. Builds on OS X 10.9 and 10.10 now pass, but there is still a problem on 10.11, log here:

/usr/local/Library/Homebrew/shims/super/clang++    -I/tmp/lepton-20160718-65941-texea9/lepton-1.2 -I/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/util -I/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model -I/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/encoder -I/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/decoder  -std=c++11 -fno-exceptions -fno-rtti -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.11   -msse4.2   -DDEFAULT_ALLOW_PROGRESSIVE -DHIGH_MEMORY -o CMakeFiles/lepton.dir/src/lepton/jpgcoder.cc.o -c /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/jpgcoder.cc
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/jpgcoder.cc:64:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/vp8_decoder.hh:4:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/lepton_codec.hh:4:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/model.hh:10:
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/numeric.hh:295:32: error: call to '_mm_mullo_epi32' is ambiguous
    __m128i t = _mm_srli_epi32(_mm_mullo_epi32(m, abs_num), log_max_numerator);
                               ^~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.3.0/include/smmintrin.h:130:1: note: candidate function
_mm_mullo_epi32 (__m128i __V1, __m128i __V2)
^
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/../util/mm_mullo_epi32.hh:38:1: note: candidate function
_mm_mullo_epi32(const __m128i &a, const __m128i &b)
^
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/jpgcoder.cc:64:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/vp8_decoder.hh:4:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/lepton_codec.hh:4:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/model.hh:10:
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/numeric.hh:304:32: error: call to '_mm_mullo_epi32' is ambiguous
    __m128i t = _mm_srli_epi32(_mm_mullo_epi32(m, num), log_max_numerator);
                               ^~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.3.0/include/smmintrin.h:130:1: note: candidate function
_mm_mullo_epi32 (__m128i __V1, __m128i __V2)
^
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/../util/mm_mullo_epi32.hh:38:1: note: candidate function
_mm_mullo_epi32(const __m128i &a, const __m128i &b)
^
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/jpgcoder.cc:64:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/vp8_decoder.hh:4:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/lepton_codec.hh:4:
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/model.hh:903:27: error: call to '_mm_mullo_epi32' is ambiguous
        __m128i deq_low = _mm_mullo_epi32(coeffs_x_low, icos_low);
                          ^~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.3.0/include/smmintrin.h:130:1: note: candidate function
_mm_mullo_epi32 (__m128i __V1, __m128i __V2)
^
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/../util/mm_mullo_epi32.hh:38:1: note: candidate function
_mm_mullo_epi32(const __m128i &a, const __m128i &b)
^
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/jpgcoder.cc:64:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/vp8_decoder.hh:4:
In file included from /tmp/lepton-20160718-65941-texea9/lepton-1.2/src/lepton/lepton_codec.hh:4:
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/model.hh:904:28: error: call to '_mm_mullo_epi32' is ambiguous
        __m128i deq_high = _mm_mullo_epi32(coeffs_x_high, icos_high);
                           ^~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/7.3.0/include/smmintrin.h:130:1: note: candidate function
_mm_mullo_epi32 (__m128i __V1, __m128i __V2)
^
/tmp/lepton-20160718-65941-texea9/lepton-1.2/src/vp8/model/../util/mm_mullo_epi32.hh:38:1: note: candidate function
_mm_mullo_epi32(const __m128i &a, const __m128i &b)
^

Build environment:

CPU: quad-core 64-bit ivybridge
OS X: 10.11.5-x86_64
Xcode: 7.3.1
CLT: 7.3.1.0.1.1461711523
Clang: 7.3 build 703

EDIT: The log above is for v1.2 (08c52d9).

The text was updated successfully, but these errors were encountered:

danielrh · 2016-07-18T00:17:44Z

fascinating!
Could you try a quick test of find/replace in the repo all instances of _mm_mullo_epi32 to something else like vec_multiply

then get rid of all the ifdefs around
https://github.com/dropbox/lepton/blob/master/src/vp8/util/mm_mullo_epi32.hh

if that fixes it.... then we have at least an idea of a path forward, even if it causes optimized architectures to be slower

danielrh · 2016-07-18T00:20:09Z

My guess is actually that the #ifdef guards around
https://github.com/dropbox/lepton/blob/master/src/vp8/util/mm_mullo_epi32.hh
are too generous...and that function is being allowed into a build that doesn't need the (new) fallback code

zmwangx · 2016-07-18T00:27:38Z

Sorry, since I'm not familiar with this sort of low level code, let me ask one stupid question: what header(s) do I need to include for vec_multiply?

danielrh · 2016-07-18T00:40:43Z

The problem now seems to be that the fallback function is interfering with the system mm_mullo
So if you just rename all then it will use the fallback function no matter what and the naming conflict disappears

zmwangx · 2016-07-18T01:00:08Z

Ah I see, sorry for being dumb... I applied this patch and let's wait for the build server to catch up.

zmwangx · 2016-07-18T01:34:45Z

@danielrh Results in. Indeed, if _mm_mullo_epi32 is renamed to vec_multiply and the latter is used unconditionally then everything passes.

danielrh · 2016-07-18T01:40:00Z

Fascinating. Now it would be too bad to trade off speed for this but it might be good enough for a starting version

zmwangx · 2016-07-18T01:55:36Z

👍

My guess is actually that the #ifdef guards around
https://github.com/dropbox/lepton/blob/master/src/vp8/util/mm_mullo_epi32.hh
are too generous...and that function is being allowed into a build that doesn't need the (new) fallback code

That's true. Clang 7.3.0's smmintrin.h is here in case you're interested.

With a cursory glance I don't see how the #ifdef can be fixed, but what about manually checking for system __mm_mullo_epi32 in configure.ac and CMakeLists.txt?

danielrh · 2016-07-18T02:03:34Z

That could work... It's a lot of annoyance to maintain that kind of check in cmake but I think it may not happen there

zmwangx · 2016-07-18T02:08:13Z

It's a lot of annoyance to maintain that kind of check in cmake

Yeah I know...

but I think it may not happen there

By "may not happen there" you mean?

danielrh · 2016-07-18T02:22:35Z

cmake doesn't use -march=native so maybe it does a better job running on the build servers

zmwangx · 2016-07-18T02:24:34Z

I was using CMake on the build servers all along

$ cmake . -DCMAKE_C_FLAGS_RELEASE=-DNDEBUG -DCMAKE_CXX_FLAGS_RELEASE=-DNDEBUG -DCMAKE_INSTALL_PREFIX=/usr/local/Cellar/lepton/1.2 -DCMAKE_BUILD_TYPE=Release -DCMAKE_FIND_FRAMEWORK=LAST -DCMAKE_VERBOSE_MAKEFILE=ON -Wno-dev

so the errors do happen with CMake.

danielrh · 2016-07-18T02:59:03Z

Hmm I tried making you a branch that does things "right" on OSX...
that branch is called osx_hack if it works we can merge to master... it should use the good math functions
I found on this list that OSX is not guaranteed to provide the needed macros (though 10.10 does)
https://software.intel.com/en-us/node/514528

danielrh · 2016-07-18T03:13:16Z

https://github.com/dropbox/lepton/tree/osx_hack

zmwangx · 2016-07-18T03:21:01Z

I haven't applied the latest commit, but 7e1a155 alone results in something weird on 10.11 on 10.10:

/usr/local/Library/Homebrew/shims/super/clang++    -I/tmp/lepton-20160718-51992-ij3ypr/lepton-1.2 -I/tmp/lepton-20160718-51992-ij3ypr/lepton-1.2/src/vp8/util -I/tmp/lepton-20160718-51992-ij3ypr/lepton-1.2/src/vp8/model -I/tmp/lepton-20160718-51992-ij3ypr/lepton-1.2/src/vp8/encoder -I/tmp/lepton-20160718-51992-ij3ypr/lepton-1.2/src/vp8/decoder  -std=c++11 -fno-exceptions -fno-rtti -DNDEBUG   -march=core-avx2 -D__SSE4_1__=1 -D__SSE4_2__=1 -D__AVX2__=1 -D__AVX__=1   -DDEFAULT_ALLOW_PROGRESSIVE -DHIGH_MEMORY -o CMakeFiles/lepton-avx.dir/src/lepton/recoder.cc.o -c /tmp/lepton-20160718-51992-ij3ypr/lepton-1.2/src/lepton/recoder.cc
fatal error: error in backend: Do not know how to split this operator's operand!

http://bot.brew.sh/job/Homebrew%20Core%20Pull%20Requests/5020/version=mavericks/testReport/junit/brew-test-bot/mavericks/install_lepton/
http://bot.brew.sh/job/Homebrew%20Core%20Pull%20Requests/5020/version=yosemite/testReport/junit/brew-test-bot/yosemite/install_lepton/

And something else strikes back in 10.9:
http://bot.brew.sh/job/Homebrew%20Core%20Pull%20Requests/5020/version=mavericks/testReport/junit/brew-test-bot/mavericks/install_lepton/

danielrh · 2016-07-18T03:40:46Z

The plot thickens

try osx_hack2 if the other commit fails?

danielrh · 2016-07-18T03:41:39Z

also: I can't seem to access the logs--it appears to want some sort of access that I can't grant

zmwangx · 2016-07-18T03:44:43Z

it appears to want some sort of access that I can't grant

Ah yes, IIRC bot.brew.sh needs to read your organization membership to determine if you're a Homebrew maintainer, and enable more functionality if you are.

If that's not okay for you, I'll upload the logs to gists.

danielrh · 2016-07-18T03:45:28Z

that would be ideal: thank you!

zmwangx · 2016-07-18T04:08:47Z

With both commits, on 10.9 and 10.10 I get the command line parsing error, and on 10.11 I get

/usr/local/Library/Homebrew/shims/super/clang++    -I/tmp/lepton-20160718-13240-ozse7s/lepton-1.2 -I/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/vp8/util -I/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/vp8/model -I/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/vp8/encoder -I/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/vp8/decoder  -std=c++11 -fno-exceptions -fno-rtti -DNDEBUG -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -mmacosx-version-min=10.11   -march=core-avx2 -D__SSE4_1__=1 -D__SSE4_2__=1 -D__AVX2__=1 -D__AVX__=1   -DDEFAULT_ALLOW_PROGRESSIVE -DHIGH_MEMORY -o CMakeFiles/lepton-avx.dir/src/lepton/recoder.cc.o -c /tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/lepton/recoder.cc
/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/lepton/recoder.cc:81:23: error: always_inline function '_mm256_load_si256' requires target feature 'sse4.2', but would be inlined into function 'find_aligned_end_64' that is compiled without support for 'sse4.2'
        __m256i row = _mm256_load_si256((const __m256i*)(const char*)(block + iter));
                      ^
/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/lepton/recoder.cc:82:27: error: always_inline function '_mm256_cmpeq_epi16' requires target feature 'avx2', but would be inlined into function 'find_aligned_end_64' that is compiled without support for 'avx2'
        __m256i row_cmp = _mm256_cmpeq_epi16(row, _mm256_setzero_si256());
                          ^
/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/lepton/recoder.cc:82:51: error: always_inline function '_mm256_setzero_si256' requires target feature 'sse4.2', but would be inlined into function 'find_aligned_end_64' that is compiled without support for 'sse4.2'
        __m256i row_cmp = _mm256_cmpeq_epi16(row, _mm256_setzero_si256());
                                                  ^
/tmp/lepton-20160718-13240-ozse7s/lepton-1.2/src/lepton/recoder.cc:83:16: error: always_inline function '_mm256_movemask_epi8' requires target feature 'avx2', but would be inlined into function 'find_aligned_end_64' that is compiled without support for 'avx2'
        mask = _mm256_movemask_epi8(row_cmp);
               ^

Logs:

10.11: https://gist.github.com/anonymous/c8584f6b220f6f93324bc5ae8180ebb7
10.10: https://gist.github.com/anonymous/a4a3f8f2a53cd946bbf918b21ed908bd
10.9: https://gist.github.com/anonymous/f78f0648b6c981d49c7136c7154f7548

danielrh · 2016-07-18T04:12:15Z

can you try osx_hack2 ?

zmwangx · 2016-07-18T04:14:40Z

Just noticed the branch, pushed, waiting for server.

zmwangx · 2016-07-18T04:22:14Z

With osx_hack2, build passes on 10.11, but the "SSE 4.1 instruction set not enabled" issue is back on 10.10 and 10.9.

Logs:

10.10: https://gist.github.com/f21889ed9bfaf607ef4863c85c72068e
10.9: https://gist.github.com/72b8453697bbd77f8ec3b0670f5f0a22

danielrh · 2016-07-18T04:28:07Z

ok the differences between the working thing and the failing thing are absolutely minimal.

I guess there's only one line it could be...pushed a new osx_hack2...can you try one last time--maybe this is the magic bullet. Not sure why these centralized build systems are always so buggy

zmwangx · 2016-07-18T04:36:54Z

I'm delighted to report that all three builds passed (with 927635b)!

danielrh · 2016-07-18T04:39:43Z

what a marathon!

zmwangx · 2016-07-18T04:40:27Z

Thanks for all the work over here!

zmwangx · 2016-07-18T04:46:11Z

Is it possible to make a release for this so that lepton could be more readily packaged on OS X? Or maybe you'll wait for some more substantial changes?

danielrh · 2016-07-18T08:36:44Z

https://github.com/dropbox/lepton/releases does this work well enough for you--it's sort of a partial release since there's no changes from a windows perspective

danielrh · 2016-07-18T08:36:49Z

1.2.1 that is

zmwangx · 2016-07-18T09:13:51Z

That's good enough, thank you.

zmwangx mentioned this issue Jul 18, 2016

lepton 1.2.1 (new formula) Homebrew/homebrew-core#3012

Closed

4 tasks

danielrh closed this as completed Jul 18, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build failures on OS X #28

Build failures on OS X #28

zmwangx commented Jul 18, 2016 •

edited

Loading

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

Build failures on OS X #28

Build failures on OS X #28

Comments

zmwangx commented Jul 18, 2016 • edited Loading

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

zmwangx commented Jul 18, 2016

danielrh commented Jul 18, 2016

danielrh commented Jul 18, 2016

zmwangx commented Jul 18, 2016

zmwangx commented Jul 18, 2016 •

edited

Loading