Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build ffmpeg with -ffat-lto-objects #354

Open
OpenSourceAnarchist opened this issue Jun 18, 2019 · 23 comments
Open

Build ffmpeg with -ffat-lto-objects #354

OpenSourceAnarchist opened this issue Jun 18, 2019 · 23 comments

Comments

@OpenSourceAnarchist
Copy link

According to a post on Clear Linux's community forum (just 4 days ago), FFmpeg can be built safely with LTO so long as -ffat-lto-objects is enabled.

Quote: "FFMPEG does build nicely with the link-time optimizer, but putting -flto in the flags or configuring with --enable-lto tends to cause the build to fail with lots of undefined symbols. Instead, put -ffat-lto-objects in the flags (already there if you use the default CFLAGS that comes with Clear) so that the linker has a fallback. Do be sure to include --extra-ldflags='-flto -fuse-linker-plugin' --ar=gcc-ar."

Source: https://community.clearlinux.org/t/tips-and-techniques-for-building-ffmpeg/795

@ionenwks
Copy link

ionenwks commented Jun 18, 2019

I've "personally" never had trouble using LTO on ffmpeg as long as I use --enable-lto and not set -flto myself in the CFLAGS (on gentoo I use EXTRA_FFMPEG_CONF="--enable-lto"). Maybe it has to do with my USE flags though.
Edit: I believe --enable-lto omits LTO on some problematic parts, so that's why it's important to not set it yourself or else it just use it on everything anyway. And while at it, my USE flags are media-video/ffmpeg fdk fontconfig libaom libass mp3 openssl opus pic theora truetype vorbis vpx x264 x265 xcb

@barolo
Copy link

barolo commented Jun 18, 2019

I can confirm that it works currently [ it didn't just few weeks ago, same setup ] via EXTRA_FFMPEG_CONF="--enable-lto", with ffmpeg-4.1.3
USE="X alsa bzip2 encode gpl hardcoded-tables iconv libdrm network opengl openssl postproc pulseaudio threads vaapi vorbis vpx webp zlib"
CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext sse sse2 sse3 sse4_1 sse4_2 ssse3"

@pchome
Copy link
Contributor

pchome commented Jun 19, 2019

https://bugs.gentoo.org/566282

I use [[ ${ABI} == x86 ]] && filter-flags "-flto*" || append-flags "-flto" hack in addition (in modified ebuild). "Fixes" x86 part.
amd64 build was always fine for me w/ -flto flag.

@ionenwks
Copy link

^ Oh was it only the x86 version that's acting up? Been a while since I've seen anything wrong so don't really remember. Reminder can append to CFLAGS_amd64 and/or CFLAGS_x86 if don't want a ebuild hack. That aside, my x86 version is built with LTO as well (with --enable-lto and no -flto in CFLAGS) just fine.

@pchome
Copy link
Contributor

pchome commented Jun 19, 2019

I don't remember what exactly wrong with CFLAGS_amd64 and CFLAGS_x86, but I tried to use them with ffmpeg with no success.

Just tried media-video/ffmpeg *FLAGS-="-flto*" "export EXTRA_FFMPEG_CONF=--enable-lto" on gentoo ebuild, this failed too.

Error (x86)
src/libswscale/x86/yuv2rgb_template.c: In function ‘yuv420_bgr24_mmxext’:
src/libswscale/x86/yuv2rgb_template.c:346:9: error: ‘asm’ operand has impossible constraints
  346 |         YUV2RGB_INITIAL_LOAD
      |         ^
lto-wrapper: fatal error: /usr/bin/x86_64-pc-linux-gnu-gcc returned 1 exit status
compilation terminated.
/usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../x86_64-pc-linux-gnu/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [/var/tmp/portage/media-video/ffmpeg-4.1.3/work/ffmpeg-4.1.3/ffbuild/library.mak:103: libswscale/libswscale.so.5] Error 1
make: *** Waiting for unfinished jobs....
src/libavcodec/x86/vc1dsp_mmx.c: In function ‘avg_vc1_mspel_mc30_mmxext’:
src/libavcodec/x86/vc1dsp_mmx.c:318:1: error: ‘asm’ operand has impossible constraints
  318 | MSPEL_FILTER13_8B     (shift3, "0(%1     )", "0(%1,%3  )", "0(%1,%3,2)", "0(%1,%4  )", OP_AVG, avg_)
      | ^
lto-wrapper: fatal error: /usr/bin/x86_64-pc-linux-gnu-gcc returned 1 exit status
compilation terminated.
/usr/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../x86_64-pc-linux-gnu/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [/var/tmp/portage/media-video/ffmpeg-4.1.3/work/ffmpeg-4.1.3/ffbuild/library.mak:103: libavcodec/libavcodec.so.58] Error 1
 * ERROR: media-video/ffmpeg-4.1.3::gentoo failed (compile phase):
 *   emake failed
 * 
 * If you need support, post the output of `emerge --info '=media-video/ffmpeg-4.1.3::gentoo'`,
 * the complete build log and the output of `emerge -pqv '=media-video/ffmpeg-4.1.3::gentoo'`.
 * The complete build log is located at '/var/tmp/portage/media-video/ffmpeg-4.1.3/temp/build.log'.
 * The ebuild environment file is located at '/var/tmp/portage/media-video/ffmpeg-4.1.3/temp/environment'.
 * Working directory: '/var/tmp/portage/media-video/ffmpeg-4.1.3/work/ffmpeg-4.1.3-abi_x86_32.x86'
 * S: '/var/tmp/portage/media-video/ffmpeg-4.1.3/work/ffmpeg-4.1.3'

@ionenwks
Copy link

ionenwks commented Jun 19, 2019

Figured it could be my USE flags so I tried around a bit and seems I get the same error only if I remove my pic USE flag (not using that flag always seemed kind of strange considering it mixes non-pic ASM with gcc's default PIE -- the entire system is using position-independent code).

@ionenwks
Copy link

And yeah, pic implies --disable-asm (for x86 only) so any x86 asm-related errors won't happen. As to whether this is really slower or not, I couldn't say. Compiler does perform plenty of optimizations that may render the asm code not-so-relevant anymore (I imagine it's quite dated).

@javashin
Copy link

if -flto is passed to the cflags --enable-lto is automatically passed to the ./configure

@javashin
Copy link

i successfully built with :

media-video/ffmpeg-4.1.3::gentoo was built with the following:
USE="X alsa bs2b bzip2 chromium encode fdk fontconfig gpl hardcoded-tables iconv jpeg2k ladspa libaom libass libcaca libdrm lzma modplug mp3 network openal opengl openh264 openssl opus postproc pulseaudio rubberband sdl speex svg theora threads truetype vaapi vorbis vpx wavpack webp x264 x265 xcb xvid zlib (-altivec) -amr -amrenc (-appkit) -bluray -cdio -chromaprint -codec2 -cpudetection -debug -doc -flite -frei0r -fribidi -gcrypt -gme -gmp -gnutls -gsm -iec61883 -ieee1394 -jack -kvazaar -libilbc -libressl -librtmp -libsoxr -libv4l -libxml2 -lv2 (-mipsdspr1) (-mipsdspr2) (-mipsfpu) (-mmal) -opencl -oss -pic -samba -snappy -srt -ssh -static-libs -test -twolame -v4l -vdpau -zeromq -zimg -zvbi" ABI_X86="32 (64) (-x32)" CPU_FLAGS_X86="aes avx avx2 fma3 mmx mmxext sse sse2 sse3 sse4_1 sse4_2 ssse3 -3dnow -3dnowext -fma4 -xop" FFTOOLS="aviocat cws2fws ffescape ffeval ffhash fourcc2pixfmt graph2dot ismindex pktdumper qt-faststart sidxindex trasher" VIDEO_CARDS="-nvidia"
CFLAGS="-O3 -march=native -mfpmath=both -funroll-loops -falign-functions=32 -fgraphite-identity -floop-nest-optimize -fno-semantic-interposition -fuse-linker-plugin -flto=3 -ffat-lto-objects -fipa-pta -fno-math-errno -fno-trapping-math -fdevirtualize-at-ltrans -fno-stack-protector -pipe -Wl,-O2 -Wl,--as-needed,-z,now -fuse-ld=gold -Wl,--hash-style=gnu"
CXXFLAGS="-O3 -march=native -mfpmath=both -funroll-loops -falign-functions=32 -fgraphite-identity -floop-nest-optimize -fno-semantic-interposition -fuse-linker-plugin -flto=3 -ffat-lto-objects -fipa-pta -fno-math-errno -fno-trapping-math -fdevirtualize-at-ltrans -fno-stack-protector -pipe -Wl,-O2 -Wl,--as-needed,-z,now -fuse-ld=gold -Wl,--hash-style=gnu"
LDFLAGS="-Wl,-O2 -Wl,--as-needed,-z,now -fuse-ld=gold -Wl,--hash-style=gnu -O3 -march=native -mfpmath=both -funroll-loops -falign-functions=32 -fgraphite-identity -floop-nest-optimize -fno-semantic-interposition -fuse-linker-plugin -flto=3 -ffat-lto-objects -fipa-pta -fno-math-errno -fno-trapping-math -fdevirtualize-at-ltrans -fno-stack-protector -pipe"

@ionenwks
Copy link

ionenwks commented Jun 21, 2019

After retrying a bit, seems the whole thing about not having -flto in CFLAGS isn't necessary after all (I already knew the ebuild added --enable-lto but I thought there was a problem with doing it like that from previous builds, maybe there WAS at one point but been a while).

I'd personally argue the best way to build this with LTO isn't to use -ffat-lto-object but just add the pic USE flag and nothing else needs changes and can use normal -flto (everything works out regardless of x86 or amd64). Without pic it attempts to use non-pic asm (only on x86 -- flag should have close to no effect on the amd64 version) despite being in a default PIE environment which no matter how I look at it shouldn't be a thing. I feel like this flag should in fact be a gentoo default at this point.

@ionenwks
Copy link

ionenwks commented Jun 21, 2019

^ Although, if don't want to force USE flags, adding fat-lto would be simpler for GentooLTO workarounds
Edit: I guess could omit the workaround if the pic USE flag happens to be set though. If set there should be no need to change anything at all, it just works with default lto flags (seems to do for me anyway)

@InBetweenNames
Copy link
Owner

pic has a serious performance penalty on x86, doesn't it?

Reference #15 #47

I just tested out media-video/ffmpeg with the newer ebuild and it seems to be working fine on my system amd64 now, whereas I got ODR violations previously as well as some asm compilation errors.

@ionenwks , previously we used -fno-lto to disable LTO selectively, but ffmpeg's ebuild in particular didn't play nice with that. The upstream Gentoo maintainers were also not interested in fixing the ebuild or accepting patches to fix the ebuild to amend this. Since I realized I was on my own to fix that, I went with the approach currently chosen.

@pchome It seems that we should make this workaround apply only for x86, and for amd64 leave it as default -- does this sound reasonable?

@InBetweenNames
Copy link
Owner

Also, @barolo , does this work for you even with -flto in your CFLAGS? Or does it only work for you using EXTRA_FFMPEG_CONF="--enable-lto" and without -flto in your CFLAGS?

@InBetweenNames
Copy link
Owner

Proposed modification to ltoworkarounds.conf:

media-video/ffmpeg !*FLAGS-=-flto*

The ! will cause package.cflags to apply this workaround only to x86 and not amd64.

@pchome
Copy link
Contributor

pchome commented Jun 26, 2019

@InBetweenNames

It seems that we should make this workaround apply only for x86, and for amd64 leave it as default -- does this sound reasonable?

I simplified workaround to just [[ ${ABI} == x86 ]] && myconf+=( --disable-asm ), ok w/ USE=-pic and -flto. So possible solutions:

  • conditionally disable LTO in workarrounds (if abi_x86_32 use flag is set)
  • override ebuild to
    • disable LTO for abi_x86_32 only
    • disable ASM for abi_x86_32 only

...
oh, I see your comment while writing this ...

@pchome
Copy link
Contributor

pchome commented Jun 26, 2019

The ! will cause package.cflags to apply this workaround only to x86 and not amd64.

I'm not sure, maybe $HOSTTYPE=x86_64 for both abi_x86_32 and abi_x86_64, since you compiling on x86_64. Need to check.

@InBetweenNames
Copy link
Owner

Ah yes, you're right -- I'll check.

Currently I have it built with ABI_X86="32 64" with -flto enabled on both and it seems to build. I have USE=-pic set as well.

@InBetweenNames
Copy link
Owner

Indeed, the !*FLAGS-=-flto* won't work for exactly that reason. In that case, perhaps we should fork the ebuild? I could try to get it upstreamed, but last time I didn't have much luck.

@InBetweenNames
Copy link
Owner

Reference #103 too

@nivedita76
Copy link

@InBetweenNames do you have clarity on why fat lto objects fixes things? From the docs it seems like the only thing it should do is include regular object code in addition to the IR, so it would only matter if the link step is not using LTO.

@InBetweenNames
Copy link
Owner

Indeed -- it's actually due to incorrect linker setup usually. It's rare that it happens, but when the linker is invoked in such a way that LTO is inhibited, it can't "see" the LTO symbols and will instead claim there are undefined symbols instead. -ffat-lto-objects is a hack to work around that, allowing the program to still link, but obviously not all of the code is LTOed.

@nivedita76
Copy link

Wouldn’t it be completely non-lto in that case though? ie you might as well just turn off lto?

@InBetweenNames
Copy link
Owner

In the worst case -- absolutely. But there are cases where some object files can be linked with LTO and others can't (for example, shared object dependencies built as part of the same package). That's where -ffat-lto-objects can (theoretically) help. Obviously, a proper fix would be much more preferable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants