Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LTO build fails on Fedora 32 with GCC 10 / binutils 2.34 #36984

Closed
akien-mga opened this issue Mar 11, 2020 · 37 comments
Closed

LTO build fails on Fedora 32 with GCC 10 / binutils 2.34 #36984

akien-mga opened this issue Mar 11, 2020 · 37 comments

Comments

@akien-mga
Copy link
Member

Godot version:
3.2.1-stable, master branch possibly affected too.

OS/device including version:
Linux, Fedora 32 and 33 with GCC 10

Issue description:
LTO build for the official godot package on Fedora 32 and 33 fails when linking:

scons-3 -j6 'CCFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LINKFLAGS=-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld' builtin_freetype=no builtin_libogg=no builtin_libpng=no builtin_libtheora=no builtin_libvorbis=no builtin_libvpx=no builtin_libwebp=no builtin_mbedtls=no builtin_opus=no builtin_pcre2=no builtin_zlib=no builtin_zstd=no builtin_miniupnpc=no use_lto=yes udev=yes progress=no p=x11 tools=yes target=release_debug
...
Linking Static Library ==> core/libcore.x11.opt.tools.64.a
Ranlib Library         ==> modules/libmodules.x11.opt.tools.64.a
core/variant_call.cpp: In function 'register_variant_methods':
core/variant_call.cpp:1485:6: note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without
 1485 | void register_variant_methods() {
      |      ^
Ranlib Library         ==> servers/libservers.x11.opt.tools.64.a
Ranlib Library         ==> core/libcore.x11.opt.tools.64.a
core/variant_call.cpp: In function 'register_variant_methods':
core/variant_call.cpp:1485:6: note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without
 1485 | void register_variant_methods() {
      |      ^
editor/plugins/visual_shader_editor_plugin.cpp: In member function '__ct_base ':
editor/plugins/visual_shader_editor_plugin.cpp:2280:1: note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without
 2280 | VisualShaderEditor::VisualShaderEditor() {
      | ^
Ranlib Library         ==> editor/libeditor.x11.opt.tools.64.a
Ranlib Library         ==> scene/libscene.x11.opt.tools.64.a
editor/plugins/visual_shader_editor_plugin.cpp: In member function '__ct_base ':
editor/plugins/visual_shader_editor_plugin.cpp:2280:1: note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without
 2280 | VisualShaderEditor::VisualShaderEditor() {
      | ^
Linking Program        ==> bin/godot.x11.opt.tools.64
servers/visual_server.cpp: In function '_bind_methods':
servers/visual_server.cpp:1639: note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without
 1639 | void VisualServer::_bind_methods() {
      | 
editor/editor_node.cpp: In member function '__ct_base ':
editor/editor_node.cpp:5579: note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without
 5579 | EditorNode::EditorNode() {
      | 
/bin/ld: /tmp/godot.x11.opt.tools.64.rEQkSf.ltrans37.ltrans.o: in function `EditorNode::EditorNode()':
/builddir/build/BUILD/godot-3.2.1-stable/./editor/plugins/root_motion_editor_plugin.h:62: undefined reference to `vtable for EditorInspectorRootMotionPlugin'
/bin/ld: /tmp/godot.x11.opt.tools.64.rEQkSf.ltrans51.ltrans.o: in function `AnimationPlayerEditor::AnimationPlayerEditor(EditorNode*, AnimationPlayerEditorPlugin*)':
/builddir/build/BUILD/godot-3.2.1-stable/editor/animation_track_editor_plugins.h:165: undefined reference to `vtable for AnimationTrackEditDefaultPlugin'
collect2: error: ld returned 1 exit status
scons: *** [bin/godot.x11.opt.tools.64] Error 1
scons: building terminated because of errors.

Full build log: https://kojipkgs.fedoraproject.org//work/tasks/2482/42402482/build.log
Build environment: https://kojipkgs.fedoraproject.org//work/tasks/2482/42402482/root.log

gcc-10.0.1-0.8.fc32.x86_64
binutils-2.34-2.fc32.x86_64

@akien-mga
Copy link
Member Author

Builds ran fine on Fedora 30, 31 and RHEL 8.

Here's the Fedora 31 build logs for comparison: https://kojipkgs.fedoraproject.org//packages/godot/3.2.1/1.fc31/data/logs/x86_64/build.log https://kojipkgs.fedoraproject.org//packages/godot/3.2.1/1.fc31/data/logs/x86_64/root.log

gcc-9.2.1-1.fc31.x86_64
binutils-2.32-31.fc31.x86_64

@akien-mga akien-mga changed the title LTO build fails on Fedora 32+ with GCC 10 LTO build fails on Fedora 32+ with GCC 10 / binutils 2.34 Mar 11, 2020
@akien-mga
Copy link
Member Author

On Mageia 8 with binutils 2.34 and gcc 9.3.0.RC1, it works fine. So I guess it's related to GCC 10, or specific flags that Fedora 32+ would have enabled.

@YRTV
Copy link

YRTV commented Mar 11, 2020

dnf install make gcc-c++

Welcome to fedora modular age!

@akien-mga
Copy link
Member Author

dnf install make gcc-c++

Welcome to fedora modular age!

https://kojipkgs.fedoraproject.org//work/tasks/2482/42402482/root.log
gcc-c++-10.0.1-0.8.fc32.x86_64
make-1:4.2.1-16.fc32.x86_64

@YRTV
Copy link

YRTV commented Mar 11, 2020

and libstdc++? Fedora is now extreemly modular. I build godot 2 days ago on vm, but i do not remember all packages needed.

@akien-mga
Copy link
Member Author

and libstdc++? Fedora is now extreemly modular. I build godot 2 days ago on vm, but i do not remember all packages needed.

It's all in the log, ctrl+f can answer :) https://kojipkgs.fedoraproject.org//work/tasks/2482/42402482/root.log

Yes, libstdc++ is installed.

@YRTV
Copy link

YRTV commented Mar 11, 2020

Ok it just finished linking. It builds fine on regular fedora 32 workstation image in gnome boxes vm. With lazy homeuser build options scons p=x11 tools=yes target=release_debug use_lto=yes -j6 verbose=yes.
I do not know how offiсial fedora packaginging works but, is it possible you double your LTO flags?
https://bugzilla.redhat.com/show_bug.cgi?id=1789137 This states you need to opt out or else its enabled by default. Maybe one of redhat specs files?

@akien-mga
Copy link
Member Author

akien-mga commented Mar 11, 2020

Thanks for checking, I'll look into it. I suspect that might be linked to the default packaging %build_cflags or %build_ldflags which might have changed in F32+:

'CCFLAGS=-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection' 'LINKFLAGS=-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld'

I just finished upgrading my VM to F32, so I'll be able to experiment too.

@akien-mga
Copy link
Member Author

Using the same command as on the official package (see OP), I could reproduce the note: messages prior to linking such as:

note: variable tracking size limit exceeded with '-fvar-tracking-assignments', retrying without

But I didn't reproduce the final linking error yet because my VM doesn't have enough RAM to link, even with -j1. I'll try to bump it and restart.

Here are the contents of the redhat-* spec files used in the cflags and ldflags:

$ cat /usr/lib/rpm/redhat/redhat-annobin-cc1 
*cc1_options:
+ %{!-fno-use-annobin:%{!iplugindir*:%:find-plugindir()} -fplugin=annobin}

$ cat /usr/lib/rpm/redhat/redhat-hardened-cc1
*cc1_options:
+ %{!r:%{!fpie:%{!fPIE:%{!fpic:%{!fPIC:%{!fno-pic:-fPIE}}}}}}

$ cat /usr/lib/rpm/redhat/redhat-hardened-ld
*self_spec:
+ %{!static:%{!shared:%{!r:-pie}}}

@akien-mga
Copy link
Member Author

After increasing my VM's RAM I could reproduce the same issue as in the OP, so I'm fairly confident it's due to one of the build or linking flags.

@YRTV
Copy link

YRTV commented Mar 12, 2020

No it,s not flags. LTO is unstable somehow. I did 3 builds on VM with scons p=x11 tools=yes target=release_debug use_lto=yes -j6 verbose=yes cleaning repo with git clean -fxd between builds.
2 successful and one failed with:

[100%] /bin/ld: /tmp/godot.x11.opt.tools.64.djJMFw.ltrans38.ltrans.o: in function `EditorNode::EditorNode()':
/home/yurii/godot/./editor/plugins/root_motion_editor_plugin.h:62: undefined reference to `vtable for EditorInspectorRootMotionPlugin'
/bin/ld: /tmp/godot.x11.opt.tools.64.djJMFw.ltrans51.ltrans.o: in function `AnimationPlayerEditor::AnimationPlayerEditor(EditorNode*, AnimationPlayerEditorPlugin*)':
/home/yurii/godot/editor/animation_track_editor_plugins.h:165: undefined reference to `vtable for AnimationTrackEditDefaultPlugin'
collect2: error: ld returned 1 exit status
scons: *** [bin/godot.x11.opt.tools.64] Error 1
scons: building terminated because of errors.

Nothing was changed between builds.

@akien-mga
Copy link
Member Author

That's puzzling... what's surprising also is that it always fails on the same two references, so if it's only those, it might well be something that we do in those constructors that GCC 10 doesn't like, and which we could workaround.

I opened a bug report upstream: https://bugzilla.redhat.com/show_bug.cgi?id=1812783

@YRTV
Copy link

YRTV commented Mar 12, 2020

This may be 2 different problems.

  1. This document here https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html#LTO-Overview states what if LTO fails early, it does so silently without notifying user. "Successful" builds may be Non-LTO.

  2. LTO process alters scope of definitions. So then LTO executed properly functions is undefined.

@YRTV
Copy link

YRTV commented Mar 12, 2020

It doesn't even need use_lto=yes to fail. 5 builds with scons p=x11 tools=yes target=release_debug -j6 verbose=yes, with git clean -fxd between them 4 good 1 failed with same error. I don't know Fedora package submission rules, but maybe just delay packaging until GCC 10 stable (aka GCC 10.1) at end of March ,begining of April.

@akien-mga
Copy link
Member Author

Thanks a lot for testing! That's very useful findings. I'll see if I can reproduce it too, without slow LTO it should be easier :)

@akien-mga
Copy link
Member Author

@YRTV How much RAM do you have on your VM? Mine has 10 GiB and I just did 3 successful builds with scons p=x11 tools=yes target=release_debug -j6 verbose=yes, I wonder if being limited on RAM could be a way to trigger the bug (which explains why LTO makes it more frequent, as it's very RAM hungry).

@YRTV
Copy link

YRTV commented Mar 12, 2020

12 GB of RAM for vm. Host has 16gb total. I hope this will be fixed in GCC 10.1, or i will just distrohop to ubuntu 20.04.

@YRTV
Copy link

YRTV commented Mar 12, 2020

My final bet will be: Node constructors never called until user adds node to the scene (with exeption of some UI nodes used for editor) considered dead code during compilation ,and got eliminated by optimizer. LTO indeed increase chance of that (optimizer has full call graph and function bodies). Try mark nodes classes to not be optimized.

@YRTV
Copy link

YRTV commented Mar 13, 2020

I was unable to reproduce crash with non-LTO buid by running builds all night. As result i do not trust git clean -fxd any more. ltrans is LTO build artifacts and shouldn't be present in non-LTO build (and definetly not after git clean -fxd), unless build system cache was involved. I will cp clean cloned repo and rm -rf after build from now on.

I wanted to try use gold as linker scons p=x11 tools=yes target=release_debug use_lto=yes -j6 verbose=yes 'CCFLAGS=-fuse-linker-plugin -fuse-ld=gold' but scon does not pass CCFLAGS to final g++ invocation. Why?

[Initial build] g++ -o bin/godot.x11.opt.tools.64 -flto=6 -pipe -no-pie platform/x11/godot_x11.x11.opt.tools.64.o platform/x11/context_gl_x11.x11.opt.tools.64.o platform/x11/crash_handler_x11.x11.opt.tools.64.o platform/x11/os_x11.x11.opt.tools.64.o platform/x11/key_mapping_x11.x11.opt.tools.64.o platform/x11/joypad_linux.x11.opt.tools.64.o platform/x11/power_x11.x11.opt.tools.64.o platform/x11/detect_prime.x11.opt.tools.64.o main/libmain.x11.opt.tools.64.a main/tests/libtests.x11.opt.tools.64.a modules/libmodules.x11.opt.tools.64.a platform/libplatform.x11.opt.tools.64.a drivers/libdrivers.x11.opt.tools.64.a editor/libeditor.x11.opt.tools.64.a scene/libscene.x11.opt.tools.64.a servers/libservers.x11.opt.tools.64.a core/libcore.x11.opt.tools.64.a modules/freetype/libfreetype_builtin.x11.opt.tools.64.a -lXcursor -lXinerama -lXrandr -lXrender -lX11 -lXi -lasound -lpulse -lGL -lpthread -ldl
/bin/ld: /tmp/godot.x11.opt.tools.64.Qlq39g.ltrans38.ltrans.o: in function `EditorNode::EditorNode()':
/home/yurii/godot/./editor/plugins/root_motion_editor_plugin.h:62: undefined reference to `vtable for EditorInspectorRootMotionPlugin'
/bin/ld: /tmp/godot.x11.opt.tools.64.Qlq39g.ltrans51.ltrans.o: in function `AnimationPlayerEditor::AnimationPlayerEditor(EditorNode*, AnimationPlayerEditorPlugin*)':
/home/yurii/godot/editor/animation_track_editor_plugins.h:165: undefined reference to `vtable for AnimationTrackEditDefaultPlugin'
collect2: error: ld returned 1 exit status
scons: *** [bin/godot.x11.opt.tools.64] Error 1
scons: building terminated because of errors.


@akien-mga
Copy link
Member Author

CCFLAGS are passed to the C and C++ compiler, but in this case you want to set a linker flag, so you should use LINKFLAGS.

That's actually my usual build command locally:

alias gobuild_x11="scons LINKFLAGS='-fuse-ld=gold' -j7 p=x11 warnings=extra werror=yes"

@akien-mga
Copy link
Member Author

@marxin Did you see any similar LTO issues on the GCC bug tracker for 10+?

@marxin
Copy link
Contributor

marxin commented Mar 13, 2020

I've just build the 3.2.1 version with:

$ scons progress=yes verbose=yes -j16 platform=x11 tools=yes   target=release_debug warnings=extra use_lto=1 werror=0 CXX=/home/marxin/bin/gcc2/bin/g++ CC=/home/marxin/bin/gcc2/bin/gcc PATH=/home/marxin/bin/gcc2/bin/ -j16

and it works fine. I used today's master.

@akien-mga
Copy link
Member Author

@marxin What OS and binutils versions are you on?

@marxin
Copy link
Contributor

marxin commented Mar 13, 2020

openSUSE Tumbleweed and

$ ld -v
GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.33.1.20191023-3

@YRTV
Copy link

YRTV commented Mar 14, 2020

@marxin was binaries actually optimized? LTO can fail silently.

https://gcc.gnu.org/onlinedocs/gccint/LTO-Overview.html#LTO-Overview
A, perhaps surprising, side effect of this feature is that any mistake in the toolchain leads to LTO information not being used (e.g. an older libtool calling ld directly). This is both an advantage, as the system is more robust, and a disadvantage, as the user is not informed that the optimization has been disabled.

https://wiki.debian.org/LTO
it may not always work (e.g. http://www.phoronix.com/scan.php?page=article&item=gcc_471_lto&num=2) but this is 5 years ago and the sizes of binaries are not mentioned as a check that LTO had kicked in. Find problems with how parameters are passed to the linker if binaries do not get smaller. One rarely gets it right at the first attempt. There is little to no reason why more freedom to optimize should get something worse.

You can build on fedora 32 by killing optimizer with CCFLAGS=-O0, it all so apparently can fail by itself producing false successful builds.

@YRTV
Copy link

YRTV commented Mar 14, 2020

It builds with gcc 10 on OpenSUSE and binaries are smaller with LTO. But on SUSE where is 2 versions of binutils LTO plugin.

yurii@localhost:~> md5sum /usr/lib64/gcc/x86_64-suse-linux/10/liblto_plugin.so
979c7ad48b01cc3fe4cdbeb59d62443f  /usr/lib64/gcc/x86_64-suse-linux/10/liblto_plugin.so
yurii@localhost:~> md5sum /usr/lib64/gcc/x86_64-suse-linux/9/liblto_plugin.so
101b7460ec35c7afbcc3acc6497c499c  /usr/lib64/gcc/x86_64-suse-linux/9/liblto_plugin.so

I don't know how to check which one is loaded during build. Can scons still load gcc 9 version of plugin in binutils when CC and CXX set to gcc-10 g++-10.

@marxin
Copy link
Contributor

marxin commented Mar 16, 2020

@marxin was binaries actually optimized? LTO can fail silently.

Yes. I'm GCC developer working on LTO a lot, so I'm sure.
Note that historically we generated so called fat LTO object, but it's now disabled by default:

       -ffat-lto-objects
           Fat LTO objects are object files that contain both the intermediate language and the object code. This makes them usable for both LTO linking and normal linking. This option is effective only when compiling with -flto and is ignored at link time.

@marxin
Copy link
Contributor

marxin commented Mar 16, 2020

I don't know how to check which one is loaded during build. Can scons still load gcc 9 version of plugin in binutils when CC and CXX set to gcc-10 g++-10.

Note that plugin are compatible and do not play role here. You can load gcc-9 LTO plugin for GCC 10 LTO bytecode and it will work fine!

@akien-mga
Copy link
Member Author

I'll check if Fedora 32's pending update to gcc-10.0.1-0.9.fc32 (upstream commit 61bcda69ca5dc9e9d5e25de7b914dd3a86089244 from 20200311) helps: https://src.fedoraproject.org/rpms/gcc/c/fec5ba4393659be855d63cc1b5958baa34027011?branch=master

@akien-mga
Copy link
Member Author

I'll check if Fedora 32's pending update to gcc-10.0.1-0.9.fc32 (upstream commit 61bcda69ca5dc9e9d5e25de7b914dd3a86089244 from 20200311) helps: https://src.fedoraproject.org/rpms/gcc/c/fec5ba4393659be855d63cc1b5958baa34027011?branch=master

Still failing on Fedora 32 with that update.

@marxin
Copy link
Contributor

marxin commented Mar 16, 2020

Do you have a Fedora package somewhere? If so, I would create an issue at RedHat bugzilla. I know about a Fedora guy who can debug that.

@akien-mga
Copy link
Member Author

akien-mga commented Mar 16, 2020

Do you have a Fedora package somewhere? If so, I would create an issue at RedHat bugzilla. I know about a Fedora guy who can debug that.

This is from the official godot package, which I maintain: https://src.fedoraproject.org/rpms/godot

I opened a rhbz bug report already: https://bugzilla.redhat.com/show_bug.cgi?id=1812783

@akien-mga
Copy link
Member Author

For the reference, I'm running Mageia 8 now with binutils 2.34 and GCC 10.1 and I don't have this LTO issue. Last time I tried to reproduce the bug on Fedora I ran into another breakage with their LTO wrapper, so it's a bit confusing. I'll throw a new build on Koji eventually to see how it behaves on current Rawhide.

@marxin
Copy link
Contributor

marxin commented May 30, 2020

@akien-mga Have you managed to build it on Fedora system?
Do you need it for an official build? I can provide openSUSE build which works just fine.

@akien-mga
Copy link
Member Author

On Fedora 32 in a podman container it works fine, which is what we use for official builds.

The problem that led to this issue was only seen when updating the Fedora package on Koji - which is why Fedora still doesn't have Godot 3.2.1-stable packaged. I'm waiting to release 3.2.2-stable to try again.

@marxin
Copy link
Contributor

marxin commented May 30, 2020

Ah, ok. Anyway, we would be pleased as openSUSE if you use for official builds ;)

@akien-mga akien-mga changed the title LTO build fails on Fedora 32+ with GCC 10 / binutils 2.34 LTO build fails on Fedora 32 with GCC 10 / binutils 2.34 Feb 14, 2021
@akien-mga
Copy link
Member Author

For the reference, it builds fine on F33 and later, while there's still a problem on F32. As F32 will soon be EOL, I consider this fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants