Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rehex build results vary from parallelism #129

Closed
bmwiedemann opened this issue May 19, 2021 · 6 comments
Closed

rehex build results vary from parallelism #129

bmwiedemann opened this issue May 19, 2021 · 6 comments
Milestone

Comments

@bmwiedemann
Copy link

While working on reproducible builds for openSUSE, I found that
our rehex package varies between builds using -j1 and -j4

When building without LTO, there are only 4 .o files differing:
res/version.o
res/license.o
src/lua-bindings/rehex_bind.o
src/lua-plugin-preload.o

For version.o I found with filterdiff

filterdiff objdump\ -x rehex-0.3.91/res/version.o
-0000000000000000 g     O .data.rel.local       0000000000000008 REHEX_VERSION
+0000000000000000 g     O .data.rel.local       0000000000000008 REHEX_LIBDIR
 0000000000000008 g     O .data.rel.local       0000000000000008 REHEX_BUILD_DATE
-0000000000000010 g     O .data.rel.local       0000000000000008 REHEX_LIBDIR
+0000000000000010 g     O .data.rel.local       0000000000000008 REHEX_VERSION

and that probably comes from the fact that in the -j1 case, it is compiled twice with different options (no CFLAGS in 2nd call)

g++ -Wall -std=c++11 -ggdb -I. -Iinclude/ -IwxLua/modules/ -I/usr/include/capstone  -I/usr/include/lua5.4 -I/usr/lib64/wx/include/gtk2-unicode-3.1 -I/usr/include/wx-3.1 -D_FILE_OFFSET_BITS=64 -DWXUSINGDLL -D__WXGTK__ -pthread -O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type  -DNDEBUG -DLONG_VERSION='"Version 0.3.91"' -DLIBDIR='"/usr/lib64"' -c -o res/version.o res/version.cpp
g++ -Wall -std=c++11 -ggdb -I. -Iinclude/ -IwxLua/modules/ -I/usr/include/capstone  -I/usr/include/lua5.4 -I/usr/lib64/wx/include/gtk2-unicode-3.1 -I/usr/include/wx-3.1 -D_FILE_OFFSET_BITS=64 -DWXUSINGDLL -D__WXGTK__ -pthread  -DNDEBUG -DLONG_VERSION='"Version 0.3.91"' -DLIBDIR='"/usr/lib64"' -c -o res/version.o res/version.cpp

while in the -j4 build only the first call happens.

I think, this indicates that there is something wrong with the dependencies or rules defined in the build system.

@solemnwarning
Copy link
Owner

Can you provide the exact make commands and environment variables being used?

The version.o object is built as part of making the main rehex binary and also the test suite, from the looks of the commands you're seeing I think some additional flags (optimisations, warnings) are being set for the 'all' target, but not the 'check' target.

The same thing could possibly cause differences in the other .o files.

@bmwiedemann
Copy link
Author

https://code.opensuse.org/package/rehex/blob/master/f/rehex.spec#_53 has our build instructions. We dont run 'check', but I see, CFLAGS is not set for install.

@bmwiedemann
Copy link
Author

bmwiedemann commented May 19, 2021

OTOH, switching the build section to non-parallel makes results deterministic. I'll build non-parallel with strace to maybe see where the 2nd call comes from.
Edit: strace showed the 2nd g++ call came from make install

@bmwiedemann
Copy link
Author

So setting same C(XX)FLAGS on make install, makes results reproducible again.
I still find it strange, that it would try another compilation of these .cpp files, as if a dependency of them got updated after compilation.
Should I just submit the added install CFLAGS on our side, or do you want to make it reproducible without that?

solemnwarning added a commit that referenced this issue May 19, 2021
Normal Makefile rules that specify multiple targets are interpreted
as a rule that produces each of those files individually, so Make
would execute these rules in parallel to produce the same file
multiple times, invalidating already-built targets that depended on
the files and potentially corrupting the output files too.

The correct fix would be grouped targets, but they only exist in
bleeding-edge versions of GNU Make, so we have this horrible hacky
thing instead...
@solemnwarning
Copy link
Owner

I think I've fixed it on master. Does it work for you?

@bmwiedemann
Copy link
Author

Yes, d321dff also fixed it in our setup.
A many thanks.

@solemnwarning solemnwarning added this to the 0.3.92 milestone Aug 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants