Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-lang builds of Pinecilv2 fail with ld/lto1 errors #1764

Closed
ia opened this issue Jul 28, 2023 · 5 comments · Fixed by #1765
Closed

Multi-lang builds of Pinecilv2 fail with ld/lto1 errors #1764

ia opened this issue Jul 28, 2023 · 5 comments · Fixed by #1765
Assignees

Comments

@ia
Copy link
Collaborator

ia commented Jul 28, 2023

Describe the bug
Multi-lang builds for Pinecilv2 fail with ld/lto1 errors.

To Reproduce

$ cat test.sh
#!/usr/bin/env bash

set -x
set -e
while [ 1 -eq 1 ]; do
	make clean-build
	make  -j2  model=Pinecilv2  firmware-multi_compressed_European  firmware-multi_compressed_Bulgarian+Russian+Serbian+Ukrainian  firmware-multi_Chinese+Japanese
done;
$ make docker-shell
# ./test.sh

Expected behavior
Successful multi-lang builds for PinecilV2.

Details of your device:
Build problem, not a device one.

Additional context

I create this issue to:

  • let others know that we're aware of the problem;
  • share my thoughts & collected observations at this moment;
  • figure out with help from others how to fix it in the more appropriate way.

If you work on your branch in forked repo and see similar problem, please, add comment providing:

  • the direct link to the exact line with error output in the log on github CI
  • copy-pasted snippet of log with error output right in the comment

Here are the examples of this issue:

lto1 error / upstream:

lto1: internal compiler error: cannot read 'LTO_section_decls' from Objects/Pinecilv2/./Core/BSP/Pinecilv2/bl_mcu_sdk/drivers/bl702_driver/hal_drv/src/hal_sec_hash.o
0xde08d0 internal_error(char const*, ...)
	???:0
0x5d8040 read_cgraph_and_symbols(unsigned int, char const**)
	???:0
0x5caf11 lto_main()
	???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
lto-wrapper: fatal error: riscv-none-elf-g++ returned 1 exit status
compilation terminated.
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:844: Hexfile/Pinecilv2_multi_compressed_Bulgarian+Russian+Serbian+Ukrainian.elf] Error 1
make[1]: Leaving directory '/__w/IronOS/IronOS/source'
make: *** [Makefile:162: firmware-multi_compressed_Bulgarian+Russian+Serbian+Ukrainian] Error 2
make: *** Waiting for unfinished jobs....
Linking Hexfile/Pinecilv2_multi_compressed_European.elf
lto1: internal compiler error: cannot read 'LTO_section_decls' from Objects/Pinecilv2/./Core/BSP/Pinecilv2/bl_mcu_sdk/drivers/bl702_driver/hal_drv/src/hal_sec_hash.o
0xde08d0 internal_error(char const*, ...)
	???:0
0x5d8040 read_cgraph_and_symbols(unsigned int, char const**)
	???:0
0x5caf11 lto_main()
	???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
lto-wrapper: fatal error: riscv-none-elf-g++ returned 1 exit status
compilation terminated.
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:844: Hexfile/Pinecilv2_multi_compressed_European.elf] Error 1
make[1]: Leaving directory '/__w/IronOS/IronOS/source'
make: *** [Makefile:162: firmware-multi_compressed_European] Error 2
Error: Process completed with exit code 2.

lto1 error / branch:

Linking Hexfile/Pinecilv2_multi_compressed_European.elf
lto1: internal compiler error: in read_cgraph_and_symbols, at lto/lto-common.c:2739
0xde08d0 internal_error(char const*, ...)
	???:0
0x5a5ae1 fancy_abort(char const*, int, char const*)
	???:0
0x5d77f4 read_cgraph_and_symbols(unsigned int, char const**)
	???:0
0x5caf11 lto_main()
	???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
lto-wrapper: fatal error: riscv-none-elf-g++ returned 1 exit status
compilation terminated.
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:844: Hexfile/Pinecilv2_multi_compressed_European.elf] Error 1
make[1]: Leaving directory '/__w/IronOS/IronOS/source'
make: *** [Makefile:162: firmware-multi_compressed_European] Error 2
Error: Process completed with exit code 2.

ld error / branch:

Linking Hexfile/Pinecilv2_multi_compressed_European.elf
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: warning: Objects/Pinecilv2/./Core/Threads/OperatingModes/USBPDDebug_HUSB238.o has a section extending past end of file
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: Objects/Pinecilv2/./Core/Threads/OperatingModes/USBPDDebug_HUSB238.o: ELF section name out of range
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:844: Hexfile/Pinecilv2_multi_compressed_European.elf] Error 1
make[1]: Leaving directory '/__w/IronOS/IronOS/source'
make: *** [Makefile:162: firmware-multi_compressed_European] Error 2
Error: Process completed with exit code 2.

lto1 error / branch:

lto1: internal compiler error: in read_cgraph_and_symbols, at lto/lto-common.c:2739
0xde08d0 internal_error(char const*, ...)
	???:0
0x5a5ae1 fancy_abort(char const*, int, char const*)
	???:0
0x5d77f4 read_cgraph_and_symbols(unsigned int, char const**)
	???:0
0x5caf11 lto_main()
	???:0
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
lto-wrapper: fatal error: riscv-none-elf-g++ returned 1 exit status
compilation terminated.
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:846: Hexfile/Pinecil_multi_compressed_European.elf] Error 1
make[1]: Leaving directory '/__w/IronOS-plus/IronOS-plus/source'
make: *** [Makefile:162: firmware-multi_compressed_European] Error 2
Error: Process completed with exit code 2.

At first, I couldn't reproduce it locally, not without -j$(nproc) at all nor with -j4 (since it's the value of nproc on my system). But when I did put -j2 which seems the case with github CI, I got interesting result almost right away:

Linking Hexfile/Pinecilv2_multi_compressed_Bulgarian+Russian+Serbian+Ukrainian.elf
Generating Objects/Pinecilv2/Core/Gen/translation.files/multi.EUR.o
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: warning: Objects/Pinecilv2/./Core/BSP/Pinecilv2/bl_mcu_sdk/components/ble/ble_stack/sbc/enc/sbc_analysis.o has a section extending past end of file
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: Objects/Pinecilv2/./Core/BSP/Pinecilv2/bl_mcu_sdk/components/ble/ble_stack/sbc/enc/sbc_analysis.o: ELF section name out of range
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:846: Hexfile/Pinecilv2_multi_compressed_Bulgarian+Russian+Serbian+Ukrainian.elf] Error 1
make[1]: Leaving directory '/build/ironos/source'
make: *** [Makefile:162: firmware-multi_compressed_Bulgarian+Russian+Serbian+Ukrainian] Error 2
make: *** Waiting for unfinished jobs....
Generating BriefLZ compressed translation for multi-language European
INFO:root:Reading pickled language data from Objects/Pinecilv2/Core/Gen/translation.files/multi.EUR.pickle...
INFO:root:Read language data for ['EN', 'CS', 'DA', 'DE', 'ES', 'FI', 'FR', 'HR', 'HU', 'IT', 'LT', 'NL', 'NL_BE', 'NB', 'PL', 'PT', 'SK', 'SL', 'SV', 'TR', 'VI']
INFO:root:Build version: v2.22B.55D36C98
INFO:root:Generating block for ['EN', 'CS', 'DA', 'DE', 'ES', 'FI', 'FR', 'HR', 'HU', 'IT', 'LT', 'NL', 'NL_BE', 'NB', 'PL', 'PT', 'SK', 'SL', 'SV', 'TR', 'VI']
INFO:root:Font table 12x16 compressed from 3672 to 1528 bytes (ratio 0.416)
INFO:root:Font table 06x08 compressed from 798 to 739 bytes (ratio 0.926)
INFO:root:Strings for EN compressed from 3232 to 2339 bytes (ratio 0.724)
INFO:root:Strings for CS compressed from 3418 to 2525 bytes (ratio 0.739)
INFO:root:Strings for DA compressed from 3396 to 2530 bytes (ratio 0.745)
INFO:root:Strings for DE compressed from 3526 to 2441 bytes (ratio 0.692)
INFO:root:Strings for ES compressed from 3806 to 2559 bytes (ratio 0.672)
INFO:root:Strings for FI compressed from 3310 to 2594 bytes (ratio 0.784)
INFO:root:Strings for FR compressed from 3758 to 2510 bytes (ratio 0.668)
INFO:root:Strings for HR compressed from 3972 to 2772 bytes (ratio 0.698)
INFO:root:Strings for HU compressed from 3510 to 2513 bytes (ratio 0.716)
INFO:root:Strings for IT compressed from 4304 to 2587 bytes (ratio 0.601)
INFO:root:Strings for LT compressed from 3626 to 2710 bytes (ratio 0.747)
INFO:root:Strings for NL compressed from 3626 to 2574 bytes (ratio 0.71)
INFO:root:Strings for NL_BE compressed from 3340 to 2553 bytes (ratio 0.764)
INFO:root:Strings for NB compressed from 3194 to 2407 bytes (ratio 0.754)
INFO:root:Strings for PL compressed from 3844 to 2762 bytes (ratio 0.719)
INFO:root:Strings for PT compressed from 3394 to 2441 bytes (ratio 0.719)
INFO:root:Strings for SK compressed from 3454 to 2648 bytes (ratio 0.767)
INFO:root:Strings for SL compressed from 3172 to 2541 bytes (ratio 0.801)
INFO:root:Strings for SV compressed from 3206 to 2516 bytes (ratio 0.785)
INFO:root:Strings for TR compressed from 3160 to 2608 bytes (ratio 0.825)
INFO:root:Strings for VI compressed from 3300 to 2344 bytes (ratio 0.71)
INFO:root:Done
Linking Hexfile/Pinecilv2_multi_compressed_European.elf
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: warning: Objects/Pinecilv2/./Core/BSP/Pinecilv2/bl_mcu_sdk/components/ble/ble_stack/sbc/enc/sbc_analysis.o has a section extending past end of file
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: Objects/Pinecilv2/./Core/BSP/Pinecilv2/bl_mcu_sdk/components/ble/ble_stack/sbc/enc/sbc_analysis.o: ELF section name out of range
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:846: Hexfile/Pinecilv2_multi_compressed_European.elf] Error 1
make[1]: Leaving directory '/build/ironos/source'
make: *** [Makefile:162: firmware-multi_compressed_European] Error 2

Binary files mentioned in the log above can be found here.

My further plan is to:

  • run builds without -j, with -j2, and with -j4 to see the behavior;
  • run builds using different ways to call make: make -C source multi_... vs make multi_... vs cd source && make multi_... (I doubt that this could be the reason but just to exclude the probability that it somehow related to changing build job command for respective build on github CI).

My current suspicious that it probably may be somehow related to parallel building creating race condition-like situation (i.e. some binary file is not fully generated yet when some related dependency in a target inside Makefile "thinks" that it's ready.

Less (but not impossible BTW) it could be a bug in the toolchain.

@ia ia assigned Ralim Jul 28, 2023
@ia
Copy link
Collaborator Author

ia commented Jul 28, 2023

Rate of reproducing this issue locally - about 99%. But sometimes it's slightly different error every next time:

Linking Hexfile/Pinecilv2_multi_compressed_European.elf
lto1: fatal error: bytecode stream in file 'Objects/Pinecilv2/./Core/BSP/Pinecilv2/bl_mcu_sdk/common/partition/partition.o' generated with GCC compiler older than 10.0
compilation terminated.
lto-wrapper: fatal error: riscv-none-elf-g++ returned 1 exit status
compilation terminated.
/usr/lib/gcc/riscv-none-elf/11.2.0/../../../../riscv-none-elf/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:846: Hexfile/Pinecilv2_multi_compressed_European.elf] Error 1
make[1]: Leaving directory '/build/ironos/source'
make: *** [Makefile:162: firmware-multi_compressed_European] Error 2

And it seems my fault after all, sorry! 😶


TL; DR - probable root cause (mini write-up or today I learned):

  • if you call make -jX target1 target2 ... targetY and you use only one Makefile then make tool using X for parallel build speculatively will try to run needed internal dependencies/internal targets in parallel but every external target from input will be executed one-by-one in order, not in parallel;
  • BUT if you call make -jX target1 target2 ... targetY but inside main Makefile you pass through input targets to be executed through another one call make of another one Makefile, then external targets will be run in parallel as well, i.e. make will just call in parallel make target1, make target2 up until make targetX (as much as Xs for -j).

I looked through logs before/after changes in push.yml very carefully and noticed that by the way how the logs report building, it seems that in the scenario with cd source && make -j2 multi_... OR in the scenario with make -C source/ (tested locally) option for parallel build -jN if not ignored but applied inside of making every input target in sequential order (one output of Building for Pine64 Pinecilv2 line because make process is the only one).

While after the changes in push.yml targets themselves are run in parallel (two outputs of Building for Pine64 Pinecilv2 line in parallel because make process forked into nproc processes of itself and they started to compile target firmware-multi_compressed_European & target firmware-multi_compressed_Bulgarian+Russian+Serbian+Ukrainian in parallel, hence conflict of binary data in files generated/overwritten leading to compilation error).

Working on a fix now...

ia added a commit to ia/IronOSf that referenced this issue Jul 28, 2023
@ia ia mentioned this issue Jul 28, 2023
2 tasks
@ia
Copy link
Collaborator Author

ia commented Jul 28, 2023

I just added -C source/ to test.sh from the original report and after that I got more than a dozen successful cycles of building. PR is ready here. I had zero idea about such nuances of behavior of make BTW.

Ralim pushed a commit that referenced this issue Jul 29, 2023
@Ralim
Copy link
Owner

Ralim commented Jul 29, 2023

Neither did I so I never caught it 😓
Thanks for getting a fix figured out before I woke up; very nice to wake up to a fixed issue 😁

@ia
Copy link
Collaborator Author

ia commented Jul 29, 2023

Thanks for getting a fix figured out before I woke up; very nice to wake up to a fixed issue

Sure, no problem! Sorry to bring this bug in the first place to the repo. :|

And I could be wrong in the terminology in root cause part but the bottom line is - as far as I could understand:

  • paralleling execution of multiple sub-targets for main target which is being called by one make external call using one Makefile is good & ok;
  • paralleling execution of multiple targets which is being called by one make but splitting into more than one make calls (one for each independent target) using the same Makefile may lead to nondeterministic behavior (as they say).

Something like that as far as I did manage to figure out this in a brief only to fix the issue in the most fast & suitable way.

@Ralim
Copy link
Owner

Ralim commented Jul 29, 2023

This does make sense, but also makes it hairy to debug 😓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants