Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmake: fix races without ninja #11821

Merged
merged 4 commits into from Apr 11, 2019
Merged

cmake: fix races without ninja #11821

merged 4 commits into from Apr 11, 2019

Conversation

julianoes
Copy link
Contributor

This fixes build races which happened if "Unix Makefiles" instead of ninja-build was used as the cmake backend.

For any dependencies of commands on files we need to create a target.
Otherwise, if "Unix Makefiles" are used as the generator the commands are run in parallel on the different files which often can lead to races or redundancies in our build.

A nice write-up can be found here:
https://samthursfield.wordpress.com/2015/11/21/cmake-dependencies-between-targets-and-files-and-custom-commands/#custom-commands-and-parallel-make

Fixes #11003.

@julianoes julianoes requested review from dagar and davids5 April 9, 2019 14:59
@dagar
Copy link
Member

dagar commented Apr 9, 2019

Thanks for digging into this, I'll take a closer look.

In the past there were a number of other issues with dependencies when chains of custom command generated files are involved. I'll review again.

@dagar dagar added this to the Release v1.9.0 milestone Apr 9, 2019
@dagar dagar added this to To Do in Release 1.11 Blockers via automation Apr 9, 2019
This fixes build races which happened if "Unix Makefiles" instead of
ninja-build was used as the cmake backend.

For any dependencies of commands on files we need to create a target.
Otherwise, if "Unix Makefiles" are used as the generator the commands
are run in parallel on the different files which often can lead to
races or redundancies in our build.

A nice write-up can be found here:
https://samthursfield.wordpress.com/2015/11/21/
cmake-dependencies-between-targets-and-files-and-custom-commands/#
custom-commands-and-parallel-make
This should prevent future regressions when compiling with "Unix
Makefiles" or others instead of ninja as the generator behind cmake.
@julianoes
Copy link
Contributor Author

@Tony3dr could you build and flash this without ninja as follows and test please? Thanks!

make clean
NO_NINJA_BUILD=1 make px4_fmu-v3_default

I suggest to check fmu-v3 and fmu-v5, just a quick regression test / sanity check.

@julianoes julianoes moved this from To Do to Testing in Release 1.11 Blockers Apr 10, 2019
@julianoes
Copy link
Contributor Author

@dagar the builds are still different but I think that's probably expected:

-rwxr-xr-x. 1 julianoes julianoes 1018432 Apr 10 11:29 px4_fmu-v2-ninja.bin
-rwxr-xr-x. 1 julianoes julianoes 1018488 Apr 10 11:31 px4_fmu-v2-noninja.bin

@dagar
Copy link
Member

dagar commented Apr 10, 2019

@dagar the builds are still different but I think that's probably expected:

It's definitely not expected. As a start we can use bloaty to compare the binaries.

@dagar
Copy link
Member

dagar commented Apr 10, 2019

@julianoes in addition to the Jenkins compile pipeline fmu-v2 and fmu-v5 are built in the default top level Jenkinsfile (https://github.com/PX4/Firmware/blob/master/Jenkinsfile) for bloaty comparisons. How about we just change those to use the Makefile generator?

If possible I'd prefer to keep Jenkinsfile-compile to a single build per board because it serves as an archive for binaries.

@dagar
Copy link
Member

dagar commented Apr 10, 2019

@julianoes
Copy link
Contributor Author

@dagar ok, I'll move it to Jenkinsfile.

@Tony3dr Tony3dr added this to Ready for testing in Test Flights Apr 10, 2019
@dannyfpv
Copy link

dannyfpv commented Apr 10, 2019

Tested on pixhawk 4 v5

Modes Tested:

  • Takeoff: Good
  • Position Mode: Good.
  • Altitude Mode: Good.
  • Stabilized Mode: Good.
  • Mission Plan Mode (Automated): Good.
  • RTL (Return To Land): Good.

Pre-flight Procedure:

  • Flash firmware through QGC ✓
  • Arm vehicle ✓
  • Calibrations ✓

Flight Test:

  • Arm and Take off in position mode ✓
  • Throttle Test ✓
  • Pitch, Roll, Yaw ✓
  • Mode Switch ✓
  • RTL ✓

Notes:
Took off in Position mode, after flying for approximately one minute, switched to Altitude Mode, then stabilized mode and mission mode No issues noted good flight in general. The vehicle behaved as expected.

https://review.px4.io/plot_app?log=bbcc1b2e-ce65-43f4-89d3-6372bf7d4144

@Junkim3DR
Copy link

Junkim3DR commented Apr 10, 2019

Tested on Pixhawk Pro v4

Modes Tested

  • Takeoff: Good
  • Position Mode: Good.
  • Altitude Mode: Good.
  • Stabilized Mode: Good.
  • Mission Plan Mode (Automated): Good.
  • RTL (Return To Land): Good.

Pre-flight Procedure:

  • Flash firmware through QGC ✓
  • Arm vehicle ✓
  • Calibrations ✓
  • Parameter check ✓

Flight Test:

  • Arm and Take off in position mode ✓
  • Throttle Test ✓
  • Pitch, Roll, Yaw ✓
  • Mode Switch ✓
  • RTL ✓

Notes:
Arm and Take off in position mode, after flying for approximately one minute, switched to altitude then stabilized mode proceed to switch to mission plan mode then make sure that vehicle follows all waypoints as shown in QGC, once completed all waypoint turn off RC to trigger Failsafe RTL and see landing behavior.

https://review.px4.io/plot_app?log=b6ae5bba-0e22-4177-a3c8-18533dc92534

@jorge789
Copy link

jorge789 commented Apr 10, 2019

Tested on Pixhawk Cube v3

Modes Tested

  • Takeoff: Good
  • Position Mode: Good.
  • Altitude Mode: Good.
  • Stabilized Mode: Good.
  • Mission Plan Mode (Automated): Good.
  • RTL (Return To Land): Good.

Pre-flight Procedure:

  • Flash firmware through QGC ✓
  • Arm vehicle ✓
  • Calibrations ✓
  • Parameter check ✓

Flight Test:

  • Arm and Take off in position mode ✓
  • Throttle Test ✓
  • Pitch, Roll, Yaw ✓
  • Mode Switch ✓
  • RTL ✓

Notes:
Arm and Take off in position mode, after flying for approximately one minute, switched to altitude then stabilized mode proceed to switch to mission plan mode then make sure that vehicle follows all waypoints as shown in QGC, once completed all waypoint turn off RC to trigger Failsafe RTL and see landing behavior.

https://review.px4.io/plot_app?log=17cb4f32-642d-4cba-bbc1-eed3a7372305

Tested on PixRacer v4:

Modes Tested

  • Takeoff: Good
  • Position Mode: Good.
  • Altitude Mode: Good.
  • Stabilized Mode: Good.
  • Mission Plan Mode (Automated): Good.
  • RTL (Return To Land): Good.

Pre-flight Procedure:

  • Flash firmware through QGC ✓
  • Arm vehicle ✓
  • Calibrations✓
  • Parameter check ✓

Flight Test:

  • Arm and Take off in position mode ✓
  • Throttle Test ✓
  • Pitch, Roll, Yaw ✓
  • Mode Switch ✓
  • RTL ✓

Notes:
Arm and Take off in position mode, after flying for approximately one minute, switched to altitude then stabilized mode proceed to switch to mission plan mode then make sure that vehicle follows all waypoints as shown in QGC, once completed all waypoint turn off RC to trigger Failsafe RTL and see landing behavior.

https://review.px4.io/plot_app?log=17cb4f32-642d-4cba-bbc1-eed3a7372305

@Tony3dr Tony3dr moved this from Ready for testing to Done in Test Flights Apr 10, 2019
@julianoes
Copy link
Contributor Author

Great, thanks a lot for testing and the write-up!

@julianoes julianoes merged commit cd9b3d6 into master Apr 11, 2019
Release 1.11 Blockers automation moved this from Testing to Done Apr 11, 2019
@julianoes julianoes deleted the fix-makefile-build branch April 11, 2019 08:07
@julianoes
Copy link
Contributor Author

@davids5 and I spent some time looking into differences between the Makefile and the ninja build and can confidently say that the binaries produced are the same except from the build time.

One caveat is that the paths included in the binary using __FILE__ which is used in assert() and ut_assert() are different for Makefile and ninja builds:

  • For Makefile the whole path is used, e.g. /home/julianoes/src/Firmware/src/systemcmds/....
  • For ninja the relative path is used, e.g. ../../src/systemcmds/....

@dagar if you have an idea how to prevent this that would be good.

We replaced __FILE__ with the empty string to remove that difference using:

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 0bfcee70d8..022a94bb98 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -106,6 +106,8 @@ set(PX4_BINARY_DIR "${CMAKE_CURRENT_BINARY_DIR}")
 
 list(APPEND CMAKE_MODULE_PATH ${PX4_SOURCE_DIR}/cmake)
 
+add_definitions("-D__FILE__=\"\"" -Wno-builtin-macro-redefined)
+
 #=============================================================================
 # git
 #

We used the following to do comparisons:

Build with ninja:

make px4_fmu-v4_default
mv build/px4_fmu-v4_default build/px4_fmu-v4_default-ninja
NO_NINJA_BUILD=1 make px4_fmu-v4_default
mv build/px4_fmu-v4_default build/px4_fmu-v4_default-make

Extract strings:

strings build/px4_fmu-v4_default-make/px4_fmu-v4.bin > build/px4_fmu-v4_default-make/px4_fmu-v4.bin.strings
strings build/px4_fmu-v4_default-ninja/px4_fmu-v4.bin > build/px4_fmu-v4_default-ninja/px4_fmu-v4.bin.strings

And compare them using meld, colordiff, or just diff"

colordiff build/px4_fmu-v4_default-make/px4_fmu-v4.bin.strings build/px4_fmu-v4_default-ninja/px4_fmu-v4.bin.strings

Once it was all matching we could also compare the binary in hex:

xxd build/px4_fmu-v4_default-make/px4_fmu-v4.bin > build/px4_fmu-v4_default-make/px4_fmu-v4.bin.hex
xxd build/px4_fmu-v4_default-ninja/px4_fmu-v4.bin > build/px4_fmu-v4_default-ninja/px4_fmu-v4.bin.hex

And then compare that:

colordiff build/px4_fmu-v4_default-make/px4_fmu-v4.bin.hex build/px4_fmu-v4_default-ninja/px4_fmu-v4.bin.hex

We also tried to compare the output of objdump -d -S file.elf but unfortunately the elf file produced with ninja does not contain the source code like the one from make does. We're not sure why that is.

arm-none-eabi-objdump -d -S build/px4_fmu-v4_default-make/px4_fmu-v4_default.elf > build/px4_fmu-v4_default-make/px4_fmu-v4_default.elf.objdump
arm-none-eabi-objdump -d -S build/px4_fmu-v4_default-ninja/px4_fmu-v4_default.elf > build/px4_fmu-v4_default-ninja/px4_fmu-v4_default.elf.objdump

Here is a comparison:
make:

08073768 <commander_main>:
	if (argc < 2) {
 8073768:	2801      	cmp	r0, #1
{
 807376a:	e92d 41f0 	stmdb	sp!, {r4, r5, r6, r7, r8, lr}
 807376e:	4607      	mov	r7, r0
 8073770:	460d      	mov	r5, r1
	if (argc < 2) {
 8073772:	dc03      	bgt.n	807377c <commander_main+0x14>
		usage("missing command");
 8073774:	4895      	ldr	r0, [pc, #596]	; (80739cc <commander_main+0x264>)
	usage("unrecognized command");
 8073776:	f7fb fca1 	bl	806f0bc <_Z5usagePKc>
 807377a:	e048      	b.n	807380e <commander_main+0xa6>
	if (!strcmp(argv[1], "start")) {

And ninja:

08073768 <commander_main>:
 8073768:	2801      	cmp	r0, #1
 807376a:	e92d 41f0 	stmdb	sp!, {r4, r5, r6, r7, r8, lr}
 807376e:	4607      	mov	r7, r0
 8073770:	460d      	mov	r5, r1
 8073772:	dc03      	bgt.n	807377c <commander_main+0x14>
 8073774:	4895      	ldr	r0, [pc, #596]	; (80739cc <commander_main+0x264>)
 8073776:	f7fb fca1 	bl	806f0bc <_Z5usagePKc>
 807377a:	e048      	b.n	807380e <commander_main+0xa6>
 807377c:	684e      	ldr	r6, [r1, #4]
 807377e:	4994      	ldr	r1, [pc, #592]	; (80739d0 <commander_main+0x268>)
 8073780:	f8df 82ec 	ldr.w	r8, [pc, #748]	; 8073a70 <commander_main+0x308>
 8073784:	4630      	mov	r0, r6
 8073786:	f79e f8e2 	bl	801194e <strcmp>

@dagar
Copy link
Member

dagar commented Apr 12, 2019

We can gradually eliminate our use of __FILE__, but at this point I'm not too concerned (it's almost entirely test code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

FMUv2 Nuttx build regression
6 participants