Skip to content
This repository has been archived by the owner on Mar 24, 2018. It is now read-only.

QEMU issue (when building for ARM) #87

Closed
po1 opened this issue Jan 30, 2013 · 7 comments
Closed

QEMU issue (when building for ARM) #87

po1 opened this issue Jan 30, 2013 · 7 comments

Comments

@po1
Copy link

po1 commented Jan 30, 2013

As stated here:
https://bugs.launchpad.net/qemu/+bug/955379

there is big problem with all current versions of qemu, which (among other) can be triggered by cmake.
What happens is that sometimes, during a check (for CXX ABI info among others), the whole build process hangs forever. It hangs because there is a bug in the select() call handler in QEMU, and that the SIGCHLD is lost in space when the child finishes, for which the parent (cmake) waits forever and ever.

Quick workaround: when that happens, manually doing a 'kill -SIGCHLD $pid' will unlock the cmake process and resume its execution.
As I see it, there are 2 things we can do:

  1. write a script that automates the SIGCHLD sending when the build process hangs
  2. write a patch for qemu

No. 2 does not sound very realistic...

@trainman419
Copy link
Contributor

On my build farm, I'm only seeing this now that I've turned on armhf builds; I wasn't seeing it when building for armel.

@po1
Copy link
Author

po1 commented Feb 1, 2013

This is a race condition. Actually it (almost?) never happened on one of my two machines, but would happen regularly on the other. It depends on a lot of things, including the processor, kernel, direction of the wind...

@trainman419
Copy link
Contributor

Confirmed; I'm seeing it on armel builds now as well.

@po1
Copy link
Author

po1 commented Feb 8, 2013

There is a pattern that one can see in the output of 'ps auxf', a dead child takes up 0 bytes of memory, and can thus be easily identified by a script running periodically.
One just has to send a SIGCHLD to the parent process to resume the process.
It is a hacky workaround, but it should work.
If nobody has got a better idea I will put a pull request together with a script that checks for that in the background.

@tfoote
Copy link
Member

tfoote commented May 1, 2013

@po1 Is this in your aggregated patches?

@po1
Copy link
Author

po1 commented May 2, 2013

There is a workaround, a script written by Austin that checks for zombie cmake processes and sends a SIGCHLD to the parent process, it should be included in the files that I put together, and it is documented in the howto.

@dirk-thomas
Copy link
Member

The existing buildfarm will not be modified anymore. If this is this a problem on the new farm please consider filling a new ticket there (related to ros-infrastructure/ros_buildfarm#21).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants