Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Unable to return a buffer to the encoder port on long-running processes #40
Okay, finally gotten to the bottom of this and a fix will be forthcoming shortly. But it's probably worth documenting as it's an interesting one...
The issue is indeed caused by a race condition, and it's been there since day one. However, being a race condition it requires specific timing for the bug to manifest. In the case of earlier versions of picamera (and I think even in raspistill too) the main thread is sufficiently fast in disabling the background encoder thread that the bug had little or no chance of appearing.
One extra "if" statement was introduced in release 1.0 and it seems this was just enough to push the main thread into being slow enough for the bug to appear. Specifically, once the camera is set to capture, the encoder callback is repeatedly until a full frame has been captured. Once a full frame has been captured, it appears the encoder callback is called once more (for what purpose I'm not sure, but it's not for writing anything to the file). Only the main thread can disable the port (shutting down the background encoder thread), so once the encoder callback notices the end of the frame it wakes up the main thread (via a semaphore) to disable the port.
In the vast majority of cases, the main thread wakes up quickly, disables the port, and everything is fine. But one in every few thousand calls (in picamera 1.0), the main thread is a little slow and the extraneous post-frame-end call to the encoder callback (mentioned above), is already running when the port gets disabled. Crucially, the port has to be disabled immediately before the call to mmal_port_send_buffer because before this is a check that the port is enabled (which must pass in order for the call to mmal_port_send_buffer to occur). This is the location of the race condition.
To fix the race condition I've done a couple of things. Firstly, we no longer check whether the port is enabled when recycling the buffers - we just assume it. Secondly, we use the internal flag in the encoder ("stopped") to ensure that, after frame-end is encountered, any subsequent calls to the encoder's callback are ignored. As the flag is only set within the encoder thread, or when the encoder thread is guaranteed not to be running, no race condition is introduced.
The test suite is currently running (mostly to make sure I haven't broken any edge cases as this is a change to the base encoder class) and if it passes, I'll push the fix for anyone desperate to get it early. Otherwise, it'll be in a release just before the weekend (hopefully with a couple of other minor bits if I can get them done in time).