New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspivid: is abort functionality working correctly? #268

Closed
fullergalway opened this Issue Nov 5, 2015 · 25 comments

Comments

Projects
None yet
10 participants
@fullergalway
Copy link
Contributor

fullergalway commented Nov 5, 2015

When raspivid output channel goes away, message repeats about aborting, but never aborts. Try something like:

raspivid -t 60000 -o - | cat > /dev/null &
killall cat

Messages like this appear:

mmal: Failed to write buffer data ...

I tried, but failed to understand why pause_and_test_abort method isn't catching this, and eventually managed to get expected behavior for my app with a hack of RaspiVid.c, changing this:

if (bytes_written != buffer->length)
{
   vcos_log_error("Failed to write buffer data (%d from %d)- aborting", bytes_written, buffer->length);
   pData->abort = 1;
}

to this:

if (bytes_written != buffer->length)
{
   vcos_log_error("Failed to write buffer data (%d from %d)- aborting", bytes_written, buffer->length);
   sys.exit(1);
}

Thoughts?

@JamesH65

This comment has been minimized.

Copy link
Collaborator

JamesH65 commented Nov 5, 2015

Hmm, I wonder if the loop that tests pData->abort has stopped by the time
this happens. Dunno. I'll try and find time to take a look. Note that this
code was originally intended simply as demo code, certainly not
'production' code, so there are bound to be gotcha's! Hopefully they will
be found and fixed as time goes by. This looks like one of them!

On 5 November 2015 at 11:13, fullergalway notifications@github.com wrote:

When raspivid output channel goes away, message repeats about aborting,
but never aborts. Try something like:

raspivid -t 60000 -o - | cat > /dev/null &
killall cat

Messages like this appear:

mmal: Failed to write buffer data ...

I tried, but failed to understand why pause_and_test_abort method isn't
catching this, and eventually managed to get expected behavior for my app
with a hack of RaspiVid.c, changing this:

if (bytes_written != buffer->length)
{
vcos_log_error("Failed to write buffer data (%d from %d)- aborting", bytes_written, buffer->length);
pData->abort = 1;
}

to this:

if (bytes_written != buffer->length)
{
vcos_log_error("Failed to write buffer data (%d from %d)- aborting", bytes_written, buffer->length);
sys.exit(1);
}

Thoughts?


Reply to this email directly or view it on GitHub
#268.

@cjsoftuk

This comment has been minimized.

Copy link

cjsoftuk commented Feb 18, 2016

This appears to exist (in some form) in the latest build of raspivid. It's exceptionally unhelpful in my case as I'm piping to netcat and it appears that a dead pipe doesn't always cause raspivid to terminate.

I managed to create a situation where it did exit correctly if run from bash/dash/whatever shell is default, but not if run from systemd as part of rc.local.

@JamesH65

This comment has been minimized.

Copy link
Collaborator

JamesH65 commented Feb 18, 2016

I've not had time to look at this - if you find a fix, please send us a
patch.

On 18 February 2016 at 09:02, cjsoftuk notifications@github.com wrote:

This appears to exist (in some form) in the latest build of raspivid. It's
exceptionally unhelpful in my case as I'm piping to netcat and it appears
that a dead pipe doesn't always cause raspivid to terminate.

I managed to create a situation where it did exit correctly if run from
bash/dash/whatever shell is default, but not if run from systemd as part of
rc.local.


Reply to this email directly or view it on GitHub
#268 (comment)
.

@cjsoftuk

This comment has been minimized.

Copy link

cjsoftuk commented Feb 18, 2016

@JamesH65 - Will try to have a look at some point - however I've worked around it in my application using auto-login and .bashrc for now. It seems that launching from there doesn't cause this issue, only from systemd/init.

@fedeman

This comment has been minimized.

Copy link

fedeman commented Sep 4, 2016

Thanks for sharing.
I setup a systemctl service after multi-user.target as described in here and the script tries to run without succes (withsystemctl status, I can see _mmal: Failed to write buffer data_). After see that you were able to run it through auto login + bashrc setup, the script runs perfect. I didn't understand the source of the problem though...

@Ruffio

This comment has been minimized.

Copy link

Ruffio commented Dec 30, 2016

@fullergalway has this issue been resolved and therefore it should be closed?

@fullergalway

This comment has been minimized.

Copy link
Contributor Author

fullergalway commented Dec 31, 2016

@dvzrv

This comment has been minimized.

Copy link

dvzrv commented Feb 25, 2017

This problem still exists and should be fixed!
When raspivid is run, does it rely on any environment variables?
Those could be applied in a systemd service. Run on its own, it doesn't work though.

@JamesH65

This comment has been minimized.

Copy link
Collaborator

JamesH65 commented Feb 27, 2017

It does not rely on any environment variables.

Still not had a chance to look at the issue though. To busy right now, will try and get a look at some point, meanwhile if anyone else fancies a look please feel free to send us a patch.

@6by9

This comment has been minimized.

Copy link
Contributor

6by9 commented Feb 27, 2017

The original reported issue of running

raspivid -t 60000 -o - | cat > /dev/null &
killall cat

not killing raspivid doesn't seem to be an issue with the current raspivid. It exits on cue, and the GPU cleanup code does the right thing (ideally the app should have a signal handler to clean up properly, but there you go).

As nobody else has posted details of their command that fails, I'll give this 24 hours for someone to post a command-line that fails, otherwise this will be closed.

@6by9

This comment has been minimized.

Copy link
Contributor

6by9 commented Feb 28, 2017

No response. Closing as the original reported problem is not an issue on the current tree.

If there is an alternate mechanism that does have an issue then please reopen/create a new issue with those details.

@6by9 6by9 closed this Feb 28, 2017

@ghost

This comment has been minimized.

Copy link

ghost commented Mar 12, 2017

Still have the same issue, the service will never abort, although it should. Have to run it as simple bash script as well. Like

while true; do
  raspivid ...
  sleep 3
done
@6by9

This comment has been minimized.

Copy link
Contributor

6by9 commented Mar 12, 2017

@mankoell And what condition are you expecting that raspivid call to quit on?
You haven't given full steps for reproducing anything, therefore this remains closed. There are far too many raspivid parameters to guess which ones you're using.

@ghost

This comment has been minimized.

Copy link

ghost commented Mar 12, 2017

raspivid -o - -t 0 -vs -h $HEIGHT -w $WIDTH -fps $FPS -b $BITRATE -op $OPACITY $OPTIONS | ffmpeg -re -ar 44100 -ac 2 -acodec pcm_s16le -f s16le -ac 2 -i /dev/zero -f h264 -i - -vcodec copy -acodec aac -ab 128k -g 50 -strict experimental -f flv $1
@Malvineous

This comment has been minimized.

Copy link

Malvineous commented Mar 13, 2017

I'd just like to add a note to this as I came across the same issue with the latest firmware (well 20170301.)

It turns out that while this bug is fixed, and raspivid does terminate when the pipe breaks, this only seems to apply when running it via the command line. I was launching raspivid through systemd (then piping it to ffmpeg) and when ffmpeg died, raspivid kept running spewing out a bunch of mmal: Failed to write buffer data errors.

I'm not sure whether this is a bug in raspivid or not, however I found a workaround.

Instead of using pipes to direct from raspivid to the other program, I used Bash process substitution to start the other program and tell it to read data from raspivid instead.

To use a simple example, instead of this:

raspivid -o - | ffmpeg -i -

I replaced it with this:

ffmpeg -i <(raspivid -o -)

I am not entirely sure why this works, but perhaps it means systemd monitors the ffmpeg process rather than the raspivid one. I'm not sure why it's even a problem running raspivid from within systemd, but at least this way it works!

@JamesH65

This comment has been minimized.

Copy link
Collaborator

JamesH65 commented Mar 13, 2017

@6by9

This comment has been minimized.

Copy link
Contributor

6by9 commented Mar 13, 2017

@mankoell And what is in $OPTIONS?
And what conditions is that aborting on? Neither ffmpeg nor raspivid have any apparent triggers or time limit, therefore they should run indefinitely.
Still insufficient data to investigate.

@Malvineous That does sound like a subtlety of pipes, or possibly the order things get killed.
Why is ffmpeg dying? Is it actually fully terminating?

There is nothing fundamentally complicated about

static int pause_and_test_abort(RASPIVID_STATE *state, int pause)
{
   int wait;

   if (!pause)
      return 0;

   // Going to check every ABORT_INTERVAL milliseconds
   for (wait = 0; wait < pause; wait+= ABORT_INTERVAL)
   {
      vcos_sleep(ABORT_INTERVAL);
      if (state->callback_data.abort)
         return 1;
   }
   return 0;
}

ABORT_INTERVAL is 100ms.
As I still haven't got sufficient information from anyone on how to fully reproduce this, I'll kick it back to those having issues to add a logging line into that loop to confirm that is where the main raspivid thread is waiting. If it isn't then there's some other fact that you haven't imparted.

@6by9

This comment has been minimized.

Copy link
Contributor

6by9 commented Mar 13, 2017

And if this is only under systemd, then there appear to be some quirks as it is not doing shell expansion on the exec line.
A Google for "start app systemd pipe" threw up http://stackoverflow.com/questions/32968506/how-to-pipe-output-to-a-file-when-running-as-a-systemd-service, and linking on to https://www.freedesktop.org/software/systemd/man/systemd.kill.html#KillMode=
I'm not a systemd expert, but those may be linked to your issue. Is this issue when you are trying to stop the service, or due to an error?

@Malvineous

This comment has been minimized.

Copy link

Malvineous commented Mar 13, 2017

@JamesH65 I take your point, but for command line use there isn't much alternative. I originally started out with ffmpeg and the V4L device but it was so buggy at the time that raspivid was the only workable option, and it worked well enough that I've stuck with it. But the limitations are apparent, it's just that I'm not sure there's much alternative. I don't really want to have to code my own app when I'm not doing anything special with the camera. Does the V4L device now allow you to control all the same features of the camera that raspivid does? If so then maybe it's worth another look.

@6by9 I'm calling a bash script from within systemd so shell expansion shouldn't be a problem - which is also why I am so confused. systemd must be altering the way the bash script executes because running the same script file from the command line does not reproduce the problem. I'm not an expert either, but I can only guess that SIGPIPE normally reaches raspivid and it aborts (per the default signal handler) but when running under systemd that signal never reaches raspivid so it never knows to abort. Although from your code snippet it looks like you're handling a pipe failure differently, so I'm not sure why it affects your method as well. I am completely guessing here of course.

ffmpeg is dying because I am streaming over a USB wifi device, and due to occasional interference the wifi drops out and reconnects. When the interface is momentarily down, ffmpeg terminates. It is a clean termination, and killall ffmpeg has the same effect. I am relying on systemd to restart the service when this happens, but of course that only works when raspivid also terminates, otherwise systemd doesn't notice that anything is wrong!

I wasn't going to post the full steps to reproduce the issue because it's pretty involved, but for the record, this is how you can do it. I don't expect anyone to actually do this now there is a workaround, but if you are interested this should at least let you experiment.

  1. Use a distro that has systemd. I'm using Arch Linux ARM.
  2. Create /etc/systemd/system/stream-picam.sh with this content: (the IP address does not have to be changed, it's multicast and largely irrelevant for this test anyway)
#!/bin/bash

/opt/vc/bin/raspivid -w 1296 -h 972 -fps 0 -pf high -b 5000000 --exposure night --drc high --nopreview --timeout 0 -if both --intra 30 --inline -o - | ffmpeg -r 30 -f h264 -i - -vcodec copy -an -f mpegts udp://224.0.1.2:5004 -loglevel fatal
  1. Create /etc/systemd/system/stream-picam.service with this content:
[Unit]
Description=Raspberry Pi camera streaming
Wants=network-online.target
After=network-online.target

[Service]
ExecStart=/etc/systemd/system/stream-picam.sh
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
  1. Make systemd see the new service: systemctl daemon-reload
  2. Start the new service systemctl start stream-picam
  3. Confirm both raspivid and ffmpeg are running: ps aux | grep -E '(ff|vid)'
  4. Make ffmpeg terminate: killall ffmpeg
  5. Run command in step 6 again, this time ffmpeg is gone but raspivid is still there
  6. View output: systemctl status stream-picam - see lots of mmal errors.
@6by9

This comment has been minimized.

Copy link
Contributor

6by9 commented Mar 13, 2017

I originally started out with ffmpeg and the V4L device but it was so buggy at the time that raspivid was the only workable option,

And your bug report is logged where? I have generally tried to respond to issues raised.

Does the V4L device now allow you to control all the same features of the camera that raspivid does? If so then maybe it's worth another look.

It is a little more limited due to the defined V4L2 API missing some controls, but almost the whole list you've given there are supported or you're just taking the default value anyway:
-w 1296 -h 972 = v4l2-ctl -v width=1296,height=972,pixelformat=H264
-fps 0 = v4l2-ctl --set-ctrl=exposure_dynamic_framerate=1
-pf high = v4l2-ctl --set-ctrl=h264_profile=4 (default anyway)
-b 5000000 = v4l2-ctl --set-ctrl=video_bitrate=5000000
--exposure night = v4l2-ctl --set-ctrl=scene_mode=8
--drc high - not supported.
--nopreview - default
--timeout 0 - up to your app.
-if both - default
--intra 30 = v4l2-ctl --set-ctrl=h264_i_frame_period=30
--inline = v4l2-ctl --set-ctrl=repeat_sequence_header=1

I'm not an expert either, but I can only guess that SIGPIPE normally reaches raspivid and it aborts (per the default signal handler) but when running under systemd that signal never reaches raspivid so it never knows to abort. Although from your code snippet it looks like you're handling a pipe failure differently, so I'm not sure why it affects your method as well. I am completely guessing here of course.

If it is due to signals, then that is not this issue - reread the original report.
When mmal: Failed to write buffer data ... is logged it is due to the fwrite failing and should terminate the app. On fwrite not writing the correct amount of data, then the abort flag is set (https://github.com/raspberrypi/userland/blob/master/host_applications/linux/apps/raspicam/RaspiVid.c#L1439)

There is a signal handler installed, but it does nothing different except on SIGUSR1
https://github.com/raspberrypi/userland/blob/master/host_applications/linux/apps/raspicam/RaspiVid.c#L2290

All the code for Raspivid is open source so there is absolutely nothing stopping you adding some logging of the signals that systemd is or isn't sending - you don't have to wait for anyone else to investigate.

@Malvineous

This comment has been minimized.

Copy link

Malvineous commented Mar 13, 2017

Sorry I didn't mean to sound negative about the V4L bugs. It was only in the weeks after the camera's initial release and the general consensus at the time was that raspivid was being more actively developed and it did work much better at the time. It seemed pointless to submit V4L bug reports back then as it was so early in the development cycle and we were being told that things were still in a state of flux and improving all the time. Once I got raspivid working I just kind of forgot about it and it has been working happily for years, until my recent change from wired to wifi causing ffmpeg to terminate randomly.

All that was some years ago and I haven't tried the V4L2 API since, so I didn't realise it was now the preferred option (at least compared to raspivid.) Thanks for explaining the V4L2 flags, that helps a lot. I will see if I can switch to this.

I also didn't mean to post my original comment as a request for help, but rather to provide info for anyone else having the same problem since this page came up on Google when I was investigating the error message. Sorry if that was unclear. It was just that after reading your exasperated comment that still nobody had provided steps to reproduce the problem, I didn't want that to include me :-)

Again I apologise if anything I said came across negatively. Along with many others I sincerely appreciate all the effort of everyone working on these projects, and I'm very grateful for the time that you and others spend on fixing issues like the original one here!

@6by9

This comment has been minimized.

Copy link
Contributor

6by9 commented Mar 13, 2017

Yes, originally V4L2 was largely ignored. Implementing it got subcontracted out which didn't go brilliantly. Camera had launched in about June 2013, but V4L2 wasn't in a state to ship until Jan 2014. Broadcom then laying us all off in August 2014 didn't help, but the V4L2 driver did get improved over time.

Thank you for steps to reproduce something. It won't be high up on the list to investigate, and if you don't mind I'm going to shift it to a new issue because it doesn't appear to be the same as the original. The golden rule of bug tracking is keep issues separate until they are confirmed as duplicates otherwise you end up with crossing discussions.

@Malvineous

This comment has been minimized.

Copy link

Malvineous commented Mar 15, 2017

No worries at all, I wasn't sure if the problem was with raspivid or systemd or what.

@robert-blankenship

This comment has been minimized.

Copy link

robert-blankenship commented May 23, 2017

host=192.168.0.101
while true; do
        fifo=vid-pipe.fifo.$$
        mkfifo $fifo
        raspivid -o - -t 0 -vf -hf -fps 30 -b 6000000 > $fifo &
        nc $host 8000 < $fifo
        killall raspivid
        killall nc
        sleep 5
        rm $fifo
done

Using a named pipe seems to work :). Replace nc with your ffmpeg command or whatever other command you want.

@cristiklein

This comment has been minimized.

Copy link

cristiklein commented Jan 19, 2018

I found a solution to this problem. Seems like raspivid expects to get SIGPIPE when the process connected to its output terminates. For some reason, systemd defaults to ignoring SIGPIPE. Adding the following in the unit's service file fixes the problem for me:

[Service]
IgnoreSIGPIPE=false
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment