Add user-specified callback to video+image encoders (H.264 and PTS) #97

d2rk · 2014-05-15T09:10:08Z

I would like to sync video and audio frames. Is it any way to fix/edit video frame [timestamp (PTS)] https://github.com/waveform80/picamera/blob/master/picamera/encoders.py#L78)? Because I don't think here is any timing information in an h.264 stream.

waveform80 · 2014-05-15T12:51:53Z

There seems to be timing when I query it...

>>> import picamera
>>> camera = picamera.PiCamera()
>>> camera.resolution = (1280, 720)
>>> camera.framerate = 30
>>> camera.start_recording('/dev/null', format='h264')
>>> camera.frame
PiVideoFrame(index=57, keyframe=False, frame_size=17913L, video_size=1010411L, split_size=1010411L, timestamp=1838431L, header=False)
>>> camera.frame
PiVideoFrame(index=139, keyframe=False, frame_size=21132L, video_size=2663364L, split_size=2663364L, timestamp=4464760L, header=False)
>>> camera.frame
PiVideoFrame(index=183, keyframe=False, frame_size=27L, video_size=3468427L, split_size=3468427L, timestamp=None, header=True)
>>> camera.frame
PiVideoFrame(index=229, keyframe=False, frame_size=19927L, video_size=4394489L, split_size=4394489L, timestamp=7386552L, header=False)
>>> camera.frame
PiVideoFrame(index=265, keyframe=False, frame_size=18767L, video_size=5125989L, split_size=5125989L, timestamp=8535571L, header=False)
>>> camera.frame
PiVideoFrame(index=304, keyframe=False, frame_size=18172L, video_size=5846855L, split_size=5846855L, timestamp=9815908L, header=False)
>>> camera.frame
PiVideoFrame(index=335, keyframe=False, frame_size=18183L, video_size=6496454L, split_size=6496454L, timestamp=10800780L, header=False)
>>> camera.frame
PiVideoFrame(index=364, keyframe=False, frame_size=16082L, video_size=7029499L, split_size=7029499L, timestamp=11752823L, header=False)
>>> camera.frame
PiVideoFrame(index=393, keyframe=False, frame_size=17101L, video_size=7626398L, split_size=7626398L, timestamp=12672039L, header=False)
>>> camera.frame
PiVideoFrame(index=431, keyframe=False, frame_size=19795L, video_size=8385680L, split_size=8385680L, timestamp=13886716L, header=False)
>>> camera.frame
PiVideoFrame(index=467, keyframe=False, frame_size=17775L, video_size=9053536L, split_size=9053536L, timestamp=15068565L, header=False)
>>> camera.frame
PiVideoFrame(index=505, keyframe=False, frame_size=18411L, video_size=9824806L, split_size=9824806L, timestamp=16283241L, header=False)
>>> camera.frame
PiVideoFrame(index=526, keyframe=False, frame_size=18467L, video_size=10206506L, split_size=10206506L, timestamp=16972652L, header=False)
>>> camera.frame
PiVideoFrame(index=554, keyframe=False, frame_size=16888L, video_size=10781451L, split_size=10781451L, timestamp=17859040L, header=False)
>>> camera.frame
PiVideoFrame(index=587, keyframe=False, frame_size=16899L, video_size=11395820L, split_size=11395820L, timestamp=18942399L, header=False)
>>> camera.frame
PiVideoFrame(index=616, keyframe=False, frame_size=19440L, video_size=12011487L, split_size=12011487L, timestamp=19861615L, header=False)
>>> camera.frame
PiVideoFrame(index=643, keyframe=False, frame_size=19372L, video_size=12539546L, split_size=12539546L, timestamp=20748001L, header=False)
>>> camera.frame
PiVideoFrame(index=670, keyframe=False, frame_size=17088L, video_size=13052561L, split_size=13052561L, timestamp=21634391L, header=False)
>>> camera.stop_recording()

There's one frame up there where timestamp is None but it's a header frame so that's understandable. Are you seeing no timing information at all?

d2rk · 2014-05-15T13:20:05Z

No, I can see timing information, but it is not correct when I have two streams. I need to sync PTS for video stream and audio stream at some point. E.g. when audio goes faster then video has to "speed up".

waveform80 · 2014-05-15T15:19:21Z

How are you attempting to sync the two streams? I think you're correct that there's no timing information embedded in the H.264 stream itself; the meta-data provided by the frame property is separate and if you're sync'ing two streams then presumably you need to write some frame indexes and their corresponding timestamps to some external source to use as a reference.

Unfortunately I'm not particularly familiar with AV sync (beyond playing with it in cinelerra and other systems) so I'm not sure software that performs this function usually requires for such purposes.

d2rk · 2014-05-15T18:34:18Z

I would like to do something like this by manually manage two streams and correct their PTS.

waveform80 · 2014-05-16T11:03:59Z

I've trawled through that link briefly, and as far as I can tell they're using audio as the base PTS. I can't see anywhere that actually reads the PTS from the video encoder; the video PTS is simply counted as frames come in and then tweaked to bring it in sync with the audio PTS (this is the only place video_current_pts is set and as you can see in that function, it's either incremented (VIDEO_PTS_STEP) with adjustments (+/- 150), or simply reset to audio_current_pts).

In other words, this approach (if implemented in Python) wouldn't use the PiCamera.frame property at all; the video timestamp provided by the encoder is unused. That said, I don't think this approach could (currently) be implemented with picamera as it requires some code to run for every frame and picamera doesn't (yet) provide a means to hook into the encoder's callback.

Incidentally, I also had a look at the Pi's H.264 output with the handy h264bitstream utility and you are correct that no timing information is present in the output stream (the timing_info_present_flag in the VUI portion of the SPS is 0).

d2rk · 2014-05-25T19:54:07Z

OK, then callback for encoder would be nice for the first time.

waveform80 · 2014-05-25T19:57:34Z

Yup, I'll change this ticket to an enhancement one to remind me to add a user-specified callback parameter to the various recording and capture functions. Shouldn't be difficult so I'm confident it'll go in 1.5

d2rk · 2014-05-25T20:05:39Z

OK, thanks. Maybe it would be even better to pass own encoder derived from PiVideoEncoder.

waveform80 · 2014-05-25T20:50:27Z

Tempting, but the callback idea keeps things simpler to test & debug for the time being; about the only use-case I can see for custom encoder classes vs callbacks is that a custom encoder class could remove existing bits of the encoder classes for the purpose of reducing overhead (and, assuming one stays within Python, I'd be impressed if someone could significantly speed up the encoder classes within the callback phase - they're already pretty minimal)

waveform80 · 2014-05-28T19:55:49Z

Hmm ... I've had a play with callbacks now, and I've come to the conclusion that you're right: it'd be better to permit a custom encoder class. The issue with callbacks comes with where to place them: before the write to output? After? There's decent arguments for either. If before, do we allow the callback to mutate the buffer to be written? Again, there's a use-case for it. So instead of lots of different callback hooks with different calling conventions, let's go with your idea of specifying a replacement encoder class.

The only objection I can come up with at the moment is that while it's easy for start_recording (before it only ever uses PiVideoEncoder) it's a tad more tricky when it comes to the capture methods (as they use one of PiRawOneImageEncoder, PiCookedOneImageEncoder, PiRawMultiImageEncoder or PiCookedMultiImageEncoder). For the time being, I'm going to completely avoid that question and only permit specification of the encoder class with start_recording, not capture et al - but I'm sure I'll have to tackle it at some point in the future!

d2rk · 2014-05-28T20:27:26Z

Good point. I've already changed the code for my needs. But I did it with builder picamera.Picamera(encoder_builder=MyEncoderBuilder()) so you may modify builder instance and do not touch start_recording.

waveform80 · 2014-05-29T00:30:18Z

Interesting idea - so you've got a factory instance which has some method(s) on it which construct the encoder given a set of parameters. That's certainly more flexible than merely adding a class parameter to start_recording. How does your builder handle construction of encoders for image capture (or is it purely for video at the moment?).

I was looking at PiCamera._get_ports and thinking that something like PiCamera._get_encoder would be a nice parallel (which could be overridden by anyone wanting to output a different encoder class), but it'd need a few parameters to decide which of the current classes to return. Something along the lines of for_video, for_one_shot, for_raw.

However, if we go with an encoder factory class it could have separate methods for video or image encoder construction (I'm not sure that adds any particular flexibility but it might be easier for people to override as they could elect to only override the video related method, for example).

d2rk · 2014-05-29T16:31:54Z

Yes, on this moment I'm working with video. But you can do something like this: PiCamera(video_encoder_builder=MyVideoEncoderBuilder(), image_encoder_builder=MyImageEncoderBuilder()). The key moment here is that you're passing options to build() method such as format, so the builder can generate right encoder instance, e.g. PiRawOneImageEncoder, PiCookedOneImageEncoder.

waveform80 · 2014-06-01T21:25:39Z

Good point on the format parameter. I've been playing around with this today, and I think I've settled on a design. In the end I didn't go with a separate encoder builder class, but with a bunch of extra methods on PiCamera: _get_video_encoder, etc. which take more or less the same parameters as the encoder classes' initializers - it just seemed simpler than introducing yet another class (and I don't think it limits what can be achieved by overriding the classes).

The bulk of the work turned out to be all the extra documentation that needed writing; all the encoder classes now have some (admittedly rudimentary) docs, along with the new _get_*_encoder methods, and a recipe demonstrating a simple custom video encoder class.

Once I've run through the test suite (which'll take an hour or so) I'll push it to GitHub.

d2rk · 2014-06-02T04:30:57Z

OK, sounds good to me. Thank you.

Also corrected a mistake in the inheritance diagram for the recipe

waveform80 changed the title ~~h.264 and PTS~~ Add user-specified callback to video+image encoders (H.264 and PTS) May 25, 2014

waveform80 added the enhancement label May 25, 2014

waveform80 added this to the 1.5 milestone May 25, 2014

waveform80 self-assigned this May 25, 2014

waveform80 closed this as completed in 323b06b Jun 2, 2014

waveform80 added a commit that referenced this issue Jun 2, 2014

Tidy up the docs for #97 a bit more

7a01d54

Also corrected a mistake in the inheritance diagram for the recipe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add user-specified callback to video+image encoders (H.264 and PTS) #97

Add user-specified callback to video+image encoders (H.264 and PTS) #97

d2rk commented May 15, 2014

waveform80 commented May 15, 2014

d2rk commented May 15, 2014

waveform80 commented May 15, 2014

d2rk commented May 15, 2014

waveform80 commented May 16, 2014

d2rk commented May 25, 2014

waveform80 commented May 25, 2014

d2rk commented May 25, 2014

waveform80 commented May 25, 2014

waveform80 commented May 28, 2014

d2rk commented May 28, 2014

waveform80 commented May 29, 2014

d2rk commented May 29, 2014

waveform80 commented Jun 1, 2014

d2rk commented Jun 2, 2014

Add user-specified callback to video+image encoders (H.264 and PTS) #97

Add user-specified callback to video+image encoders (H.264 and PTS) #97

Comments

d2rk commented May 15, 2014

waveform80 commented May 15, 2014

d2rk commented May 15, 2014

waveform80 commented May 15, 2014

d2rk commented May 15, 2014

waveform80 commented May 16, 2014

d2rk commented May 25, 2014

waveform80 commented May 25, 2014

d2rk commented May 25, 2014

waveform80 commented May 25, 2014

waveform80 commented May 28, 2014

d2rk commented May 28, 2014

waveform80 commented May 29, 2014

d2rk commented May 29, 2014

waveform80 commented Jun 1, 2014

d2rk commented Jun 2, 2014