-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
errors when feeding VStarcam data to ffmpeg #15
Comments
H.264 decoders need to have the parameter sets (SPS and PPS) to work. RFC 6184 says that parameter sets can be passed "in-band" (meaning as part of the RTP data), "out-of-band" (in the SDP of the Currently, retina doesn't copy out-of-band parameter sets into the I don't know if other RTSP clients copy parameters into the frame. I don't have any cameras that only do out-of-band data so I haven't tried this scenario.
I know ffmpeg's built-in RTSP client uses Annex B. Not sure about others.
The format you need depends on what you're doing with it. I'm using retina to ultimately produce I'm pretty sure you can convince ffmpeg to accept either format. But again we also could add a knob for Retina to output either. Maybe via an extra parameter to |
That example hardcodes
No, "parameters" refers to the SPS and PPS. Decoding won't work without them, so either they must be before the first slice NAL (actual encoded part of a picture) of the first frame or in extra data. I'm reasonably sure the timestamps (DTS and PTS) don't matter to decoding. ffmpeg just copies them from its input to its output.
In a
so no SPS or PPS. But the /* If this is a new IDR picture following an IDR picture, reset the idr flag.
* Just check first_mb_in_slice to be 0 as this is the simplest solution.
* This could be checking idr_pic_id instead, but would complexify the parsing. */
if (!new_idr && unit_type == H264_NAL_IDR_SLICE && (buf[1] & 0x80))
new_idr = 1;
/* prepend only to the first type 5 NAL unit of an IDR picture, if no sps/pps are already present */
if (new_idr && unit_type == H264_NAL_IDR_SLICE && !sps_seen && !pps_seen) {
if (ctx->par_out->extradata)
count_or_copy(&out, &out_size, ctx->par_out->extradata,
ctx->par_out->extradata_size, -1, j);
new_idr = 0;
/* if only SPS has been seen, also insert PPS */
} else if (new_idr && unit_type == H264_NAL_IDR_SLICE && sps_seen && !pps_seen) {
if (!s->pps_size) {
LOG_ONCE(ctx, AV_LOG_WARNING, "PPS not present in the stream, nor in AVCC, stream may be unreadable\n");
} else {
count_or_copy(&out, &out_size, s->pps, s->pps_size, -1, j);
}
} so your
so now there's the SEI ( |
Yeah, that's the spot, and I think we could AVC vs Annex B via a per-stream option passed to
With AVC, the Line 726 in 59e513c
Annex B's actually easier. The
I've never used nvdec before, but I see their docs say "After de-muxing and parsing, the client can submit the bitstream which contains a frame or field of data to hardware for decoding." That sounds like the definition of access unit, what Retina is already doing. We could output single NALs, yes, but I'm a little afraid of confusing folks by making it unclear what forms you see when. It's not hard for the caller to break an access unit down into NALs in terms of coding or (particularly for the length-prefixed AVC form) efficiency. |
Only sometimes. There can be more than one slice NAL per frame (encoder's choice) and also SPS+PPS NALs, SEI NALs, etc. From my quick look at NVDEC's docs, I think what Retina is doing now matches what NVDEC is expecting, with the likely exception of AVC vs Annex B encoding. If it doesn't work for some reason, then we can make changes, but speculation isn't a good way to design an easy-to-use API.
It's not the same thing. Retina's RTP H.264 depacketization logic is messy because it has to understand fragmentation+aggregation packet types, deal with packet loss, and handle bugs in different IP camera models. Callers shouldn't and don't have to deal with this. Reading a length and then that number of bytes is more straightforward. |
yes, ffmpeg works now, we just need to make the NAL splitter (which I have made, still didn't make a proper PR for that, but the issue is somewhere here) |
As discussed in #13, connecting to a VStarcam camera and feeding its frames to ffmpeg produced ffmpeg errors. From discussion there:
@LucasZanella wrote:
However, on my app, while the frame producing works, passing
retina::codec::VideoFrame::data().borrow()
to the ffmpeg nal units parser sometimes reject, and sometimes parse and send to the decoder, which procuesHere's one VideoFrame:
I wrote:
I haven't tried feeding video directly from retina to ffmpeg yet, but in principle it should work. The frames should be fine to pass to ffmpeg. How are you setting up the stream with ffmpeg? You'll likely need to pass it the extra_data from VideoParameters.
The log messages from ffmpeg suggest it's not seeing a valid stream—NAL unit types should never be 0, and I think it's rare for the PPS id to be 16 rather than 0. But maybe the problem is just that without the extra data, ffmpeg is expecting a stream in Annex B format, and my code is passing it instead in AVC format. (The former means that NAL units are separated by the bytes 00 00 01, and the latter means that each NAL unit is preceded by its length in bytes as a big-endian number which can be 2, 3, or 4 bytes long. My code uses 4 bytes.) If you prefer to get Annex B data, it'd be possible to add a knob to retina to tell it that. Or conversion isn't terribly difficult: you can scan through NAL units and change the prefix to be 00 00 01.
I suppose I could add a retina example that decodes with ffmpeg into raw images or something. What ffmpeg crate are you using?
When you don't get the packet follows marked packet with same timestamp error, have you tried saving a .mp4 and playing it back in your favorite video player? Does it work?
@LucasZanella wrote:
For ffmpeg I'm using https://github.com/lucaszanella/rust-ffmpeg-1 which uses https://github.com/lucaszanella/rust-ffmpeg-sys-1 (this one is not needed, I just added some vdpau linking stuff, the original could be used). I had to modify the
rust-ffmpeg-1
to add support forffmpeg's av_parser_parse2
which parses the individual nal units. The original project doe snot have this and he doesn't want to maintain. My patch is very experimental.I haven't tried feeding video directly from retina to ffmpeg yet, but in principle it should work. The frames should be fine to pass to ffmpeg. How are you setting up the stream with ffmpeg? You'll likely need to pass it the extra_data from VideoParameters.
I've never needed to pass additional parameters to ffmpeg, just the nal units. I extracted the h264 bitstream from a big buck bunny .mp4 file and passed to ffmpeg calling
av_parser_parse2
to break into individual nal units and then passed those units usingavcodec_send_packet
and it works. The same process is not working for retina. When my code used to be all C++, I used to pass the output of ZLMediaKit to ffmpeg in this way also and it worked.Even though
av_parser_parse2
has the option to pass pts, dts, etc, I never used but I'll read more about these parameters.VideoParameters
debug:I've sent you a dump of the camera via email.
do you have experience in which types the rtsp clients out there do these things? I've never took a deep look on how ZLMediaKit does, I simply used it and now I'm getting deeper into RTSP/RTP/h264/etc because rust had no rtsp clients so I had to make one.
This is how I extracted the big buck bunny to make it work:
as you see by
h264_mp4toannexb
, it's as you supposed.May I know why you use the AVC format in your code? Isn't the Annex B proper for streaming?
The text was updated successfully, but these errors were encountered: