Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose encoding/decoding raw packets without streams #155

Closed
mikeboers opened this issue Mar 21, 2016 · 22 comments
Closed

Expose encoding/decoding raw packets without streams #155

mikeboers opened this issue Mar 21, 2016 · 22 comments

Comments

@mikeboers
Copy link
Member

Streams are not actually necessary for encoding/decoding, only muxing/demuxing. Refactor encoding/decoding onto a CodecContext, and then have a {Video,Audio,Subtitle}CodecContext which implements the specifics functions.

Do this for #154.

@TechWhizZ199
Copy link

TechWhizZ199 commented May 28, 2016

I was tried making some edits to the FFVideo library, though due to the fact that we do not have #174 solved, I can't actually proceed any further. I thought maybe I could load a sample flv file into the library , this would mean the library would set the codec context to flv. Then I can just feed in custom packets by creating a new function which should theoretically decode the packets. If you take a look at the ffvideo.pyx file which resides in ffvideo_custom/ffvideo/ there is a class VideoSingleFrameDecode (possible the worst class name ever given) but that is where I intend to try it out.

I have yet to try this out but the library I am modifying resides here: https://github.com/oddballz/labs/tree/master/AV

@TechWhizZ199
Copy link

Just a quick follow up for #174 we need to have access to:

  1. avcodec_find_decoder() with the CODEC IDS e.g. AV_CODEC_ID_MPEG1VIDEO
  2. avcodec_alloc_context from which we can just use avcodec_open to open and start using the codec.

@mikeboers
Copy link
Member Author

I've started making progress in the codec-ctx branch. Encoding works (with a ton of hardcoded options), but decoding roughly does this (lifting the error from #181):

<av.VideoCodecContext video/flv at 0x7f470d5f47e0>
<av.Packet of #0, dts=None, pts=None; 128 bytes at 0x7f4711bd3890>
Traceback (most recent call last):
  File "/home/ubuntu/workspace/PyAV/scratchpad/cctx_decode.py", line 30, in <module>
    frames = cc.decode(packet) or ()
  File "av/codeccontext.pyx", line 83, in av.codeccontext.CodecContext.decode (src/av/codeccontext.c:3598)
  File "av/codeccontext.pyx", line 115, in av.codeccontext.CodecContext.decode (src/av/codeccontext.c:3331)
  File "av/video/codeccontext.pyx", line 134, in av.video.codeccontext.VideoCodecContext._decode_one (src/av/video/codeccontext.c:2109)
  File "av/utils.pyx", line 78, in av.utils.err_check (src/av/utils.c:1517)
av.AVError: [Errno 1094995529] Invalid data found when processing input
ERROR:libav.flv:Bad picture start code
ERROR:libav.flv:header damaged

I got the same in my "mpeg4" tests. More testing/investigation is needed.

@GoelBiju
Copy link

GoelBiju commented Jul 26, 2016

I'm not sure if what I am saying is correct, but when allocating the packet with the data, maybe we should provide with the compressed data and not just a chunk of the file. E.g. when we are streaming, we get the frames of the data. If this is the case, is it only looking to decode the actual frame from the data?

Maybe treating the the file without streams causes this, is it only useful for raw packet data (where the relevant data has been compressed and needs decoding)?

@GoelBiju
Copy link

GoelBiju commented Jul 26, 2016

I just saw the reference to exactly what you meant here: https://ffmpeg.org/doxygen/trunk/doc_2examples_2decoding_encoding_8c-example.html

The notes here point to exactly the notes made in the commit:

Some codecs need width/height/pix_fmt(??) before decoding, because that
isn't in the bitstream.
Some codecs need packets that contain exactly a frame, but some can deal with
streams.

It says:

 if(codec->capabilities&CODEC_CAP_TRUNCATED)
        c->flags|= CODEC_FLAG_TRUNCATED; /* we do not send complete frames */
    /* For some codecs, such as msmpeg4 and mpeg4, width and height
       MUST be initialized there because this information is not
       available in the bitstream. */
    /* open it */
    if (avcodec_open2(c, codec, NULL) < 0) {
        fprintf(stderr, "Could not open codec\n");
        exit(1);
    }
    f = fopen(filename, "rb");
    if (!f) {
        fprintf(stderr, "Could not open %s\n", filename);
        exit(1);
    }
    frame = av_frame_alloc();
    if (!frame) {
        fprintf(stderr, "Could not allocate video frame\n");
        exit(1);
    }
    frame_count = 0;
    for (;;) {
        avpkt.size = fread(inbuf, 1, INBUF_SIZE, f);
        if (avpkt.size == 0)
            break;
        /* NOTE1: some codecs are stream based (mpegvideo, mpegaudio)
           and this is the only method to use them because you cannot
           know the compressed data size before analysing it.
           BUT some other codecs (msmpeg4, mpeg4) are inherently frame
           based, so you must call them with all the data for one
           frame exactly. You must also initialize 'width' and
           'height' before initializing them. */
        /* NOTE2: some codecs allow the raw parameters (frame size,
           sample rate) to be changed at any frame. We handle this, so
           you should also take care of it */
        /* here, we use a stream based decoder (mpeg1video), so we
           feed decoder and see if it could decode a frame */
        avpkt.data = inbuf;
        while (avpkt.size > 0)
            if (decode_write_frame(outfilename, c, frame, &frame_count, &avpkt, 0) < 0)
                exit(1);
    }
    /* some codecs, such as MPEG, transmit the I and P frame with a
       latency of one frame. You must do the following to have a
       chance to get the last frame of the video */
    avpkt.data = NULL;
    avpkt.size = 0;
    decode_write_frame(outfilename, c, frame, &frame_count, &avpkt, 1);

@mikeboers
Copy link
Member Author

Just pushed another commit. With big enough chunks it is able to decode from mp4 and flv. I think a problem we are running into is that Packets don't mutate as they are consumed to reflect how much data is left, and so for stuff like flv that does not need a parser we are dropping parts of frames across packet boundaries.

Getting somewhere though!

@GoelBiju
Copy link

GoelBiju commented Jul 26, 2016

Great work Mr. @mikeboers , I really look forward to seeing this library have an option to decode a single frame from a packet (and encode a frame from raw data)!

@tuxuser
Copy link

tuxuser commented Jul 6, 2017

Hello people,
Is the implementation finished.. And if so, is there a quick example available? Thank you

Update: Updated to 0.4.0dev0 build... now its working with CodecContext!

@GoelBiju
Copy link

GoelBiju commented Jul 6, 2017

Hi @tuxuser , I was wondering how you managed to use CodecContext. Is there an example I could see please?

Thank you.

@tuxuser
Copy link

tuxuser commented Jul 6, 2017

Sure, here is a snippet that works for me

import av

data = b'\xDE\xAD\xBE\xEF'  # Just for illustration 
codec_ctx = av.codec.CodecContext.create('aac', 'r')
packet = av.packet.Packet(data)
for frame in codec_ctx.decode(packet):
    decoded_data = frame.planes[0].to_bytes()
    # Do stuff

The original 'scratch'-code: https://github.com/mikeboers/PyAV/blob/master/scratchpad/cctx_decode.py
Did not work for me as CodecContext.parse expects a string... but I wanted to supply binary data of course... That's why I am assembling the av.packet.Packet by myself

@GoelBiju
Copy link

GoelBiju commented Jul 6, 2017

Thanks for the help @tuxuser , my issue was similarly like this, I had some h263 RTMP data which I needed to create an flv file from, so I needed a way to decode that data and place it into the file.

@mikeboers
Copy link
Member Author

@tuxuser Thanks for the example, and noting that the parse doesn't work well for you. That was in the first iteration of the CodecContext refactor, and I haven't revisited it since.

@koenvo
Copy link

koenvo commented Jul 25, 2017

I'm trying to get this working with h264 packet but no luck here. The installed version is 0.4.0.dev0

My code:

import av

fh = av.open("some_video.mp4")
original_codec_ctx = fh.streams.video[0].codec_context
codec_ctx = av.codec.CodecContext.create(original_codec_ctx.name, 'r')

for packet in fh.demux(fh.streams.video):
    cloned_packet = av.packet.Packet(packet.to_bytes())
    for frame in codec_ctx.decode(cloned_packet):
        # do stuff

I get this error:

  File "av/codec/context.pyx", line 315, in av.codec.context.CodecContext.decode
  File "av/codec/context.pyx", line 338, in av.codec.context.CodecContext.decode
  File "av/codec/context.pyx", line 214, in av.codec.context.CodecContext._send_packet_and_recv
  File "av/utils.pyx", line 105, in av.utils.err_check
av.AVError: [Errno 1094995529] Invalid data found when processing input (libav.h264: Error splitting the input into NAL units.)

When I used the original CodecContext it is able to decode the packets.

import av

fh = av.open("some_video.mp4")
original_codec_ctx = fh.streams.video[0].codec_context
codec_ctx = av.codec.CodecContext.create(original_codec_ctx.name, 'r')

for packet in fh.demux(fh.streams.video):
    cloned_packet = av.packet.Packet(packet.to_bytes())
    for frame in original_codec_ctx.decode(cloned_packet):
        # do stuff

I can imagine the context keeps some kind of state that is missing in the first piece of code. But I don't really have a clue how to solve this, if it's possible at all.

edit:
Some links that might be related:
https://stackoverflow.com/questions/39105571/decoding-mp4-mkv-using-ffmpeg-fails => #222
http://git.videolan.org/?p=ffmpeg.git;a=commitdiff;h=a9bb4cf87d1eb68f9ed2dc971e3400b95c1a6a78

@adavoudi
Copy link
Contributor

@koenvo Did you manage to solve the problem?
In my case the error is as below:

File "av/codec/context.pyx", line 326, in av.codec.context.CodecContext.decode File "av/codec/context.pyx", line 372, in av.codec.context.CodecContext.decode File "av/video/codeccontext.pyx", line 108, in av.video.codeccontext.VideoCodecContext._decode File "av/utils.pyx", line 105, in av.utils.err_check av.AVError: [Errno 1094995529] Invalid data found when processing input (libav.h264: no frame!)

@koenvo
Copy link

koenvo commented Feb 21, 2018

nope. But I didn't spend time on it lately. Looks like your error is same as mine..

@adavoudi
Copy link
Contributor

Some codecs need a few extra data for decoding and I think h264 is one of them. So I made a few changes to the code and provided a function to copy these data from a source CodecContext to a destination CodecContext. In my case, the data was about 40 bytes (so we can easily save it somewhere).

I sent a pull request #287.

@mikeboers Would you please check it?

@jd20
Copy link

jd20 commented Mar 18, 2018

I've been hitting the same issue when trying to re-decode from raw packets (about no start code, and cannot split NAL units).

@adavoudi I tested your PR locally, and it fixed the problems I was having. Would be great if this could get merged.

@zuypt
Copy link

zuypt commented Jul 8, 2019

I am working with the tello drone which transmits raw h264 through udp socket. How do I get the extradata from the stream ?? The TelloPy library creates a file like object from recveived udp then use av.open on it which result in a very laggy stream.

@ramoncaldeira
Copy link
Contributor

@l0stb1t can you get this from the container?

Example:

codec_ctx = container.streams.video[0].codec_context 
codec_ctx.extradata

@arvindchandel
Copy link

@koenvo @jd20 Hi, I want my frame to H264 encoded and then decoded back to original frame. For this i tried the above code and hitting the same issues 'Error splitting the input into NAL units'. Can you provide me any right example/pointer for my problem.

@arvindchandel
Copy link

@koenvo Did you manage to solve the problem?
In my case the error is as below:

File "av/codec/context.pyx", line 326, in av.codec.context.CodecContext.decode File "av/codec/context.pyx", line 372, in av.codec.context.CodecContext.decode File "av/video/codeccontext.pyx", line 108, in av.video.codeccontext.VideoCodecContext._decode File "av/utils.pyx", line 105, in av.utils.err_check av.AVError: [Errno 1094995529] Invalid data found when processing input (libav.h264: no frame!)

did u get solution for this.

@jlaine
Copy link
Collaborator

jlaine commented Feb 23, 2021

Please refrain from hijacking a closed issue. If you have found a bug in pyav please open a new issue. If it's to chat with other users consider using the project's gitter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests