-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
variable frame rate video #32
Comments
In all honestly, the time representation needs a massive overhaul in general, and especially hasn't gotten much attention on the encoding side of things. That said, give me a bit to take a look and try to answer the actual question. =P On Jul 8, 2014, at 1:30 AM, mkassner notifications@github.com wrote:
|
I do believe that the absolute best thing to do is overhaul all time representations, as "discussed" in #25. In the mean time, the time_base on a Frame could be exposed for modification (but this mutability would go away with the time overhaul since it would be meaningless). Take a look at the definition of av_rescale_q and the docs for av_rescale_rnd (from the FFmpeg docs); it is essentially changing a number of ticks from one time_base to another. |
....and I just noticed that your fork (from Mark's fork) does expose the time_base. =P |
Yeah the easiest fix is to add a time_base property to the Frame so you can set it. The presentation time stamp (pts) and time base are used to determine the time a frame is to be displayed, Pts * float(time_base) = seconds ( I think). When encoding the pts of the frame needs to be in the same "time base" as the codec. The scale is there so the frame's pts can be in any time base and it converts it for you. It can be really confusing all these fractions, Mike and I we're thinking of creating a Timestamp object or something to help make it them easier to deal with. |
hi, here my summary of time representation: Time_base (fraction) is the conversion factor to multiply the uint64 pts value to seconds. two time bases that we care about exsist: going from packet pts to frame pts when decoding: ..when encoding: Setting the time_base: The timebase of the stream is not user settable. It is determined by ffmpeg. --- end notes Can you confirm my findings? I had a hard time finding good docs on this. frame.time_base should be set to codec timebase on encode. When the user creates a frame they should probably set it themselves. Writing it out like this I guess it can be problematic because a frame can come from one codec and be destined for another... But you already have this solved by using av_rescale_q twice in encode(). Then one could introduce a class property .pts_seconds: property pts_seconds:
"""Presentation time stamp of this frame. in seconds"""
def __get__(self):
if self.ptr.pts == lib.AV_NOPTS_VALUE:
return None
return self.pts*self.time_base #timebase should be taken from the codec on decode
def __set__(self, value):
if value is None:
self.ptr.pts = lib.AV_NOPTS_VALUE
else:
self.ptr.pts = value/self.time_base |
I also did not find any good docs on this; a very thorough description of how time works in the various libraries would be really nice to have. AVCodecContext and AVStream both say "This is the fundamental unit of time (in seconds) in terms of which frame timestamps are represented.". The first example video I threw at it has a different time_base on each, and they appear to be related by AVCodecContext.ticks_per_frame. From what I can see with a few examples, while decoding both the packet and frame pts/dts are expressed in AVStream.time_base. The only time we use the AVCodecContext.time_base is when actually encoding to set the frame's pts/dts. This does not make a ton of sense to me. A sticky part in what you have written (that you may be aware of, but I don't know from this) is that due to frame re-ordering (and other effects), the i-th frame does not necessarily link to the i-th packet, nor is there a guaranteed 1:1 relationship in the quantities of packets or frames. This setting of a time_base on a frame is only tricky because of FFmpeg/Libav's structure not matching what we would ideally want to do in Python (e.g. encoding an arbitrary frame into a stream, or attaching a manually constructed stream to a container. ... the further I dig into this the less sense it makes. facepalm Mike On Jul 8, 2014, at 11:51 PM, mkassner notifications@github.com wrote:
|
My confusion could be coming from using av_best_effort_timestamp. Now my frames simply do not have a pts. The best description I have found is in an old tutorial. |
I understand that this part is tricky, going through ffmpeg source the trail goes from From my tests I found the frame pts can be set during encoding and read out properly during decoding for mpeg4 in mp4 container. This allows me to read and write variable frame rate video. Here my understanding so far: Encoding:
decoding:
I find it a bit confusing that before encoding the timebase is codec based and after decoding it is stream based. But I can certainly live with that. |
I think a reason the timebase of the encoded packet in AVCodecContext.time_base is because avcodec_encode_video2 doesn't know what stream the packet is going to be added to. here is the example, where I believe those the av_rescale_q code came from It might be more flexible to do this scaling in OutputContainer.mux, but AVPacket doesn't seem to have a time_base attribute. |
Does this paragraph on time accurately represent what we believe to be the case of time_base(s)? |
Hi,
First of, this project is fantastic, we had be searching for useable ffmpeg python bindings for a while now. Really enjoying PyAV.
We want to record video from a live source and use recording timestamps at pts for each frame since the source may change frame rate.
Looking at video/stream.pyx L116-L120 i can see that frame.pts gets scaled. But frame.time_base is always 0/0 thus creating non-sensical pts. I cannot figure out how to set frame.time_base, before I add this as a variable I wanted to better understand your motivation for the above mentioned lines.
I guess the question comes down to understanding the input args for lib.av_rescale_q.
Could you give me a pointer?
thanks!
The text was updated successfully, but these errors were encountered: