-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Fully support hardware filters on Apple VideoToolbox #11014
Conversation
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>
|
It now works with incorrect colorspace (as the main is in YUV and the overlay is in BGRA):
To handle the colorspace, we have multiple approaches here, and this could be very specific to The most native way would be to use CoreImage, generate an intermediate image with Another approach we could try is to bypass the I'm going to try the CoreImage first to see if the performance is acceptable. |
|
Attaching the example file for reference. |
|
By adding |
|
ffmpeg's |
|
Perhaps you need some calls from CoreImage to ensure the video data has been synchronized correctly, similar to glflush()/glfinish() in OpenGL. |
|
I got it. CoreImage defers rendering until users request access to it, that's why the frame is out of sync. I have to use |
|
Good catch. But i can still find some vibrating frames in this clip. Especially some frames at the beginning and end. |
|
At this point it seems like it is unlikely a frame syncing issue in our filter. If I switch to If I forced it to run at a very slow speed (like 30fps), then even Probably we should not output BGRA frames directly, but convert them back to planer YUV to make the hardware happier? Or we need to find some way to throttle the frame output so that we never overloads the downstream. |
|
There is no overload in libavfilter. HW encoders can be overloaded but this is usually mitigated by dropping frames, some encoders allow this. It doesn't happen with such low resolution and bitrate use cases. Such uncertainty means that there is still an out-of-sync somewhere in the overlay filter or the hwcontext of videotoolbox. It's best to start by reviewing your code. |
|
By converting the output frame back to yuv fixed this, so this is indeed the behavior of |
|
So the input and output of your filter was (yuv+bgra=>bgra)? |
|
Yes. That is what I did. |
|
Then it’s indeed an issue of the videotoolbox encoder when it comes to handling non-yuv input. A common practice for an overlay filter is to keep the format of the output and main input the same, so the latter one is better. |
|
The initial working tree is pushed here: Before submitting this upstream, I'd like to receive some suggestions first. |
Looks good at first glance. I left some suggestions. |
|
Pushed v2 patch: Now I'm thinking about using |
|
Now we have a new problem: for certain input combination, we will end up having a lot of duplicated frames in generated output when |
|
It seems like we have to manually set the timebase in The problem seems to gone with this. |
Signed-off-by: gnattu <gnattuoc@me.com>
|
I simplified some redundant parts. And the order of the deint filter has been adjusted, because it must be chained after the decoder, not after the scale filter, otherwise the image quality will be degraded. Finally, I disabled hwSurface only for cases using VideoToolBox tone-mapping. Because for common transcoding use cases, it is not cost-effective to exchange higher system power consumption (2x memory copy: decoder buffer->copy to sw buffer->copy to hw buffer) and more heat for a small amount of extra FPS, especially for battery-powered notebooks. |
No, you should not use hw surface, as it not only breaks tone mapping, but it also breaks HDR pass-through. The wrong colors will appear everywhere. It also cannot guarantee zero copy on VideoToolbox, as the hardware decoder, GPU, and hardware encoder could be located at different locations off the bus (not always in the same chip like other platforms). For Intel Macs, the hardware encoders/decoders could be the Intel-built-in, or they could be the T2 chip's built-in on some models. However, the filter context would be created on the amdgpu if available. In this case, hw surface not only makes it slower, but it also does not eliminate the copy either. If we have to use it, we should only use it for 8bit videos and take closer look at the performance and power consumption to see if it produces any measurable benefits. |
Actually we haven't implemented HDR passthrough yet because it requires HEVC Main10 instead of Main. Unless you want 8-bit HDR.
Well, I'm convinced. hwSurface is now only enabled for OpenCL tone-mapping. nyanmisaka@bbbdf7c |
|
Please re-test it when you have time. Includes disabling hardware decoders. Then we can approve and 🚀 it. |
Co-authored-by: nyanmisaka <nst799610810@gmail.com> Signed-off-by: gnattu <gnattuoc@me.com>
|
Tested a few files and it appears to be OK. I merged most of your suggested code into my branch, only made some modifications:
|
Signed-off-by: gnattu <gnattuoc@me.com>
|
What we need to do is modify the upstream |
This comment was marked as resolved.
This comment was marked as resolved.
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>
Signed-off-by: gnattu <gnattuoc@me.com>


Changes
The upstream ffmpeg 6.1 released with a new
scale_vtfilter which provides support for scaling and tone mapping using VideoToolBox's native method. This PR:One thing to note is that the VideoToolBox's tone mapping method has to use hard-coded Apple's parameters and does not have any tunable like other filters, but Apple's settings looks good enough to me.
Currently the WebUI does not have the options for enabling Tone Mapping for VideoToolBox, which need to be implemented separately as it does not have the tunable options available to other implementations.
Issues