AMF on Linux/Windows (and ffmpeg) - the missing features list - will they get fixed/implemented? #348

bavdevc · 2022-09-02T00:05:31Z

Hello AMD developers,

I tested your current software stack on Windows 10 and Linux and I found the following issues or missing features that should have been working:

1. Linux-only: AMFVideoDecoderHW_H265_MAIN10 is not implemented in AMF (Vulkan)

Details/Example:

Unsupported codec MPEG2:
[AMFDecodeEngineImplVulkan] Warning: InitDecoder() - WARRNING: Vulkan supports H264 and HEVC only for now

Not implemented codec HEVC/MAIN10:
[AMFDecodeEngineImplVulkan] Error: ../../../../../runtime/src/components/DecoderUVD/DecodeEngines/Vulkan/DecodeEngineVulkan.cpp(121):InitDecoder() Codec is not supported by HW

--> Problem in NAVI2x ( see #341 ) and also VCN2.x block as NAVI1x and many APUs

2. Windows/Linux: ffmpeg is missing the AMF decoder

... since ~ 2 years - see #199 (feature was not accepted/merged the 1st time)
btw. VA-API decoder working fine with mesa/gallium/radeonsi (ALL input formats, like the windows version)

3. Windows/Linux: ffmpeg is missing the AMF HEVC Main10 encoder

... - that feature is not implemented yet #259
btw. VA-API encoder working fine with mesa/gallium/radeonsi (H264 and HEVC - Main and Main10)

4. Linux-only: performance problems when combining AMF encoder with the AMD OpenCL runtime

(for example resizing or tonemapping with vceencc/ffmpeg incl. encoding)

example processing speeds with identical API options, same input, same AMF version and latest AMD driver stack:

resizing: windows version ~ 59.14 fps | linux version ~ 4.35 fps fps
tonemapping: windows version ~ 27.93 fps | linux version ~ 3.91 fps

5. Windows/Linux: ffmpeg context/hardware mapping doesn't work with AMD

(only Intel and Nvidia hw context mapping works):
usecase - Map hardware frames to system memory or to another device. (hwmap - FFmpeg):
example:

Decode/Upload with VA-API or "AMF/DecoderUVD"
copy/use HWcontext for processing using OpenCL functions (ffmpeg: hwmap=derive_device=opencl)
copy/use HWcontext for processing using Vulkan functions (ffmpeg: hwmap=derive_device=vulkan)
afterwards re-mapping the context for encoding (amf or vaapi) (ffmpeg: hwmap=derive_device=vaapi or amf?)
all those operations should be possible without transferring GPU memory to system memory and back again.

if you need more information just ask, I'm still testing the other features to see if everything works as expected or not - I will let you know of any more issues I find while testing AMD Windows/Linux stack regarding Video/AMF

Kind regards and thanks for all your hard work in amdgpu/mesa/ffmpeg and AMF! Your open source efforts are much appreciated!

MikhailAMD · 2022-09-02T02:10:04Z

Few answers:

Yes, HEVC Main10 decoder is not implemented on Linux. It will come eventually.
AMD tried to submit to FFmpeg AMF HW context needed for integration with AMF decoder and converter. Didn't have enough support from maintainers: https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=2553
We submitted HEVC 10-bit encoder support to FFmpeg 2 months ago; no reaction so far: https://patchwork.ffmpeg.org/project/ffmpeg/list/?series=6989
In general, AMF on Linux is build on top of Vulkan. Vulkan doesn't have interops with OpenCL. On Windows AMF encoder works primary through D3D11 and there is interop from D3D11 to OpenCL. Similar is true for D3D11VA decoder vs VAAPI.
HW mapping would be possible if AMF HW context is implemented. See Video Encode API: Unable to set Two Pass Encoding #2.

nyanmisaka · 2022-09-02T17:39:01Z

Any chance AMF team can setup a staging area for ffmpeg and keep a custom fork with full AMF hwcontext and features?

Here are two nice repos from Intel devs to demonstrate this process.
https://github.com/intel-media-ci/cartwheel-ffmpeg
https://github.com/intel-media-ci/ffmpeg

Also if AMD can get in touch with the core developers of ffmpeg and express your intentions for AMF code maintenance, I believe this will push the integration of AMF into ffmpeg.

Patches on the ffmpeg mailing list seem to be easily ignored unless you re-send patches regularly, or you are an accredited maintainer.

nyanmisaka · 2022-09-02T17:39:44Z

copy/use HWcontext for processing using OpenCL functions (ffmpeg: hwmap=derive_device=opencl)

This is feasible for d3d11va<->opencl on Windows with two of my patches:
https://github.com/jellyfin/jellyfin-ffmpeg/blob/jellyfin/debian/patches/0010-add-d3d11-opencl-interop-for-amd.patch
https://github.com/jellyfin/jellyfin-ffmpeg/blob/jellyfin/debian/patches/0011-add-a-hack-for-opencl-reverse-mapping.patch

afterwards re-mapping the context for encoding (amf or vaapi) (ffmpeg: hwmap=derive_device=vaapi or amf?)

Mapping from AMD proprietary Vulkan driver to Vaapi is not available since they don't support the necessary image modifier VK_EXT_image_drm_format_modifier but RADV does support.

bavdevc · 2022-09-02T18:16:48Z

Thank you for the quick answers, @MikhailAMD - you accurately describe the current situation.

@nyanmisaka very good ideas, the staging repo is a quick win but in the long run ffmpeg maintainers should merge it - I have never read a real reason from them why not to.

regarding mapping - I noticed it only works very limited at the moment and investigated some time like you obviously did with those patches for windows.

MikhailAMD · 2022-09-02T18:36:56Z

With first AMF integration we had asked for maintainer rights, didn't happen.
Yes, constant maintenance of a fork is resource consuming in a long run. The preference would be to just deliver features. I suggest we start with 10-bit encoder support. We intend to ping for merging.

softworkz · 2022-09-07T21:41:07Z

Intel is contributing to FFmpeg for many years with 5-10 developers. Only less than a year ago, one of them became a maintainer, so this will require patience, endurance and continued presence and participation.
FFmpeg is a weird place. Hardly anybody will welcome you, or say something positive about patches you submit. The best you can get is when nobody is objecting.

The preference would be to just deliver features

This won't work out as easy as that. You will always have a lag between patch submission and inclusion in ffmpeg, possibly even for many months. You won't be able to work in a way that you submit one patch and wait on that patch to be merged before working on another patch. It's unavoidable that a queue of unmerged patches will emerge as your own baseline and you'll need some build mechanism for dealing with it.

constant maintenance of a fork is resource consuming

You'll surely want to perform automated tests of your patches on a hardware farm to make sure that those patches are working across the full range of supported hardware and to identify regressions early. For that purpose alone, you'll need automated builds.

I'm afraid to say that, but watching the (marginal) progress over the past years and reading the responses above only confirms my impression that AMD is not strongly invested in providing software (driver/middleware) support for the media acceleration capabilities of its range of dGPU products. AMD is way behind its competitors in that area and I can't see intensive efforts like would be required to close that gap.

What's missing in the list above and IMO one of the TOP3 shortcomings is the lack of hardware video processing through filters in ffmpeg like supported by the competition. Mapping to other hw contexts (e.g. OpenCL) is sometimes useful, but it can only be a last resort in case when there's no other way. All those context mappings introduce an additional source of error - and that's one that is difficult to foresee and deal with.
Of course that doesn't apply when you are working on something for a very specific and defined hardware/software environment. But when you develop applications that are supposed to run on a wide range of hardware, operating systems, platform architectures and software configurations, you need a reliable API (in this case for hw accelerated video processing) that will work everywhere in the same way without preconditions (besides hw capabilities).

Scaling, deinterlacing, color conversion, overlay, transforms - at minimum those basics would need to be available as native hw filters, to make some significant step out of the shadow of the two big ones (= we're still talking about hw media acceleration - only ;-)

I don't just want to sound pessimistic. I really wish and hope to see some more progress and stronger movement from AMD in this area. But it should be clear to the AMD program managers that simply continuing like before won't suffice.

Best wishes.
softworkz

MikhailAMD · 2022-09-14T17:14:35Z

10-bit encoding patch was re-sent: http://ffmpeg.org/pipermail/ffmpeg-devel/2022-September/301109.html

Chipcraft · 2022-09-25T16:27:27Z

10-bit encoding patch was re-sent: http://ffmpeg.org/pipermail/ffmpeg-devel/2022-September/301109.html

There is still no p010le support thou, after the patch?
Meaning, the decode has to happen by d3d11va, for pix_fmt to be in d3d11?

Specifying "-profile:v main10" in the FFMpeg command, without specifying the pix_fmt results in yuv420p (8-bit) output, while specifying yuv420p10le or p010le (which will become nv12) will halt the program and crash the driver (19044.2006 Win10, 22.5.1 WHQL, RX 6800 XT RDNA2).

MikhailAMD · 2022-09-26T13:43:25Z

Not sure I understand. http://ffmpeg.org/pipermail/ffmpeg-devel/2022-September/301110.html claims that p010le supported as encoder input. This patch doesn't change decoding. Input can be p010le in system memory or DXGI_FORMAT_P010.
Also HEVC profile main10 doesn't define bit-ness, can be 8 or 10-bit: https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding

bavdevc · 2022-11-16T18:59:59Z

Just so you all know, AMD is actively working on improving this case, Dmitrii Ovchinnikov will help ffmpeg development now:
https://git.ffmpeg.org/gitweb/ffmpeg.git/commit/9b13078c6aa404bb8405651ee2d01d08f9b0925c
quote Dmitrii:

MAINTAINERS: add myself as amfenc* maintainer

Due to the lack of an active AMF maintainer at the moment, as well
as plans to add the av1 encoder and other improvements of AMF,
I added myself to the maintainers. Timely review and merging
patches targeting AMF integration should improve support
of AMD GPUs and APUs in FFmpeg.
For the last couple of years I have been working on AMF related
patches to ffmpeg and other open source projects.

Thank you Dmitrii

bavdevc · 2022-11-16T19:14:33Z

ping @OvchinnikovDmitrii - much appreciated

bavdevc added the question label Sep 2, 2022

ishkong mentioned this issue Oct 22, 2022

Could you please add some patches to update the AMF SDK to 1.4.23.0 (or support hevc_amf encode h265 10bit)? BtbN/FFmpeg-Builds#202

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AMF on Linux/Windows (and ffmpeg) - the missing features list - will they get fixed/implemented? #348

AMF on Linux/Windows (and ffmpeg) - the missing features list - will they get fixed/implemented? #348

bavdevc commented Sep 2, 2022

MikhailAMD commented Sep 2, 2022

nyanmisaka commented Sep 2, 2022

nyanmisaka commented Sep 2, 2022

bavdevc commented Sep 2, 2022

MikhailAMD commented Sep 2, 2022

softworkz commented Sep 7, 2022

MikhailAMD commented Sep 14, 2022

Chipcraft commented Sep 25, 2022

MikhailAMD commented Sep 26, 2022

bavdevc commented Nov 16, 2022

bavdevc commented Nov 16, 2022

AMF on Linux/Windows (and ffmpeg) - the missing features list - will they get fixed/implemented? #348

AMF on Linux/Windows (and ffmpeg) - the missing features list - will they get fixed/implemented? #348

Comments

bavdevc commented Sep 2, 2022

1. Linux-only: AMFVideoDecoderHW_H265_MAIN10 is not implemented in AMF (Vulkan)

2. Windows/Linux: ffmpeg is missing the AMF decoder

3. Windows/Linux: ffmpeg is missing the AMF HEVC Main10 encoder

4. Linux-only: performance problems when combining AMF encoder with the AMD OpenCL runtime

5. Windows/Linux: ffmpeg context/hardware mapping doesn't work with AMD

Kind regards and thanks for all your hard work in amdgpu/mesa/ffmpeg and AMF! Your open source efforts are much appreciated!

MikhailAMD commented Sep 2, 2022

nyanmisaka commented Sep 2, 2022

nyanmisaka commented Sep 2, 2022

bavdevc commented Sep 2, 2022

MikhailAMD commented Sep 2, 2022

softworkz commented Sep 7, 2022

MikhailAMD commented Sep 14, 2022

Chipcraft commented Sep 25, 2022

MikhailAMD commented Sep 26, 2022

bavdevc commented Nov 16, 2022

bavdevc commented Nov 16, 2022