Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AV1 Video Decode with Media Source does not use HW Decode on AMD Windows #10213

Open
theofficialgman opened this issue Feb 6, 2024 · 15 comments
Assignees

Comments

@theofficialgman
Copy link

theofficialgman commented Feb 6, 2024

Operating System Info

Windows 11

Other OS

No response

OBS Studio Version

30.0.2 and 30.1-beta1

OBS Studio Version (Other)

No response

OBS Studio Log URL

https://obsproject.com/logs/LPk6Mr1icydav9OL

OBS Studio Crash Log URL

No response

Expected Behavior

I expect hardware decoding to be used on AMD AV1 when using the "Media Source"

Current Behavior

Software video decode on the CPU is used for AV1 when using the "Media Source"

Steps to Reproduce

  1. Use "Media Source" on AMD Windows with AV1 decode support and tick the "Use hardware decoding when available" box
  2. Observe no Video Codec usage and high CPU utilization

Anything else we should know?

Tested the same file in Windows Media Player and Video Codec is used and CPU usage is very low (HW decode is used). So this is not an issue with the tested video files (I have tried AV1 video files all with the same result).

VP9 AMD HW accelerated decode works in "Media Source"

@Fenrirthviti
Copy link
Member

This is most likely expected. Media source uses ffmpeg for playback, so we'd need to ship a version that supports HW decode (if one even exists at the moment).

@theofficialgman
Copy link
Author

theofficialgman commented Feb 6, 2024

d3d11va and dxva2 (H.264, MPEG-2, VC-1, WMV 3,VP9, AV1, HEVC) in ffmpeg both cover the HW decode for AV1 and should be available already. One of them is what VP9 HW decode in obs ffmpeg already uses.

I will play around with standalone ffmpeg for Windows and see if it can be made to work or not.

@theofficialgman
Copy link
Author

theofficialgman commented Feb 6, 2024

Tested both D3D11VA and DXVA2 on standalone ffmpeg for VP9 and AV1. Using the BtbN builds n6.1.1-1-g61b88b4dda-20240206 as suggested on FFmpeg website https://ffmpeg.org/download.html#build-windows

ffmpeg version n6.1.1-1-g61b88b4dda-20240206 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 13.2.0 (crosstool-NG 1.25.0.232_c175b21)
  configuration: --prefix=/ffbuild/prefix --pkg-config-flags=--static --pkg-config=pkg-config --cross-prefix=x86_64-w64-mingw32- --arch=x86_64 --target-os=mingw32 --enable-gpl --enable-version3 --disable-debug --disable-w32threads --enable-pthreads --enable-iconv --enable-libxml2 --enable-zlib --enable-libfreetype --enable-libfribidi --enable-gmp --enable-lzma --enable-fontconfig --enable-libharfbuzz --enable-libvorbis --enable-opencl --disable-libpulse --enable-libvmaf --disable-libxcb --disable-xlib --enable-amf --enable-libaom --enable-libaribb24 --enable-avisynth --enable-chromaprint --enable-libdav1d --enable-libdavs2 --disable-libfdk-aac --enable-ffnvcodec --enable-cuda-llvm --enable-frei0r --enable-libgme --enable-libkvazaar --enable-libaribcaption --enable-libass --enable-libbluray --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librist --enable-libssh --enable-libtheora --enable-libvpx --enable-libwebp --enable-lv2 --enable-libvpl --enable-openal --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenh264 --enable-libopenjpeg --enable-libopenmpt --enable-librav1e --enable-librubberband --enable-schannel --enable-sdl2 --enable-libsoxr --enable-libsrt --enable-libsvtav1 --enable-libtwolame --enable-libuavs3d --disable-libdrm --enable-vaapi --enable-libvidstab --enable-vulkan --enable-libshaderc --enable-libplacebo --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libzimg --enable-libzvbi --extra-cflags=-DLIBTWOLAME_STATIC --extra-cxxflags= --extra-ldflags=-pthread --extra-ldexeflags= --extra-libs=-lgomp --extra-version=20240206

On VP9:
dxva2 did not work -hwaccel dxva2 and resulted in this in logs:

[vp9 @ 0000017e8f921bc0] Failed setup for format dxva2_vld: hwaccel initialisation returned error.

and it fell back to using software decoding

d3d11va worked -hwaccel d3d11va and resulted in the HW Decode of the VP9 video file at even better performance than my SOC specifications. I expected 2160p196 8/10bpc VP9 but got 2160p at 250fps.

On AV1:
dxva2 did not work -hwaccel dxva2 and resulted in along with log spam about how it could not decode:

Press [q] to stop, [?] for help
[av1 @ 000002165dd9a500] Failed setup for format dxva2_vld: hwaccel initialisation returned error.
[av1 @ 000002165dd9a500] Your platform doesn't support hardware accelerated AV1 decoding.
[av1 @ 000002165dd9a500] Failed to get pixel format.

d3d11va worked -hwaccel d3d11va and resulted in the HW Decode of the AV1 video file (very low CPU usage but only 38% "Video Codec" useage in task manager). I expect to get 100% usage of "Video Codec" as I am running a benchmark. I expect 2160p240 8/10bpc AV1 as per my SOC specifications https://www.amd.com/en/product/13196 but I am only getting 2160p at 48fps. For reference I am benchmarking like so

ffmpeg.exe -hwaccel d3d11va -i "C:\Users\USER\Downloads\yt-dlp_win\COSTA RICA IN 4K 60fps HDR (ULTRA HD) [LXb3EKWsInQ].mp4" -f null - -benchmark

edit:specifying -hwaccel_output_format d3d11 after d3d11va avoids the memory copy to system memory and the decode performance is now better than expected (2160p at 275fps) and 100% "Video Codec" usage

So it looks like AMD did not implement AV1 decode in their DX9 driver which makes sense because it is probably rarely used compared to the DX11 driver.

So this appears to be a purely OBS issue @Fenrirthviti since I have confirmed that d3d11va works on standalone ffmpeg on Windows on both VP9 and AV1 while on OBS only VP9 HW decode works. I assume d3d11va is used based on this

enum AVHWDeviceType hw_priority[] = {
AV_HWDEVICE_TYPE_D3D11VA, AV_HWDEVICE_TYPE_DXVA2,
AV_HWDEVICE_TYPE_CUDA, AV_HWDEVICE_TYPE_VAAPI,
AV_HWDEVICE_TYPE_VDPAU, AV_HWDEVICE_TYPE_QSV,
AV_HWDEVICE_TYPE_VIDEOTOOLBOX, AV_HWDEVICE_TYPE_NONE,
};
I have tried putting hwaccel=d3d11va in custom ffmpeg options without any affect for both VP9 and AV1, so it seems this option is filtered out or overridden by the "Use hardware decoding when available" option.

@Fenrirthviti
Copy link
Member

I don't know offhand if we ship with d3d11va enabled on Windows or not. Could be as simple as enabling in the build.

Tagging @RytoEX as he's more familiar with our ffmpeg build opts.

@theofficialgman
Copy link
Author

theofficialgman commented Feb 6, 2024

I don't know offhand if we ship with d3d11va enabled on Windows or not. Could be as simple as enabling in the build.

got the build info from checking the strings in the OBS libavcodec.dll

--disable-w32threads --enable-pthreads --arch=x86_64 --target-os=mingw32 --cross-prefix=x86_64-w64-mingw32- --pkg-config=pkg-config --enable-cross-compile --disable-mediafoundation --enable-libaom --enable-libsvtav1 --prefix=/home/runner/work/obs-deps/obs-deps/windows-x64/obs-ffmpeg-x64 --host-cflags=-I/home/runner/work/obs-deps/obs-deps/windows-x64/obs-ffmpeg-x64/include --host-ldflags=-I/home/runner/work/obs-deps/obs-deps/windows-x64/obs-ffmpeg-x64/include --extra-cflags='-I/home/runner/work/obs-deps/obs-deps/windows-x64/obs-ffmpeg-x64/include -static-libgcc -w -pipe -fno-semantic-interposition -O3 -g -DNDEBUG' --extra-cxxflags='-I/home/runner/work/obs-deps/obs-deps/windows-x64/obs-ffmpeg-x64/include -static-libgcc -static-libstdc++ -w -pipe -fno-semantic-interposition -O3 -g -DNDEBUG' --extra-ldflags='-L/home/runner/work/obs-deps/obs-deps/windows-x64/obs-ffmpeg-x64/lib -static-libgcc -static-libstdc++ -static-libgcc -Wl,-Bstatic -pthread' --enable-version3 --enable-gpl --enable-libx264 --enable-libopus --enable-libvorbis --enable-libvpx --enable-librist --enable-libsrt --enable-shared --disable-static --disable-libjack --disable-indev=jack --disable-sdl2 --disable-doc --disable-postproc --disable-stripping --pkg-config-flags=--static

libavcodec license: GPL version 3 or later FFmpeg version n6.0-12-ga6dc92968a

which matches up with what the ffmpeg buildscript shows https://github.com/obsproject/obs-deps/blob/e1d4ddd710dbcc08407ff4c5c6267372b6bb0c83/deps.ffmpeg/99-ffmpeg.zsh which is commit version of obs-deps used in OBS 30.0.2

So it isn't explicitly enabled, though I don't think it needs to be. After all, VP9 is already using HW decode and the only HW decoder available for it in ffmpeg is d3d11va, so you would think it must be available.

@RytoEX
Copy link
Member

RytoEX commented Feb 6, 2024

Using the BtbN builds n6.1.1-1-g61b88b4dda-20240206 as suggested on FFmpeg website

OBS Studio 30.0.2 uses FFmpeg n6.0 with some commits backported. Please test with that to see if -hwaccel d3d11va works there.

I have tried putting hwaccel=d3d11va in custom ffmpeg options without any affect for both VP9 and AV1

You shouldn't have to specify anything. OBS will test each enum value in order until one works.
https://github.com/obsproject/obs-studio/blob/30.0.2/deps/media-playback/media-playback/decode.c#L54-L63

If none of them work, then you do not get hardware accelerated decode.

@theofficialgman
Copy link
Author

theofficialgman commented Feb 6, 2024

also confirmed that OBS-Studio-30.1-beta1 has the same issue (VP9 still works there, AV1 does not work)

@theofficialgman
Copy link
Author

theofficialgman commented Feb 6, 2024

Using the BtbN builds n6.1.1-1-g61b88b4dda-20240206 as suggested on FFmpeg website

OBS Studio 30.0.2 uses FFmpeg n6.0 with some commits backported. Please test with that to see if -hwaccel d3d11va works there.

I have tried putting hwaccel=d3d11va in custom ffmpeg options without any affect for both VP9 and AV1

You shouldn't have to specify anything. OBS will test each enum value in order until one works. https://github.com/obsproject/obs-studio/blob/30.0.2/deps/media-playback/media-playback/decode.c#L54-L63

If none of them work, then you do not get hardware accelerated decode.

no need. see the above comment. The issue still exists on OBS 30.1-beta1 which uses ffmpeg 6.1.1 https://github.com/obsproject/obs-deps/blob/4202b500fdd849eaf50272e4856880b0b8218350/deps.ffmpeg/99-ffmpeg.zsh#L5

@theofficialgman
Copy link
Author

theofficialgman commented Feb 6, 2024

ignore previous message that I already deleted (forgot to put -hwaccel d3d11va so came to the entirely wrong conclusion)

@RytoEX I should have thought to test the OBS deps directly. FFmpeg binary is built, just not shipped with OBS.
I downloaded OBS 30.1-beta1 deps https://github.com/obsproject/obs-studio/blob/30.1.0-beta1/buildspec.json#L4 https://github.com/obsproject/obs-deps/releases/download/2024-01-27/windows-deps-2024-01-27-x64.zip and confirmed that using OBS FFmpeg directly DOES work.

C:\Users\USER\Downloads\windows-deps-2024-01-27-x64\bin>ffmpeg.exe -hwaccel d3d11va -i "C:\Users\USER\Downloads\yt-dlp_win\COSTA RICA IN 4K 60fps HDR (ULTRA HD) [LXb3EKWsInQ].mp4"  -f null - -benchmark
image

So it seems to not be FFmpeg itself but how OBS calls it is somehow flawed still.

@RytoEX
Copy link
Member

RytoEX commented Feb 7, 2024

I dug into this a little bit. The short answer so far is that avcodec_get_hw_config returns null if the media codec is AV1.

Checking the debug log output of ffmpeg.exe, it looks like FFmpeg changes the decoder from the specific codec (e.g., libaom-av1) to a generic one:

Selecting decoder 'av1' because of requested hwaccel method d3d11va

As far as I can tell, ffmpeg.exe (the implementation in ffmpeg_demux.c) is different and more failsafe than their example implementation in hw_decode.c, which has no such fallbacks and is closer to our implementation. I suspect if you tried to compile and use their hw_decode example, it would also fail.

@theofficialgman
Copy link
Author

theofficialgman commented Feb 7, 2024

I think your suspicions are probably right. The example implementation hasn't been touched seriously in over 3 years. It's probably safe to say nobody tested against it.

I could try to compile ffmpeg on windows but that's going to take a lot of setup on my end... Not really prepared for that (I'm more of a Linux person myself).

Might be worth pinging elenril who authored the movement of selecting hwaccel into ffmpeg_demux.c and also the original implementation in ffmpeg_opt.c before the move. They might have input to say their either the example code should be updated to include those changes or avcodec_get_hw_config shouldn't return null if the codec passed (libaom-av1) will get automatically translated by ffmpeg to a codec that does have hwaccel support (av1).

I think my question now is, why is libaom-av1 being selected in the first place? I assume that when you call

bool mp_decode_init(mp_media_t *m, enum AVMediaType type, bool hw)
it ends up with libaom-av1 as the codec as well when that function calls avcodec_find_decoder(id)

@RytoEX
Copy link
Member

RytoEX commented Feb 7, 2024

I think your suspicions are probably right. The example implementation hasn't been touched seriously in over 3 years. It's probably safe to say nobody tested against it.

The tricky thing about ffmpeg.exe is that it has failsafes/fallbacks like this to try to resolve to a known working condition.

I could try to compile ffmpeg on windows but that's going to take a lot of setup on my end... Not really prepared for that (I'm more of a Linux person myself).

Our deps build scripts can now produce FFmpeg builds on Windows. I may also get around to trying to build the example eventually, but I have other tasks that are more likely to consume my time at the moment.

Might be worth pinging elenril who authored the movement of selecting hwaccel into ffmpeg_demux.c and also the original implementation in ffmpeg_opt.c before the move. They might have input to say their either the example code should be updated to include those changes or avcodec_get_hw_config shouldn't return null if the codec passed (libaom-av1) will get automatically translated by ffmpeg to a codec that does have hwaccel support (av1).

Personally, I wouldn't know where to begin there.

I think my question now is, why is libaom-av1 being selected in the first place? I assume that when you call

bool mp_decode_init(mp_media_t *m, enum AVMediaType type, bool hw)

it ends up with libaom-av1 as the codec as well when that function calls avcodec_find_decoder(id)

Yes, avcodec_find_decoder(id) returns an AVCodec with:

name: "libaom-av1"
long_name: "libaom AV1"
type: AVMEDIA_TYPE_VIDEO
id: AV_CODEC_ID_AV1
wrapper_name: "libaom"

Similarly, av_find_best_stream used in hw_decode.c will produce the same result.

@DimkaTsv
Copy link

On AV1:
dxva2 did not work -hwaccel dxva2 and resulted in along with log spam about how it could not decode:
d3d11va worked -hwaccel d3d11va and resulted in the HW Decode of the AV1 video file

Small context for you why -hwaccel dxva2 didn't work here specifically. AMD GPU's only support AV1 HW decode through D3D11 VA.
VP9 on other hand should've worked with both DXVA2 and D3D11 VA

@theofficialgman
Copy link
Author

pinging relevant authors from https://github.com/FFmpeg/FFmpeg/blob/master/fftools/ffmpeg_demux.c and https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/hw_decode.c
@fhvwy @xhaihao
specifically for your contributions
FFmpeg/FFmpeg@b0cd14f
FFmpeg/FFmpeg@ad67ea9

the issue we are having is that decode of AV1 video ends up with libaom-av1 in OBS and HW decode doesn't get used.
the relevant OBS code for choosing the decoder is here -> https://github.com/obsproject/obs-studio/blob/master/deps/media-playback/media-playback/decode.c

I believe that the last commit from @xhaihao (FFmpeg/FFmpeg@ad67ea9) actually fixes this issue in ffmpeg since the commit message description seems to explain the issue identically

Usually a HW decoder is expected when user specifies a HW acceleration
method via -hwaccel option, however the current implementation doesn't
take HW acceleration method into account, it is possible to select a SW
decoder.

The example code (https://github.com/FFmpeg/FFmpeg/blob/master/doc/examples/hw_decode.c) it seems is not updated to include that change.
@RytoEX ^

@RytoEX
Copy link
Member

RytoEX commented Mar 22, 2024

While I appreciate the enthusiasm and the attempt to get visibility on this, our repo is probably not the best place to discuss issues with FFmpeg examples. For issues with their examples, someone should probably file a bug on FFmpeg's Trac or bring it up in their IRC or on their mailing list.

For the problem here, we probably need to implement something similar to what was done in fftools/ffmpeg_opt.c to determine an appropriate decoder when using hardware acceleration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants