Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: DG2 A380 VAAPI doesn't work #1487

Closed
Quackdoc opened this issue Sep 11, 2022 · 42 comments
Closed

[Bug]: DG2 A380 VAAPI doesn't work #1487

Quackdoc opened this issue Sep 11, 2022 · 42 comments
Assignees
Labels
AV1 AV1 Encode video encode related

Comments

@Quackdoc
Copy link

Quackdoc commented Sep 11, 2022

Which component impacted?

Decode, Encode

Is it regression? Good in old configuration?

No

What happened?

VAAPI does not work on A380 on either ffmpeg or gstreamer

What's the usage scenario when you are seeing the problem?

Transcode for media delivery, Playback

What impacted?

both FFMPEG and Gstreamer are effected, making this effect mpv, vlc, etc.

Debug Information

Arch Linux
Kernel 6.0.0-rc4-1-git-00302-gb96fbd602d35
Intel Media Driver Git 2022.5.3.r76.g45c0eb1df-1
Libva-git-2.2.1.pre1.20180921.r6.gcf11abe-1
ffmpeg-git 5.2.r108102.g60d8c2019f-1 OR ffmpeg-git-5.2.r108117.g2cfc4ac2b3-1-x86_64.pkg.tar.zst from ffmpeg-cartwheel

AVC vaapi playback
avc.tar.gz

VP9 vaapi playback
vp9.tar.gz

Av1 vaapi playback
av1-libva.log

Do you want to contribute a patch to fix the issue?

No response

@Quackdoc
Copy link
Author

As further information, I was not able to replicate this issue on fedora. it seems to work on that

@Quackdoc
Copy link
Author

More information. Installed ffmpeg-git intel-media-driver-git linux-git linux-firmware-git on arch virtual machine via chaotic AUR.

using VAAPI decode on netflix's chimera results in KVM crash on qemu. am experincing similar crashes when using cartwheel

error: kvm run failed Bad address
RAX=00000001105e001b RBX=ffff964d4f52a000 RCX=0000000000000200 RDX=0000000000000200
RSI=00000001105e001b RDI=ffffa3acc0859000 RBP=ffff964d4266a640 RSP=ffffa3acc0867ae8
R8 =ffffa3acc0859000 R9 =ffff964e7fff7af0 R10=0000000000000000 R11=0000000000000000
R12=0000000000000000 R13=ffff964d4326a440 R14=ffff964d43268000 R15=ffff964d43268000
RIP=ffffffffc0d532fb RFL=00000286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00800000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00800000
FS =0000 00007f946ff1a000 ffffffff 00800000
GS =0000 ffff964e77c40000 ffffffff 00800000
LDT=0000 0000000000000000 0000ffff 00000000
TR =0040 fffffe000003e000 00004087 00008b00 DPL=0 TSS64-busy
GDT=     fffffe000003c000 0000007f
IDT=     fffffe0000000000 00000fff
CR0=80050033 CR2=000055d7074f3b30 CR3=000000010ce2c000 CR4=003506e0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=87 a0 02 00 00 89 d1 48 89 f0 49 81 e0 00 f0 ff ff 4c 89 c7 <f3> 48 ab be 00 10 00 00 4c 89 c7 e9 65 57 b3 ff 0f 1f 44 00 00 0f 1f 44 00 00 41 57 41 56

@Quackdoc
Copy link
Author

A similar crash is now happening in fedora VM

error: kvm run failed Bad address
RAX=00000000014c001b RBX=ffff8ca490988400 RCX=0000000000000200 RDX=0000000000000200
RSI=00000000014c001b RDI=ffff9fa180a2d000 RBP=ffff8ca491083300 RSP=ffff9fa180b67a20
R8 =ffff9fa180a2d000 R9 =0000000000000000 R10=0000000000002000 R11=ffff9fa180a2d000
R12=0000000000000000 R13=ffff8ca48eafa360 R14=ffff8ca48eaf8000 R15=00000000808da328
RIP=ffffffffc076d0db RFL=00000286 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0000 0000000000000000 ffffffff 00800000
CS =0010 0000000000000000 ffffffff 00a09b00 DPL=0 CS64 [-RA]
SS =0018 0000000000000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =0000 0000000000000000 ffffffff 00800000
FS =0000 00007ff6a575e3c0 ffffffff 00800000
GS =0000 ffff8ca4bbc00000 ffffffff 00800000
LDT=0000 0000000000000000 0000ffff 00000000
TR =0040 fffffe0000003000 00004087 00008b00 DPL=0 TSS64-busy
GDT=     fffffe0000001000 0000007f
IDT=     fffffe0000000000 00000fff
CR0=80050033 CR2=000055d1a311fe40 CR3=0000000001262000 CR4=003506f0
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
DR6=00000000ffff0ff0 DR7=0000000000000400
EFER=0000000000000d01
Code=87 a0 02 00 00 89 d1 48 89 f0 49 81 e0 00 f0 ff ff 4c 89 c7 <f3> 48 ab be 00 10 00 00 4c 89 c7 e9 f5 b9 14 ed 0f 1f 44 00 00 0f 1f 44 00 00 41 57 41 56

@eero-t
Copy link

eero-t commented Sep 21, 2022

what dmesg | grep i915 shows?

@Quackdoc
Copy link
Author

Sorry for the late reply, I cannot check the sPC right now since it's running stable linux and I won't be able to reset it until later, however the KVM output of dmesg is this

[    0.000000] Command line: root=/dev/vda2 console=ttyS0 i915.force_probe=56a5
[    0.065986] Kernel command line: root=/dev/vda2 console=ttyS0 i915.force_probe=56a5
[    2.807908] i915 0000:00:05.0: [drm] Incompatible option enable_guc=3 - HuC is not supported!
[    2.808455] i915 0000:00:05.0: [drm] VT-d active for gfx access
[    2.808510] i915 0000:00:05.0: vgaarb: deactivate vga console
[    2.808581] i915 0000:00:05.0: [drm] Local memory IO size: 0x000000017c800000
[    2.808586] i915 0000:00:05.0: [drm] Local memory available: 0x000000017c800000
[    2.834783] i915 0000:00:05.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[    2.846848] i915 0000:00:05.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_06.bin (v2.6)
[    2.876739] i915 0000:00:05.0: [drm] GuC firmware i915/dg2_guc_70.1.2.bin version 70.1
[    2.888818] i915 0000:00:05.0: [drm] GuC submission enabled
[    2.888824] i915 0000:00:05.0: [drm] GuC SLPC enabled
[    2.889500] i915 0000:00:05.0: [drm] GuC RC: enabled
[    2.952350] [drm] Initialized i915 1.6.0 20201103 for 0000:00:05.0 on minor 0
[    2.953113] i915 0000:00:05.0: [drm] Cannot find any crtc or sizes
[    2.953352] i915 0000:00:05.0: [drm] Cannot find any crtc or sizes

@eero-t
Copy link

eero-t commented Sep 22, 2022

Is that all that you see when the error happens?

Btw. I'm assuming you're using KVM PCI pass-through.

Regarding this:

[ 2.807908] i915 0000:00:05.0: [drm] Incompatible option enable_guc=3 - HuC is not supported!

You should not be overriding GuC option, unless driver maintainer asks you to do that (I'm not a driver maintainer, just another user).

HuC not being supported sounds suspicious though. Related kernel code is here:

But I do not see from that why it would not be supported.

@Quackdoc
Copy link
Author

Correct it is KVM PCI passthrough, I don't have the messages from the crash. that is a dmesg of a fresh boot.

enable_guc=3 is what the driver is choosing, the only overrides I am preforming are root=/dev/vda2 console=ttyS0 i915.force_probe=56a5 and whatever arch ships.

Although, after some more testing I am getting a similar KVM crash when trying to copy from card to... anything else.

for instance, running waypipe headless will have a similar crash, DRI_PRIME sometimes also has a crash, so I'm assuming the issues i'm having are not something to do with vaapi driver. can't say for certain though so i'm leaving the issue open for now, I plan on filing a bug report later, but have not yet had the time.

@eero-t
Copy link

eero-t commented Sep 22, 2022

enable_guc=3 is what the driver is choosing

What you mean by driver here:

  • i915 module defaults set by something (what?)?
  • i915 selecting that mode by itself?

If latter, enabling HuC although it's not supported would make what the kernel is doing even more suspicious => file bug: https://gitlab.freedesktop.org/drm/intel/-/issues

@Quackdoc
Copy link
Author

Quackdoc commented Sep 22, 2022

If latter, enabling HuC although it's not supported would make what the kernel is doing even more suspicious => file bug: https://gitlab.freedesktop.org/drm/intel/-/issues

someone already brought it up... somewhere, I came across the report and it was acknowledged at least, I just shoved it off the the side of my mind since it seems to fail elegantly and doesn't (at least I hope it's not) posing any issue.

if I come across the issue again ill chime in for sure though

@XinfengZhang
Copy link
Contributor

you could check /sys/kernel/debug/dri/0/i915_params/enable_guc , it will show current enable_guc parameter

@Quackdoc
Copy link
Author

seems like Im still getting the vaapi issue but my vulkan and other kernel issues seem to have cleared up with now running stable linux 6.0

@eero-t
Copy link

eero-t commented Oct 14, 2022

For better dGPU support than what's in upstream 6.x kernels, install kernel DKMS driver(s) as documented here: https://dgpu-docs.intel.com/installation-guides/index.html

@Quackdoc
Copy link
Author

I guess I can try that if I can get a way to download it, so far I have tested arch's zen kernel, arch's regular kernel, and a clean build of https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux. and all present the same problem while using both arch's libva and intem-media-driver as well as building from git

@kkartaltepe
Copy link

kkartaltepe commented Oct 17, 2022

guess I can try that if I can get a way to download it, so far I have tested arch's zen kernel, arch's regular kernel, and a clean build of https://kernel.googlesource.com/pub/scm/linux/kernel/git/torvalds/linux. and all present the same problem while using both arch's libva and intem-media-driver as well as building from git

You wont be able to test this on arch using any typical packages since the only working kernel is afaik unreleased and all the required patches in this have not yet been submitted or accepted by upstreams (drm-next / torvalds), intel instead provides dkms based i915 drivers based on various vendor's stable kernels for you to use. I have tried to package the ubuntu OEM kernel mentioned above and the associated backported i915 driver here https://github.com/kkartaltepe/intel-dg2-arch

However it seems that this kernel is not compatible with the latest stable mesa or my inexperienced packaging of linux has resulted in something terrible. When booting I see

$ journalctl -b-1 | grep i915
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Bumping pre-emption timeout from 640 to 7500 on rcs'0.0 to allow slow compute pre-emption
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Disabling pre-emption timeout to work around forced preemption for rcs'0.0
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Bumping pre-emption timeout from 640 to 7500 on ccs'0.0 to allow slow compute pre-emption
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Disabling pre-emption timeout to work around forced preemption for ccs'0.0
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Bumping pre-emption timeout from 640 to 7500 on ccs'1.0 to allow slow compute pre-emption
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Disabling pre-emption timeout to work around forced preemption for ccs'1.0
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Bumping pre-emption timeout from 640 to 7500 on ccs'2.0 to allow slow compute pre-emption
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Disabling pre-emption timeout to work around forced preemption for ccs'2.0
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Bumping pre-emption timeout from 640 to 7500 on ccs'3.0 to allow slow compute pre-emption
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Disabling pre-emption timeout to work around forced preemption for ccs'3.0
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] VT-d active for gfx access
Oct 16 19:40:44 eldorado kernel: fb0: switching to i915 from EFI VGA
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: vgaarb: deactivate vga console
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Using Transparent Hugepages
Oct 16 19:40:44 eldorado kernel: i915 0000:09:00.0: [drm] Local memory available: 0x00000001fc000000
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] Finished loading DMC firmware i915/dg2_dmc_ver2_07.bin (v2.7)
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] GuC error state capture buffer maybe too small: 2097152 < 3737592 (min = 1245864)
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] GuC firmware i915/dg2_guc_70.2.0.bin version 70.2
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] HuC firmware i915/dg2_huc_7.10.3_gsc.bin version 7.10
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] GuC submission enabled
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] GuC SLPC enabled
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] GuC RC: enabled
Oct 16 19:40:45 eldorado kernel: [drm] Initialized i915 1.6.0 20201103 for 0000:09:00.0 on minor 0
Oct 16 19:40:45 eldorado kernel: snd_hda_intel 0000:0a:00.0: bound 0000:09:00.0 (ops i915_audio_component_bind_ops [i915])
Oct 16 19:40:45 eldorado kernel: fbcon: i915drmfb (fb0) is primary device
Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] fb0: i915drmfb frame buffer device
Oct 16 19:40:45 eldorado kernel: Creating 4 MTD partitions on "i915.spi.2304":
Oct 16 19:40:45 eldorado kernel: 0x000000000000-0x000000001000 : "i915.spi.2304.DESCRIPTOR"
Oct 16 19:40:45 eldorado kernel: 0x000000001000-0x0000005f0000 : "i915.spi.2304.GSC"
Oct 16 19:40:45 eldorado kernel: 0x0000005f0000-0x0000007f0000 : "i915.spi.2304.OptionROM"
Oct 16 19:40:45 eldorado kernel: 0x0000007f0000-0x000000800000 : "i915.spi.2304.DAM"

Which seems correct except for Oct 16 19:40:45 eldorado kernel: i915 0000:09:00.0: [drm] GuC error state capture buffer maybe too small: 2097152 < 3737592 (min = 1245864).

However any attempt to use mesa from here results in a crash during screen creation in the iris driver (tested mesa git as of a few days ago and mesa 22.2.1 the current arch package as of a few days ago). Though this probably isnt the place for support with mesa or the backports driver if anyone has any ideas id love to hear them.

--- edit
An example segfault in x11 when loading from modesetting: Driver for Modesetting Kernel Drivers: kms

[   486.454] (EE) Backtrace:    
[   486.454] (EE) 0: /usr/lib/Xorg (dri3_send_open_reply+0xdd) [0x562005e6bbad]    
[   486.455] (EE) 1: /usr/lib/libc.so.6 (__sigaction+0x50) [0x7fe55f61fa00]    
[   486.455] (EE) 2: /usr/lib/dri/iris_dri.so (nouveau_drm_screen_create+0x45fea0) [0x7fe55db2e5e0]    
[   486.455] (EE) 3: /usr/lib/dri/iris_dri.so (nouveau_drm_screen_create+0x4786e0) [0x7fe55db46e20]    
[   486.455] (EE) 4: /usr/lib/dri/iris_dri.so (nouveau_drm_screen_create+0x46356f) [0x7fe55db31caf]    
[   486.455] (EE) 5: /usr/lib/dri/iris_dri.so (nouveau_drm_screen_create+0x45d335) [0x7fe55db2ba75]    
[   486.455] (EE) unw_get_proc_name failed: no unwind info found [-10]    
[   486.455] (EE) 6: /usr/lib/dri/iris_dri.so (?+0x0) [0x7fe55ce4892b]    
[   486.455] (EE) 7: /usr/lib/dri/iris_dri.so (__driDriverGetExtensions_d3d12+0x61c37a) [0x7fe55d46526a]    
[   486.455] (EE) 8: /usr/lib/dri/iris_dri.so (__driDriverGetExtensions_d3d12+0x1af8) [0x7fe55ce4a9e8]    
[   486.455] (EE) 9: /usr/lib/dri/iris_dri.so (__driDriverGetExtensions_d3d12+0xa97f) [0x7fe55ce5386f]    
[   486.455] (EE) 10: /usr/lib/libgbm.so.1 (gbm_format_get_name+0x1006) [0x7fe55e97bb46]    
[   486.455] (EE) 11: /usr/lib/libgbm.so.1 (gbm_format_get_name+0x17d9) [0x7fe55e97c319]    
[   486.455] (EE) unw_get_proc_name failed: no unwind info found [-10]    
[   486.455] (EE) 12: /usr/lib/libgbm.so.1 (?+0x0) [0x7fe55e97a2dc]    
[   486.455] (EE) 13: /usr/lib/libgbm.so.1 (gbm_create_device+0x4a) [0x7fe55e97a42a]    
[   486.455] (EE) 14: /usr/lib/xorg/modules/libglamoregl.so (glamor_egl_init+0x65) [0x7fe55e8fb6b5]    
[   486.456] (EE) unw_get_proc_name failed: no unwind info found [-10]    
[   486.456] (EE) 15: /usr/lib/xorg/modules/drivers/modesetting_drv.so (?+0x0) [0x7fe55fbe9ee7]    
[   486.456] (EE) 16: /usr/lib/Xorg (InitOutput+0x18dd) [0x562005e88cad]    
[   486.456] (EE) 17: /usr/lib/Xorg (SProcXkbDispatch+0x17ec) [0x562005d4b037]    
[   486.456] (EE) 18: /usr/lib/libc.so.6 (__libc_init_first+0x90) [0x7fe55f60a290]    
[   486.456] (EE) 19: /usr/lib/libc.so.6 (__libc_start_main+0x8a) [0x7fe55f60a34a]    
[   486.456] (EE) 20: /usr/lib/Xorg (_start+0x25) [0x562005d4c475]    
[   486.456] (EE)     
[   486.456] (EE) Segmentation fault at address 0x14    
[   486.456] (EE)     
Fatal server error:    
[   486.456] (EE) Caught signal 11 (Segmentation fault). Server aborting    

@kkartaltepe
Copy link

kkartaltepe commented Oct 18, 2022

Ah I have discovered the answer to my question at least, the ubuntu 22.04 backports have intentionally disabled gem_create_ext ioctls that are used in mesa releases from the past year and that this functionality isnt complete in the backports kernel even if you remove the disabling statements. Likely the mesa releases on ubuntu 22.04 dont have this relatively new feature so backports simply disabled it here https://github.com/intel-gpu/intel-gpu-i915-backports/blob/ubuntu/main/drivers/gpu/drm/i915/gem/i915_gem_create.c#L715

I would hope that media-driver team could provide a reference to a compatible kernel that also works with modern mesa, but perhaps there simply isnt one.

--- edit
Investigating the ubuntu 22 repository intel ships, though they dont expose their sources in the repository oddly it appears intel does have a specially patched mesa which may be what they ship on ubuntu 22, reference at https://github.com/intel-gpu/Mesa/blob/dg2-20221012

@Quackdoc
Copy link
Author

Still haven't been able to get this to work, it's a real head scratcher,

@kkartaltepe
Copy link

h264 encode and decode and av1 decode worked for me on the appropriate backports driver/kernel/mesa/libva. However it seems intel has not upstreamed an ffmpeg implementation of av1 encode (and i could not find any ffmpeg forks from them) so I was unable to test that.

@Quackdoc
Copy link
Author

Quackdoc commented Oct 20, 2022

sadly I wasn't able to get backports to install on zen, vanila or linux LTS here is the log if you happen to be interested, ill look at it later, ill also try the kernel later too
make.log

thanks for the pkgbuid however. you probably want ffmpeg-carthweel, here is a pkgbuild for it, should work just rename it, and you also probably need to tweak it a bit

PKGBUILD.txt

@eero-t
Copy link

eero-t commented Oct 20, 2022

h264 encode and decode and av1 decode worked for me on the appropriate backports driver/kernel/mesa/libva. However it seems intel has not upstreamed an ffmpeg implementation of av1 encode (and i could not find any ffmpeg forks from them) so I was unable to test that.

You could try "libvpl-tools" + "libmfxgen1" from Intel package repo. Its "sample_multi_transcode" tool seems to support AV1 encoding: https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_multi_transcode/src/transcode_utils.cpp#L132

(oneVPL is the frontend/loader, "libmfxgen1" is the backend from: https://github.com/oneapi-src/oneVPL-intel-gpu)

@Quackdoc
Copy link
Author

Cheers, managed to get it working using the ubuntu kernel and dkms, AV1 rarely seems to decode fast enough for some reason, but it seems spotty, ffmpeg vaapi encoding works great with av1 however when buidling ffmpeg cartwheel, I havent tested onevpl yet

@kkartaltepe
Copy link

kkartaltepe commented Oct 21, 2022

h264 encode and decode and av1 decode worked for me on the appropriate backports driver/kernel/mesa/libva. However it seems intel has not upstreamed an ffmpeg implementation of av1 encode (and i could not find any ffmpeg forks from them) so I was unable to test that.

You could try "libvpl-tools" + "libmfxgen1" from Intel package repo. Its "sample_multi_transcode" tool seems to support AV1 encoding: https://github.com/oneapi-src/oneVPL/blob/master/tools/legacy/sample_multi_transcode/src/transcode_utils.cpp#L132

(oneVPL is the frontend/loader, "libmfxgen1" is the backend from: https://github.com/oneapi-src/oneVPL-intel-gpu)

Thanks i might try that if i cannot get the carwheel patches running

Cheers, managed to get it working using the ubuntu kernel and dkms, AV1 rarely seems to decode fast enough for some reason, but it seems spotty, ffmpeg vaapi encoding works great with av1 however when buidling ffmpeg cartwheel, I havent tested onevpl yet

Awesome i did not know about this repo, they do have an av1_vaapi patch staged https://github.com/intel-media-ci/cartwheel-ffmpeg/blob/master/patches/0071-lavc-vaapi-support-av1-encode.patch

---edit

And building it looks good I can get 400fps av1 encoded with cqp, though the CBR example appears broken on my machine with the encoder failing after sending a few frames to the encoder with Failed to map output buffers: 24 (internal encoding error).

@Quackdoc
Copy link
Author

im getting the same encoding error with cbr and vbr,

@Sherry-Lin
Copy link
Contributor

@Tianhaol could you take a look?

@Sherry-Lin Sherry-Lin added Encode video encode related AV1 AV1 labels Nov 15, 2022
@Jexu Jexu removed their assignment Nov 23, 2022
@nyanmisaka
Copy link
Contributor

nyanmisaka commented Dec 8, 2022

I’ve got everything work on A380 (dec, enc, vpp) with upstream kernel 🎉

Prod/backport KMD is not required anymore if you run with the latest drm-tip kernel, firmware, media-driver and onevpl. Check with dmesg | grep i915 to make sure there’s no firmware issue and Guc Huc are enabled by default. Force probe is not required too.

Guc and Huc must be correctly loaded so as you can use the CBR and VBR rate controls. Current mainline kernel has an issue on DG2 firmware loading, so you need drm-tip.

# cat /sys/kernel/debug/dri/0/gt/uc/guc_info
# cat /sys/kernel/debug/dri/0/gt/uc/huc_info

I’m on Arch, so it’s trivial to build all these. Good luck ;)

@kkartaltepe
Copy link

kkartaltepe commented Dec 8, 2022

I’ve got everything work on A380 (dec, enc, vpp) with upstream kernel tada

Confirming things look like they are working (including CBR now) on an A750 with drm-tip kernel and current upstream git heads for the related userspaces. Though it seems no av1 encode patches have been merged into ffmpeg yet so cartwheel is still needed there.

@nyanmisaka
Copy link
Contributor

I’ve got everything work on A380 (dec, enc, vpp) with upstream kernel tada

Confirming things look like they are working (including CBR now) on an A750 with drm-tip kernel and current upstream git heads for the related userspaces. Though it seems no av1 encode patches have been merged into ffmpeg yet so cartwheel is still needed there.

https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/qsvenc_av1.c

QSV AV1 encoder was merged into the mainline in October.

@kkartaltepe
Copy link

kkartaltepe commented Dec 8, 2022

QSV AV1 encoder was merged into the mainline in October.

Ah so it was, sorry we were just using vaapi instead which has not been merged. I do see the qsv enc/dec though.

@nyanmisaka
Copy link
Contributor

That make sense. But you can derive an qsv surface from vaapi surface easily: -vf hwmap=derive_device=qsv,format=qsv
Since the underlying surface used by qsv is still vaapi on linux.

@zlice
Copy link

zlice commented Dec 16, 2022

To those referring to "latest" media-driver, do you mean release or git versions? I am unable to get my A770 to do vaapi/qsv and git media-driver builds bomb out on a patch from a month ago that added METEORLAKE in.

@nyanmisaka
Copy link
Contributor

nyanmisaka commented Dec 16, 2022

22.6.4 or newer works for me. 22.5.x doesn't work.

Edit: 22.6.5 and 22.6.6 have a regression on DG2, use 22.6.4 or 23.1.0+.

@eero-t
Copy link

eero-t commented Dec 19, 2022

Can this be closed now?

If not, at least the bug title should be changed to reflect what exactly does not work, and on which HW with which kernel version...

@kode54
Copy link

kode54 commented Dec 26, 2022

Not working on my A750. linux-tkg-pds 6.1.1-273. CONFIG_DRM_I915_CAPTURE_ERROR=n per Arch wiki, which had no explanation about why I need that setting. intel-media-driver 22.6.4-1 from Arch repos.

❯ mpv --hwdec=auto Japan\ in\ 8K\ \[m1jY2VLCRmY\].webm 
 (+) Video --vid=1 (*) (av1 7680x4320 23.976fps)
 (+) Audio --aid=1 --alang=eng (*) (opus 2ch 48000Hz)
[vo/gpu/wayland] GNOME's wayland compositor lacks support for the idle inhibit protocol. This means the screen can blank during playback.
Cannot load libcuda.so.1
[ffmpeg/video] av1: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] av1: HW accel end frame fail.
Error while decoding frame (hardware decoding)!
[ffmpeg/video] av1: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] av1: HW accel end frame fail.
Error while decoding frame (hardware decoding)!
[ffmpeg/video] av1: Failed to end picture decode issue: 23 (internal decoding error).
[ffmpeg/video] av1: HW accel end frame fail.
Error while decoding frame (hardware decoding)!
AO: [pipewire] 48000Hz stereo 2ch floatp
VO: [gpu] 7680x4320 yuv420p
AV: 00:00:17 / 00:02:21 (12%) A-V:  0.000

Exiting... (Quit)

It does appear to be using CPU decoding.

@nyanmisaka
Copy link
Contributor

Linux 6.1 doesn’t support Arc.

@kode54
Copy link

kode54 commented Dec 27, 2022

What do you mean? It detects and loads the i915 device for it, and loads most of the firmware. What, do I need some sort of AUR packaged drm-next kernel?

Disregard this, I’ve switched to Windows 11 anyway.

E: oh, proprietary OEM kernels and gpu drivers. May as well just run windows instead.

@kkartaltepe
Copy link

What, do I need some sort of AUR packaged drm-next kernel?

E: oh, proprietary OEM kernels and gpu drivers. May as well just run windows instead.

Yes, we reported our results with the so-called drm-tip kernel repository and you were using a different one. Since changes in drm-tip typically make it into the next upstream linux release, one might say your kernel was too old.

It has nothing to do with proprietary kernels and gpu drivers, just that you were using the wrong kernel. Though intel also distributed sources that allowed you enable this functionality prior to upstream linux accepting intel's driver updates, if you wanted to build the kernel and userspace yourself.

@kode54
Copy link

kode54 commented Dec 27, 2022

Sorry for being rude. I was just disappointed with what I experienced, and didn't really understand that even the current stable release was missing vital changes that were still in drm-next. If there's ever a next time, I'll be sure to run that until the API stabilizes.

@eero-t
Copy link

eero-t commented Dec 27, 2022

@kode54 Phoronix news site is pretty good at reporting Linux driver news from different vendors.

It e.g. told that full Intel dGPU support was accepted to upstream only in Linux v2.6-rc (i.e. few days ago): https://www.phoronix.com/news/Linux-6.2-rc1-Released

And it has tested and tracked that driver support before it was in upstream (using drivers from Intel's public source / package repositories).

@kode54
Copy link

kode54 commented Dec 27, 2022

Thank you very much for informing me. I will cease bumping this issue after this, as my initial post was so much noise from using an incomplete DRM implementation. If I install Linux for desktop use again on this machine, I will be sure to wait for 6.2, or use a 6.2 RC kernel, as my preferred custom kernel already supports building the 6.2 RC.

@Quackdoc
Copy link
Author

Quackdoc commented Jan 5, 2023

Sorry for taking so long, I had this put on the back burner, its working for me on kernel 6.2 RC2 so Ill go ahead and close this for now, Thanks!

@Quackdoc Quackdoc closed this as completed Jan 5, 2023
@kode54
Copy link

kode54 commented Jan 21, 2023

Please reopen, 22.6.6 is also broken on DG2. Currently running linux-drm-next-git 6.2.0-rc2-2-drm-next-git-00684-g0b45ac1170ea. 22.6.4 works, though.

@nyanmisaka
Copy link
Contributor

nyanmisaka commented Jan 21, 2023

Please reopen, 22.6.6 is also broken on DG2. Currently running linux-drm-next-git 6.2.0-rc2-2-drm-next-git-00684-g0b45ac1170ea. 22.6.4 works, though.

No need to reopen. 22.6.5 and 22.6.6 are two faulty versions for DG2.

The latest release is 23.1.0:
https://github.com/intel/media-driver/releases/tag/intel-media-23.1.0

@kode54
Copy link

kode54 commented Jan 21, 2023

I had no idea v23 was even a thing yet.

Edit: Oh, right, the version numbers are tied to the year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AV1 AV1 Encode video encode related
Projects
None yet
Development

No branches or pull requests

10 participants