Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD Ryzen 7 5700G with Radeon Graphics does not work . #118

Closed
takawata opened this issue Oct 7, 2021 · 50 comments
Closed

AMD Ryzen 7 5700G with Radeon Graphics does not work . #118

takawata opened this issue Oct 7, 2021 · 50 comments
Labels
amdgpu amdgpu related problems enhancement New feature or request

Comments

@takawata
Copy link

takawata commented Oct 7, 2021

Describe the bug
AMD Ryzen 7 5700G with Radeon Graphics does not recognized with amdgpu.ko

FreeBSD version
FreeBSD akatsuki 14.0-CURRENT FreeBSD 14.0-CURRENT #0 main-n249842-42dfad2ef12: Tue Oct 5 01:06:52 JST 2021 takawata@akatsuki:/usr/obj/usr/home/takawata/projects/freebsd/src/amd64.amd64/sys/CORONEL amd64

PCI Info

pciconf -lv

vgapci0@pci0:11:0:0: class=0x030000 rev=0xc8 hdr=0x00 vendor=0x1002 device=0x1638 subvendor=0x1043 subdevice=0x8809
vendor = 'Advanced Micro Devices, Inc. [AMD/ATI]'
device = 'Cezanne'
class = display
subclass = VGA

DRM KMOD version
5.6-wip

To Reproduce
Loading amdgpu.ko

Additional context
Adding

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index 008ce265c..e016245ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1019,6 +1019,7 @@ static const struct pci_device_id pciidlist[] = {
 
        /* Renoir */
        {0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU},
+       {0x1002, 0x1638, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU}, 
 
        /* Navi12 */
        {0x1002, 0x7360, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI12|AMD_EXP_HW_SUPPORT},

and
put /usr/local/etc/X11/xorg.conf.d/amdgpu.conf

Section "Device"
        Identifier "Card0"
        Driver "amdgpu"

will solve the problem.


[ 52319.668] (II) AMDGPU: Driver for AMD Radeon:
        All GPUs supported by the amdgpu kernel driver
[ 52319.668] (--) Using syscons driver with X support (version 2.0)
[ 52319.668] (--) using VT number 9

[ 52319.673] (II) AMDGPU(0): [KMS] Kernel modesetting enabled.
[ 52319.678] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[ 52319.678] (II) AMDGPU(0): Creating default Display subsection in Screen section
        "Default Screen Section" for depth/fbbpp 24/32
[ 52319.678] (==) AMDGPU(0): Depth 24, (--) framebuffer bpp 32
[ 52319.678] (II) AMDGPU(0): Pixel depth = 24 bits stored in 4 bytes (32 bpp pixmaps)
[ 52319.678] (==) AMDGPU(0): Default visual is TrueColor
[ 52319.678] (==) AMDGPU(0): RGB weight 888
[ 52319.678] (II) AMDGPU(0): Using 8 bits per RGB (8 bit DAC)
[ 52319.678] (--) AMDGPU(0): Chipset: "Unknown AMD Radeon GPU" (ChipID = 0x1638)
[ 52319.678] (II) Loading sub module "fb"
[ 52319.678] (II) LoadModule: "fb"


This is a kind of backport from Linux .

@evadot
Copy link
Contributor

evadot commented Oct 8, 2021

Fix release with the upstream commit (8bf0835132c19) is 5.12 but I guess we can backport this if this is the only needed thing.
Can you test applying this to the current master branch to be sure that everything is still working correctly ?
Thanks.

@evadot evadot added amdgpu amdgpu related problems enhancement New feature or request labels Oct 8, 2021
@takawata
Copy link
Author

takawata commented Oct 8, 2021

Looking it deeper, this patch is insufficient for this chip.
It needs to load different firmware for VCN (amdgpu/green_sardine_vcn.bin), which is not included for our firmware package.

@khbsd
Copy link

khbsd commented Oct 10, 2021

I've got the 5700u- I'll try this tomorrow with the IDs changed for mine.

@khbsd
Copy link

khbsd commented Oct 12, 2021

Cloned from git, modified the file for the patch, recompiled, and installed. I also added the section to amdgpu.conf.

Now, it seems like it recognizes my card, but if I try startx I get the attached kernel panic.

IMG_20211011_204324.jpg

@valpackett
Copy link
Contributor

Interesting, are you on the current master branch here? Also appropriate main base?

There was a recent FPU context issue for dcn1 but I thought dcn21 was confirmed working on 5.6. The dcn21_validate_bandwidth[_fp] codepath looks fine…

@khbsd
Copy link

khbsd commented Oct 12, 2021

Yeah- I got this after updating my FreeBSD CURRENT a few minutes prior. I had also cloned this git repo about an hour and a half prior.

Edit: I can attach my modified file as well, to make sure I did it right.

@valpackett
Copy link
Contributor

Hmm either we didn't actually make sure dcn21 doesn't have fpu context problems, or this FPU context error is somehow obscuring a real error (by hitting the cpu exception before the error reporting code runs) that would mean something from this hardware is not yet actually supported…

Please try something like:

diff --git i/drivers/gpu/drm/amd/display/dc/os_types.h w/drivers/gpu/drm/amd/display/dc/os_types.h
index c34eba1986..1eeff0bf3e 100644
--- i/drivers/gpu/drm/amd/display/dc/os_types.h
+++ w/drivers/gpu/drm/amd/display/dc/os_types.h
@@ -53,8 +53,8 @@
 #if defined(CONFIG_DRM_AMD_DC_DCN)
 #if defined(CONFIG_X86)
 #include <asm/fpu/api.h>
-#define DC_FP_START() kernel_fpu_begin()
-#define DC_FP_END() kernel_fpu_end()
+#define DC_FP_START() { printf("+fpu %s (%s:%d)\n", __func__, __FILE__, __LINE__); kernel_fpu_begin(); }
+#define DC_FP_END() { printf("-fpu %s (%s:%d)\n", __func__, __FILE__, __LINE__); kernel_fpu_end(); }
 #elif defined(CONFIG_PPC64)
 #include <asm/switch_to.h>
 #include <asm/cputable.h>

@valpackett
Copy link
Contributor

@bluedudexoo Here's the upstream patch adding 0x164c — note that it also adds it to a condition that sets AMD_APU_IS_RENOIR!

There are no similar changes for 0x1638.

@khbsd
Copy link

khbsd commented Oct 12, 2021

Fascinating! My C programming skills aren't the best, but I can dive in and learn a bit. Let me apply that debugging patch you suggested and check the results.

@khbsd
Copy link

khbsd commented Oct 12, 2021

Okay! Recompiled with the debug lines in there- this is the error. I'll eventually get a repo set up where I can just push the tar dumps to it lol.

IMG_20211012_131638.jpg

IMG_20211012_131141.jpg

@valpackett
Copy link
Contributor

Is this from the beginning of the fpu lines? The panic itself is the same, that's not interesting, only the prints are.

It's very odd that there's a jump from 1141 back to 1197. It's like there's somehow interleaved (?!) multiple validate_bandwidth functions running.

I'm not sure how this could happen but to be sure, try to manually kldload amdgpu rather than relying on startx auto-loading it.

@khbsd
Copy link

khbsd commented Oct 12, 2021

I included the rest since it was technically an update (re-cloned from repo.)

I have it loading in with kld_list="amdgpu" in rc.conf. Do you want me to comment that out and kldload it after reboot?

Edit: another piece of the puzzle- possibly a solve? I used startx after commenting out my kld_list entry. I no longer get a kernel panic, so maybe Xorg was trying to load two instances of the driver?

@valpackett
Copy link
Contributor

I have it loading in with kld_list="amdgpu" in rc.conf

Oh that's fine (if it always loads, that is).

I no longer get a kernel panic

But do you actually have it working?


I'd like to see a trace of where the second dcn21_validate_bandwidth call comes from that is supposedly somehow before the first dcn21_validate_bandwidth_fp returns…

diff --git i/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c w/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
index 66f137397c..76fdfd9226 100644
--- i/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
+++ w/drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c
@@ -1198,6 +1198,7 @@ bool dcn21_validate_bandwidth(struct dc *dc, struct dc_state *context,
 		bool fast_validate)
 {
 	bool voltage_supported;
+	printf("---%s\n", __func__); dump_stack();
 	DC_FP_START();
 	voltage_supported = dcn21_validate_bandwidth_fp(dc, context, fast_validate);
 	DC_FP_END();

@khbsd
Copy link

khbsd commented Oct 12, 2021

No, unfortunately. If I specify the BusID in my 10-amdgpu.conf file, I get a no screens found error.

Okay, let me get that up and running 😁

@khbsd
Copy link

khbsd commented Oct 14, 2021

Okay! Appreciate your patience- got the repo up and running. This is my current textdump for the panic.

@khbsd
Copy link

khbsd commented Oct 14, 2021

I'm also not sure how much it helps, but it seems OpenBSD added support for this APU with their latest update. I know there's not much binary capability, but maybe I could help beta test some implementations?

@valpackett
Copy link
Contributor

OHH. The interleaved call is because the system has already started a panic → vt postswitch → a new drm_atomic_commit for the terminal. Okay. I guess a real panic is obscured by the one coming from the fpu context reorder?

Try reverting c937a405bdce2fd1 in base and rebuilding the kernel. (Funnily enough, it's a patch from me that ensures a panic is visible even when using a GUI, I didn't really expect that to happen… to be fair, we're testing unsupported hardware here.)


OpenBSD's drm port is completely independent. Sadly we're behind now, they're at 5.10…

@khbsd
Copy link

khbsd commented Oct 14, 2021

Just so I understand, you mean the commit to the FreeBSD source? Or the drm-kmod source?

And ahh, that's about what I figured. Oh well. 😁

Edit: figured it out! Building now.

@khbsd
Copy link

khbsd commented Oct 14, 2021

New dump here!

@valpackett
Copy link
Contributor

Well that changed nothing. Let's just block that code path instead:

diff --git i/drivers/gpu/drm/linux_fb.c w/drivers/gpu/drm/linux_fb.c
index 2cf264bf86..a311102252 100644
--- i/drivers/gpu/drm/linux_fb.c
+++ w/drivers/gpu/drm/linux_fb.c
@@ -127,7 +127,7 @@ vt_kms_postswitch(void *arg)
 			DRM_DEBUG("fb helper is null!\n");
 			return -1;
 		}
-		drm_fb_helper_restore_fbdev_mode_unlocked(sc->fb_helper);
+		// drm_fb_helper_restore_fbdev_mode_unlocked(sc->fb_helper);
 	}
 	return (0);
 }

@khbsd
Copy link

khbsd commented Oct 15, 2021

Looks like there are less fpu debugging messages?

@thesunexpress
Copy link

Hey guys, got a bit of a funfusing conundrum going on here. I've got an AMD 5600G box running 14-CURRENT, cloned the 5.7-wip repo, built it with the minor mod to amdgpu_drv.c as noted by OP. Everything built cleanly & kldload-ed without complaint, resulting in Renoir firmware being autoloaded. With x11-drivers/xf86-video-amdgpu built and installed as well, I'm rewarded with a lovely X11 session that I haven't been able to crash just yet. Though I use a minimalist DE with picom composting (because reasons), everything seems to work rather smoothly, it compares like-for-like to an Arch Linux box with all the bloated trimmings enabled. Keeping an eye on what both OS'es report & actual power consumption from the power outlet, indicates both work as intended.
How & why is this possible, when the iGPU on this APU isn't yet supported? Crucially, without the 5.7-wip bits installed, I get no love. With and without 5.6-stable bits installed, I get no love. With and without graphics/drm-devel-kmod bits installed, I get no love. Thus, with 5.7-wip I get all the AMDGPU DRM happy times, whilst 5.6-stable & 5.5.19 in ports are a bag of meth-infused horny ferrets. Side-note: Vulkan caps reports are nearly identical between the BSD setup & Arch (with Green Sardine firmware) setup. Really confusing! Any pointers would be appreciated.

@valpackett
Copy link
Contributor

How & why is this possible, when the iGPU on this APU isn't yet supported?

Well, you've added the PCI ID ("minor mod to amdgpu_drv.c") so you've enough "support" :)

Green Sardine is a very minor revision of Renoir, it's not that surprising that treating it the same as original Renoir "just works". Upstream 5.11 where support is properly added does also add a bunch of conditionals deep inside various things but evidently they're not necessary for it to work.

@khbsd
Copy link

khbsd commented Dec 2, 2021

Is main the same as 5.7?

@wulf7
Copy link
Contributor

wulf7 commented Dec 2, 2021

No. Main is 5.6.0 now

@khbsd
Copy link

khbsd commented Dec 2, 2021

Ah, I was wondering- 5.7 doesn't build. When was it updated?

@wulf7
Copy link
Contributor

wulf7 commented Dec 2, 2021

Last time it was updated 2 days ago. It requires recent CURRENT (clang 13) to be built

@khbsd
Copy link

khbsd commented Dec 2, 2021

I'll update and try again😊

@thesunexpress
Copy link

thesunexpress commented Dec 8, 2021

FYI, a recent update to base src broke 5.7-wip, at least in my case. Haven't been able to get it to work again. However, surprise surprise, 5.6-stable (with modification to amdgpu_drv.c !) suddenly started working instead. I dug through the base src commits back to about ~8 days before 5.7-wip stopped working, but nothing in those commits appeared to have anything to do at all with drm-kmod. A buildworld & 5.6-stable for whatever reason jived.
I could never get a trace on what caused the error. Console would just suddenly go blank during boot & subsequently the computer would just power off.... which is more than what a simple kernel panic would do normally. Snouting around in logs revealed nothing much useful; almost seemed like a normal 'shutdown -p now' would be issued. Perhaps attaching a serial link to console could have indicated something on console, but that's history now.
I'll keep tinkering with 5.6-stable & 5.7-wip (if anything new is submitted) and see where things go...

@KYTAZ0
Copy link

KYTAZ0 commented Feb 1, 2022

Hello, is there any chance for news on the issue in the near term?

@101313
Copy link

101313 commented Feb 7, 2022

Hello, is there any chance for news on the issue in the near term?

I guess they're taking a break or w/e

@wulf7
Copy link
Contributor

wulf7 commented Feb 11, 2022

Hello, is there any chance for news on the issue in the near term?

0x1002, 0x164C card support was added to 5.7-stable branch in commit 5eb4816 as it appeared to be plain Renoir and was reported to work.

If someone would get success with other PID/VIDs they would be added too. Otherwise, you should wait for 5.10

@thesunexpress
Copy link

Please add:

  `   {0x1002, 0x1638, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU} `

It is fully functional & I haven't been able to get it to break or crash for over 2 months now.
Provides support for AMD 5600G (5700G possibly too?).

wulf7 added a commit to wulf7/drm-kmod that referenced this issue Feb 13, 2022
@wulf7
Copy link
Contributor

wulf7 commented Feb 14, 2022

Please add:

  `   {0x1002, 0x1638, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU} `

It is fully functional & I haven't been able to get it to break or crash for over 2 months now. Provides support for AMD 5600G (5700G possibly too?).

Included in to drm-devel-kmod-5.7.19.g20220213

@thesunexpress
Copy link

Thank you kindly. I'll keep testing. There may be a 5700G coming across my bench in the near future & will report back if anything curious pops up. Excellent work guys!

wulf7 added a commit to wulf7/drm-kmod that referenced this issue Feb 23, 2022
@rozhuk-im
Copy link

rozhuk-im commented Mar 1, 2022

Is there any chance that it will work with FreeBSD 13?

UPD: work fine with graphics/drm-devel-kmod

@pointcheck
Copy link

It does not work to me. I upgraded to FreeBSD 13.1 and was able to successfully build graphics/drm-devel-kmod (it's drm-kmod 5.7.19). The module amdgpu loads successfully, but Xorg crashes soon as gets access to the driver:

root@butterfly:/usr/ports/graphics/drm-devel-kmod # X

X.Org X Server 1.20.13
X Protocol Version 11, Revision 0
Build Operating System: FreeBSD 13.0-RELEASE-p7 amd64 
Current Operating System: FreeBSD butterfly 13.1-BETA1 FreeBSD 13.1-BETA1 #0 releng/13.1-n249974-ad329796bdb: Thu Mar 10 02:30:25 UTC 2022     root@releng3.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
Build Date: 12 March 2022  04:39:35AM
 
Current version of pixman: 0.40.0
	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sun Mar 13 02:53:24 2022
(==) Using config file: "/etc/X11/xorg.conf"
(==) Using system config directory "/usr/local/share/X11/xorg.conf.d"
(II) AMDGPU(0): [KMS] Kernel modesetting enabled.
amdgpu: os_same_file_description couldn't determine if two DRM fds reference the same file description.
If they do, bad things may happen!
Assertion failed: (key->initialized), function dixGetPrivateAddr, file /usr/local/include/xorg/privates.h, line 121.
(EE) 
(EE) Backtrace:
(EE) 0: /usr/local/bin/Xorg (OsInit+0x38a) [0x41e27a]
(EE) unw_get_proc_name failed: no unwind info found [-10]
(EE) 1: /lib/libthr.so.3 (?+0x0) [0x80093358e]
(EE) unw_get_proc_name failed: no unwind info found [-10]
(EE) 2: /lib/libthr.so.3 (?+0x0) [0x800932b3f]
(EE) 3: ? (?+0x0) [0x7ffffffff8a3]
(EE) unw_get_proc_name failed: no unwind info found [-10]
(EE) 4: /lib/libc.so.7 (?+0x0) [0x800a7e2ca]
(EE) unw_get_proc_name failed: no unwind info found [-10]
(EE) 5: /lib/libthr.so.3 (?+0x0) [0x800932a00]
(EE) 6: ? (?+0x0) [0x0]
(EE) unw_get_proc_name failed: no unwind info found [-10]
(EE) 7: /lib/libc.so.7 (?+0x0) [0x8009f6c74]
(EE) unw_step failed: unspecified (general) error [-1]
(EE) 
(EE) 
Fatal server error:
(EE) Caught signal 6 (Abort trap). Server aborting

And the driver has this weird report in dmesg:

amdgpu: os_same_file_description couldn't determine if two DRM fds reference the same file description.
If they do, bad things may happen!

@rozhuk-im
Copy link

@pointcheck probably you need to uninstall nvidia driver.

@pointcheck
Copy link

Just have tried with nvidia.ko and nvidia-modeset.ko unloaded, pretty same effect! It looks more like an Xorg issue, it behaves (crashes) the same way with both AMD and nVidia cards/drivers.

The only way I can run Xorg now is through modesetting or scbf xf86 drivers, which is expectedly too slow.

@rozhuk-im
Copy link

rozhuk-im commented Mar 12, 2022

I suspect that xorg may load back nvidia drivers, you can check it using kldstat or there is some other side effects then driver installed.

@pointcheck
Copy link

It does not. It's rather a bug in Xorg.

@fellmoon
Copy link

It does not. It's rather a bug in Xorg.

have you been able to solve this issue?
I gave up frustrated 2 months ago and currently run openbsd on a cezanne laptop but not really happy with openbsd and would love to go back to freebsd...

@ivan-volnov
Copy link

@fellmoon here is #155 my story with Cezanne. Works pretty well, but with Wayland.

@pointcheck
Copy link

I gave up frustrated 2 months ago and currently run openbsd on a cezanne laptop but not really happy with openbsd and would love to go back to freebsd...

What X server do you use on OpenBSD ?

I was not able to solve this Xorg crash issue. I am currently using Xorg with modesetting driver. It works but there's no GL/EGL, it is slow and frustrating. Seems that Xorg is now completely abandoned (in favour of Wayland), no one wants to fix bugs in it any more. :-( I nearly in position to dive into Xorg codes my self.

@fellmoon
Copy link

I should mention, that I have not a Ryzen 7 5700 CPU but a Ryzen 5 Pro 5650u (which is also Cezanne and also device ID 0x1638 )

What X server do you use on OpenBSD ?

just default install of OpenBSD 7.0, worked out of the box.

@pointcheck
Copy link

pointcheck commented Apr 15, 2022

What is the default one on OpenBSD 7.0 ? Sorry, I have never touched OpenBSD, have no clue how it's made. If that's Xorg, can you please show /var/log/Xorg.0.log ?

@fellmoon
Copy link

I just installed OpenBSD and it worked out of the box with the version from the installer. If you want to dig into their code > https://cvsweb.openbsd.org/xenocara/ have fun

@pointcheck
Copy link

Hmm.. it seems they are using Xorg 21.1.3 (Xorg adopted new versioning a year ago). Have to try it on FreeBSD.

@evadot
Copy link
Contributor

evadot commented Jun 9, 2022

Should be ok with updated xf86-video-amdgpu

@takawata
Copy link
Author

drm-510 solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
amdgpu amdgpu related problems enhancement New feature or request
Projects
None yet
Development

No branches or pull requests