Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 #1

Closed
osy opened this issue Jun 27, 2019 · 5 comments · Fixed by #279
Closed

Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 #1

osy opened this issue Jun 27, 2019 · 5 comments · Fixed by #279
Labels
graphics Graphics/GPU issues kernel KEXT and kernel issues

Comments

@osy
Copy link
Owner

osy commented Jun 27, 2019

After updating to 10.14.5 Beta 2, kernel panics on boot if GPU acceleration is enabled.

Crash is in AMDRadeonX4000 but replacing AMDRadeonX4000HWLibs with the one from Beta 1 resolves the issue.

Running bindiff and looking at the changes in HWLibs, we identify the following function changes (data only changes and constant operand changes are not identified by bindiff)

A number of functions have bzero added to clear the stack data before use:

_SW_SMUM_Dpm_SetMinDeepSleepDceFClk
_SW_SMUM_Dpm_SetWorkloadPolicy
_SW_SMUM_Dpm_DisableUclkFastSwitching
_SW_SMUM_Dpm_EnableUclkFastSwitching
_SW_SMUM_Dpm_GetCurrentDpm
_SW_SMUM_Dpm_ForceDpmLevel
_SW_SMUM_Dpm_SetClockLimit
_SW_SMUM_Dpm_GetClockLimit

A couple of hardware specific changes may or may not be used by Polaris22 path:

_gc_9_1_init_gfx_power_gating
_gc_10_1_get_gb_addr_config_default
_greenland_update_hw_virtualization_settings
_Cail_Bonaire_UpdateMultimediaClockGating
_PhwCIslands_BugCheckRegisterDump (added register write)
_PhwCIslands_Initialize
_PhwPolaris10_Initialize

Some "suspicious" changes:

_CailIdentifyCrossDisplayAndXGP: check (ulong)(*(int *)(lParm1 + 0x198) - 0x41U < 0x40) for "EnableXDSupport"
_PEM_CWDDEPM_AdjustPowerOptimizationSettings: added blob of code
_CAILFullResetSupport: new branch for *(int *)(lParm1 + 0x19c) - 0x41U < 0x40)
_Cail_MCILQuerySystemInfo: old:0,1 new:0,2
AtiPowerPlayServices::ppInitialize(PPDisplayConfiguration *): call _PP_Initialize => now calls _PP_InitializeEX
_CailReadinRegistryFlags: check (uint)(0x3f < *(int *)(lParm1 + 0x198) - 0x41U) for "DisableFBCSupport"

Other changes that seem benign:
_PEM_CWDDEPM_PMLogControl: removed _PECI_LockPowerPlayOnly/_PECI_UnlockPowerPlayOnly
_PHM_CollectDbgInfo: added indirect call at end
_CailSaveCailInitInfo: added a register copy
_PEM_CWDDEPM_GetODDefaultPerformanceLevels: added assertion
__ZN20AtiPowerPlayServicesC2EP18PowerPlayCallbacks: assertion changes, maybe more
__ZN21AtiPowerPlayInterface25createPowerPlayServiceForEP18PowerPlayCallbacks: assertion changes
__ZN25AtiApplePowerTuneServices23createPowerTuneServicesEP11PP_InstanceP18PowerPlayCallbacks: added navi support
__ZN20AtiAppleMcilServices9obtainIriEPvP22_MCIL_IRI_OBTAIN_INPUTP23_MCIL_IRI_OBTAIN_OUTPUT: removed "ATY,CAIL_IRI"

Trying to individually patch out each change identified here and reverting the behaviour does not appear to fix the issue.

As a workaround, we load the Beta 1 HWLibs and it works normally but can break in a future macOS update.

@osy osy added the kernel KEXT and kernel issues label Jun 27, 2019
@osy osy changed the title Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 and 15.1 Beta Oct 26, 2019
@osy osy changed the title Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 and 15.1 Beta Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 and 10.15.1 Beta Oct 26, 2019
@osy
Copy link
Owner Author

osy commented Oct 26, 2019

This issue is showing up in 10.15.1 and the workaround no longer works.

@osy osy changed the title Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 and 10.15.1 Beta Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 and 10.15.1 Beta 2 Oct 26, 2019
@osy osy pinned this issue Oct 28, 2019
@osy osy mentioned this issue Oct 29, 2019
6 tasks
@osy osy changed the title Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 and 10.15.1 Beta 2 Kernel panic in AMD GPU drivers on bootup after 10.14.5 Beta 2 Oct 29, 2019
@osy osy unpinned this issue Oct 30, 2019
@osy osy added the graphics Graphics/GPU issues label Dec 21, 2019
@desert0616
Copy link

I am facing a very similar issue on my Macbook Pro 2015 (rx450). Even though the GPU on NUC8 is named vega but it should be of old architecture, so I think the it is a driver bug for Polaris.

@jasanders
Copy link

Will this change in 10.15.4 Beta 1 impact this bug?
https://www.reddit.com/r/hackintosh/comments/ezr3e5/macos_10154_beta_1_gives_back_drm_to_polaris/

@osy
Copy link
Owner Author

osy commented Feb 6, 2020

Interesting, we’ll have to see.

@osy osy added the wontfix This will not be worked on label Apr 29, 2020
@osy
Copy link
Owner Author

osy commented Jun 14, 2020

Figured out the issue. Polaris22_UploadSMUFirmwareImageDefault calls PECI_IsEarlySAMUInitEnabled to check if SMU firmware can be loaded directly. PECI_IsEarlySAMUInitEnabled looks at bit 0x160 of CAIL_DDI_CAPS_POLARIS22_A0 which should be 0. But it is 1, leading the firmware to not be loaded. Patching the function to return 0 will fix it.

It worked before by chance. AtiAppleCailServices::isAsicCapEnabled was updated to include Polaris22 settings. Previously it wasn't there so it defaulted to 0.

@osy osy removed the wontfix This will not be worked on label Jun 14, 2020
osy pushed a commit that referenced this issue Jun 16, 2020
* Use Lilu again
* Patch works on resume (fixes #206)
* Fix kernel panic on bootup without workaround (fixes #1)
* Above means booting with latest AMD GPU drivers works, this may fix other
  issues as well.
@osy osy closed this as completed in #279 Jun 24, 2020
osy pushed a commit that referenced this issue Sep 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
graphics Graphics/GPU issues kernel KEXT and kernel issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants