New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DriverDetails: Add broken discard with early-Z bug on Apple Silicon GPUs #10290
Conversation
550c737
to
9d8a0b5
Compare
9d8a0b5
to
465b901
Compare
|
Hi @OatmealDome , Is there a certain build that this is already fixed in? Or is it not implemented into a build yet? I'm running build dev version 5.0-15801 on an M1 Mac mini and having the black textures issue with Vulcan backend. Thanks! |
|
@andevode68 There is no beta or development build with this fix in it at this time. Before that happens, core Dolphin developers must review this pull request to ensure the code quality is acceptable, then choose to "merge" it. Once that happens, a new development build will be automatically made with this fix. That being said, you can download a testing version of this pull request here. However, I don't recommend staying on a pull request build for very long, as you may miss other fixes and improvements that get added over time to the development / beta builds. |
|
@OatmealDome FANTASTIC! Thank you for the quick response! That definitely fixed it, and after applying 16:9 codes, it looks just as good as the Steam version! Now I just wish the frame times were consistent, but I guess this game is just hard to emulate. But I'm not going to complain because now the game is playable! This is the only game I couldn't play, and now I can! And I'll be on the lookout for a proper development build. Thank you again!! |
9cf1517
to
3990448
Compare
00a32d7
to
eaf6620
Compare
|
Please squash the lint fix. |
1ed4c89
to
92f637f
Compare
|
Squashed. |
|
This also fixes Ikaruga. The only way to make it render correctly was to enable exclusive ubershaders before this fix. edit: broken example: https://imgur.com/eVlsKg3 |
|
@Neirene This might just be a dual core bug. Try turning off dual core (uncheck Config -> General -> Enable Dual Core). |
|
@OatmealDome It was indeed a dual core bug, by disabling it I was able to proceed to the game as usual. Now i'm posting here 2 short clips with the results and can confirm it fixes the graphics rendering issues on M1 for Phantasy Star Online Episode 1 & 2 Plus (GPOJ8P) Macbook Air M1 16/8GPU |
92f637f
to
0150f41
Compare
|
Something changed after the rebase. I get flickering textures in F-zero, a game that didn’t have problems before these changes. Ikaruga which was affected now has it worse. What bypassed the problem before (enabling exclusive ubershaders) prevented the problem for the two I just tested. I also got a ton of these dumps before I trashed the shader cache. |
cbcf6bd
to
a17c76e
Compare
|
@dvessel Please try again - I made a mistake while rebasing. |
|
It’s not as severe but I still get broken textures. I got a bunch of render time logs from dev but I stopped after running f-zero from this pull. Disabled audio, native resolution and emulation speed set to 5x to put the pressure on the CPU. f-zero single attract loop, average frame time, over 10k samples:
|
|
@dvessel Would it be possible to send a FIFO log of the flickering textures? I wrote instructions here: https://bugs.dolphin-emu.org/issues/12705#note-1 Also, this change only really affects the GPU side of things, so increasing IR slowly and seeing when performance starts to dip is what I think would be the best way to benchmark this. |
|
Okay, I’ll get to it later today. I thought I was looking for performance regressions due to instructions normally done on the GPU being done in the CPU instead due to the bug. |
|
Running a similar test, whatever was changed, it really got broken.... the menus included Fifo Menu Fifo Pioneer 2 Central Control Area (The area shown in the previous reply) Fifo CCA Fifo CCA Seaside I've increased the IR to up to 4x and the performance and FPSs are not affected only the graphics |
a17c76e
to
259a5fc
Compare
|
@dvessel I fixed another issue with the rebase, please try again. |
My usual settings (includes the other settings):
What’s odd here is that my usual settings perform faster than master. ¯_(ツ)_/¯ @OatmealDome mentioned to me that this was held up because of the potential performance regression for games that weren’t affected. I could test Ikaruga but there’s no point. Using dedicated ubershaders to work around the bug was a massive performance drain and this fixes it. |
|
Quick Rogue Squadron test.. Removed 3 frame spikes (in excess of 1k ms) for each run. They are pretty much the same. |
|
I’ve been keeping this up to date locally and it works great. What else is needed to get this merged? Please don’t let it go stale, thanks! |
commit c42392c Merge: 61edcf7 259a5fc Author: JMC47 <JMC4789@gmail.com> Date: Sun Apr 24 13:30:04 2022 -0400 Merge pull request dolphin-emu#10290 from OatmealDome/m1-earlyz-bug DriverDetails: Add broken discard with early-Z bug on Apple Silicon GPUs commit 61edcf7 Merge: 6abf367 787e3ef Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sun Apr 24 18:02:19 2022 +0200 Merge pull request dolphin-emu#10606 from AdmiralCurtiss/memory-widget-refactoring-1 Qt/MemoryWidget: Light refactoring and quality of life features. commit 787e3ef Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sun Apr 24 06:49:10 2022 +0200 Qt/MemoryViewWidget: Detect row breakpoint cell by cell data instead of cell position. commit 6920a24 Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sun Apr 24 06:30:40 2022 +0200 Qt/MemoryViewWidget: Add option to copy the actually displayed cell value to clipboard. commit 54ec0bd Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sun Apr 24 06:25:42 2022 +0200 Qt/MemoryViewWidget: Don't use a member variable to hold information about the current mouse click. commit 6abf367 Merge: 13e2dda 14f9ffe Author: JosJuice <josjuice@gmail.com> Date: Sun Apr 24 10:37:45 2022 +0200 Merge pull request dolphin-emu#10588 from JosJuice/jitarm64-psq-stxx-q0 JitArm64: Always lock Q0 in psq_stXX commit 13e2dda Author: JosJuice <josjuice@gmail.com> Date: Sun Apr 24 10:06:44 2022 +0200 Translation resources sync with Transifex commit 26f9c8b Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sun Apr 24 05:24:20 2022 +0200 Qt/MemoryWidget: Don't force a fixed size for the sidebar. commit 4c080b8 Merge: e0afcb3 a7111e3 Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sat Apr 23 22:23:31 2022 +0200 Merge pull request dolphin-emu#10578 from TryTwo/PR_MemoryWidget_Dual_Views Debugger MemoryWidget: Add dual views commit e0afcb3 Merge: cb5e967 b5a7ae5 Author: JosJuice <josjuice@gmail.com> Date: Sat Apr 23 22:04:10 2022 +0200 Merge pull request dolphin-emu#10540 from nyanpasu64/fix-gcadapter-atomics Remove atomic usage and fix mutex locking in GCAdapter code commit cb5e967 Merge: 8b5a61b 235f729 Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sat Apr 23 21:07:35 2022 +0200 Merge pull request dolphin-emu#10596 from richarm4/patch-3 Added space in comment commit 235f729 Author: Matthew Richards-Wells <91291346+richarm4@users.noreply.github.com> Date: Wed Apr 20 12:49:17 2022 -0700 GameSettings: Add missing space in comment. commit 8b5a61b Merge: 19c71db 12cd81b Author: Admiral H. Curtiss <pikachu025@gmail.com> Date: Sat Apr 23 20:32:47 2022 +0200 Merge pull request dolphin-emu#10599 from shuffle2/libusb Libusb fixups commit 19c71db Merge: 69ca38d f5f5262 Author: Mai M <mathew1800@gmail.com> Date: Sat Apr 23 06:10:20 2022 -0400 Merge pull request dolphin-emu#10597 from Simonx22/fix-ingame-menu-design Android: Fix in game menu rippleColor and colorEdgeEffect commit a7111e3 Author: TryTwo <taolas@gmail.com> Date: Sun Apr 17 00:47:05 2022 -0700 Dual View any size. commit 14f9ffe Author: JosJuice <josjuice@gmail.com> Date: Sat Apr 23 11:37:52 2022 +0200 JitArm64: Add documentation comment for EmitBackpatchRoutine commit 69ca38d Merge: 56bb965 6eb9111 Author: JosJuice <josjuice@gmail.com> Date: Sat Apr 23 10:25:48 2022 +0200 Merge pull request dolphin-emu#10600 from t895/modern-card Android: Modernize game card commit 56bb965 Merge: 2e01dc0 7840798 Author: JMC47 <JMC4789@gmail.com> Date: Fri Apr 22 23:24:22 2022 -0400 Merge pull request dolphin-emu#10584 from Pokechu22/emboss-single-normal-v2 VideoCommon: Handle emboss texgen with only a single normal commit 6eb9111 Author: Charles Lombardo <clombardo169@gmail.com> Date: Fri Apr 22 12:56:58 2022 -0400 Modernize game card +Remove background on card +Increase max # of lines for game title +Root layout is now a linear layout with the card view rounding the corners on the box art commit 7840798 Author: Pokechu22 <Pokechu022@gmail.com> Date: Tue Apr 19 17:46:20 2022 -0700 VideoCommon: Add comment explaining why only the first normal gets normalized Co-authored-by: Scott Mansell <phiren@gmail.com> commit 2a5c77f Author: Pokechu22 <Pokechu022@gmail.com> Date: Wed Apr 13 22:03:34 2022 -0700 VideoCommon: Handle emboss texgen with only a single normal Fixes a large number of effects in Rogue Squadron 2 and 3. commit 39b2854 Author: Pokechu22 <Pokechu022@gmail.com> Date: Thu Apr 14 12:01:57 2022 -0700 VertexLoader: Convert count register to remaining register This more accurately represents what's going on, and also ends at 0 instead of 1, making some indexing operations easier. This also changes it so that position_matrix_index_cache actually starts from index 0 instead of index 1. commit 97d0ff5 Author: Pokechu22 <Pokechu022@gmail.com> Date: Wed Apr 13 16:12:53 2022 -0700 Convert vertex loader position cache to std::array commit f722bdf Author: Pokechu22 <Pokechu022@gmail.com> Date: Wed Apr 13 20:57:38 2022 -0700 VertexLoaderX64: Refactor so that zfreeze is only in one place (Specifically, the copy for VertexLoaderManager::position_cache. The position matrix index happens elsewhere, and the float path still has special logic to copy to scratch3.) commit 6f1350a Author: Pokechu22 <Pokechu022@gmail.com> Date: Wed Apr 13 17:03:53 2022 -0700 VertexLoaderARM64: Fix z-freeze position matrix index Before, it would always write to index 0 (which is unused). Now it writes to the correct index. commit 04fdadd Author: Pokechu22 <Pokechu022@gmail.com> Date: Fri Apr 22 12:50:44 2022 -0700 VideoCommon: Rename norm0/norm1/norm2 to normal/tangent/binormal commit 88134a6 Author: Pokechu22 <Pokechu022@gmail.com> Date: Tue Dec 28 13:01:57 2021 -0800 VertexShaderGen: Simplify normal calculation This is a readability change; there should be no functional or performance differences. commit 12cd81b Author: Shawn Hoffman <godisgovernment@gmail.com> Date: Fri Apr 22 08:58:38 2022 -0700 GCAdapter: don't call libusb_detach_kernel_driver on apple commit 5cd3cf9 Author: Shawn Hoffman <godisgovernment@gmail.com> Date: Fri Apr 22 08:48:28 2022 -0700 GCAdapter: fix retval check of libusb_detach_kernel_driver commit 978c908 Author: Shawn Hoffman <godisgovernment@gmail.com> Date: Fri Apr 22 07:37:56 2022 -0700 GCAdapter: move libusb context teardown last commit 1c9dfb7 Author: Shawn Hoffman <godisgovernment@gmail.com> Date: Fri Apr 22 07:36:56 2022 -0700 GCAdapter: some macro cleanup commit f52d948 Author: Shawn Hoffman <godisgovernment@gmail.com> Date: Fri Apr 22 07:12:09 2022 -0700 GCAdapter: set read/write thread names commit 0a07c76 Author: Shawn Hoffman <godisgovernment@gmail.com> Date: Fri Apr 22 07:07:20 2022 -0700 update libusb submodule to latest commit af930bc Author: Shawn Hoffman <godisgovernment@gmail.com> Date: Fri Apr 22 07:05:41 2022 -0700 make libusb submodule shallow commit f5f5262 Author: Simonx22 <simon@oatmealdome.me> Date: Wed Apr 20 16:22:06 2022 -0400 Android: Fix in game menu rippleColor and colorEdgeEffect commit 259a5fc Author: OatmealDome <julian@oatmealdome.me> Date: Sat Dec 25 00:27:43 2021 -0500 DriverDetails: Add broken discard with early-Z bug on Apple Silicon GPUs commit e7f5e51 Author: OatmealDome <julian@oatmealdome.me> Date: Fri Jul 30 04:02:51 2021 -0400 DriverDetails: Introduce new VENDOR_APPLE for Apple GPUs commit 80dfefb Author: OatmealDome <julian@oatmealdome.me> Date: Thu Jan 6 04:15:21 2022 -0500 UberShaderPixel: Add support for non-dual source shader blending commit c1d87db Author: OatmealDome <julian@oatmealdome.me> Date: Thu Jan 6 04:15:07 2022 -0500 PixelShaderGen: Add support for non-dual source shader blending commit bad0283 Author: OatmealDome <julian@oatmealdome.me> Date: Sat Dec 25 00:27:53 2021 -0500 VKPipeline: Add shader blending support commit cc22f1a Author: TryTwo <taolas@gmail.com> Date: Wed Apr 6 22:50:05 2022 -0700 MemoryWidget add dual views for two separate column types. Force first column to be Hex32. commit 2ef2d47 Author: JosJuice <josjuice@gmail.com> Date: Sat Apr 16 13:22:36 2022 +0200 JitArm64: Always lock Q0 in psq_stXX Q0 is used as a scratch register by EmitBackpatchRoutine. Fixes a vertex explosion in Spider-Man 2 that was uncovered by 20b2300. commit b5a7ae5 Author: nyanpasu64 <nyanpasu64@tuta.io> Date: Sun Mar 27 22:37:43 2022 -0700 Fix locking the wrong mutex in GCAdapter_Android.cpp ResetRumble() I am not confident there are no race conditions between s_write_mutex, s_controller_write_payload_size, and s_controller_write_payload. But this code should be safer than before. commit 7616027 Author: nyanpasu64 <nyanpasu64@tuta.io> Date: Sun Mar 27 22:27:44 2022 -0700 Remove unnecessary atomic usage in GCAdapter_Android.cpp s_controller_write_payload_size needs to remain an atomic because Read() loads and stores without holding a mutex, Output() stores while holding s_write_mutex, and ResetRumble() stores while holding s_read_mutex! I'm pretty sure this code is wrong, specifically ResetRumble(). commit 871b01a Author: nyanpasu64 <nyanpasu64@tuta.io> Date: Sun Mar 27 22:25:40 2022 -0700 Remove unnecessary atomic usage in GCAdapter.cpp You can safely read or write non-atomic integers on multiple threads, as long as every thread reading or writing it holds the same mutex while doing so (here, s_mutex). Removing the atomic accesses makes the code faster, but the actual performance difference is probably negligible.















The drivers for newer Apple Silicon GPUs appear to have some sort of issue with
discardwhen early-Z testing is enabled. Fragments that are discarded can appear corrupted (glitchy textures, being rendered as all black, completely invisible, etc).I workaround this issue by emulating
discardusing framebuffer fetch. Instead of using the built-indiscard, I read in the value of the framebuffer and use that as the fragment output. Unfortunately, I couldn't figure out a good solution that works with hardware blending (what do I write out toocol1?), so shader blending is enforced. This may decrease performance a bit.I also solve a pre-existing bug with OpenGL where using shader blending along with ubershaders may result in double blending due to the fact that the OpenGL backend uses a different check for shader blending versus the Ubershader generator.
Fixes Super Mario Sunshine, Sonic Adventure 2: Battle, Phantasy Star Online Episodes 1 & 2, Chibi Robo, and probably a bunch more.