Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Texture Sync, incompatible overlap handling, data flush improvements. #2971

Merged
merged 15 commits into from Jan 9, 2022

Conversation

riperiperi
Copy link
Member

@riperiperi riperiperi commented Jan 3, 2022

This PR aims to solve a bunch of issues caused by texture modification and data flushing, by primarily handling data flush on a per-handle instead of per-view basis, and synchronizing flushes with syncpoint increments.

This introduces a new backend method, so it's not currently compatible with Vulkan (though the benefits are more important there)

Part 1: Syncing Texture Flush

This one has been a while coming. When texture flush via memory tracking was first added, our CPU and various other components were considerably slower. The GPU was so tight with the CPU as a result of both this and the lack of backend multithreading, so when the game tried to access the data it was almost certain that the draw would have completed and the data would be there, just by chance.

This did not last. Over time people started encountering various issues where textures would flush before their data was fully ready, causing white water in BOTW and rainbow lighting in Splatoon. This is compounded a lot by the addition of Background Multithreading and Vulkan, so much that people thought there was a regression. There wasn't, the texture flush was just behaving as it always did.

The new flushing mechanism is similar to the method we use for buffers, which has proven speedy and always gets the data in place by the time the CPU wants it. There is one key difference - texture data is only synced at syncpoint increment, rather than every WaitForIdle. This is mostly a limitation of buffer flush that can be fixed later, but it means that adding these syncpoints for texture flush should not cost anything at all comparatively.

But things are not as simple as shoving the buffer syncaction logic onto textures, as there is another hurdle that could affect games that write into complex texture view layouts.

Modified Flags move to the Texture Group

Texture groups are a useful abstraction for keeping track of each subimage's memory tracking within a storage texture, and is shared across all views of a texture. Before now, this was not used for texture flush, and each texture view could have its own modified flag and flush action... these views could also overlap. While this has the benefit of fetching the data for modified views in one chunk, it doesn't bode well for tracking rules that use the gpu modified state of each subimage, as they are decoupled from the texture group and might flush at the wrong time.

Now, the modified flag and flush action have been moved to the texture group, and are tracked for each handle individually. Each handle can also have its own sync numbers for determining which sync object to wait on, or if it should even wait at all (more on this later)

Flushing by slice, rather than by view

Since modified flags are now owned by texture group handles, rather than texture views, they must be able to flush at a granularity smaller than a view. I've added methods on the backend to allow this, though cases which would flush the whole storage texture will still try to do that. These methods will also use the OGL 4.5 methods for flushing texture sub data, rather than flushing an entire level (all layers) and picking out of that. This has potential to be a little faster or slower depending on the game. Hopefully no drivers complain about this.

Flushing data that isn't ready

When a game flushes texture data, it's a reasonable assumption that the guest must have waited on some synchronization so that it could be sure that the data was actually there. If not, then they're leaving things up to chance, and they've essentially caused a race. This is undefined behaviour, so if texture data is accessed before its syncpoint is incremented, we flush the data regardless of whether it is ready or not (but wait on the last syncpoint that the guest did acknowledge, if possible).

This is the best of both worlds - we ensure data that's appropriately waited on becomes available, don't waste time waiting on accidental flushes that happen after an unrelated gpu write or other textures that share pages.

When one of these undefined flushes happens, the Modified flag is not cleared. The tracking action is re-added when the sync for the texture is actually reached, in case the game wants to access the up to date data later on. CPU writes during this state are considered to be racing with the GPU, so this is also treated specially.

When the modified flag remains set on a texture handle, data is ignored from the CPU. One possible case is that data is written to one mip level of a texture, but another mip level of the same texture remains untouched and contains data modified by the GPU (and is not flushed due to sync rules). Both are on the same page, so both recieve the same dirty flag from write tracking... but one contains data that is unflushed or flushed out of date. It would be invalid to overwrite that unflushed data from CPU if it truly was not touched, so tracking dirty flags are consumed and ignored in this state. Given a race, we therefore say that the GPU always wins.

Finally, it's important to note that accesses from the GPU thread do not follow these rules at all. The GPU's own engines and methods must have the texture data at the latest point in time. Thankfully, flushing from the main GPU thread in our backends guarantees this, so sync is skipped and the data is accessed "immediately". Buffers, conversely, must wait right now.

Part 2: Flush ordering for incompatible data

As of right now, Ryujinx follows a core assumption with the texture cache, where a use of any given texture should be valid in its own layout+format. If this is the case, then any previously cached textures overlapping the same memory that do not share the layout and format are tharefore invalid - they will contain garbage data. This core assumption keeps important texture data alive while saving time by not flushing or loading data that would be "garbage" in the new format/layout.

This rule wasn't fully established before. Incompatible textures only have the potential to be deleted when a new overlap appears over them, and any checks only happened when the texture was created. If both textures existed in the cache at the same point, they would be disjoint, and could flush separately... in any order. This meant that old texture data could flush over new texture data - not ideal for games with complex streaming behaviour.

This is where the new incompatible overlap rules come into play. To put it simply, if a texture is written to and one or more other incompatible textures overlap its memory, their data is considered invalid and their modified flags are immediately cleared. This means that only one can live at a time, and therefore when data flushes, it will always be the latest. Problem solved! Note that these rules were already in place before, they were just enforced on creation, rather than on each use.

This is a large sweeping change that affects every game, so testing would be appreciated to see if everything is still holding up.

Part 3: Flushing host incompatible formats

Occasionally, a texture format or relationship is used which isn't fully supported on the host. Two such examples are:

  • ASTC compressed textures, which are not supported on desktop GPUs.
  • BCn compressed 3D textures, which are not supported by OpenGL (but can be supported by Vulkan)

Ryujinx supports these formats by decompressing them to a supported, uncompressed format on the CPU. But, this means that data cannot be copied to/from these textures on the gpu due to using the replacement format, so they are ineligible for texture copy dependencies and views. This causes problems when certain games draw or copy data to these formats by aliasing them as some other data format:

  • Life is Strange: True Colors, and potentially other UE4 games (dungeon defenders?) use ASTC textures for characters and environments. UE4 moves texture data around by aliasing levels of textures as a data format such as R32G32Float, and copying them with a blit. Before, this moved data would be lost, as there was no relationship between the copy target and the ASTC texture.
  • BOTW draws into a compressed 3D BCn texture to use for the blue dissolve teleportation animation. The texture contains some perlin noise generated and compressed by the GPU. Before, link would just disappear immediately as ryu could not move the data for this texture.

The changes to add incompatible overlaps allowed me to add an additional case for these textures - the ability to force overlaps of these host incompatible format textures to flush their data back to the CPU before it can be used. This is not the fastest, but it is fully compatible and allows us to cover cases which were completely broken before.

This path is essentially the "slow path" for this kind of relationship, and will allow us to support rendered compressed textures on platforms that don't support BCn like mobile hardware. In time, we can introduce compute shader paths that will allow copy dependencies between data textures and "emulated" compressed textures to be done without any data flushing at all, which will further improve performance.

Potential issues with buffer/texture flush as it stands

One tracking structure is shared between all users of the same address space - they all see and signal the same tracked handles. However, there might be times where we want to sync on GPU, but the sync should not be visible on CPU yet. On top of this, a sync from CPU could remove tracking without sync flushing, but the GPU might try to use that handle immediately after for something like a UBO/I2M write and expect to get the latest data.

This last case can happen with buffers and textures. Not that it couldn't happen before, but it's important to bring up as potential future work. I have not seen it happen in practice.

Fixes

1:

2:

  • Texture streaming issues in UE3/UE4 games (mostly)
    • There are still some small issues in A Hat in Time, but they are greatly reduced. These are being investigated.

3:

I also fixed an issue with partial 3D texture DMA/load/flush that caused the gob size in z to be incorrect. Not sure what this fixes, but it might improve some issues with lighting in UE4 games a little.

Some things have been fixed regarding the DmaClass fast path used for videos. It won't try to copy to compressed textures directly now, and is aware of its 1 layer limitation.

Testing would be greately appreciated, to ensure that nothing is broken and that performance is similar.

Specifically, AMD/Intel Windows and Linux should be tested to make sure they don't have any weird issues with glGetTextureSubImage, as it's technially a DSA method and these drivers are pretty terrible at that.

Fixes a lot of issues with Unreal Engine games. Still a few minor issues (some caused by dma fast path?) Needs docs and cleanup.
Improve rules for fast DMA
…extures.

Fixes the new Life is Strange game.
Note: nosy people can no longer merge this with Vulkan. (unless they are nosy enough to implement the new backend methods)
@riperiperi riperiperi added gpu Related to Ryujinx.Graphics fix Fix something labels Jan 3, 2022
@GamerzHell9137
Copy link

Life is Strange True Colors

PR
PR1

Master
image

Dungeon Defenders Awakened

PR
PR2

Master
image

@gdkchan
Copy link
Member

gdkchan commented Jan 4, 2022

Fixes texture corruption in cutscenes on The Witcher 3 (gameplay after the cutscenes would also remain corrupted).

Before:
image
image
After:
image
image

Ryujinx.Graphics.Gpu/Image/Texture.cs Show resolved Hide resolved
Ryujinx.Graphics.Gpu/Image/TextureCache.cs Outdated Show resolved Hide resolved
Ryujinx.Graphics.Gpu/Image/TextureCompatibility.cs Outdated Show resolved Hide resolved
Ryujinx.Graphics.Gpu/Image/TextureGroup.cs Outdated Show resolved Hide resolved
Ryujinx.Graphics.Gpu/Image/TextureGroupHandle.cs Outdated Show resolved Hide resolved
Ryujinx.Graphics.Gpu/Memory/MultiRangeWritableBlock.cs Outdated Show resolved Hide resolved
Ryujinx.Graphics.OpenGL/Image/TextureView.cs Outdated Show resolved Hide resolved
Ryujinx.Graphics.OpenGL/Image/TextureView.cs Outdated Show resolved Hide resolved
Ryujinx.Graphics.OpenGL/PersistentBuffers.cs Outdated Show resolved Hide resolved
@Djipi71
Copy link

Djipi71 commented Jan 4, 2022

Test on BLue Fire : a lot less graphical issues :)
ryujinx_capture_2022-01-04_08-08-55

@Djipi71
Copy link

Djipi71 commented Jan 4, 2022

Brothers now have perct Graphics with this fix
ryujinx_capture_2022-01-04_08-11-13

@Djipi71
Copy link

Djipi71 commented Jan 4, 2022

Darksiders 3 Look a lot better
ryujinx_capture_2022-01-04_08-14-46

@Djipi71
Copy link

Djipi71 commented Jan 4, 2022

Always shadows issues but a lot better too
ryujinx_capture_2022-01-04_08-23-45

Copy link
Member

@gdkchan gdkchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, great work, nice to see some of those texture issues fixed!

@Djipi71
Copy link

Djipi71 commented Jan 8, 2022

with the new build Crash4 not starting :(
Ryujinx_1.0.0+2b8da3f_2022-01-08_05-27-41.log

@LukeWarnut
Copy link
Contributor

with the new build Crash4 not starting :( Ryujinx_1.0.0+2b8da3f_2022-01-08_05-27-41.log

Disable guest internet access.

@GamerzHell9137
Copy link

with the new build Crash4 not starting :( Ryujinx_1.0.0+2b8da3f_2022-01-08_05-27-41.log

Disable guest internet access.

Won't help, there might come a PR that allows the game to boot again tho.

@gdkchan gdkchan merged commit cda6599 into master Jan 9, 2022
@gdkchan gdkchan deleted the feature/tex-sync branch January 9, 2022 16:29
@Djipi71
Copy link

Djipi71 commented Jan 9, 2022

Seems that the last build before merging have some issues (i hve post some of them on general channel)

riperiperi added a commit to riperiperi/Ryujinx that referenced this pull request Jan 10, 2022
This fixes some regressions caused by Ryujinx#2971 which caused rendered 3D texture data to be lost for most slices. Fixes issues with Xenoblade 2's colour grading, probably a ton of other games.

This also removes the check from TextureCache, making it the tiniest bit smaller (any win is a win here).
marysaka pushed a commit that referenced this pull request Jan 11, 2022
#2993)

This fixes some regressions caused by #2971 which caused rendered 3D texture data to be lost for most slices. Fixes issues with Xenoblade 2's colour grading, probably a ton of other games.

This also removes the check from TextureCache, making it the tiniest bit smaller (any win is a win here).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Fix something gpu Related to Ryujinx.Graphics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants