Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsx: Improve shader decompiler output #8984

Merged
merged 5 commits into from Sep 27, 2020
Merged

Conversation

kd-11
Copy link
Contributor

@kd-11 kd-11 commented Sep 26, 2020

  • Rewrites the vertex decoder to emit much simpler and smaller code which in turn drastically speeds up link time. On GCN GPUs this can result in almost 2x speedup of shader link time which means less time wasted waiting for shaders to compile. Resulting bytecode is almost 30% smaller as well, which may help performance on low end GPUs and/or IGPs.
  • Reimplements fp32->fp16 casting to use GPU native instructions. This seems to work on my AMD card correctly and still preserves INF/NaN special encoding after the cast. 10-15% instruction count savings.
  • Use intrinsics wherever possible to take advantage of native hardware instructions. Replaces many shift-and operations with a single bfe and shift-or with bfi. Generates slightly smaller code but more importantly - it is much cleaner and easier to read.

To Testers: Just check if games are working the same as before. While we already found some bugs and fixed them with internal testing, bugs may still lurk.

- I've not found it to be very useful and it just breaks good code right now.
  TODO: Re-enable when things improve.
- Significantly improves compilation speed by simplifying most of the code and doing something similar to LICM.
  * Actual decoding is now vectorized and performed in one step rather than in a loop.
  * Switches inside loops are removed and replaced with simple comparison. Generates much nicer (and smaller) GCN bytecode.
@kd-11 kd-11 changed the title [TESTERS NEEDED] rsx: Improve shader decompiler output [TESTERS NEEDED][WIP] rsx: Improve shader decompiler output Sep 26, 2020
@kd-11
Copy link
Contributor Author

kd-11 commented Sep 26, 2020

Relegated to WIP, looks like I broke something when doing squash/rebase.

@kd-11 kd-11 marked this pull request as draft September 26, 2020 14:53
@kd-11 kd-11 force-pushed the rsx_volatile branch 2 times, most recently from b929ad0 to e7c199b Compare September 26, 2020 17:27
- Optimize clamp16
- Use bfe instead of shift-and
@kd-11 kd-11 changed the title [TESTERS NEEDED][WIP] rsx: Improve shader decompiler output [TESTERS NEEDED] rsx: Improve shader decompiler output Sep 26, 2020
@kd-11 kd-11 marked this pull request as ready for review September 26, 2020 18:30
@kd-11
Copy link
Contributor Author

kd-11 commented Sep 26, 2020

Regressions fixed.

@AniLeo
Copy link
Member

AniLeo commented Sep 26, 2020

i7-6700HQ, HD 530

Render: Vulkan
Driver: Mesa 20.1.7 (anv)
SPU Block Size: Mega
Shader ASynchronous Mode

Test performed with clean emulator shaders_cache and clean .cache/mesa_shaders_cache, SPU code cached beforehand. Time from first shader to last shader compialtion without moving the character.

Persona 5 Before After
School Corridor (190 shaders) 62s 51s
Backstreets (164 shaders) 62s 47s

@Kravickas
Copy link
Contributor

Kravickas commented Sep 27, 2020

NHL Legacy

Black screen in cutscenes with UI rendering.
Old cache, new cache with this PR tested
Master is ok

RPCS3.log.gz
RSX Capture https://drive.google.com/file/d/13kuKqLya-NNh6fgxYQHMS29kYb9qeEFj/view?usp=sharing (in open rsx capture it doesnt show black screen with ui, it shows like first frame ingame with correct captured ui)

@kd-11
Copy link
Contributor Author

kd-11 commented Sep 27, 2020

@Kravickas Capture opengl renderdoc with shader compiler set to legacy (single thteaded). Also remember to set 'use opengl legacy buffers' debug option.

@Kravickas
Copy link
Contributor

Weeell OGL works :/

@kd-11
Copy link
Contributor Author

kd-11 commented Sep 27, 2020

You can capture vulkan if you have amd polaris or nvidia pascal, but if opengl works then it is likely not a problem with the shaders as code generator is shared.

@Kravickas
Copy link
Contributor

I dont have either, so if no one find any bug, dont hold the merge because of that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants