Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need for Speed: Most Wanted - LLVM runs slower than ASMJIT #14752

Open
Ordinary205 opened this issue Oct 21, 2023 · 1 comment
Open

Need for Speed: Most Wanted - LLVM runs slower than ASMJIT #14752

Ordinary205 opened this issue Oct 21, 2023 · 1 comment
Labels
CPU Optimization Optimizes existing code

Comments

@Ordinary205
Copy link
Contributor

Quick summary

Using LLVM recompiler runs significantly slower than ASMJIT.

Details

Heres the performance comparison between LLVM and ASMJIT.

LLVM recompiler: 9.2/10.8/11.8 FPS

Screenshot

LLVM

ASMJIT recompiler: 14.8/16.8/18.4 FPS

Screenshot

ASMJIT

I disabled the RSX tiled memory to show better performance results.
The strange thing about LLVM is that it's supposed to be faster than/tied with ASMJIT, which is why I thought it would be necessary to post this bug.

Attach a log file

LLVM.log.gz

ASMJIT.log.gz

Attach capture files for visual issues

No response

System configuration

AMD Ryzen 5900X 12-Core Processor | 24 Threads | 15.89 GiB RAM | RTX 3080 driver 545.84.0.0 | Windows 10 Pro 22H2

Other details

No response

@elad335 elad335 added CPU Optimization Optimizes existing code labels Oct 22, 2023
@Ordinary205
Copy link
Contributor Author

Ordinary205 commented Dec 13, 2023

I've done some research by testing other SPU options. I've noticed that SPU Interpreter dynamic slightly improves the performance over ASMJIT, although this does causes a huge audio slowdown when enabling these options.

ASMJIT: 15.3/16.8/18.1 FPS

Screenshot

ASMJIT

Interpreter dynamic: 16.4/17.4/18.5 FPS

Screenshot

SPU dynamic

Interpreter static: 10.5/12.2/13.2 FPS

Screenshot

SPU static

I've ranked each SPU options from slowest to fastest:

  1. Interpreter dynamic (fastest)
  2. ASMJIT recompiler
  3. Interpreter static
  4. LLVM recompiler (slowest)

The ASMJIT performance seems to have regressed when enabling RSX tiled memory, while the LLVM performance works totally fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CPU Optimization Optimizes existing code
Projects
None yet
Development

No branches or pull requests

2 participants