New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] SPU performance optimizations #4108
Conversation
Nice :) |
It's promising, but it's very unstable for me. Most games freeze without error message or with "F {MFC Thread} MEM: Access violation reading location 0x0" randomly. That being said I spotted big gains on several games, especially Ni no kuni which becomes fullspeed with OpenGL (before of freeze totally like the others). |
Same situation most game are stoped working, but some are still working |
Ni no kuni stuck at loading with Vulkan |
The gcc build failed on Ubuntu 16.04. (Line 503) |
P5 now have locked 30FPS in sometimes!!! |
@@ -121,13 +121,13 @@ void spu_recompiler::compile(spu_function_t& f) | |||
compiler.alloc(vec_vars[5], asmjit::x86::xmm5); | |||
|
|||
// Initialize labels | |||
std::vector<Label> pos_labels{ 0x10000 }; | |||
this->labels = pos_labels.data(); | |||
this->labels = std::unique_ptr<Label[]>(reinterpret_cast<Label*>(new char[0x10000 * sizeof(Label)]())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use u8 instead of char
rpcs3/Emu/Cell/SPUAnalyser.cpp
Outdated
@@ -5,16 +5,18 @@ | |||
|
|||
const spu_decoder<spu_itype> s_spu_itype; | |||
|
|||
spu_function_t* SPUDatabase::find(const be_t<u32>* data, u64 key, u32 max_size) | |||
std::shared_ptr<spu_function_contents_t> SPUDatabase::find(const be_t<u32>* data, u64 key, u32 max_size, void * ignore) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void* ignore contains extra spacing
rpcs3/Emu/Cell/SPUAnalyser.cpp
Outdated
@@ -33,16 +35,22 @@ SPUDatabase::~SPUDatabase() | |||
// TODO: serialize database | |||
} | |||
|
|||
spu_function_t* SPUDatabase::analyse(const be_t<u32>* ls, u32 entry, u32 max_limit) | |||
std::shared_ptr<spu_function_contents_t> SPUDatabase::analyse(const be_t<u32>* ls, u32 entry, void * ignore /*=nullptr*/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
On my end all games just crash cpu x5675 (westmere) |
rpcs3/Emu/Cell/SPUAnalyser.cpp
Outdated
@@ -352,7 +366,7 @@ spu_function_t* SPUDatabase::analyse(const be_t<u32>* ls, u32 entry, u32 max_lim | |||
m_db.emplace(key, func); | |||
} | |||
|
|||
LOG_NOTICE(SPU, "Function detected [0x%05x-0x%05x] (size=0x%x)", func->addr, func->addr + func->size, func->size); | |||
LOG_FATAL(SPU, "Function detected [0x%05x-0x%05x] (size=0x%x)", func->addr, func->addr + func->size, func->size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure this should be fatal? also LOG_* macros are deprecated, see Log.h
rpcs3/Emu/Cell/SPUAnalyser.h
Outdated
|
||
// For internal use | ||
spu_function_t* find(const be_t<u32>* data, u64 key, u32 max_size); | ||
std::shared_ptr<spu_function_contents_t> find(const be_t<u32>* data, u64 key, u32 max_size, void * ignore=nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing, void* ignore = nullptr
rpcs3/Emu/Cell/SPUThread.cpp
Outdated
waiter.stamp = rtime; | ||
waiter.data = rdata.data(); | ||
waiter.init(); | ||
vm::waiter * waiter = new vm::waiter(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing
rpcs3/Emu/Cell/SPUThread.cpp
Outdated
@@ -1189,28 +1141,33 @@ bool SPUThread::get_ch_value(u32 ch, u32& out) | |||
return true; | |||
} | |||
|
|||
vm::waiter waiter; | |||
|
|||
vm::waiter * waiter = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing
Utilities/mutex.cpp
Outdated
@@ -173,7 +173,7 @@ void shared_mutex::imp_lock(s64 _old) | |||
|
|||
for (int i = 0; i < 10; i++) | |||
{ | |||
busy_wait(); | |||
if (i != 0) { busy_wait(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
braces should be on newlines
Utilities/mutex.cpp
Outdated
@@ -14,7 +14,7 @@ void shared_mutex::imp_lock_shared(s64 _old) | |||
|
|||
for (int i = 0; i < 10; i++) | |||
{ | |||
busy_wait(); | |||
if (i != 0) busy_wait(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add braces
Utilities/mutex.cpp
Outdated
@@ -237,6 +237,7 @@ void shared_mutex::imp_lock_degrade() | |||
bool shared_mutex::try_lock_shared() | |||
{ | |||
// Conditional decrement | |||
if (m_value < c_min) return false; // Fast path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
braces
Added CR fixes. |
Found the cause for the crashes, will push a fix later today |
The gcc build did complete for me on Ubuntu as of 6b677b9. |
@Kravickas This seems related to the known bug, I'd say wait for the fix before posting logs. |
P5 boosts quite obvious and nice PR . 1942 Joint Strike also no more lag in-game . |
Can confirm, 5-10+ FPS boost in P5 depending on the situation. Haven't been able to test other games yet due to it being very unstable. |
Eternal Sonata: Transitions are slightly faster, new SPU Halt errors here and there. Tales of Vesperia: Doesn't start on SPU Recompiler. Has MEM Access Violations on the PPU thread with Interpreter, before reaching the main menu. Tales of Graces f: Similar issues to Vesperia, it doesn't reach the menu on Recompiler, goes ingame on Interpreter but tends to freeze. Tales of Xillia: tends to throw a Fatal Error from the start on Recompiler, with no actual message (just a blank window), or just stops before the main menu. Goes Ingame with Interpreter but audio is very choppy. Tales of Xillia 2: Hangs upon reaching the main menu on Recompiler. Gets stuck before the intro on Interpreter (though it seems like it's still running, it just doesn't do anything). One thing I've noticed, things go wrong when "Branch-to-next with $LR" happen. |
Drakengard 3 ==> Freeze on loading |
This has worked really well for P5 I get almost double my fps in some situations and a solid 30 in most areas now. I have one problem thats happened twice where the audio crashes during a load screen and then a few minutes later the game will lock up. I attached the log of what happens. |
You should probably wait before you start testing games, there's a bad breaking bug that's causing hangs. Some games don't even boot with this. |
Deception IV: The Nightmare Princess - on recompiler freezes after starting new game or continuing and music is missing |
Closing until I upload the fix, many of these reports are the same problem over and over- thanks everyone |
So, didn't get to fix it today, so I implemented a quick solution that will degrade performance, just to know if there are other crashes in there. |
Quick test with the temporary crash fix: Drakengard 3: Freeze on loading Log show: |
"Branch-to-next with $LR" errors are completely irrelevant. |
Well, I don't know, but these messages appear on this PR and not on the Master for me, so I indicate it in case. If it is useless then the games freeze without displaying any noticeable error message in this case. |
Workaround dirty AVX high state
Use new patterns for saturation instructions Avoid ZExt/SExt completely
Who knows without a linux VM. I sure don't. Only 1 way to find out quickly. Nobody reads these notes anyway ¯\_(ツ)_/¯
Until I find out what else triggers SPU invalidation
PR will be closed soon for some maintenance |
@Farseer2 reopen please! |
How can I actually use this build? Sorry, I'm stupid and new to RPCS3. |
Don't use this build. Master already surpass it in term of performance. Just enable debug mode and disable zcull. |
WIP for stability reasons
This PR includes some optimizations, mostly to the SPU, being the current bottleneck of the emulator:
So, now for the bad parts:
Thanks to whymsical, haico1992, digitaldude555 and Ani for helping me test this.