-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
PPU LLVM/Interpreter: Accurate vector instruction NaNs #8148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dd03d84
to
dcf0584
Compare
0ad58f8
to
701f8d3
Compare
rpcs3/Emu/Cell/PPUInterpreter.cpp
Outdated
@@ -1906,7 +1948,10 @@ bool ppu_interpreter::VRLW(ppu_thread& ppu, ppu_opcode_t op) | |||
|
|||
bool ppu_interpreter::VRSQRTEFP(ppu_thread& ppu, ppu_opcode_t op) | |||
{ | |||
ppu.vr[op.vd].vf = _mm_div_ps(_mm_set_ps(1.0f, 1.0f, 1.0f, 1.0f), _mm_sqrt_ps(ppu.vr[op.vb].vf)); | |||
const auto a = _mm_set_ps(1.0f, 1.0f, 1.0f, 1.0f); | |||
const auto b = _mm_sqrt_ps(ppu.vr[op.vb].vf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the argument passed to vec_handle_nan needs to be unmodified vb.
rpcs3/Emu/Cell/PPUTranslator.h
Outdated
template <typename T> | ||
auto vec_handle_nan(T&& expr) | ||
{ | ||
return VecHandleNan(expr.eval(m_ir)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe return a value_t<> instead of llvm::Value* for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i considered it but set_vr
above this is the same style and does not return value_t
, so the current way is consistent with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean its consistent with other functions in CPUTranslator.h
It's also odd to mix "old style" llvm::Value* functions with the "new style" value_t<> in the functions' interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_vr also doesn't return anything so it's not a fair comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah, does call eval
but doesn't return, i'll change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait i'm not crazy it does return lol
but ret type is void
🤔 just like SetVr
. pretty weird & misleading
template <typename T>
void set_vr(u32 vr, T&& expr)
{
return SetVr(vr, expr.eval(m_ir));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed
This comment has been minimized.
This comment has been minimized.
note for testers: clear ppu game cache and firmware cache. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I think it should be under an option. |
Tested with https://github.com/RPCS3/ps3autotests/tree/master/tests/cpu/ppu_vpu, results in that test improved by about half.
Tested with https://github.com/RPCS3/ps3autotests/tree/master/tests/cpu/ppu_vpu. This commit gets us from 2746 to 353 different lines compared to realhw.
Pushed (not exposed to GUI). |
Hmm, not pushed? |
The option needs to affect ppu cache as well. (see accurate DFMA setting handling) |
Turned off by default.
pushed |
This doesn't seem to fix issue #5289, is the option enabled by default? |
No |
How do we set it? Or do we need to wait for a PR to expose it to the GUI?
|
You have to set |
or use ppu interpreter. |
Tested with https://github.com/RPCS3/ps3autotests/tree/master/tests/cpu/ppu_vpu.
Results in that test improved:
In the LLVM recompiler: by about half, ~11k different lines from realhw to ~6k.
In the precise interpreter: From 2746 to 353 different lines.
Needs testing to determine performance regressions and improved game compatibility.
Note for testers: Clear PPU game cache and firmware cache