Skip to content

PPU LLVM/Interpreter: Accurate vector instruction NaNs #8148

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 14, 2020

Conversation

VelocityRa
Copy link
Member

@VelocityRa VelocityRa commented May 3, 2020

Tested with https://github.com/RPCS3/ps3autotests/tree/master/tests/cpu/ppu_vpu.

Results in that test improved:
In the LLVM recompiler: by about half, ~11k different lines from realhw to ~6k.
In the precise interpreter: From 2746 to 353 different lines.

Needs testing to determine performance regressions and improved game compatibility.

Note for testers: Clear PPU game cache and firmware cache

@VelocityRa VelocityRa marked this pull request as draft May 3, 2020 03:15
@VelocityRa VelocityRa changed the title PPU LLVM: Fix vrsqrtefp NaNs PPU LLVM: Accurate vector instruction NaNs May 3, 2020
@VelocityRa VelocityRa changed the title PPU LLVM: Accurate vector instruction NaNs [NEEDS TESTING] PPU LLVM: Accurate vector instruction NaNs May 3, 2020
@VelocityRa VelocityRa marked this pull request as ready for review May 3, 2020 04:34
@VelocityRa VelocityRa changed the title [NEEDS TESTING] PPU LLVM: Accurate vector instruction NaNs [TESTERS NEEDED] PPU LLVM: Accurate vector instruction NaNs May 3, 2020
@VelocityRa VelocityRa force-pushed the vrsqrtefp branch 2 times, most recently from dd03d84 to dcf0584 Compare May 3, 2020 06:32
@VelocityRa VelocityRa marked this pull request as draft May 3, 2020 06:34
@VelocityRa VelocityRa force-pushed the vrsqrtefp branch 2 times, most recently from 0ad58f8 to 701f8d3 Compare May 3, 2020 07:27
@VelocityRa VelocityRa marked this pull request as ready for review May 3, 2020 07:34
@@ -1906,7 +1948,10 @@ bool ppu_interpreter::VRLW(ppu_thread& ppu, ppu_opcode_t op)

bool ppu_interpreter::VRSQRTEFP(ppu_thread& ppu, ppu_opcode_t op)
{
ppu.vr[op.vd].vf = _mm_div_ps(_mm_set_ps(1.0f, 1.0f, 1.0f, 1.0f), _mm_sqrt_ps(ppu.vr[op.vb].vf));
const auto a = _mm_set_ps(1.0f, 1.0f, 1.0f, 1.0f);
const auto b = _mm_sqrt_ps(ppu.vr[op.vb].vf);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the argument passed to vec_handle_nan needs to be unmodified vb.

template <typename T>
auto vec_handle_nan(T&& expr)
{
return VecHandleNan(expr.eval(m_ir));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe return a value_t<> instead of llvm::Value* for consistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i considered it but set_vr above this is the same style and does not return value_t, so the current way is consistent with that

Copy link
Contributor

@knight4u32 knight4u32 May 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean its consistent with other functions in CPUTranslator.h
It's also odd to mix "old style" llvm::Value* functions with the "new style" value_t<> in the functions' interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_vr also doesn't return anything so it's not a fair comparison.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yeah, does call eval but doesn't return, i'll change it

Copy link
Member Author

@VelocityRa VelocityRa May 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait i'm not crazy it does return lol
but ret type is void 🤔 just like SetVr. pretty weird & misleading

	template <typename T>
	void set_vr(u32 vr, T&& expr)
	{
		return SetVr(vr, expr.eval(m_ir));
	}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll change it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed

@Fneufneu

This comment has been minimized.

@knight4u32
Copy link
Contributor

note for testers: clear ppu game cache and firmware cache.

@Yahfz

This comment has been minimized.

@Fneufneu

This comment has been minimized.

@VelocityRa VelocityRa changed the title [TESTERS NEEDED] PPU LLVM: Accurate vector instruction NaNs PPU LLVM: Accurate vector instruction NaNs May 4, 2020
@VelocityRa VelocityRa changed the title PPU LLVM: Accurate vector instruction NaNs PPU LLVM/Interpreter: Accurate vector instruction NaNs May 4, 2020
@AniLeo AniLeo requested a review from a user May 5, 2020 22:38
@ghost
Copy link

ghost commented May 13, 2020

I think it should be under an option.

@VelocityRa
Copy link
Member Author

Pushed (not exposed to GUI).

@ghost
Copy link

ghost commented May 13, 2020

Hmm, not pushed?

@knight4u32
Copy link
Contributor

knight4u32 commented May 13, 2020

The option needs to affect ppu cache as well. (see accurate DFMA setting handling)

@VelocityRa
Copy link
Member Author

pushed

@AniLeo AniLeo merged commit b1fb5b6 into RPCS3:master May 14, 2020
@rafael-57
Copy link

This doesn't seem to fix issue #5289, is the option enabled by default?

@AniLeo
Copy link
Member

AniLeo commented May 14, 2020

No

@rafael-57
Copy link

How do we set it? Or do we need to wait for a PR to expose it to the GUI?

No

@RainbowCookie32
Copy link
Contributor

You have to set PPU LLVM Accurate Vector NaN values to true in config.yml until it's exposed to the GUI. You might also have to clear PPU Cache as well, not sure about that

@knight4u32
Copy link
Contributor

or use ppu interpreter.

@rafael-57
Copy link

rafael-57 commented May 14, 2020

I haven't been able to set the option in the config .yml for some reason, but this does fix #5289 when using ppu interpreter

Edit: Even letting the emulator create a new config.yml file and setting the option to true doesn't fix #5289

illusion0001 added a commit to illusion0001/rpcs3 that referenced this pull request May 14, 2020
illusion0001 added a commit to illusion0001/rpcs3 that referenced this pull request May 14, 2020
illusion0001 added a commit to illusion0001/rpcs3 that referenced this pull request May 14, 2020
ghost pushed a commit that referenced this pull request May 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants