Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetPlay/Jit64: Avoid using software FMA #9804

Merged
merged 1 commit into from Jun 13, 2021

Conversation

JosJuice
Copy link
Member

@JosJuice JosJuice commented Jun 9, 2021

When I added the software FMA path in 2c38d64 and made us use it when determinism is enabled, I was assuming that either the performance impact of software FMA wouldn't be too large or CPUs that were too old to have FMA instructions were too slow to run Dolphin well anyway. This was wrong. To give an example, the netplay performance went from 60 FPS to 30 FPS in one case.

This change makes netplay clients negotiate whether FMA should be used. If all clients use an x64 CPU that supports FMA, or AArch64, then FMA is enabled, and otherwise FMA is disabled. In other words, we sacrifice accuracy if needed to avoid massive slowdown, but not otherwise. When not using netplay, whether to enable FMA is simply based on whether the host CPU supports it.

The only remaining case where the software FMA path gets used under normal circumstances is when an input recording is created on a CPU with FMA support and then played back on a CPU without. This is not an especially common scenario (though it can happen), and TASers are generally less picky about performance and more picky about accuracy than other users anyway.

With this change, FMA desyncs are avoided between AArch64 and modern x64 CPUs (unlike before 2c38d64), but we do get FMA desyncs between AArch64 and old x64 CPUs (like before 2c38d64). This desync can be avoided by adding a non-FMA path to JitArm64 as an option, which I will wait with for another pull request so that we can get the performance regression fixed as quickly as possible.

https://bugs.dolphin-emu.org/issues/12542

@JosJuice JosJuice force-pushed the revert-fma branch 2 times, most recently from 09883ec to 23786a3 Compare June 9, 2021 19:29
Source/Core/Core/Config/SessionSettings.cpp Outdated Show resolved Hide resolved
Source/Core/Core/NetPlayClient.cpp Outdated Show resolved Hide resolved
When I added the software FMA path in 2c38d64 and made us use
it when determinism is enabled, I was assuming that either the
performance impact of software FMA wouldn't be too large or CPUs
that were too old to have FMA instructions were too slow to run
Dolphin well anyway. This was wrong. To give an example, the
netplay performance went from 60 FPS to 30 FPS in one case.

This change makes netplay clients negotiate whether FMA should
be used. If all clients use an x64 CPU that supports FMA, or
AArch64, then FMA is enabled, and otherwise FMA is disabled.
In other words, we sacrifice accuracy if needed to avoid massive
slowdown, but not otherwise. When not using netplay, whether to
enable FMA is simply based on whether the host CPU supports it.

The only remaining case where the software FMA path gets used
under normal circumstances is when an input recording is created
on a CPU with FMA support and then played back on a CPU without.
This is not an especially common scenario (though it can happen),
and TASers are generally less picky about performance and more
picky about accuracy than other users anyway.

With this change, FMA desyncs are avoided between AArch64 and
modern x64 CPUs (unlike before 2c38d64), but we do get FMA
desyncs between AArch64 and old x64 CPUs (like before 2c38d64).
This desync can be avoided by adding a non-FMA path to JitArm64 as
an option, which I will wait with for another pull request so that
we can get the performance regression fixed as quickly as possible.

https://bugs.dolphin-emu.org/issues/12542
@@ -272,7 +275,7 @@ void INIGameConfigLayerLoader::Save(Config::Layer* layer)
const Config::Location& location = config.first;
const std::optional<std::string>& value = config.second;

if (!IsSettingSaveable(location))
if (!IsSettingSaveable(location) || location.system == Config::System::Session)
Copy link
Contributor

@Techjar Techjar Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just modify IsSettingSaveable to return false for location.system == Config::System::Session instead? Maybe it already does that, which would make this check redundant I think.

Copy link
Member Author

@JosJuice JosJuice Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsSettingSaveable marks settings that are still handled by the old config system, and based on code comments, it seems like we intend to drop IsSettingSaveable when the old config system is dead. The Session system is already handled by the new config system, and we want to keep not saving it in the future when the old config system is gone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah okay. I guess Session is where anything that we don't want to be persistent should go. I'm gonna make a follow-up PR that moves a few of such settings there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still feels a bit weird that Config::System::Session shows up scattered around everywhere. Isn't this what the layered configuration system was designed to solve in first place? Have a layer that is session-only, doesn't save, and has that single setting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The layered config system supports session-only values, but not session-only settings. I want to make it impossible for this setting to get loaded from Dolphin.ini so that nobody overrides the default by accident.

Copy link
Contributor

@Techjar Techjar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested this yet, but code looks good. I don't have pre-Haswell hardware in service to properly test anyways, just decommissioned my Sandy Bridge hardware last month.

@JosJuice
Copy link
Member Author

I asked people in the netplay community that I got the performance regression report from to test it, and they said things work with good performance and without desyncs regardless of whether the players' CPUs support FMA. I've also verified on my own that when you run netplay with only CPUs that support FMA, you do take the hardware FMA code path.

@leoetlino leoetlino merged commit 4e3e3bf into dolphin-emu:master Jun 13, 2021
11 checks passed
@JosJuice JosJuice deleted the revert-fma branch June 13, 2021 11:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants