Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Overlay #4620

Merged
merged 6 commits into from May 30, 2018
Merged

Performance Overlay #4620

merged 6 commits into from May 30, 2018

Conversation

@VelocityRa
Copy link
Contributor

VelocityRa commented May 20, 2018

Summary

This is meant to be a user-oriented overlay for getting real-time performance metrics.

See discussion in #4500 if you want to see how the design progressed.

Features

  • Other than FPS, frametime and RSX Load, it includes new stats:
    • How much the PPU, SPU or RSX emulation takes a toll on your CPU.
    • How many thread there are, for each of the above.
    • Total (process) CPU usage.
  • It makes use of kd's native overlay system so it can look much better than the debug overlay.
  • I've made 4 "presets", each one with with different levels of detail.
  • The following are configurable via the config file as of now:
    • Detail level (preset)
    • Update interval of the metrics
    • Size (dictated by font size)

These will soon(tm) be configurable via GUI (in another PR), for now they're config options.
Edit: Done by Megamouse here: #4673

Currently it uses a font from the firmware so it'll be present everywhere.
Initially I used this downloaded one, but I'm not sure if it's worth to include it with RPCS3.

Screenshots

Note: They were taken when opening the PR, so they're bit outdated
In the most detail possible (number in parentheses in Host Utilization is thread count):

image

Fullscreen

image

image

Fullscreen

image


Here's all the detail levels:

1. Minimal

image

2. Low

image

3. Medium

image

4. High

image

Future work

  • More stats:
    • PPU/SPU guest load (CPU usage currently shown is probably more useful for users)
    • CPU/GPU model
    • Graphs(?)
  • Other possible configurable settings:
    • Text/title/background colors and opacity
    • Overlay position on the screen
    • Font

Closing words

As the title suggest, this is a WIP, getting some stats in Linux doesn't work yet and I haven't made proper commits (also it's enabled by default, for easier testing).

That's all, please post feedback!

@theoldsport

This comment has been minimized.

Copy link

theoldsport commented May 20, 2018

I figured I'd try to push things to the max, and killed your overlay. Removing the framerate cap on Atelier Totori seemed to do it
RPCS3.log.gz

Upon further inspection, it may be vsync that's killing it

@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch from 67f662e to cbe945b May 20, 2018
@kd-11 kd-11 mentioned this pull request May 20, 2018
@kd-11

This comment has been minimized.

Copy link
Contributor

kd-11 commented May 20, 2018

Maybe add a half-space between Host and Guest metrics? kinda triggers me as is lol.

rawspus = idm::select<RawSPUThread>([&spu_cycles](u32, RawSPUThread& rawspu) { spu_cycles += rawspu.get()->get_cycles(); });

if (!rsx_thread)
rsx_thread = fxm::get<GSRender>();

This comment has been minimized.

Copy link
@TGEnigma

TGEnigma May 20, 2018

Contributor

use fxm::get_always instead of checking

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

Actually never forcefully create rsx::thread which is the base for GSRender. Doesn't matter anyway since a better way will exist soon (via rsx::get_current_renderer()). If an instance of rsx does not exist (which is unlikely since update is called from rsx->flip), just skip getting its cycles count since obviously its 0.

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 20, 2018

Author Contributor

The reason I check that here is because rsx_thread might be already present from the switch fallthrough above.

The point of the fallthrough is to avoid code duplication, but it does cause a few things like this. Hopefully not too bad to the point of it being unmaintainable though.

@@ -147,6 +147,15 @@ enum class frame_limit_type
_auto,
};

enum class perf_overlay_detail_level
{
minimal_1, // fps only

This comment has been minimized.

Copy link
@TGEnigma

TGEnigma May 20, 2018

Contributor

why add number suffix to the enum members

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 20, 2018

Author Contributor

Because the names could be considered subjective, it's clearer this way.
I don't feel strongly about this, should I remove it? In the gui it will be obvious by the order they're presented anyway.


#include <time.h>

template <>
inline void fmt_class_string<perf_overlay_detail_level>::format(std::string& out, u64 arg)

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

This function does not belong here imo. If perf_overlay_detail_level (which I feel should be a generic level-of-detail enum for any future options with incrementing levels named something like option_detail_level or something) is part of the core System, then it belongs in System.cpp. Keep in mind that RSX/Overlays is a completely optional part of rpcs3 afterall, like RSX/Capture and the three renderer backends with visual output. Non-core code is as much as possible kept out of the core emulator code, although sometimes it is difficult to do so cleanly.

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 20, 2018

Author Contributor

I know, originally I had the enum in the perf_metrics_overlay struct itself.

I had to remove and rename it because of a cyclic dependency with System which I couldn't resolve any other way (System.h needed it for configuration, but overlays.h itself included overlay_controls.h which included System.h).

Maybe fixed with refactoring PR?

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

Idk how it can be resolved easily either without using a static_cast and using a raw cfg::_int as the node base type. But I also mean this fmt method does not belong here if the enum is defined in the core. It would belong in System.cpp.
As for the enum, my idea was to promote it to core, but rename it from perf_xxx to a generic level of detail enum. That way it would make more sense if more options could be added which also use the same enum.

@kd-11

This comment has been minimized.

Copy link
Contributor

kd-11 commented May 20, 2018

I feel there is no benefit in separating high and ultra here - the RSX load metrics are 'free' to grab and the addition is actually very important if evaluating performance. Combine High and Ultra into one option with the full suite of metrics.

bool font_found = false;
for (auto& font_dir : font_dirs)
{
std::string requested_file = font_dir + ttf_name;

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

If its necessary to append .ttf anyway there is no need for the option to supply the font name with the ttf. The caller should make sure of that otherwise this can hide bad code.

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 20, 2018

Author Contributor

The reason for supporting both with and without an extension was to support cases where the extensions is capitalized (.TTF) as is the case with fonts in the firmware (since I wanted to make these available too, instead of just having one of them hardcoded as a fallback).

I didn't want to touch existing code to force the caller to include the extension always. Is this what you're suggesting I do?

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

That will work only if you do element.set_font("FULL_NAME.TTF", size). Thats strange to include the extension when loading the font by name but I'll allow it, mostly because I do not want to actually rewrite the font search code to properly handle corner cases lol. Really should use directory traversal on linux in case the file name does not match exactly. I already had this problem on my system vs an arch user getting different results because of it.

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 20, 2018

Author Contributor

Oh... I didn't know it was already possible to set the name exactly.

but I'll allow it

👍

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

I mean with your solution it will become possible to pass full name. My method always appends .ttf in lower case which will break exact font names. So you can currently set full name but it will fail as you get double extension.

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 20, 2018

Author Contributor

Yup, that's why I did it.

namespace rsx
{
namespace overlays
{
//Singleton instance declaration
// Singleton instance declaration

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

Why? Also happens below

This comment has been minimized.

Copy link
@Megamouse

Megamouse May 20, 2018

Contributor

it happens when you use clang format in VS.

//Comment
becomes
//->Comment

// Comment
is fine though

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 20, 2018

Author Contributor

yea I've run clang-format on everything
if formatting is questionable in places, it's not my fault :^

rsx_thread = fxm::get<GSRender>();
rsx_load = rsx_thread->get_load();

/* fallthrough */

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

Use c++ comments //

@theoldsport

This comment has been minimized.

Copy link

theoldsport commented May 20, 2018

One other nitpick is when the "Compiling Shaders" overlay shows up, it removes the background, fps indicator, and some color from the performance overlay
aamq7bs

@kd-11

This comment has been minimized.

Copy link
Contributor

kd-11 commented May 20, 2018

That issue is fixed with #4623 - it was a design flaw in the core. After #4623 is merged some parts of this PR will be simplified and/or removed which is why I'm not really reviewing object management or apparent visual glitching right now.


if (g_cfg.video.perf_overlay.perf_overlay_enabled)
{
m_perf_overlay = std::make_unique<rsx::overlays::perf_metrics_overlay>(false);

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

With the refactor now merged, you no longer need to worry about managing objects so m_perf_overlay variable should not be needed anymore. Also this code now belongs in rsx::thread::on_task in the block with the make_always<rsx::overlays::display_manager>. Its drawing is also handled automatically so no need for any code in the renderers at all, just set and forget.

This comment has been minimized.

Copy link
@kd-11

kd-11 May 20, 2018

Contributor

Alternatively, move this to rsx_utils to avoid duplication, where I eventually plan on moving all remaining traces of overlays to remove them from the core files completely.

@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch from cbe945b to 6319f4f May 20, 2018
Copy link
Contributor

kd-11 left a comment

Only a few nits left, looks good otherwise.

@@ -458,3 +479,19 @@ struct cfg_root : cfg::node
extern cfg_root g_cfg;

extern bool g_use_rtm;

template <>
inline void fmt_class_string<detail_level>::format(std::string& out, u64 arg)

This comment has been minimized.

Copy link
@kd-11

kd-11 May 21, 2018

Contributor

Why not have this in system.cpp with all the other enum formatting methods? Its easier to find stuff like this if its all kept in one file

if (g_cfg.video.perf_overlay.perf_overlay_enabled)
{
auto perf_overlay = std::make_unique<rsx::overlays::perf_metrics_overlay>(false);
perf_overlay->set_detail_level(g_cfg.video.perf_overlay.level);

This comment has been minimized.

Copy link
@kd-11

kd-11 May 21, 2018

Contributor

You can also just have auto perf_overlay = m_overlay_manager->create<rsx::overlays::perf_metric_overlay>(false) to avoid the add later on. Its just a nit though as this is functionally equivalent.

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 21, 2018

Author Contributor

Yeah but I needed to call all these setters afterwards, seems better this way than calling .get() or whatever on the display manager.

@@ -6,12 +6,12 @@ namespace rsx
{
namespace overlays
{
//Singleton instance declaration
// Singleton instance declaration

This comment has been minimized.

Copy link
@kd-11

kd-11 May 21, 2018

Contributor

This is just crazy looking. You can disable clang-format unless you use manual formatting to fix this kind of thing.


// 1. Fetch/calculate metrics we'll need
switch (m_detail)
{

This comment has been minimized.

Copy link
@kd-11

kd-11 May 21, 2018

Contributor

With this switch block and the one after it, it might be easier on the eyes if the cases were braced, because they have a lot of content spanning many lines in some of the blocks.

@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch 2 times, most recently from 90f053b to e16078f May 21, 2018
@@ -2196,6 +2206,7 @@ namespace rsx

void thread::flip(int buffer)
{
this->get()->get_cycles();

This comment has been minimized.

Copy link
@kd-11

kd-11 May 24, 2018

Contributor

Why is this being called every frame? That's sure to incur an overhead

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 24, 2018

Author Contributor

Accidentally pushed debug code, sorry!

Copy link
Contributor

jbeich left a comment

Can you apply VelocityRa#1?

while (fgets(line, 128, file) != NULL)
{
if (strncmp(line, "processor", 9) == 0)
m_num_processors++;

This comment has been minimized.

Copy link
@jbeich

jbeich May 24, 2018

Contributor

Why not m_num_processors = sysconf(_SC_NPROCESSORS_ONLN) ? /proc/cpuinfo may not be available/mounted on other Unix systems.

#include "string.h"
#include "sys/times.h"
#include "sys/types.h"
#include "sys/vtimes.h"

This comment has been minimized.

Copy link
@jbeich

jbeich May 24, 2018

Contributor

Is this GNU libc specific include actually used?

In file included from rpcs3/Emu/Cell/Modules/cellGcmSys.cpp:8:
In file included from rpcs3/Emu/RSX/GSRender.h:3:
In file included from rpcs3/Emu/RSX/RSXThread.h:15:
In file included from rpcs3/Emu/RSX/Overlays/overlays.h:14:
Utilities/CPUStats.h:15:10: fatal error: 'sys/vtimes.h' file not found
#include "sys/vtimes.h"
         ^~~~~~~~~~~~~~

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 24, 2018

Author Contributor

It's not, apparently.

}
CloseHandle(snapshot);
return ret ? entry.cntThreads : -1;
#else

This comment has been minimized.

Copy link
@jbeich

jbeich May 24, 2018

Contributor

/proc/self/task is not supported on FreeBSD and probably macOS. Maybe use #elif defined(__linux__) here and leave #else for a fallback.

@VelocityRa

This comment has been minimized.

Copy link
Contributor Author

VelocityRa commented May 24, 2018

@jbeich Sure, thanks for this, I'll merge after I've made proper commits.
Linux still doesn't display correct stats unfortunately (does FreeBSD, with your PR?), I tried to look into it in my dualboot but I can't even run RPCS3 because of unrelated issues.

So unless someone can look into it, or I explicitly remove linux support, I don't want to make proper commits and ask for a merge.

@jbeich

This comment has been minimized.

Copy link
Contributor

jbeich commented May 24, 2018

Linux still doesn't display correct stats unfortunately (does FreeBSD, with your PR?),

No. Stats are broken on FreeBSD as well. I've only tested OpenGL.
rpcs3 perf-overlay
(Scogger HD title screen)

@VelocityRa

This comment has been minimized.

Copy link
Contributor Author

VelocityRa commented May 24, 2018

Ok, then there may be a logic error because on linux is the same thing, I'll try to debug in a linux VM with software rendering I set up.

@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch from e16078f to aa05c30 May 24, 2018
@VelocityRa

This comment has been minimized.

Copy link
Contributor Author

VelocityRa commented May 24, 2018

Fixed and made commits. Tested and working on Windows and Linux. FreeBSD should be fine too if @jbeich's changes work as expected.

Ready for another review.

@VelocityRa VelocityRa changed the title Performance Overlay [WIP] Performance Overlay May 24, 2018
@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch from aa05c30 to 964b393 May 24, 2018
@jbeich

This comment has been minimized.

Copy link
Contributor

jbeich commented May 24, 2018

@VelocityRa, I confirm, perf stats look fine even on FreeBSD now.

ret = Process32Next(snapshot, &entry);
}
CloseHandle(snapshot);
return ret ? entry.cntThreads : -1;

This comment has been minimized.

Copy link
@jbeich

jbeich May 24, 2018

Contributor

Any reason Windows uses -1 (aka 4294967295) as fallback compared to 0 on Linux?

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 25, 2018

Author Contributor

True, fixed.

@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch from 964b393 to 3079016 May 25, 2018
@greentop

This comment has been minimized.

Copy link

greentop commented May 25, 2018

Ubuntu 16.04 / Nvidia 396.24 / i7 3770

Game freezes (1) while loading the shaders. Deleting the shaders gets you further, but the game will freeze (2) once you reach the Load Save Native Overlay screen - and also the Performance Overlay layer is not seen on top of the Load Save screen. The Performance Overlay layer background flickers on/off when the Compiling Shaders message appears. Performance Overlay layer is displaying resource data within all fields.

  1. F {rsx::thread} St13runtime_error thrown: Verification failed (e=17): (in file /home/green/rpcs3/rpcs3/Emu/RSX/VK/VKOverlays.h:213)

  2. F {rsx::thread} St13runtime_error thrown: Verification failed (e=2): (in file /home/green/rpcs3/rpcs3/Emu/RSX/VK/VKOverlays.h:213)

screenshot from 2018-05-25 05-28-37

RPCS3.log.gz

@kd-11

This comment has been minimized.

Copy link
Contributor

kd-11 commented May 25, 2018

There are a few overlay bugs but they are unrelated to this PR. I'll submit a fix for them very soon (they are already up on my fork)

@VelocityRa

This comment has been minimized.

Copy link
Contributor Author

VelocityRa commented May 25, 2018

I forgot to disable this by default.
Speaking of, should I add a temporary checkbox just for enabling/disabling it in the "System" tab (or another one) before @Megamouse re-organizes those? Else a user would have to force a re-save of the config and then edit it to enable the overlay.

@@ -392,6 +400,19 @@ struct cfg_root : cfg::node

} vk{this};

struct node_perf_overlay : cfg::node
{
using detai_level = detail_level;

This comment has been minimized.

Copy link
@kd-11

kd-11 May 29, 2018

Contributor

this is wierd

This comment has been minimized.

Copy link
@VelocityRa

VelocityRa May 29, 2018

Author Contributor

🤦‍♂️

template <>
inline void fmt_class_string<detail_level>::format(std::string& out, u64 arg)
{
format_enum(out, arg, [](detail_level value) {

This comment has been minimized.

Copy link
@kd-11

kd-11 May 29, 2018

Contributor

brace on newline

@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch from 3079016 to d49f175 May 29, 2018
@Megamouse

This comment has been minimized.

Copy link
Contributor

Megamouse commented May 29, 2018

can be updated after #4673

VelocityRa and others added 6 commits May 24, 2018
Also add a getter for the native thread handle.
* Also fix a few warnings in overlay_controls
* Support for using arbitrary firmware fonts
* Support for specifying font extension in `font` constructor (useful for most firmware fonts that use .TTF)
@VelocityRa VelocityRa force-pushed the VelocityRa:perf-overlay branch from d49f175 to c24feee May 29, 2018
@VelocityRa

This comment has been minimized.

Copy link
Contributor Author

VelocityRa commented May 29, 2018

Thanks Mega, as said on Discord I think I'm gonna do GUI stuff at a later separate PR, unless there's any objections.
I'd rather get this one out of the way as I won't be able to work on it for a few days.


For everyone that wants to try this before the GUI:

  1. Forcefully re-save config.yml by changing any option and saving, or making a new one entirely.
  2. Open config.yml, scroll to the Video settings at the bottom and enable it like so
    image

Edit: There's a GUI now, so no need for this ^

@Megamouse

This comment has been minimized.

Copy link
Contributor

Megamouse commented May 30, 2018

I can just add it to my PR if you want.
If KD-11 merges this today I can just push a commit today

@kd-11 kd-11 merged commit 3d632a3 into RPCS3:master May 30, 2018
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@VelocityRa VelocityRa deleted the VelocityRa:perf-overlay branch Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
8 participants
You can’t perform that action at this time.