New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Framerate display window #6822

Merged
merged 49 commits into from Jul 19, 2018

Conversation

Projects
None yet
7 participants
@nielsmh
Contributor

nielsmh commented Jun 9, 2018

Frame rate and various game loop/graphics timing measurements and graphs. Accessible via the Help menu, and can print some stats in the console via the fps command.

image

There should be hooks into all video output drivers, but platforms should be tested that the video output measurement is actually meaningful, and that the timer used for measurements is sufficiently accurate.

Tested working on:

  • Windows (Win32)
  • Windows (SDL)
  • Mac
  • Linux (SDL)
  • DOS (Allegro)
  • OS/2

Original first image:
image

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 9, 2018

Inspiration for this was reading #6677, having a framerate display that also measures individual components of game speed should make it easier to figure out "why is it not fast".

@stavrossk

This comment has been minimized.

stavrossk commented Jun 9, 2018

@nielsmh This is awesome, thank you!!!

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 10, 2018

Time for bikeshedding: Is this preferable to the style shown in the first screenshot?

image

@andythenorth

This comment has been minimized.

Contributor

andythenorth commented Jun 12, 2018

As far as I can tell, compiles and works as expected on Mac. I can't verify timing accuracy :)

@PeterN

This comment has been minimized.

Member

PeterN commented Jun 17, 2018

  • Game loop shows frames/s even when paused.
  • Video output shows red when running at game loop rate, and is affected hugely by mouse cursor movement.
  • In fast-forward mode, frames/s switches between 500, 1000 and 9999.99 -- granularity issue?
@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 17, 2018

  1. I'll have to test more with pause mode, I haven't actually checked what happens then.
  2. My assumption has been that you want video output to run at 60 fps, simply because of mouse cursor/UI interaction being better then. Mouse movement affects window update rate on Windows since it causes messages to enter the message loop more frequently than it would otherwise process screen updates.
  3. On Windows, the timing is obtained with GetTickCount which doesn't have good granularity, so multiple calls within a 20 ms window (or there about) will often return the same value. Possibly a multimedia timer timestep increase and measurement via timeGetTime instead could give better accuracy.
@PeterN

This comment has been minimized.

Member

PeterN commented Jun 17, 2018

You appear to be timing how long a frame takes and storing lots of these values, to then calculate an average based on a number of date points, instead of time. This means the average takes quite a while to settle on a value.

You should probably just count how many frames happen within 1 second. This will reduce timing errors, require less CPU effort to calculate, and means the average value is always just from the last second.

@PeterN

This comment has been minimized.

Member

PeterN commented Jun 17, 2018

Hmm, of course, it is probably necessary to do it your way to be able to break it down into sections.

If (on display) you pick the number of data points that correspond to about 1 second that will fix the settle-time issue. Limiting the refresh rate of the fps window will reduce fluctuation too.
I guess you need to use high-resolution timers to get further accuracy.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 17, 2018

An alternative approach could be to bucketize time, e.g. have 11 buckets of 100 ms intervals, and count number of frames in each interval in a bucket. So you have bucket = (ms_timestamp / 100) % 11, have one bucket that's "active", and calculate FPS from the sum of the remaining 10. Whenever the timestamp rolls over to the next 100 ms interval, the new bucket for that interval is then reset to 0 frames counted.

@TrueBrain

This comment has been minimized.

Member

TrueBrain commented Jun 17, 2018

I would really like such a feature. And I like the first window more than the second; but that is more because the second confuses me.

Lately in a lot of games you see two values: FPS and UPS. FPS being how often your screen is redrawn; so how many times changes appear on your screen in a single second. UPS being how often the game updates it states (updates per second). So our gameloop basically.

In modern games they almost always run in two different threads, making measuring a lot easier ofc :)

What I do not really get in your second screenshot, is video output. It feels off that you have both a screen redrawing and video output. Why are they different? Seems like a waste of CPU/GPU power if one is on a higher frequency than the other.

What would be really cool to add (possibly in another PR), is a history. Only knowing the current FPS means you constantly have to look at the number. Having a history gives you more time to react on jumps etc.

Lastly, what would be really really cool, is if in for example the UPS (GameLoop) you can see how much ms each step takes (profiling on a high-level). So you see how many ms vehicle-loop took, etc. Same for FPS. But this might not be for this PR :D

Measuring you see in a wide variety. You have those that check every second how many frames, and that number bounces up and down like a mofo. I never liked those values, as they don't really mean anything. That only works if you show them over time. This is the addition you often see on this. So instead of showing the FPS of that second, you also see the last 30 seconds or so. This allows you to see spikes, etc. Still, per second is a very unreliable way of looking at those numbers. (very spiky)

The other way around is also possible, where you check every N frames how many seconds it took. And than again with the history, etc. This is what you see most in simple games, as the implementation is cheapest. The downside is if your framerate drops to 0, it takes for ever for the FPS to update. But the update to see the FPS needs a new frame, so you might wonder if it matters :)

The bucket system you suggest is also a very solid one. Calculating which bucket is cheap, and it gives a sliding window (at least, I assumed you meant that). The upside is that the FPS is a lot more stable (fluctuations take a while to show up). Downside is that putting the resulting FPS in a history is non-sensible. But this can be Incorporated with the first suggestion for history, if you make the buckets for, say, 30 seconds (with a resolution of 100ms sounds fine). For the active FPS you can take a sliding window of 10 buckets or so, and show with 100ms resolution the measured FPS in a history. This only is vastly more difficult to implement correctly, so I often see projects just pick the second approach (per N frames) and be done with it.

Sorry for the long reply of which most you most likely already knew. But sometimes it is good to write it down :D

tldr; keep it simple, UPS/FPS, history if possible would be awesome :D

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 17, 2018

@TrueBrain The way I've implemented things it could easily set up to also measure more specific parts of the game loop. Just add another element to enum FramerateElement and instantiate a FramerateMeasurer object in the scope you want to measure timing of.

The reason I have video output as a separate element is that, on some platforms, the action of copying the composited frame into video memory might actually become the bottleneck, depending on resolution, bit depth, and maybe color-space conversion, and I want to be able to display that. The point of the video output element isn't so much the FPS as it is the average time taken in ms.

The framerate window could get an expand/collapse button (similar to the Finances window) so it shows just the game loop and screen drawing FPS by default, but can expand to a list of detailed timings of several elements of the game loop.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 17, 2018

I just tried out making a timestamp-bucketized framerate measurement, and don't like it very much. It gives less precision unless I increase the number of buckets to something like 100, and then it won't have a good response time anyway. I think a better approach is to simply keep track of how many data points are valid and never use more than the valid count for calculating current rate.

This gist has the failed bucketizer.

nielsmh added some commits Jun 17, 2018

Change: Framerate window now supports element time graphs
Click on the captions in the detailed frame rate view to open the graph windows.
@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 17, 2018

I added some basic time graphs now.

image

Right now they're just using a fixed scaling (1 pixel in height = 1 millisecond measured) and lack any labels. The height is fixed to 100 pixels, with the assumption that any values above that are meaninglessly large. The X-axis timescale is simply one pixel per data point, i.e. per tick or frame.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 18, 2018

Much improved graphs now, including support for scaling based on font size.

2018-06-18_22-57-52

@TrueBrain

This comment has been minimized.

Member

TrueBrain commented Jun 18, 2018

I really really like what you did here @nielsmh . Well done! (did not look at the code, sorry; just the end-result :P)

@TrueBrain

This comment has been minimized.

Member

TrueBrain commented Jun 18, 2018

PS: we are going to get so much questions why it "only" runs at 32.25 frames/s :P

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 19, 2018

I really have no idea why it gets the framerate wrong, but I'm going to investigate it in detail today. Or perhaps the game really is running slightly slow due to system timers or whatever?

@PeterN

This comment has been minimized.

Member

PeterN commented Jun 19, 2018

Not impossible as the system does not try to catch up, but unlikely as it is showing exactly 32.25 for everyone who has tested.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 19, 2018

At least reading win32_v.cpp MainLoop, it looks like timing is based on GetTickCount, and whether to advance to next frame is determined by current timestamp >= previous frame timestamp + 30 ms. Since GetTickCount results are normally quantized to 16 ms or there about, I think there will be a bit of drift there, causing the actual frame rate to be somewhat lower than the intended 33 1/3.

@PeterN

This comment has been minimized.

Member

PeterN commented Jun 22, 2018

Also, any chance of toning down the use of auto a bit?

@@ -1182,7 +1190,7 @@ static void CheckPaletteAnim()
void VideoDriver_Win32::MainLoop()
{
MSG mesg;
uint32 cur_ticks = GetTickCount();
uint32 cur_ticks = timeGetTime();

This comment has been minimized.

@PeterN

PeterN Jun 22, 2018

Member

These changes don't belong in this PR

This comment has been minimized.

@nielsmh

nielsmh Jun 22, 2018

Contributor

Agree, it's not a complete solution either.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 22, 2018

Slightly odd effect with the viewport rendering times graph "spewing the values up", this is due to the graph rendering reading incomplete/zero values for the first item in graph, since viewport rendering happens in the same cycle as the graph drawing. Fixing this would be annoying.

2018-06-22_22-59-42

@LordAro

This comment has been minimized.

Member

LordAro commented Jun 22, 2018

...how annoying? :)

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jun 22, 2018

Two parts to fixing it: First is that you'd need to record the start and end of the cycle that measurements are accumulated over. Second is that the accumulation would have to either happen in another scratch area, or consume a slot from the back end of the durations buffer which is uncounted during accumulation.

nielsmh added some commits Jun 23, 2018

Change: Measure performance of game loop landscape processing, and ti…
…me spent waiting for link graph background threads to finish
Fix: Don't let partial measurements hang around in accumulation mode,…
… collect them in a scratch area instead
PFE_ACC_GL_ROADVEHS, ///< Time spend processing road vehicles
PFE_ACC_GL_SHIPS, ///< Time spent processing ships
PFE_ACC_GL_AIRCRAFT, ///< Time spent processing aircraft
PFE_GL_LANDSCAPE, ///< Time spent processing other world features

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

The Landscape element actually accumulates across three different functions, but not over a number of similar objects being processed. Should it have the ACC infix? Should the ACC infix even exist?

OnTick_Trees();
OnTick_Station();
OnTick_Industry();
}
OnTick_Companies();

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

I'm not sure if OnTick_Companies does anything interesting worth measuring.

STR_FRAMERATE_CURRENT :{WHITE}Current
STR_FRAMERATE_AVERAGE :{WHITE}Average
STR_FRAMERATE_DATA_POINTS :{WHITE}Data based on {COMMA} measurements
STR_FRAMERATE_VALUE :{STRING2}

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

This feels bad, it's just an artifact of how the code is implemented and I think it would be better to clean up the code and take out this indirection?

struct FramerateWindow : Window {
bool small;
static const int VSPACING = 3; ///< space between column heading and values

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

This is a really bad name isn't it? Is there any standard constant usable instead?

{
this->SetDirty();
if (this->scale_update_timer > 0) {

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

Considering basing this off real time instead of ticks, I think that might give better response times when performance is bad, and worse when performance is good.

/** Scale and interpolate a value from a source range into a destination range */
template<typename T>
static inline T Scinterlate(T dst_min, T dst_max, T src_min, T src_max, T value)

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

Does this operation have a more standard name?

TimingMeasurement timestamps[NUM_FRAMERATE_POINTS];
/** Expected number of cycles per second when the system is running without slowdowns */
double expected_rate;
/** Next index to write to in \c durations and \c timestamps */

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

Is this usage of formatting codes correct?

TimingMeasurement lastts = timestamps[point];
TimingMeasurement peak_value = 0;
Point peak_point{ 0, 0 };

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

C++11 initialization, should be fixed.

{
extern void ConPrintFramerate(); // framerate_gui.cpp
if (argc == 0) {

This comment has been minimized.

@nielsmh

nielsmh Jun 25, 2018

Contributor

Considering adding a way to summon the framerate window from console, so you can also view it on the title screen.

@ncdulo

This comment has been minimized.

ncdulo commented Jul 1, 2018

I really like this! I'm always interested in seeing the behind the scenes stats, especially when "breaking" the game with lots of grfs. What would the best way to test this feature out if I clone the repository and build openttd myself? I would be interested in testing this out for Linux.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jul 2, 2018

@ncdulo If you can build with both the SDL and the Allegro backends, and test with both, that would be ideal. The two main things that need testing is:

  1. The time graphs record with sub-millisecond precision. (Find one that displays the vertical scale at 1-2-3-4 ms or 4-8-12-16 ms, and confirm that recorded values occasionally fall between the subdivision lines.)
  2. Values get recorded for the Video output element. Each frame should take longer if running at larger resolution/window size.

Additionally, confirm whether the game can run at the intended 33.33 fps with both backends. If it can't, that's a separate issue.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jul 19, 2018

Tested working on Linux SDL (in a virtual machine). Interestingly, the game loop runs at about 34.1 fps here, so faster than intended.

@nielsmh

This comment has been minimized.

Contributor

nielsmh commented Jul 19, 2018

Also seems to work with Linux Allegro. Presumably the only blocker for DOS Allegro then is whether any working DOS build environment has std::high_resolution_clock available.

@TrueBrain

My compliments how you kept on this, changing it four times, and still going, including doing most of the testing yourself. Well done!

@TrueBrain TrueBrain merged commit 2a868b9 into OpenTTD:master Jul 19, 2018

6 checks passed

continuous-integration/jenkins/pr-head This commit looks good
Details
openttd/commit-checker The commit looks good
Details
openttd/linux-amd64-clang-3.8 The commit looks good
Details
openttd/linux-amd64-gcc-6 The commit looks good
Details
openttd/linux-i386-gcc-6 The commit looks good
Details
openttd/osx-10.9 The commit looks good
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment