Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PerfTickLogger, reduce overhead of logging long ticks. #20159

Merged
merged 1 commit into from
Jul 1, 2023

Conversation

anvilvapre
Copy link
Contributor

Make PerfTickLogger static. Do not allocate PerfLogTicker object on hot paths.

Would be nice if settings could send out notifications on changes of either any, sections of or specific setting values.

OpenRA.Game/Support/PerfTimer.cs Show resolved Hide resolved
OpenRA.Game/Support/PerfTickLogger.cs Outdated Show resolved Hide resolved
@PunkPun
Copy link
Member

PunkPun commented Nov 17, 2022

This should get a rebase as it's 237 commits behind.

PunkPun
PunkPun previously approved these changes Nov 25, 2022
@pchote
Copy link
Member

pchote commented Dec 20, 2022

Are there any profiling results showing the benefit here?

@PunkPun
Copy link
Member

PunkPun commented Dec 20, 2022

There wasn't any profiling done I think, but this should be quite a significant yet simple improvement.

So much less memory would be used on monitoring activities

@anvilvapre
Copy link
Contributor Author

ping.

@Mailaender
Copy link
Member

Mailaender commented May 27, 2023

RAM usage

Red Alert shellmap

bleed: 404 MB
this: 387 MB (note: from Nov 17, 2022)

@Mailaender
Copy link
Member

Needs a rebase.

@Mailaender
Copy link
Member

Mailaender commented May 27, 2023

RAM usage

bleed: 404 MB
this: 396 MB

CPU

bleed: ⌀ 2.60 ± 1.72 ms
this: ⌀ 3.76 ± 1,85 ms

image

@PunkPun
Copy link
Member

PunkPun commented May 27, 2023

Weird

This should definitely be faster & take less memory. Instead of constantly creating new classes and a bunch of variables, you only save current time.

@Mailaender
Copy link
Member

I was also surprised and repeated the measurements. My method is probably not perfect.

@PunkPun
Copy link
Member

PunkPun commented May 27, 2023

If you change the order of the tests, do the results change?

@Mailaender
Copy link
Member

I tested bleed, this, bleed and the average only changes slightly.

./launch-game.sh Game.Mod=ra Launch.Benchmark=true using this R script:

truetick <- read.csv("bleed/truetick_time.csv", sep = ",", header = TRUE)
bleed <- truetick[-c(1:25), ]
bleed <- bleed[1:500, ]
mean_time <- mean(bleed$time..ms)
median_time <- median(bleed$time..ms)
sd_time <- sd(bleed$time..ms)
plot(bleed$tick, bleed$time..ms, type = "l", xlab = "Tick", ylab = "Time (ms)", 
     main = "Red Alert Shellmap (bleed)")
grid()
abline(h = mean_time, col = "red", lty = 2)
abline(h = mean_time + sd_time, col = "darkred", lty = 3)

truetick <- read.csv("perftickerlogger/truetick_time.csv", sep = ",", header = TRUE)
perftickerlogger <- truetick[-c(1:25), ]
perftickerlogger <- perftickerlogger[1:500, ]
mean_time <- mean(perftickerlogger$time..ms)
median_time <- median(perftickerlogger$time..ms)
sd_time <- sd(perftickerlogger$time..ms)
plot(perftickerlogger$tick, perftickerlogger$time..ms, type = "l", xlab = "Tick", ylab = "Time (ms)", 
     main = "Red Alert Shellmap (perftickerlogger)")
grid()
abline(h = mean_time, col = "red", lty = 2)
abline(h = mean_time + sd_time, col = "darkred", lty = 3)

plot(bleed, type = "l", xlab = "Tick", ylab = "Time (ms)", col = "blue", 
     main = "Red Alert Shellmap")
lines(perftickerlogger, col = "red")
legend("topright", legend = c("bleed", "perf_ticker_logger"), 
       col = c("blue", "red"), lty = 1, cex = 0.8)
grid()

@anvilvapre
Copy link
Contributor Author

can you please check to see if a lot of long ticks were logged?

seems start in WorldUtil.DoTimed() was not reset, possible resulting in long ticks being logged. perhaps that explains the delay.

@anvilvapre
Copy link
Contributor Author

can you please check to see if a lot of long ticks were logged?

seems start in WorldUtil.DoTimed() was not reset, possible resulting in long ticks being logged. perhaps that explains the delay.

i.e. i see. before last push/fix, these Effects being logged as long ticks.

    12 ms [441] Effect: Bullet
    12 ms [441] Effect: SpriteEffect
    12 ms [441] Effect: GpsDotEffect
    12 ms [441] Effect: SpriteEffect

They are not there in last pushed version.

@Mailaender
Copy link
Member

CPU

bleed: ⌀ 2.56 ± 1.73 ms
this: ⌀ 2.78 ± 1,91 ms

image

RAM

bleed: ~396 MB
this: ~392 MB

@anvilvapre
Copy link
Contributor Author

that at least explained it then.

@Mailaender
Copy link
Member

bleed

First measurement takes long, then it is fast:

1000 iterations of perfLogger.LogTickAndRestartTimer
First tick: 41790
Average Elapsed Time: 16.463
Standard Deviation: 0.9953044760273166

this

First instantiation is significantly reduced, but it stays jumpy

1000 iterations of PerfTimer.LogLongTick
First tick: 763
Average Elapsed Time: 18.565
Standard Deviation: 84.7221327340145

Logging some more as the standard deviation is far too high:

1000 iterations of PerfTimer.LogLongTick
First tick: 834
Second tick: 197
Third tick: 2227
Fourth tick: 161
Fifth tick: 128
Sixth tick: 85547
Seventh tick: 222
Average Elapsed Time: 21.404
Standard Deviation: 82.21780089494047

My code

#region Copyright & License Information
/*
 * Copyright (c) The OpenRA Developers and Contributors
 * This file is part of OpenRA, which is free software. It is made
 * available to you under the terms of the GNU General Public License
 * as published by the Free Software Foundation, either version 3 of
 * the License, or (at your option) any later version. For more
 * information, see COPYING.
 */
#endregion

using System;
using System.Diagnostics;
using System.Globalization;
using OpenRA.Support;

namespace OpenRA.Mods.Common.UtilityCommands
{
	class CheckPerformance : IUtilityCommand
	{
		string IUtilityCommand.Name => "--check-performance";

		bool IUtilityCommand.ValidateArguments(string[] args)
		{
			return true;
		}

		[Desc("Clock functions.")]
		void IUtilityCommand.Run(Utility utility, string[] args)
		{
			const int Iterations = 1000;
			Console.WriteLine($"{Iterations} iterations of PerfTimer.LogLongTick");
			var stopwatch = new Stopwatch();

			stopwatch.Restart();
			PerfTickLogger.LogLongTick(0, "", null);
			Console.WriteLine($"First tick: {stopwatch.ElapsedTicks}");

			var elapsedTimes = new long[Iterations];

			for (var i = 0; i < Iterations; i++)
			{
				stopwatch.Restart();

				PerfTickLogger.LogLongTick(0, "", null);

				stopwatch.Stop();
				elapsedTimes[i] = stopwatch.ElapsedTicks;
			}

			var average = CalculateAverage(elapsedTimes);
			Console.WriteLine($"Average Elapsed Time: {average.ToString(CultureInfo.InvariantCulture)}");
			var standardDeviation = CalculateStandardDeviation(elapsedTimes);
			Console.WriteLine($"Standard Deviation: {standardDeviation.ToString(CultureInfo.InvariantCulture)}");
		}

		static double CalculateAverage(long[] values)
		{
			long sum = 0;
			foreach (var value in values)
				sum += value;

			return (double)sum / values.Length;
		}

		static double CalculateStandardDeviation(long[] values)
		{
			var average = CalculateAverage(values);
			double sumOfSquares = 0;

			foreach (var value in values) {
				var difference = value - average;
				sumOfSquares += difference * difference;
			}

			var variance = sumOfSquares / values.Length;
			return Math.Sqrt(variance);
		}
	}
}

@anvilvapre
Copy link
Contributor Author

Can you please try with passing current timestamp. Now you pass zero as start time which will be compared against the current timestamp - which will be greater than the threshold and there will actually be seen as a long tick?

@anvilvapre
Copy link
Contributor Author

Can you please try with passing current timestamp. Now you pass zero as start time which will be compared against the current timestamp - which will be greater than the threshold and there will actually be seen as a long tick?

So i suspect that you are now measuring how effecient the logging is.

@Mailaender
Copy link
Member

replaced it with

PerfTickLogger.LogLongTick(Stopwatch.GetTimestamp(), "", null);

./utility.sh ra --check-performance
1000 iterations of PerfTimer.LogLongTick
First tick: 2209
Average Elapsed Time: 178.191
Standard Deviation: 3679.7544258439616

which seems to make it worse.

@Mailaender
Copy link
Member

When I cache

var timestamp = Stopwatch.GetTimestamp();
PerfTickLogger.LogLongTick(timestamp, "", null);

it is still worse

./utility.sh ra --check-performance
1000 iterations of PerfTimer.LogLongTick
First tick: 96045
Average Elapsed Time: 35.791
Standard Deviation: 97.71079428087506

@anvilvapre
Copy link
Contributor Author

i'm fine with it being worse, just that i would like to try to understand why.

@anvilvapre
Copy link
Contributor Author

anvilvapre commented Jun 3, 2023

This is how it should be called. Initialize timestamp once. After set timestamp to the return value of log long tick.

This will cause only one var current = stopwatch.GetTimestamp() call. And apart from that the function only checks if current minus start is bigger than the threshold.

			var start = PerfTickLogger.GetTimestamp();
			for (var i = 0; i < Iterations; i++)
			{
				stopwatch.Restart();

				start = PerfTickLogger.LogLongTick(start, "", null);

				stopwatch.Stop();
				elapsedTimes[i] = stopwatch.ElapsedTicks;
			}

@Mailaender
Copy link
Member

			const int Iterations = 1000;
			Console.WriteLine($"{Iterations} iterations of PerfTimer.LogLongTick");
			var stopwatch = new Stopwatch();

			stopwatch.Restart();
			var start = PerfTickLogger.GetTimestamp();
			stopwatch.Stop();
			Console.WriteLine($"First tick: {stopwatch.ElapsedTicks}");

			var elapsedTimes = new long[Iterations];

			for (var i = 0; i < Iterations; i++)
			{
				stopwatch.Restart();

				start = PerfTickLogger.LogLongTick(start, "", null);

				stopwatch.Stop();
				elapsedTimes[i] = stopwatch.ElapsedTicks;
			}

gives me

1000 iterations of PerfTimer.LogLongTick
First tick: 83701
Average Elapsed Time: 15.818
Standard Deviation: 0.827572353332331

but not at all in a reproducible way. I think I am measuring something else.

@anvilvapre
Copy link
Contributor Author

but do you agree that the body of LogLongTick doesn't hold much logic at all anymore. it just gets the current time and compares that.

@anvilvapre
Copy link
Contributor Author

what might be better is to now restart your stopwatch in the loop. just measure the total loop time. you now may be testing your stopwatch. you could uncomment LogLogTick. see the diff.

@Mailaender
Copy link
Member

Yes, I am testing the stopwatch when I try to measure in the loop. I am doing 10.000 iterations now and stop in the end. The overhead of the stopwatch or the empty loop is still around 2000 ms.

this

First Tick 10,000 iterations
43964 ± 757 ms 9536 ± 2573 ms

bleed

First Tick 10,000 iterations
84571 ± 861 ms 11,591 ± 1938 ms

So we clearly have a winner, although I am not sure how impactful given how hard it was to measure.

@Mailaender
Copy link
Member

Mailaender commented Jul 1, 2023

using System;
using System.Diagnostics;
using OpenRA.Support;

namespace OpenRA.Mods.Common.UtilityCommands
{
	class BenchmarkPerfTickLogger : IUtilityCommand
	{
		string IUtilityCommand.Name => "--perf";

		bool IUtilityCommand.ValidateArguments(string[] args)
		{
			return true;
		}

		void IUtilityCommand.Run(Utility utility, string[] args)
		{
			const int Iterations = 10000;
			Console.WriteLine($"{Iterations} iterations of perfTickLogger.LogTickAndRestartTimer");
			var stopwatch = new Stopwatch();

			stopwatch.Restart();
			var perfTickLogger = new PerfTickLogger();
			perfTickLogger.Start();
			stopwatch.Stop();
			Console.WriteLine($"First tick: {stopwatch.ElapsedTicks}");

			stopwatch.Restart();
			for (var i = 0; i < Iterations; i++)
				perfTickLogger.LogTickAndRestartTimer("", null);

			stopwatch.Stop();
			Console.WriteLine($"Elapsed Time  {stopwatch.ElapsedTicks}");
		}
	}
}

@Mailaender
Copy link
Member

using System;
using System.Diagnostics;
using OpenRA.Support;

namespace OpenRA.Mods.Common.UtilityCommands
{
	class BenchmarkPerfTickLogger : IUtilityCommand
	{
		string IUtilityCommand.Name => "--perf";

		bool IUtilityCommand.ValidateArguments(string[] args)
		{
			return true;
		}

		void IUtilityCommand.Run(Utility utility, string[] args)
		{
			const int Iterations = 10000;
			Console.WriteLine($"{Iterations} iterations of PerfTimer.LogLongTick");
			var stopwatch = new Stopwatch();

			stopwatch.Restart();
			var start = PerfTickLogger.GetTimestamp();
			stopwatch.Stop();
			Console.WriteLine($"First tick: {stopwatch.ElapsedTicks}");

			stopwatch.Restart();
			for (var i = 0; i < Iterations; i++)
				start = PerfTickLogger.LogLongTick(start, "", null);

			stopwatch.Stop();
			Console.WriteLine($"Elapsed Time  {stopwatch.ElapsedTicks}");
		}
	}
}

@Mailaender Mailaender merged commit edbded8 into OpenRA:bleed Jul 1, 2023
@Mailaender
Copy link
Member

Changelog

@anvilvapre
Copy link
Contributor Author

Future improvement might be;

https://stackoverflow.com/questions/55686928/using-stopwatch-in-c-sharp

In [.NET 7.0](https://dotnet.microsoft.com/en-us/download/dotnet/7.0) there were some improvements in the Stopwatch class. There is a better way to use Stopwatch that will be more accurate, slightly faster, and won't use memory allocation at all!

You should use the methods Stopwatch.GetTimestamp() & Stopwatch.GetElapsedTime(long) to make the improvements.

An example would be:

long startTime = Stopwatch.GetTimestamp();
// Do Something...
TimeSpan elapsedTime = Stopwatch.GetElapsedTime(startTime);
For more info, read the [Stopwatch in .NET 7.0 documentation](https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.stopwatch.getelapsedtime?view=net-7.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants