Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serious performance regression in RevealsShroud #17377

Open
pchote opened this issue Nov 23, 2019 · 11 comments · May be fixed by #17383
Open

Serious performance regression in RevealsShroud #17377

pchote opened this issue Nov 23, 2019 · 11 comments · May be fixed by #17383

Comments

@pchote
Copy link
Member

@pchote pchote commented Nov 23, 2019

We are starting to get reports that release-20191117 is near-unplayable in team games. WhoCares submitted a replay (attached below) on Discord, which points at RevealsShroud on aircraft being a significant part of the problem.

Possibly caused by #16953. Aircraft move fast enough to trigger a vision update every ~second tick, which combined with the ridiculous vision range makes for a bad time.

Defining

RevealsShroud:
	MoveRecalculationThreshold: 0

in a custom map may restore previous-release behaviour and performance.

team-game-low-perf.orarep.zip

@matjaeck

This comment has been minimized.

Copy link
Contributor

@matjaeck matjaeck commented Nov 23, 2019

Visibility checks for aircraft have been a performance bottleneck before (#16393).

@tovl

This comment has been minimized.

Copy link
Contributor

@tovl tovl commented Nov 23, 2019

The performance hit can be easily reproduced:

  1. Turn visibility checks off
  2. Build ~60 yaks. In my case this has no effect on performance and fps keeps at around 60.
  3. Turn Visibility checks on. In my case fps drops to 20.

@matjaeck is right: This particular performance bottleneck is not new. It just got exacerbated because the visibility updates are done more frequently now, but even with MoveRecalculationThreshold: 0 fps still drops to ~45 in the above testcase. This is still completely disproportional considering nothing else about having that number of airplanes seems to have a significant effect on performance.

Some more testing reveals that the actual visibility checks themselves have no effect on performance either. The slowdown is exclusively caused by adding and removing cells from the shroud. There is something seriously wrong with the way we store shroud information.

@pchote

This comment has been minimized.

Copy link
Member Author

@pchote pchote commented Nov 23, 2019

We used to have serious problems with shroud state getting out of sync, so changed to a system where we first remove all the visibility cells for an actor and then add them all back fresh. This was made worse by the newer gap-gen game mechanic which means that we typically do this twice for every unit.

Changing Shroud.AddSource to Shroud.AddOrUpdateSource and only recalculating/invalidating the cells that have changed vs the last state (which Shroud already tracks) should help, but my quick attempt at implementing this seemed to make it even worse.

bfb44af shows what I had in mind, but I expect calculating addCells and removeCells to be the major bottleneck here (calculating differences of two arrays with a couple of hundred entries inside a very hot part of the code - bad!). We could speed this up by calculating the delta in AffectsShroud instead (where we can use the range checks instead of element-by-element comparisons), but this means a major rewrite of the code and giving up the safety checks that we know we need.

@tovl

This comment has been minimized.

Copy link
Contributor

@tovl tovl commented Nov 23, 2019

Hmm, maybe the real problem here is that we are calling Shroud.Invalidate 60 times per tick on mostly overlapping cells when we should only need to do that at most 1 time per tick per cell.

@pchote

This comment has been minimized.

Copy link
Member Author

@pchote pchote commented Nov 23, 2019

That was the first thing I tried - commenting Shroud.Invalidate out completely didn't really help.

@pchote pchote changed the title Potential serious performance regression in RevealsShroud Serious performance regression in RevealsShroud Nov 23, 2019
@pchote pchote added this to the Next Release milestone Nov 23, 2019
@tovl

This comment has been minimized.

Copy link
Contributor

@tovl tovl commented Nov 24, 2019

That is not what I found. Commenting out the call to Invalidate inside AddSource and RemoveSource yields a consistent 60 fps in the testcase.

@RoosterDragon

This comment has been minimized.

Copy link
Member

@RoosterDragon RoosterDragon commented Nov 24, 2019

Had an initial poke and my findings are much the same as @pchote :

  • You need to solve the multiple invalidation problem. Calculating a diff in AffectsShroud solves this nicely (even better - avoids repeated the work for each player) which means you only call Shroud.Invalidate on cells that actually changed - i.e. were added or removed to the shroud, rather than everything in sight range.
  • However, naive diff calculation is slow and you'd need to fix that too. In a profile I took it cost 38% of the time to Tick AffectsShroud with the invalidation fix. 20% of that was determining the projected cells, 13% in calculating the added/removed cells (via HashSet) and 4% for then updating the Shroud.

The solution is similar to the shroud invalidation - instead of paying the cost of cells equal to the whole sight range, we only want to pay the cost of the changed cells.

To do this, you need a smart DiffTileInAnnulus function which just works out the difference at the edges, rather than having to enumerate all the cells in range.

i.e. the dumb implementation would just do:

public void DiffTilesInAnnulus(
	CPos centerA, int minRangeA, int maxRangeA,
	CPos centerB, int minRangeB, int maxRangeB,
	out IEnumerable<CPos> onlyInA, out IEnumerable<CPos> onlyInB)
{
	var a = FindTilesInAnnulus(centerA, minRangeA, maxRangeA);
	var b = FindTilesInAnnulus(centerB, minRangeB, maxRangeB);
	onlyInA = a.Except(b);
	onlyInB = b.Except(a);
}

But this requires it to work out all the cells in range for a and b - for large areas this is a lot of work - and wasted effort if they mostly overlap. The smart version of this function would try and reduce work to only calculating the changed cells at the edges of the two areas

For our use case - planes with large sight areas but the position only changes a small amount - this would be ideal for reducing the work required.

@pchote

This comment has been minimized.

Copy link
Member Author

@pchote pchote commented Nov 24, 2019

I had a go at implementing differential updating at release-20191117...pchote:optimize-shroud-update-two but its clearly missing something because I had issues with removing cells, and performance is significantly worse than the current baseline.

Could still be a useful starting point for someone else to run with. If someone can fix whatever the main fail(s) are then there are a couple of further optimizations we can to do improve ProjectedCellDelta further, such as only calculating an outer annulus of width equal to the separation of the two points.

@tovl tovl linked a pull request that will close this issue Nov 25, 2019
@GraionDilach

This comment has been minimized.

Copy link
Contributor

@GraionDilach GraionDilach commented Nov 25, 2019

Reading through #16953 - wouldn't it make sense to set MoveRecalculationThreshold: 0 on immobile stuff in all mods regardless of #17383?

@pchote

This comment has been minimized.

Copy link
Member Author

@pchote pchote commented Nov 25, 2019

MoveRecalculationThreshold isn't relevant for actors that never move - it is a distance, not a time.

@GraionDilach

This comment has been minimized.

Copy link
Contributor

@GraionDilach GraionDilach commented Nov 25, 2019

I see that - however it still evaluates the positions every tick even on immobile otherwise and you can still save two LengthSquared operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.