Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of shroud updating. #17383

Merged
merged 2 commits into from Jan 12, 2020
Merged

Conversation

@tovl
Copy link
Contributor

tovl commented Nov 25, 2019

Fixes #17377
Fixes #16393

This improves the performance of shroud updating by only updating each cell at most once per tick and getting rid of as much redundant memory allocations as possible. Especially operations like HashSet.UnionWith appear to be a major bottleneck. I tried to replace dynamic data structures with fixed-size data structures where reasonable (CellLayer in particular). I tried to reduce the number of operations in general by eliminating redundancies between FrozenActorLayer, ShroudRenderer, RadarWidget and PlayerRadarTerrain.

As a testcase I tried building 60 yaks on Pool Party:

  • On bleed framerate on my machine drops to 20 fps (down from 60 fps frame limited).
  • If I set MoveRecalculationThreshold to 0 (effectively reverting #16953) I get ~45 fps.
  • On this PR I get 57–60 fps.
@pchote pchote requested a review from RoosterDragon Nov 25, 2019
@pchote

This comment has been minimized.

Copy link
Member

pchote commented Nov 25, 2019

A lot is going on here, can you please give a rough overview for each of the key changes here? (e.g. removing the bins from FrozenActorLayer, the touched/invalidation changes in Shroud, the changes in RenderShroud/PlayerRadarTerrain, etc).

I guess we will need a second PR targeting prep-1908 specifically, thanks to #17076 rewriting some of the visibility handling? Or are we happy to go with the much simpler MoveRecalculationThreshold: 0 bandaid for the hotfix?

@teinarss

This comment has been minimized.

Copy link
Contributor

teinarss commented Nov 25, 2019

One regression from #17076 is that the Radar dont update the same way when you Spectate as the player.
image

@reaperrr

This comment has been minimized.

Copy link
Contributor

reaperrr commented Nov 25, 2019

Or are we happy to go with the much simpler MoveRecalculationThreshold: 0 bandaid for the hotfix?

Just my 2 cents, but I'd prefer that we should go with the bandaid for the hotfix. Our track-record of finding all bugs/regressions in prs of this size hasn't been that great, and having to adapt and review another PR specifically for prep likely just increases the chance of something slipping through into that release, regardless of playtests.

@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Nov 25, 2019

Key changes:

  • AddSource/RemoveSource/etc. no longer trigger an Invalidation.
  • Invalidation is moved into Shroud.Tick which loops over all map cells once per tick.
  • Touched is used to track which cells have been affected in any way since the last tick. If a cell is not touched the invalidation for that cell is skipped immediately.
  • oldResolvedType is used to keep track of the shroud state of the previous tick.
  • CellsChanged has been removed. The CellsChanged actions would previously make ShroudRenderer, FrozenActorLayer, RadarWidget and PlayerRadarTerrain update their own internal dirty cells cache. They would then act on those cells in their own tick. The delayed action was needed when invalidation happened multiple times per tick but all the updating of redundant sets has major performance costs of its own.
  • OnShroudChanged has been added instead. It directly triggers the actual updates done by those classes on a specific cell.
  • When the evaluated shroud state is the same as the previous tick for a cell, the OnShroudChanged actions are skipped for that cell, avoiding redundant work.
  • Code in those classes that cycles through cells/bins independently of Shroud and evaluates whether a given cell needs and update and then updates that cell have mostly been removed. All updates are directly triggered by Shroud.Tick instead.
  • ShroudRenderer still needs to cycle through cells an additional time before rendering. This is because all surrounding cells of a changed cell need to be updated as well to get the right sprites. The OnShroudChanged action here is used to mark all surrounding cells as dirty in a separate CellLayer.
  • The actions in OnShroudChanged that need to happen when the renderplayer is changed where awkwardly mixed in with the actions that need to happen every tick. That has been disentangled and moved to a world.RenderPlayerChanged action following the example set by RadarWidget.
@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Nov 25, 2019

This is a complicated restructuring with a high chance of regressions missed during review so I wouldn't recommend it for a quick fix. The bandaid fix will have to do if we go the route of a bugfix release.

@pchote

This comment has been minimized.

Copy link
Member

pchote commented Nov 26, 2019

Significant chunks (possibly most of) touched is going to remain false each tick. Wouldn't it make sense to keep a coarser bins array to track these regions at a coarser level so we don't need to enumerate the whole layer each tick?

@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Nov 26, 2019

It is difficult to say without trying it out. It might squeeze a few more frames out put it might also regress the performance with a larger amount of units because of the extra operations we incur in AddSource/RemoveSource. I should note that touched itself is only responsible for a relatively small (but still significant) part of the performance gain here. Even when we do the invalidation for every cell on every tick, we still get most of the performance gains as long as we still skip the OnShroudChanged actions.

I also tried doubling the amount of yaks in the testcase to 120 and moving them as a single flock. That dropped the framerate to about 40–45 (10 on bleed). Moving them as a single flock means that the number of cells touched is mostly the same as with 60 yaks which means it is the per-unit operations that become the dominant bottleneck with these numbers. So I think the biggest performance gains at this point can be had in the projection part or the AddSource/RemoveSource part.

@pchote

This comment has been minimized.

Copy link
Member

pchote commented Nov 26, 2019

Ok. We can potentially halve the per-unit operations in RA by reworking the gap generator implementation to not need two RevealsShroud traits - something I plan to look into for a future PR.

@tovl tovl force-pushed the tovl:visibility-perf branch from be51c17 to 6749ab4 Nov 26, 2019
@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Nov 26, 2019

One regression from #17076 is that the Radar dont update the same way when you Spectate as the player.

Fixed.

@RoosterDragon

This comment has been minimized.

Copy link
Member

RoosterDragon commented Nov 27, 2019

This makes sense as a high-level change for centralizing shroud changes into one pass and all the changes you listed flow naturally from that - the rest would just be nitpicking over low-level specifics.

My main question at a high-level would be if delaying the updates affects any Tick logic in an adverse way. Currently if a unit moves into a new cell, it reveals the shroud immediately, whereas now this would be delayed by a tick so other decision making would see different outcomes when querying the visibility states of the cell within that tick.

That's the major blocker as to whether this is a valid optimization to apply - if we can satisfy ourselves that the delays doesn't have any unfortunate knock over behaviour we'd be good.

A theoretical example of what I mean (I haven't checked if something like this would actually apply)

  • A unit with identical sight and attack range moves into range of an enemy.
  • Under previous behaviour, the enemy is revealed so would be able to be attacked.
  • Under new behaviour, the shroud is still hidden so the enemy can't be seen.
  • Unit continues moving.
  • Next tick, the enemy becomes visible, but our unit continues moving into the next cell over several ticks.
  • End result: Units end up moving one cell closer than previously, which might seem odd or affect balance.
@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Nov 27, 2019

I turns out this is indeed a problem. It also turns out this is a problem with the current code as well.

Trait ticks are always carried out in the order in which the actors where created. Because the world actor is always created first, immediately followed by the player actors, we are guaranteed that player level traits always tick before unit level traits. However, the order in which different traits on the same actor are ticked is completely arbitrary. We can't even guarantee that the order is consistent because they are stored in an unsorted dictionary.

In general the tick order is as follows:

  1. Activities
  2. world traits
  3. player traits
  4. unit traits
  5. frameEndActions
  6. render ticks (in the same actor order, but that is not relevant here)

Targetting for attacks happen either inside the Attack* traits or the Attack activities. Let's start with the former. In the current code the order can be either this:

  1. Unit moves to position (Activity)
  2. Shroud is revealed (unit trait)
  3. Unit attacks target (unit trait)

or this:

  1. Unit moves to position
  2. Unit tries to attack target. Fails because it is still invisible.
  3. Shroud is revealed.
  4. Unit moves again. (new world tick)
  5. Unit attacks target.

There is no way to know or control which of the two happens, so that is pretty bad. With the new code the shroud reveal is guaranteed to happen after the activity tick and before the attack trait tick, so that sounds pretty good. However, the actual visibility calculation still happens inside the AffectsShroud unit trait tick, which is guaranteed to run after the Shroud tick! The order in which the unit traits tick doesn't matter anymore, so we are at least consistent, but we are consistently wrong.

Then there is AttackFrontal which does the targetting inside the activity. This is always guaranteed to have a one tick delay in the current situation (which might explain the suicidal tendencies that attacking infantry seem to have) and even a two tick delay with the new code! Fortunately, this will soon be resolved in #16696, so we don't have to worry about this edge case.

So we need to guarantee that the visibility calculation happens after the movement and before the shroud is evaluated. I see two ways to do this: Either we let the activity trigger the visibility calculation using INotifyVisualPositionChanged or we let the Shroud tick trigger the visibility calculations using a new interface.

@GraionDilach

This comment has been minimized.

Copy link
Contributor

GraionDilach commented Nov 27, 2019

However, the order in which different traits on the same actor are ticked is completely arbitrary. We can't even guarantee that the order is consistent because they are stored in an unsorted dictionary.

That's what Requires is for. See discussion in #10907/ #10965 et al.

@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Nov 27, 2019

Thanks for the info! Apparently we can sort the traits on the same actor then. However, this doesn't seem to apply for traits on different actors. (Even in the current situation we could be dealing with one unit providing vision for another.)

@tovl tovl force-pushed the tovl:visibility-perf branch from 6749ab4 to 653bff0 Nov 28, 2019
@teinarss

This comment has been minimized.

Copy link
Contributor

teinarss commented Dec 5, 2019

This will probably fix #17422 too (havent tested it yet)

@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Dec 5, 2019

Unfortunately, it does not. The behaviour is exactly the same.

@tovl tovl added this to the Next+1 milestone Dec 6, 2019
@reaperrr

This comment has been minimized.

Copy link
Contributor

reaperrr commented Dec 8, 2019

Can't comment much on the code (out of my league, though at least I didn't see any style issues), but the performance gain is impressive 👍
FWIW, I didn't notice any obvious issues, either.
Feel free to count this as +1 if @RoosterDragon or @pchote approves the code.

OpenRA.Game/Traits/Player/FrozenActorLayer.cs Outdated Show resolved Hide resolved
OpenRA.Game/Traits/Player/Shroud.cs Outdated Show resolved Hide resolved
OpenRA.Game/Traits/Player/Shroud.cs Show resolved Hide resolved
OpenRA.Game/Traits/Player/Shroud.cs Outdated Show resolved Hide resolved
OpenRA.Mods.Common/Traits/World/ShroudRenderer.cs Outdated Show resolved Hide resolved
OpenRA.Mods.Common/Traits/World/ShroudRenderer.cs Outdated Show resolved Hide resolved
OpenRA.Mods.Common/Traits/World/ShroudRenderer.cs Outdated Show resolved Hide resolved
OpenRA.Mods.Common/Widgets/RadarWidget.cs Outdated Show resolved Hide resolved
OpenRA.Game/Traits/Player/Shroud.cs Show resolved Hide resolved
@tovl tovl force-pushed the tovl:visibility-perf branch 2 times, most recently from 4848301 to 6f5ed40 Dec 13, 2019
@RoosterDragon

This comment has been minimized.

Copy link
Member

RoosterDragon commented Dec 18, 2019

Happy to take on the bleed branch - the soak time should hopefully reveal if the one-tick delay causes any undesirable consequences.

👍

@tovl tovl force-pushed the tovl:visibility-perf branch from d5deef0 to 667b9b0 Dec 31, 2019
Copy link
Member

pchote left a comment

Unrelated minor point:

dirtyBins[y * cols + x] = true;
}
};
self.Trait<Shroud>().OnShroudChanged += uv => dirtyFrozenActorIds.UnionWith(partitionedFrozenActorIds.At(new int2(uv.U, uv.V)));

This comment has been minimized.

Copy link
@pchote

pchote Jan 4, 2020

Member

.At requires an enumeration of all the bounds in the bin, so this might potentially hurt.

We're probably okay when considering at most bins will have less than 10 entries, and that this will be strongly outweighed by the other improvements in the PR. Just raising this here in case we are looking back on this PR in the future wrt regressions or followup optimizations.

This comment has been minimized.

Copy link
@RoosterDragon

RoosterDragon Jan 6, 2020

Member

Previously we were checking a region via InBox due to the coarse binning scheme in FrozenActorLayer that went away, so this would be an improvement anyway - certainly no worse at least.

@pchote

This comment has been minimized.

Copy link
Member

pchote commented Jan 4, 2020

While giving what I was expecting to be a quick final overlook, I noticed that I'd missed some other cases where casual uses of MPos vs PPos may allow bugs to sneak in.

In my previous review I noted

We had a lot of trouble in the past with bugs caused by calculations done in the wrong coordinate systems, and it took a lot of effort plus these types to solve them.

Working through the fixes (which I have pushed as 7f2d8c4 - Note: Updated commit x 2, so make sure you have this version) revealed exactly one of these issues! PlayerRadarTerrain has removed the shroud -> map (un)projecting, which will cause the wrong cells to updated in the preview. Can you please squash the fixes from my commit into here, and fix the unprojection between UpdateShroudCellUpdateTerrainCell?

I have filed #17553 about what is potentially a better fix, but that is out of scope of this PR.

@tovl tovl force-pushed the tovl:visibility-perf branch from 667b9b0 to 544d57d Jan 5, 2020
@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Jan 5, 2020

Applied the fixup commit and unprojection.

Copy link
Member

pchote left a comment

Aaah! Sorry, but I noticed another regression.

Also needs a rebase due to #17136 being merged.

@tovl tovl force-pushed the tovl:visibility-perf branch from 544d57d to 3dcaf48 Jan 6, 2020
@pchote

This comment has been minimized.

Copy link
Member

pchote commented Jan 8, 2020

Unfortunately this now crashes when unloading infantry from an APC:

Exception of type `System.InvalidOperationException`: Attempting to add duplicate shroud source
  at OpenRA.Traits.Shroud.AddSource (System.Object key, OpenRA.Traits.Shroud+SourceType type, OpenRA.PPos[] projectedCells) [0x0010f] in /Users/paul/src/OpenRA/OpenRA.Game/Traits/Player/Shroud.cs:255 
  at OpenRA.Mods.Common.Traits.RevealsShroud.AddCellsToPlayerShroud (OpenRA.Actor self, OpenRA.Player p, OpenRA.PPos[] uv) [0x00024] in /Users/paul/src/OpenRA/OpenRA.Mods.Common/Traits/RevealsShroud.cs:54 
  at OpenRA.Mods.Common.Traits.AffectsShroud.OpenRA.Traits.INotifyAddedToWorld.AddedToWorld (OpenRA.Actor self) [0x00069] in /Users/paul/src/OpenRA/OpenRA.Mods.Common/Traits/AffectsShroud.cs:147 
  at OpenRA.World.Add (OpenRA.Actor a) [0x0003a] in /Users/paul/src/OpenRA/OpenRA.Game/World.cs:354 
  at OpenRA.Mods.Common.Activities.UnloadCargo+<>c__DisplayClass15_0.<Tick>b__0 (OpenRA.World w) [0x0006f] in /Users/paul/src/OpenRA/OpenRA.Mods.Common/Activities/UnloadCargo.cs:128 
  at OpenRA.World.Tick () [0x0019e] in /Users/paul/src/OpenRA/OpenRA.Game/World.cs:470 
  at OpenRA.Game.InnerLogicTick (OpenRA.Network.OrderManager orderManager) [0x001bc] in /Users/paul/src/OpenRA/OpenRA.Game/Game.cs:622 
  at OpenRA.Game.LogicTick () [0x0003e] in /Users/paul/src/OpenRA/OpenRA.Game/Game.cs:649 
  at OpenRA.Game.Loop () [0x000f1] in /Users/paul/src/OpenRA/OpenRA.Game/Game.cs:813 
  at OpenRA.Game.Run () [0x0003c] in /Users/paul/src/OpenRA/OpenRA.Game/Game.cs:854 
  at OpenRA.Game.InitializeAndRun (System.String[] args) [0x00010] in /Users/paul/src/OpenRA/OpenRA.Game/Game.cs:263 
  at OpenRA.Program.Main (System.String[] args) [0x00044] in /Users/paul/src/OpenRA/OpenRA.Game/Support/Program.cs:37
@tovl tovl force-pushed the tovl:visibility-perf branch from 3dcaf48 to 520b8a0 Jan 8, 2020
@tovl

This comment has been minimized.

Copy link
Contributor Author

tovl commented Jan 8, 2020

A simple oversight. Fixed.

@tovl tovl force-pushed the tovl:visibility-perf branch from 520b8a0 to efa5616 Jan 9, 2020
@tovl tovl force-pushed the tovl:visibility-perf branch from efa5616 to d3d91ef Jan 9, 2020
@pchote
pchote approved these changes Jan 11, 2020
@pchote

This comment has been minimized.

Copy link
Member

pchote commented Jan 11, 2020

@abcdefg30 did you have any final comments wrt your last review or anything else before we merge this?

@abcdefg30

This comment has been minimized.

Copy link
Member

abcdefg30 commented Jan 12, 2020

No.

@abcdefg30 abcdefg30 merged commit fbfef90 into OpenRA:bleed Jan 12, 2020
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@abcdefg30

This comment has been minimized.

Copy link
Member

abcdefg30 commented Jan 12, 2020

@tovl tovl deleted the tovl:visibility-perf branch Jan 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

7 participants
You can’t perform that action at this time.