Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traits using TerrainSpriteLayer cause OpenGL overdraw #17126

Open
alexeik opened this issue Sep 18, 2019 · 11 comments

Comments

@alexeik
Copy link

commented Sep 18, 2019

Assemblies:
	common|OpenRA.Mods.Common.dll
	common|OpenRA.Mods.Cnc.dll
	d2k|OpenRA.Mods.D2k.dll
	d2|OpenRA.Mods.D2.dll

if one Mod attaches another Mod dlls
then openRA looks up for Traits
with
public class BuildableTerrainLayer : IRenderOverlay, IWorldLoaded, ITickRender, INotifyActorDisposing
in OpenRA.Mods.D2k.dll and adds it to RenderTick.

it creates overdraws for glDrawArrays.

@pchote

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

Can you please explain the problem in more detail?

@alexeik

This comment has been minimized.

Copy link
Author

commented Sep 19, 2019

D2k.dll has
public class BuildableTerrainLayer : IRenderOverlay, IWorldLoaded, ITickRender, INotifyActorDisposing
as u can see i use d2k.dll in Dune2 mod.

so...
go to RenderTick and see that openRA pickups this class from D2k and calls
void IRenderOverlay.Render(WorldRenderer wr)
from this class.

It makes calls for gl_DrawArrays just for nothing. Empty call for all pipeline. it sends Vertexbuffer with 9-14 thousands of bytes but XYZ are zero in that vertex buffer

nobody wants this behaviour. D2k.dll linked just for some classes, not for classes that implements
IRenderOverlay, IWorldLoaded, ITickRender, INotifyActorDisposing interfaces.

@reaperrr

This comment has been minimized.

Copy link
Contributor

commented Sep 19, 2019

I might be missing something, but shouldn't that only load if BuildableTerrainLayer is attached to the World actor in one of the yaml files?

@alexeik

This comment has been minimized.

Copy link
Author

commented Sep 19, 2019

yes.
this class BuildableTerrainLayer exists in World.yaml

	Inherits: ^BaseWorld
	ChatCommands:
	DevCommands:
	DebugVisualizationCommands:
	PlayerCommands:
	HelpCommand:
	ScreenShaker:
	BuildingInfluence:
	D2ConcreteOwners:
	ProductionQueueFromSelection:
		ProductionPaletteWidget: PRODUCTION_PALETTE
	ActorSpawnManager:
		Actors: sandworm
	DomainIndex:
	WarheadDebugOverlay:
	D2TerrainLayer:
	BuildableTerrainLayer:``` 

so. my bad. everything is fine here.
@alexeik

This comment has been minimized.

Copy link
Author

commented Sep 19, 2019

but)
the same as SmudgeLayer.cs

image

and
image

@pchote

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

These traits make draw calls because their whole purpose is to draw sprites to the screen. The dirty and remove flags relate to updating state within the buffer, which is a separate task to drawing the buffer to the screen.

@alexeik

This comment has been minimized.

Copy link
Author

commented Sep 19, 2019

and why this call sends for Vertex Shader , vertex buffer filled with zeroes and do nothing as a result?

@GraionDilach

This comment has been minimized.

Copy link
Contributor

commented Sep 20, 2019

@alexeik You have two SmudgeLayers as well.
https://github.com/OpenRA/d2/blob/master/mods/d2/rules/world.yaml#L161

@pchote He wants to skip the rendering call of TerrainSpriteRenderers if they don't have any tile within them (visible currently). That sounds like a worthwhile gain, especially if someone starts using theaters with multiple palettes.

@RoosterDragon

This comment has been minimized.

Copy link
Member

commented Sep 20, 2019

tl;dr: Something along these lines would be possible, but there's a fair amount of effort required to get something smart enough to be useful in practise (hence why nobody has done it yet!)


skip the rendering call of TerrainSpriteRenderers if they don't have any tile within them (visible currently)

To do that, we'd need to track CPU side if the buffer we were about to draw was 'empty'. This isn't straightforward performance-wise.

  • If the buffer has no items, then great, we can skip drawing it and save a bunch of GPU.
  • If the buffer has even just one item, we still have to draw it.

How often will a buffer be completely empty in practise though? Probably not ever.

This optimization getting defeated as soon as there are any items at all is pretty sad. We need something that can't be defeated so easily.

The current implementation of the layer deals with drawing items on a per-row basis. We could track the number of items in each row to skip any "empty" rows.

But how often, across an entire map, is an entire row devoid of an item to draw? Worse yet, even if the item is offscreen, we still need to draw the row it's in.

What if we checked whether there were any items in rows onscreen instead then? We could skip those and that would probably be pretty nice!

The current implementation has no quick way of checking if there are items in the section of the each row that is visible - it'd have to check each tile within the onscreen area to see what was in it.

By the time we've checked each tile individually to see what we can skip drawing, we've burned more CPU time then we could ever hope to gain back from saved GPU time.

To have an effective solution, you need something that can efficiently determine if you can skip drawing some parts of the buffer, so you don't burn too much CPU time, and therefore can actually gain something back by reducing the effort on the GPU.

At this stage you need to bring SpatiallyPartitioned into play. Instead of considering things on a per-tile basis, chunk up the map into grids of say 10x10.

  • We track the number of items within each grid, and draw that grid if there's anything.
  • If the grid is empty, we can skip drawing it.
  • The is better on CPU, because instead of checking 100 individual tiles, we can check whole grids at a time, so there's 100x less 'things' to check on.
  • By dividing things up into grids, there's more likely to be empty grids which we can skip drawing on and gain the GPU savings from. A item onscreen only causes that grid to have to be drawn, rather than the whole screen.
  • The current implementation uses a contiguous vertex buffer for the whole area, and only issues one draw call. If it needs to draw a section of this buffer to the screen, we must draw all visible rows (because the buffer is contiguous). This means we can avoid drawing regions above and below the screen, but we can't avoid overdraw to the left and right of the screen (unless we make a draw call per visible row, so we can trim the width shown, but that means more draw calls even if it's less stuff drawn)
  • By dividing things up into grids, we only draw the grids on screen, we overdraw the areas where a grid is only partially onscreen, but unlike the current implementation we don't have to draw the full width of the map.
  • This is necessarily going to require more than one draw call, and do you have separate vertex buffers per grid?
  • You can get clever with just one vertex buffer shared by all the grids, and then do some smarts to do as few draw calls as possible to show all the visible grids. e.g. if 4 grids in a line are visible, and they're all next to each other in the vertex buffer, you can coalesce that into one draw call. But if they're not next to each other in the vertex buffer, you're going to need individual draw calls for each one.

Assuming you get this far, as long as less things to draw outweighs the cost of more draw calls, and some minor CPU overhead for the bookkeeping, then you're onto a winner.

If it turns out that the 'dumb' implementation of one draw call to just 'draw everything' is faster, then you'd just added a lot of complexity for not much gain.

  • The dumb implementation wins if you have to draw most of the area on screen regardless (e.g. the terrain and shroud layers) - if there's always a large amount of things to draw, it's better to not waste time trying to determine what to draw, and just get on with drawing it.
  • The smart implementation wins if you only have to draw a few things (e.g. the smudge layer) - in this case spending time on culling is great, because you can skip drawing a bunch of 'nothing' and that saves time.

So for some of our existing layers, the dumb implementation is probably already doing a decent job - but for some of the others a smarter system could make some gains.

It's worth looking into, but getting a useful speedup requires an amount of implementation effort.

@RoosterDragon RoosterDragon changed the title overdraws from Traits from attached Mod dlls Traits using TerrainSpriteLayer cause OpenGL overdraw Sep 20, 2019
@alexeik

This comment has been minimized.

Copy link
Author

commented Sep 22, 2019

@RoosterDragon it can be smaller as just

It's worth looking into, but getting a useful speedup requires an amount of implementation effort.

every1 knows that openRA has no statistic data to answer your philosophical assumptions about gain :)

@GraionDilach

This comment has been minimized.

Copy link
Contributor

commented Sep 24, 2019

@RoosterDragon Practical use of #10246 can lead to usecases where only a few tiles use a secondary palette. However this current implementation still result with a consistent perf with the terrain on all maps, so that's probably the better player experience.

Smudges and resources should not use TerrainSpriteLayer though IMO - the current implementation of smudges breaks TS anyway (where smudges can't overlap or stack, if a cell had a smudge on it, then it is immune for further ones IIRC and some had a foundation requirement with being 2x2 sized).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.