Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grass instancing #3023

Merged
merged 13 commits into from
Jan 27, 2021
Merged

Grass instancing #3023

merged 13 commits into from
Jan 27, 2021

Conversation

akortunov
Copy link
Collaborator

@akortunov akortunov commented Oct 25, 2020

An alternative to #3010. Opened to have build artefacts to test the feature and compare performance with paging PR.
Existing settings:

[Groundcover]
enabled = true
density = 0.5
min chunk size = 0.5

A main idea - instead of merging grass instances to large shapes draw the same shape many times without uploading the same data every time. Since all instances of the same shape use the same transformations, we need to move, scale and rotate every instance via vertex shader. It allows to decrease CPU and RAM usage compared to paging, but causes higher GPU usage due to additional calculations in shaders.

Note that roundcover objects can not cast shadows since shadows system is not aware of any transformantions in the vertex shader. Can be fixed in follow-up PR, but such shadows will cause a really large performance drop. Probably it would better to use port-processing shaders (e.g. SSAO) instead of real-time shadows for groundcover.

In my testing instancing is a bit slower than paging (it has higher GPU usage, but lower Draw and Cull, but my setup is GPU-limited). An advantage of instancing is that it uses less RAM (roughly near 100MB in my testing), also cell transitions are faster since OpenMW does not need to load a lot of instances and then merge them to batches.
This behaviour is fully expected since the main purpose of instancing is to reduce CPU usage at the cost of additional GPU overhead (more complex shaders and an overhead to bind per-vertex data).

@akortunov akortunov changed the title Grass intstacing Grass instancing Oct 25, 2020
apps/openmw/main.cpp Outdated Show resolved Hide resolved
@AbduSharif
Copy link
Contributor

On my device grass paging wins by a big margin:
Grass paging:
While walking:
80-54
When cell transion happens FPS might drop to 35.
While looking a big patch of grass:
75-65

Grass instancing:
While walking:
65-30
When cell transition happens FPS might drop to 25.
While looking at a big patch of grass:
60-43

My groundcover settings:
[Groundcover]
enabled = true
fade start = 1.0
density = 0.5
distance = 1
animation = true

OP and AG are on, cell view distance is 1.18, and OP merge factor is 450

@akortunov
Copy link
Collaborator Author

akortunov commented Oct 25, 2020

I played around with groundcover chunks size.
Default size (1/8 cell = 1024 units):
Screenshot_20201025_223903

Increased size (1/2 cell = 4096 units):
Screenshot_20201025_223739

In theory, with larger chunks you should have less Draw usage, but more GPU usage.
Note that with paging increased chunks size causes depth artifacts during turning. I will try to investigate the issue and make chunk size configurable.

@psi29a
Copy link
Member

psi29a commented Oct 25, 2020

Interesting that while Draw was significantly reduced, went down a small bit when it was expected to go up. Culling also was slightly reduced.

@AbduSharif
Copy link
Contributor

AbduSharif commented Oct 26, 2020

Did something change?
"groundcover=" files aren't loaded.
Not even "data=" lines.
Built with latest commit.

Something changed with the config files detection.

@akortunov
Copy link
Collaborator Author

akortunov commented Oct 26, 2020

Interesting that while Draw was significantly reduced, went down a small bit when it was expected to go up

Actually, the behaviour is expected - if you have small amount of large drawables, CPU spends much less time to handle them than large amount of small drawables. From the other hand, GPU loading increases a bit with large chunks since culling becomes less efficient and GPU renders some object instances outside of view frustrum.
As for GPU bar, it may be just a measure fluctuation between different sessions.

@akortunov
Copy link
Collaborator Author

akortunov commented Oct 26, 2020

Did something change?

My latest commit just changed groundcover chunks size, it did not touch configs. Propbably there are some upstream changes that may cause the issue. Also you can try to build a previous commit from this branch and check if the issue is resolved.

@akortunov
Copy link
Collaborator Author

akortunov commented Oct 26, 2020

Added a configurable chunk size (0.5 cell = 4096x4096 by default). Such setting causes graphics artifacts (depth order when rotation camera) with object paging, and I did not manage to fix it, so this setting is instancing-only.

The main issue left is 3 - I'd appreciate any help to figure out why refractions cause performance drop with instancing, but not with paging. Can it be that refraction camera works with "real" coordinates, not with ones which I set in shader, so refraction tries to handle all groundcover objects?

@akortunov
Copy link
Collaborator Author

akortunov commented Oct 27, 2020

I managed to drop invisible instances (which are behind of fading distance) via vertex shader. On my setup it allows to save about 35% of time which GPU spends to handle such instances. Since it is about 20% of the time for the whole groundcover to handle invisible instances, it is about 6% of total speed to handle groundcover on GPU on my setup, and a couple of additional FPS.

Note that it is harder to use the same trick with paging since I need to collect instances coordinates and send them to GPU first, what is not really good.

@AbduSharif
Copy link
Contributor

Could you do a rebase? AO3 fixed the issue with config files.

@akortunov
Copy link
Collaborator Author

Could you do a rebase? AO3 fixed the issue with config files.

Done.

@akortunov
Copy link
Collaborator Author

I hope that we are done with crashes now.

@AbduSharif
Copy link
Contributor

AbduSharif commented Oct 29, 2020

With latest changes, cell transition with grass instancing is even snappier than grass paging (and before), seems like still shots with lots of grass has more stable FPS than before and increased by 8-15, for me FPS is still more consistent and better with grass paging.

Now what to do with "Cell reference 'unknown_grass' not found!" message spam in the log?
Having 687 lines of the same message isn't great, file size with just one message of it 4.x kb vs 33.x kb.

I'm using Remiro's grass for OpenMW

@akortunov
Copy link
Collaborator Author

Cell reference 'unknown_grass' not found!

If i remembered correctly, it is a bug in the mod itself (it contains a lot of garbage records). I doubt that we should "fix" it on the engine side.

@psi29a psi29a mentioned this pull request Oct 30, 2020
@psi29a
Copy link
Member

psi29a commented Oct 30, 2020

To your points:

  1. That isn't an issue just yet, I think it should be a setting to enable shadows for ground cover.
  2. Not a blocker as tuning and performance improvement comes over time and isn't expected to be perfect at once.
  3. Not a blocker, just more research is needed. Don't we already allow to turn off these kinds of refractions?
  4. Not a blocker, content related but should be noted in documentation for now.

@akortunov
Copy link
Collaborator Author

akortunov commented Oct 30, 2020

I made an additional investigation about instancing performance. Different sources state that paging is generally faster than instancing with low-poly meshes (as Morrowind ones), while instancing should be faster with complex meshes instead (several thousands of triangles per shape) and\or with dozends of thousands of the same mesh instances - you need to load a lot of redundant data with paging in such case.

@akortunov
Copy link
Collaborator Author

Don't we already allow to turn off these kinds of refractions?

I'd not disable groundcover in refraction because there are mods which add an underground flora.
BTW, this issue is partially mitigated now.

@heilkitty
Copy link
Contributor

While playtesting Lyithdonea: Azurial Isles with grass enabled, this happened: screenshot005

@akortunov
Copy link
Collaborator Author

akortunov commented Nov 1, 2020

this happened

Looks similar to this issue (a comberry mesh from modern Graphics Herbalism was affected). I suspect that the mod uses an esoteric texture format, which OSG (at least, your binary) does not support properly. Can you try to enable your groundcover mod as a common one ("content" instead of "groundcover") and check if the issue persists?

@heilkitty
Copy link
Contributor

I use OpenMW's OSG fork, and all the grass textures seem to be DDS. With content instead of groundcover it works fine (if you don't count FPS and collisions). Should I test with grass paging?

Different place, but same grass.

@psi29a
Copy link
Member

psi29a commented Nov 1, 2020

Does instanced groundcover now cast shadows?

@akortunov
Copy link
Collaborator Author

Can you attach an affected mesh and texture?

@akortunov
Copy link
Collaborator Author

akortunov commented Jan 12, 2021

According to feedback, an approach with IsGroundcover flag for statics does not work in cases when groundcover mod contains only instances of groundcover statics, while statics definitions are from another ESM. SHOTN uses this approach (grass itself is defined in Tamriel_Data.esm, while its instances is in the Sky_Main_Grass.esp). I see two ways to avoid this issue:

  1. Use MGE-like approach to detect grass statics (check if the mesh is in the Grass folder).
  2. Mark instances from groundcover mods as groundcover objects, not statics themselves.

I try to use second approach for now. A main idea is that I load all groundcover plugins after common ones, so I know an index of first loaded groundcover mod during game start. Since I know this index, I can save it and then compare it with content file index of every ESM::CellRef to determine if this object instance comes from groundcover file.

I also tried to use flag in the ESM reader, but without success - some parts of the engine work with ESM files directly.

@akortunov
Copy link
Collaborator Author

akortunov commented Jan 22, 2021

Note that I do not like additional optional arguments in recreateShaders to handle special cases.

@psi29a psi29a merged commit b164f1a into OpenMW:master Jan 27, 2021
@pulion
Copy link

pulion commented Feb 9, 2021

on the android version of OpenMW, when the grass instancing function is enabled, the GPU is loaded by 100% and the fps drops to 12-15. At the same time, the CPU is almost idle and the load does not exceed 20%. When grass instancing is turned off, the FPS stays stably around 30-40 in the busiest places. The GPU load does not exceed 80% and the game runs smoothly. Is it possible to redistribute threads for a uniform load between the CPU and GPU when this function is enabled?

@AnyOldName3
Copy link
Member

GL4ES doesn't actually use the GPU's support for instancing even when it's available and instead fakes it CPU-side. It's not an OpenMW problem. If the graphics driver tells us it supports instancing, it's supposed to mean that it can do it quickly.

@pulion
Copy link

pulion commented Feb 9, 2021

GL4ES doesn't actually use the GPU's support for instancing even when it's available and instead fakes it CPU-side. It's not an OpenMW problem. If the graphics driver tells us it supports instancing, it's supposed to mean that it can do it quickly.

I cannot say what exactly is to blame for the drop in FPS performance when the grass instancing function is enabled, since I do not understand this. But I see a significant difference both in benchmarks and in personal feelings when disabling grass instancing in OpenMW settings. in the first screenshot grass instancing is enabled, in the second it is disabled. The monitoring data can be seen in the screenshot in the center.

Screenshot_20210209-182400576
Screenshot_20210209-182552647

@pulion
Copy link

pulion commented Feb 10, 2021

Found the reason for the brakes and loading the GPU at 100% on the last nightgown when using grass instancing.
This problem arises when using simultaneously grass instancing with the new version of water shaders from VTASTEK + refraction. When you remove the water shader from the APC, the load on the GPU returns to normal (35-80%). FPS is stable.
in the first screenshot of grass instancing using shaders from vtastek and standard water shaders. on the second screenshot grass instancing using shaders from vtastek + his water shaders. as can be seen from the load monitoring program, in the second case the GPU is loaded at 100%, the fps is too low to play comfortably. I hope this can be fixed.
Screenshot_20210209-182552647
Screenshot_20210209-182400576

@akortunov
Copy link
Collaborator Author

in the first screenshot of grass instancing using shaders from vtastek

Judging by the first screenshot from the last post, you just enabled groundcover mod as a common one. Are you sure that you uploaded correct screenshots?

@pulion
Copy link

pulion commented Feb 10, 2021

in the first screenshot of grass instancing using shaders from vtastek

Judging by the first screenshot from the last post, you just enabled groundcover mod as a common one. Are you sure that you uploaded correct screenshots?

sorry, wrong. all screenshots are taken from the same angle and cannot be seen on the smartphone. fixed. here is the first screenshot:
Screenshot_20210210-091837694

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants