Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atlas texture and batched rendering #429

Merged
merged 16 commits into from
Mar 13, 2020

Conversation

grantramsay
Copy link
Contributor

Spent ages trying to get this working with a GL_TEXTURE_2D_ARRAY but couldn't quite get it to work, parts of the Nuklear GUI would be missing if the array texture was allocated with > 1 depth (layers).
Currently it uses an array of textures (up to 8) to store all the images, runs at ~130 FPS on my machine (GL_TEXTURE_2D_ARRAY runs at ~180).
Needs a test on a few different PCs

Reduces video memory usage
Remove hack that excluded large textures from texture atlas
Uses texture atlas for GUI/Nuklear stuff as well and doesn't duplicate sprites anymore
Some GUI stuff is still a bit broken when texture atlas has more than one layer and/or compression is enabled...
uniform float h_color_a;
uniform float imgW;
uniform float imgH;
uniform sampler2D tex[8]; // NOTE: The size of this array has a large impact on performance...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This array size has the main impact on performance.
With a GL_MAX_TEXTURE_SIZE of 16384 this only needs to be 2, 8192 is still common though so went with 8.
Using a GL_TEXTURE_2D_ARRAY avoids the need for this array (uniform sampler2DArray tex;)

Would be simpler with much better performance if a proper 2D texture array (GL_TEXTURE_2D_ARRAY) was used. I couldn't quite get it to work, parts of the Nuklear GUI would be missing if the array texture was allocated with > 1 depth (layers)...
Tidy up added rendering code and gl shaders
Remove erroneous assertion
Copy link
Owner

@wheybags wheybags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really great, I now get an FPS of ~130 in the dungeon at 4k!

I would suggest that we drop the SpriteCache altogether, and just preload all textures at startup.
For the gui, I wouldn't worry about batching draws, and just do whatever, and the textures that it generates dynamically (eg, fonts) can just be their own textures. I'm not sure how these are working in this PR at the moment.

Left a few other misc comments scattered about as well. Over all though, amazing work! Thank you very much.

components/render/atlastexture.cpp Outdated Show resolved Hide resolved
components/render/atlastexture.cpp Outdated Show resolved Hide resolved
components/render/atlastexture.cpp Show resolved Hide resolved
components/render/sdl2backend.cpp Outdated Show resolved Hide resolved
components/render/sdl2backend.cpp Outdated Show resolved Hide resolved
components/render/sdl_gl_funcs.h Outdated Show resolved Hide resolved
@@ -257,6 +257,9 @@ namespace FARender

void SpriteCache::evict()
{
if (!Render::SpriteGroup::canDelete())
return;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, we can get rid of the whole concept of a Sprite Cache.
I woule be happy with just preloading all the textures we're going to use into the atlas at startup, and if we have problems with this later, then we can figure it out later.

Copy link
Contributor Author

@grantramsay grantramsay Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh yep sweet, will give this a go.
That method name is also terrible, was meant to be canDeleteSprites

Edit: how do we know which sprites we're going to use at startup?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could dump a list, by running the game and tracing each load. Or, honestly, we could just preload every CEL/CL2 file in DIABDAT.MPQ.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wheybags I had a go at preloading all the sprites, it seems there are simply too many.
My debug outputs: 1402 Cel files, 110247 individual sprites.
The 2D bin packer mustn't have a very good Big O and gets slower and slower, but even disabling that just the Cel decoding takes > 60s.

I don't see dumping a list of used Cel files as being very maintainable.
Might have to stick with loading files runtime?

Also I had to disable decoding these 3 Cel files as they cause a crash when decoding (they also crash if clicked in the celview executable):

"monsters/fireman/firema.cl2"
"monsters/unrav/unravw.cel"
"monsters/darkmage/dmagew.cl2"

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would do here is keep some list of sprites, and on first run, decode them all into atlases, and save the atlases as pngs in soem temp folder. Then we can just laod the single pngs on subsequent runs. OFC we would need to invalidate that if the sprite list changes.

@AJenbo
Copy link

AJenbo commented Mar 15, 2019

The 3 mentioned cel files are corrupt so won't decode under normal conditions

@grantramsay
Copy link
Contributor Author

Ahh thanks @AJenbo that makes sense

@grantramsay grantramsay force-pushed the test/textureAtlasBatchRender branch 2 times, most recently from efadd7e to c70fa72 Compare March 18, 2019 01:17
MaxRects time increases exponentially, after adding 18000 items MaxRects takes 1000 times longer than Skyline. This is unacceptable with ~110,000 sprites. Skyline packing efficiency is pretty good too
Comment on lines 18 to 28
GLint maxTextureSize;
GLint maxTextures = 8; // Hardcoded in fragment shader, GL 3.x minimum is 16.
glGetIntegerv(GL_MAX_TEXTURE_SIZE, &maxTextureSize);

mTextureWidth = maxTextureSize;
mTextureHeight = maxTextureSize;

// Limit number of textures to a reasonable level (measured from testing).
// Note: Increasing this has a severe impact on performance.
GLint estimatedRequiredTextures = (1uLL << 29) / (mTextureWidth * mTextureHeight);
mTextureLayers = std::min(estimatedRequiredTextures, maxTextures);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting maxTextureSize = 32768 on my GTX 1050. This makes mTextureWidth * mTextureHeight be equal to 1073741824, which makes the division in estimatedRequiredTextures be equal to 0, making mTextureLayers be equal to 0. This leads to a assertion failure (release_assert(layer < mTextureLayers)).

// Limit number of textures to a reasonable level (measured from testing).
// Note: Increasing this has a severe impact on performance.
GLint estimatedRequiredTextures = (1uLL << 29) / (mTextureWidth * mTextureHeight);
mTextureLayers = std::min(estimatedRequiredTextures, maxTextures);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
mTextureLayers = std::min(estimatedRequiredTextures, maxTextures);
mTextureLayers = estimatedRequiredTextures > 0 ? std::min(estimatedRequiredTextures, maxTextures) : maxTextures;

Copy link
Contributor

@vieiraa vieiraa Jan 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this problem leads to a glError (freeablo/components/render/sdl2backend.cpp:694, 0x0506).

Copy link
Contributor Author

@grantramsay grantramsay Jan 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try 1 instead of maxTextures, e.g:

mTextureLayers = estimatedRequiredTextures > 0 ? std::min(estimatedRequiredTextures, maxTextures) : 1;

Or just:

GLint estimatedRequiredTextures = mtd::max(1, (1uLL << 29) / (mTextureWidth * mTextureHeight));

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is allocating so much memory it freezes my system (I have 8gb of RAM and a 19gb swap partition).

Copy link
Contributor Author

@grantramsay grantramsay Jan 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be allocating too much vram from your graphics card..
Can you try this instead:

glGetIntegerv(GL_MAX_TEXTURE_SIZE, &maxTextureSize);
maxTextureSize = std::min(maxTextureSize, 16384);
mTextureWidth = maxTextureSize;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it worked. Getting 370-400 fps. Wow.

@wheybags
Copy link
Owner

I get the error

0:14(19): error: sampler arrays indexed with non-constant expressions are forbidden in GLSL 1.30 and later

If you have time, I think the proper answer is either to move to texture arrays, or break the batch when the atlas texture changes. The second option is more scalable IMO, but it would require some intelligent allocation of atlases (at the very least, the ground textures should all be in one atlas).

@wheybags
Copy link
Owner

I will get around to renderer work eventually myself, if I do before this is merged, I'll just start working on this branch myself. Either way, what's here already is a good chunk of what needs to be done, so thanks a lot!

@grantramsay
Copy link
Contributor Author

Yeah there's some good stuff in here but it's not stable enough to merge. Storing all the textures is a bit heavy on gpu ram. Splitting textures into drawing layer categories (if possible) and only caching the current level tiles would probably be better. All good, hopefully what's here helps you out

Use a 2D array texture 8192*8192 with 2 layers
@grantramsay
Copy link
Contributor Author

I had a thought that clearing the atlas texture when changing levels would be a simple way to reduce the large RAM requirements of storing all the sprites, decided to give it a go..
Uses a 2D array texture 8192*8192 with 2 layers.
It's just a thought, let me know what you think. And feel free to discard if you carry this work on in a different direction

@wheybags
Copy link
Owner

wheybags commented Feb 3, 2020

That's a direction we could go, but tbh I think we should be ok to just store the whole game's assets in VRAM. factorio does this (normally it keeps is all on the GPU, it can swap out too if you're low on VRAM, but that's a relatively recent addition for lower end cards) and we should have waaaaaaaay less image data to store.

@grantramsay grantramsay mentioned this pull request Feb 14, 2020
…tchRender

# Conflicts:
#	apps/freeablo/engine/threadmanager.cpp
#	apps/freeablo/faworld/playerbehaviour.cpp
#	components/render/nuklear_sdl_gl3.h
#	components/render/sdl2backend.cpp
#	components/render/sdl_gl_funcs.cpp
# Conflicts:
#	apps/freeablo/farender/renderer.cpp
#	apps/freeablo/faworld/actoranimationmanager.cpp
@wheybags
Copy link
Owner

Im going to start wokring on a bit of an overhaul to the rendering code, so I think I will just merge this as-is for now, and use it as a base (also avoids my pr getting absolutely massive if I just branch off this and merge it all together :v)

@wheybags wheybags merged commit 8cffeb1 into wheybags:master Mar 13, 2020
@vieiraa vieiraa mentioned this pull request Mar 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants