GPU local texture data copying #1119

binary1248 · 2016-07-31T12:30:51Z

In many applications, it is common for the user to stitch smaller textures into bigger ones in order to minimize the amount of OpenGL state changes necessary when drawing. This is commonly known as atlasing. Even sf::Font makes use of this when loading its glyphs.

The problem with atlasing in SFML is that each time a new texture has to be inserted into the atlas, users would have to copy the texture data back to RAM, resize/update it and transfer it back on to the GPU. This costs a lot of time. With this PR, the user (given the proper hardware support) will be able to copy the existing texture to a temporary texture on the GPU, resize the original texture to its new size and copy the old texture data over. This does not involve any copying of data between the GPU and CPU.

#include <SFML/Graphics.hpp>

int main()
{
    sf::Font font;
    if (!font.loadFromFile("resources/sansation.ttf"))
        return -1;

    for (unsigned int i = 32; i < 127; i++)
        font.getGlyph(i, 1000, false, 0);
}

Testing with this code yields 2 orders of magnitude performance increase. On my system, the time it takes for the font to load the codepoints decreases from 20 seconds down to under 1 second.

This is only indirectly testing the texture copying and is a contrived example, since it is meant to be a minimal demonstration. If anybody else has texture copy benchmarks, they can provide the data here as well.

In addition to this feature, I went ahead and optimized sf::Font::cleanup() to deallocate its pixel buffer storage. This way, if the user loads glyphs using a huge character size and later reuses the same sf::Font object for another font at smaller character sizes, they won't be stuck with a std::vector using up more memory than necessary.

mantognini · 2016-07-31T13:06:14Z

src/SFML/Graphics/Texture.cpp

+        return;
+    }
+
+#endif // SFML_OPENGL_ES


Isn't this supposed to be an #else to handle the fallback just below?

ho... I just noticed the return above and it's implication with the if (GLEXT_framebuffer_object && GLEXT_framebuffer_blit)... a bit spaghetti. 😜

I couldn't think of any other way of writing it so that even on non-ES systems it would fall through to the CPU copy if the extension isn't available...

I'm not familiar with OpenGL ES enough to know if the following would compile... but it could be more readable.

#ifdef SFML_OPENGL_ES bool fallback = true; #else ensureGlContext(); priv::ensureExtensionsInit(); bool fallback = !(GLEXT_framebuffer_object && GLEXT_framebuffer_blit); #endif if (fallback) update(texture.copyToImage(), x, y); else { // the big part here }

If you ask me, this makes the code harder to reason about than the current version.

mantognini · 2016-07-31T13:07:41Z

Interesting, I'll test this code in a bit to see how it affect perfs on my machine! :-)

mantognini · 2016-07-31T13:09:45Z

src/SFML/Graphics/Texture.cpp

+        GLenum status;
+        glCheck(status = GLEXT_glCheckFramebufferStatus(GLEXT_GL_FRAMEBUFFER));
+
+        if (status != GLEXT_GL_FRAMEBUFFER_COMPLETE)


might be more concise to refactor as follows:

if (status == GLEXT_GL_FRAMEBUFFER_COMPLETE) GLEXT_glBlitFramebuffer(...) else err() << ... cleanup / return

mantognini · 2016-07-31T13:55:03Z

Best performance of several runs, without this patch: 1.67 real 0.60 user 0.33 sys
Worst performance of several runs, with this patch: 1.12 real 0.14 user 0.08 sys

👍

LaurentGomila · 2016-08-07T20:22:24Z

src/SFML/Graphics/Font.cpp

@@ -583,11 +583,14 @@ Glyph Font::loadGlyph(Uint32 codePoint, unsigned int characterSize, bool bold, f
        // pollute them with pixels from neighbors
        const unsigned int padding = 1;

+        width += 2 * padding;
+        height += 2 * padding;


(sorry I was away the whole week, I try to catch up)

What's the reason for adding 2 * padding to width and height here, and then constantly subtracting it everywhere width and height are used?

The block of memory we work with now includes the border padding (which it didn't before). Instead of initializing the texture to the padding colour when we create it (which might waste time when the texture gets really big), we do it every time a glyph would actually need the padding around it and upload it then.

eXpl0it3r · 2016-11-04T14:42:11Z

This PR needs a rebase.

LaurentGomila · 2016-11-04T20:31:20Z

src/SFML/Graphics/Font.cpp

+        unsigned int y = glyph.textureRect.top - padding;
+        unsigned int w = glyph.textureRect.width + 2 * padding;
+        unsigned int h = glyph.textureRect.height + 2 * padding;
+        page.texture.update(reinterpret_cast<const Uint8*>(&m_pixelBuffer[0]), w, h, x, y);


This looks unsafe, since it depends on the processor endianness. On little endian machines you'll end up with RGBA byte ordering and on big endian you'll get ABGR.

Why did you change the pixel buffer from uint8 to uint32?

The idea was that because the unused area of m_pixelBuffer should contain transparent white pixels, it could be done once during initialization instead of every time a glyph is added. This scales better with bigger glyphs and the more glyphs are added.

The only reason I changed it to Uint32 is because that way initialization would only consist of a single line (the assign). Now that I think of it, since a single memset wouldn't be able to be used internally either (it can only assign a single byte value to a range), the assign would loop over each T anyway. I'll probably revert the storage back to Uint8 and loop initialize using a Uint8 array. If the optimizer isn't too dumb, it should end up producing pretty similar results.

Some tests: https://godbolt.org/g/o4OGt5

LaurentGomila · 2016-11-04T20:40:15Z

src/SFML/Graphics/Font.cpp

-                page.texture.loadFromImage(newImage);
+                Texture oldTexture(page.texture);
+                page.texture.create(textureWidth * 2, textureHeight * 2);
+                page.texture.update(oldTexture);


Wouldn't it be even more efficient to create a fresh new texture with the bigger size, copy the old content to it, and swap (to be implemented) them? We already have the implementation for a swap function in operator=. So that would just be some refactoring actually.

eXpl0it3r · 2017-01-05T12:57:44Z

Does this need additional changes or have all the mentioned points been cleared up?

binary1248 · 2017-01-05T16:47:52Z

At the moment this is not compatible with the context changes i.e. it is broken. Will need to rebase/re-implement/re-test once the context stuff is merged.

binary1248 · 2017-01-28T18:27:16Z

Rebased onto master, addressed the raised issues and fixed a few bugs encountered during testing.

LaurentGomila · 2017-01-29T09:08:46Z

src/SFML/Graphics/Font.cpp

+        unsigned int y = glyph.textureRect.top - padding;
+        unsigned int w = glyph.textureRect.width + 2 * padding;
+        unsigned int h = glyph.textureRect.height + 2 * padding;
+        page.texture.update(reinterpret_cast<const Uint8*>(&m_pixelBuffer[0]), w, h, x, y);


Seems like you forgot to remove the reinterpret_cast

… roundtrip to the CPU and back, add sf::Texture::swap to enable swapping texture contents, fixed sf::Font::cleanup not shrinking its allocated pixel buffer storage when the user loads a new font using the same sf::Font object.

eXpl0it3r · 2017-03-02T15:24:30Z

This seems to work nicely, even though I don't fully understand all the changes, I still approved it, since nobody else has bothered to do so far. 😉

Here's some code I used to test things a bit:

#include <SFML/Graphics.hpp>

int main()
{
    const int gameWidth = 800;
    const int gameHeight = 600;

    sf::RenderWindow window(sf::VideoMode(gameWidth, gameHeight), "SFML Test");
    window.setVerticalSyncEnabled(true);

    sf::Texture tex1;
    tex1.loadFromFile("200.png");
    sf::Sprite spr1(tex1);

    sf::Texture tex2;
    tex2.create(200, 200);
    sf::Sprite spr2(tex2);
    spr2.setPosition({300, 0});

    sf::RenderTexture rt;
    rt.create(200, 200);
    rt.clear(sf::Color::Green);
    rt.display();

    while (window.isOpen())
    {
        sf::Event event;
        while (window.pollEvent(event))
        {
            if (event.type == sf::Event::Closed)
            {
                window.close();
                break;
            }

            if (event.type == sf::Event::KeyPressed)
            {
                if (event.key.code == sf::Keyboard::Space)
                    tex1.swap(tex2);
                else if (event.key.code == sf::Keyboard::Return)
                    tex1.update(rt.getTexture());
            }

        }

        window.clear();
        window.draw(spr1);
        window.draw(spr2);
        window.display();
    }
}

texus · 2017-03-08T13:27:18Z

Since this commit was merged the following code now crashes/asserts when copying a texture.

#include <SFML/Graphics.hpp>

int main()
{
    sf::Texture texture;
    texture.loadFromFile("image.jpg");

    sf::Texture texture2 = texture;
}

Output:

SFML/src/SFML/Graphics/Texture.cpp:450: void sf::Texture::update(const sf::Texture&, unsigned int, unsigned int): Assertion `y + texture.m_size.x <= m_size.y' failed.

eXpl0it3r · 2017-03-08T14:12:53Z

Thanks for letting us know, I'll see to get this fixed asap.

LaurentGomila · 2017-03-08T16:35:19Z

The fix is trivial, it's just a typo (should be texture.m_size.y in the assert condition).

binary1248 added feature m:sfml-graphics labels Jul 31, 2016

binary1248 self-assigned this Jul 31, 2016

mantognini reviewed Jul 31, 2016
View reviewed changes

binary1248 force-pushed the feature/texture_copy branch from 8904853 to 5cb3895 Compare July 31, 2016 22:41

LaurentGomila reviewed Aug 7, 2016
View reviewed changes

mantognini modified the milestones: 2.4.1, 2.5 Aug 9, 2016

binary1248 force-pushed the feature/texture_copy branch from 5cb3895 to bfdd834 Compare October 2, 2016 12:49

eXpl0it3r added the s:undecided label Oct 6, 2016

LaurentGomila reviewed Nov 4, 2016

View reviewed changes

binary1248 force-pushed the feature/texture_copy branch from bfdd834 to b6bd0af Compare January 28, 2017 18:26

LaurentGomila reviewed Jan 29, 2017

View reviewed changes

binary1248 force-pushed the feature/texture_copy branch from b6bd0af to ed00f1c Compare January 29, 2017 11:05

eXpl0it3r approved these changes Mar 2, 2017

View reviewed changes

eXpl0it3r moved this from Review & Testing to Ready in SFML 2.5.0 Mar 2, 2017

eXpl0it3r force-pushed the feature/texture_copy branch from ed00f1c to 6b71456 Compare March 7, 2017 11:16

eXpl0it3r added s:accepted and removed s:undecided labels Mar 7, 2017

eXpl0it3r merged commit 6b71456 into master Mar 7, 2017

eXpl0it3r deleted the feature/texture_copy branch March 7, 2017 13:58

eXpl0it3r moved this from Ready to Merged in SFML 2.5.0 Mar 7, 2017

texus mentioned this pull request Mar 20, 2017

Fixed typo in assert in Texture::update #1210

Merged

hobby8 mentioned this pull request Dec 29, 2017

Extending Font and Text to support multiple textures #1333

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU local texture data copying #1119

GPU local texture data copying #1119

binary1248 commented Jul 31, 2016

mantognini Jul 31, 2016

mantognini Jul 31, 2016

binary1248 Jul 31, 2016

mantognini Aug 1, 2016

binary1248 Aug 1, 2016

mantognini commented Jul 31, 2016

mantognini Jul 31, 2016

binary1248 Jul 31, 2016

mantognini commented Jul 31, 2016

LaurentGomila Aug 7, 2016

binary1248 Aug 7, 2016

eXpl0it3r commented Nov 4, 2016

LaurentGomila Nov 4, 2016

binary1248 Nov 4, 2016 •

edited

LaurentGomila Nov 4, 2016

binary1248 Nov 4, 2016

eXpl0it3r commented Jan 5, 2017

binary1248 commented Jan 5, 2017

binary1248 commented Jan 28, 2017

LaurentGomila Jan 29, 2017

binary1248 Jan 29, 2017

eXpl0it3r commented Mar 2, 2017

texus commented Mar 8, 2017

eXpl0it3r commented Mar 8, 2017

LaurentGomila commented Mar 8, 2017

GPU local texture data copying #1119

GPU local texture data copying #1119

Conversation

binary1248 commented Jul 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mantognini commented Jul 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mantognini commented Jul 31, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eXpl0it3r commented Nov 4, 2016

Choose a reason for hiding this comment

binary1248 Nov 4, 2016 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eXpl0it3r commented Jan 5, 2017

binary1248 commented Jan 5, 2017

binary1248 commented Jan 28, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eXpl0it3r commented Mar 2, 2017

texus commented Mar 8, 2017

eXpl0it3r commented Mar 8, 2017

LaurentGomila commented Mar 8, 2017

binary1248 Nov 4, 2016 •

edited