Skip to content

Added assert that a texture is valid when attempting to bind it.#3122

Merged
ChrisThrasher merged 1 commit intomasterfrom
feature/texture_bind_assert
Jun 27, 2024
Merged

Added assert that a texture is valid when attempting to bind it.#3122
ChrisThrasher merged 1 commit intomasterfrom
feature/texture_bind_assert

Conversation

@binary1248
Copy link
Member

Title.

Perhaps the simplest solution to make the "white square problem" at least detectable when running a debug build.

Normally, debug memory allocators fill freed memory with patterns such as 0xDDDDDDDD in order to make detecting use-after-frees easier. In our case however, because textures can be (and more recently are probably) stored in std::optionals, we can't assume that the memory will be freed by the allocator even after the texture is destroyed. The OpenGL texture name will be deleted, but the actual memory contents of the still allocated sf::Texture memory block will remain identical after its destructor has completed.

In order to detect bind-after-frees even in such cases, (in debug builds) we assign m_texture with a value that is very unlikely to be generated as an OpenGL texture name. After the destructor runs but before the memory block is deallocated, this value will sit in m_texture and any attempt to bind the texture will read this value and fail the assert because glIsTexture will return GL_FALSE.

In the case the memory was actually deallocated, either the operating system throws a SEGFAULT/Access Violation or 0xDDDDDDDD (or whatever pattern your allocator uses) is read from m_texture when attempting to bind it. 0xDDDDDDDD is also very unlikely to be a texture name OpenGL generates, so the assertion should fail in this case as well.

Test with any code that causes the "white square problem" to manifest itself, e.g.:

#include <SFML/Graphics.hpp>

int main()
{
    sf::RenderWindow window(sf::VideoMode({200, 200}), "Test");

    auto texture = sf::Texture::loadFromImage(sf::Image({100, 100}, sf::Color::Red));
    sf::Sprite sprite(*texture);
    texture.reset();

    while (window.isOpen())
    {
        while (const auto event = window.pollEvent())
        {
            if (event.is<sf::Event::Closed>())
                return 0;
        }

        window.clear();
        window.draw(sprite);
        window.display();
    }
}

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9633512138

Details

  • 1 of 2 (50.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.02%) to 56.201%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/SFML/Graphics/Texture.cpp 1 2 50.0%
Totals Coverage Status
Change from base Build 9631475793: -0.02%
Covered Lines: 11671
Relevant Lines: 19696

💛 - Coveralls

@vittorioromeo
Copy link
Member

I still appreciate the attempt to solve the lifetime issue, but I am not a fan of this solution either.

Problems with this solution:

  1. Relies on undefined behavior, as accessing an object after it has been destroyed is illegal in C++.

  2. Relies on applying an arbitrary bit pattern (0xFFFFFFFFu) to an internal member of sf::Texture.

  3. (!!!) Does not catch the issue with move constructors encountered in Rewrite Shader.cpp example #3062:

    struct S
    {
        sf::Texture texture{sf::Texture::loadFromImage(sf::Image({100, 100}, sf::Color::Red)).value()};
        sf::Sprite  sprite{texture};
    };
    
    std::optional<S> makeS()
    {
        return std::make_optional<S>();
    }
    
    int main()
    {
        sf::RenderWindow window(sf::VideoMode({200, 200}), "Test");
        auto s = makeS().value();
    
        window.clear();
        window.draw(s.sprite); // <- assertion is not triggered here, we still get white square!
        window.display();
    }
  4. Not general, only applies to sf::Sprite and sf::Texture.

  5. Not testable.

  6. Does not provide an actionable/educational diagnostic to the user.

If we are moving forward with the direction of "debug-only checks", then I don't see why we shouldn't accept #3097.

#3097 doesn't suffer from any of the drawbacks mentioned above -- it's a general, testable, minimal-overhead solution that does not rely on undefined behavior and doesn't require modifying the internals of any SFML type.

@binary1248: do you have anything in particular against #3097?

Here's an example of the output we would get with #3097 for both the original example used in this PR and the one I created above inspired by #3062:

FATAL ERROR: a texture object was destroyed while existing sprite objects depended on it.

Please ensure that every texture object outlives all of the sprite objects associated with
it, otherwise those sprites will try to access the memory of the destroyed texture, causing
undefined behavior (e.g., crashes, segfaults, or unexpected run-time behavior).

One of the ways this issue can occur is when a texture object is created as a local variable
in a function and passed to a sprite object. When the function has finished executing, the 
local texture object will be destroyed, and the sprite object associated with it will now be
referring to invalid memory. Example:

    sf::Sprite createSprite()
    {
        sf::Texture texture(/* ... */);
        sf::Sprite sprite(texture, /* ... */);

        return sprite;
        //     ^~~~~~
        // ERROR: `texture` will be destroyed right after
        //        `sprite` is returned from the function!
    }

Another possible cause of this error is storing both a texture and a sprite together in a data
structure (e.g., `class`, `struct`, container, pair, etc...), and then moving that data structure
(i.e., returning it from a function, or using `std::move`) -- the internal references between the
texture and sprite will not be updated, resulting in the same lifetime issue.

In general, make sure that all your texture objects are destroyed *after* all the sprite objects
depending on them to avoid these sort of issues.

@ChrisThrasher
Copy link
Member

Relies on undefined behavior, as accessing an object after it has been destroyed is illegal in C++.

If this is true then why do you think UBSan doesn't complain? I know sanitizers may not be perfect. Is this just a case of UBSan not being sophisticated enough to detect this?

@vittorioromeo
Copy link
Member

Relies on undefined behavior, as accessing an object after it has been destroyed is illegal in C++.

If this is true then why do you think UBSan doesn't complain? I know sanitizers may not be perfect. Is this just a case of UBSan not being sophisticated enough to detect this?

The Standard is quite clear on this matter (surprisingly...): https://stackoverflow.com/a/68736835/598696

I don't know why UBSan doesn't complain, but it seems like a check for this issue could be implemented in the sanitizer by storing a flag alongside every object, or in a heap allocated registry.

Perhaps it slowed down execution too much, or it was just never considered worthwhile implementing, or there's some other weird technical issue that I'm not seeing. It's a good question.

@vittorioromeo
Copy link
Member

@ChrisThrasher: I found something interesting: google/sanitizers#73

@binary1248 binary1248 force-pushed the feature/texture_bind_assert branch from 4627966 to c00abe8 Compare June 23, 2024 15:39
@ChrisThrasher
Copy link
Member

If these assertions rely on reading data from an object whose lifetime has already ended then I'm afraid that alone makes it a non-starter for me.

@binary1248
Copy link
Member Author

The very obvious difference between this solution and #3097 is that this solution leverages debug machinery that already comes built into most compilers/debuggers. I am not trying to invent new kinds of wheels here, just making it easier for the debugger and any other kind of existing tooling to do its job when running a debug build.

Does this catch every single possible case of undefined behaviour? Definitely not, and that was never the claim.

If you claim that something is not worth doing if it doesn't have a 100% success rate of achieving its goal, then you can also question the effort compiler vendors have put into flagging memory after it has been freed. They also never guaranteed that it would make detecting errors a 100% probability because you know... we are talking about undefined behaviour. And yet, they still go through that effort because the general consensus among the community is that it is worth it even though it doesn't always work.

Relies on undefined behavior, as accessing an object after it has been destroyed is illegal in C++.

How is this even a valid argument? By this logic, all asserts that read memory (of a possibly already destroyed object) to do their job are illegal and nobody should ever use them. ???? Going further, all code would be illegal because you can't know beforehand if the memory they are accessing has already been freed or not. Once undefined behaviour happens somewhere, all assumptions fly out the window. Dereferencing texture to access texture->m_texture in the if is already UB, so making the argument that it is problematic that the assert that takes place after that check relies on undefined behaviour is just nonsensical. asserts are meant to ensure conditions are met when everything is still running smoothly, when we reach the point of UB there is no guarantee what they will do but that was never the point. If an assert has a high probability of catching errors even in the case of UB then they are worth adding for me. Will there be false positives/negatives in the case of UB? Definitely, but why is this even up for discussion? UB for me isn't a point where I just throw my hands in the air and say "well... there's nothing left I could try to do", anything that has a chance of working, no matter how slim, will make the debugging experience less annoying/frustrating.

Also, don't forget that undefined behaviour is undefined in the context of the standard. Undefined behaviour can still be defined to a certain degree when running e.g. under a debugger with debug allocators and other instrumentation enabled. The standard makes no mention about what the differences between release and debug builds should be because it leaves it up to vendors to make their own informed decisions on how to go about with their concrete implementations.

This change targets vendor-implementation-defined behaviour, specifically when debugging. I already listed 2 likely scenarios in the description of this PR, and the probability of memory blocks being recycled by the allocator instead of being flagged with the freed bit pattern is minuscule unless the allocator has absolutely no other choice (e.g. high memory requirement applications). So in this case, when running under the debugger, "undefined behaviour" isn't as random as it would be when running in release and since this change literally only targets debug builds I don't see why we have to treat "undefined behaviour" in this context as the same scary ghost we should instantly run away from in release.

Relies on applying an arbitrary bit pattern (0xFFFFFFFFu) to an internal member of sf::Texture.

I don't see how you can assume this is arbitrary... Every OpenGL implementation I have seen in my life generates texture names in ascending order and recycles names that have been explicitly freed through glDeleteTextures. It would be nonsensical for an implementation to assign names in descending order or even worse in a completely random order. Name assignment is not a security threat. Every OpenGL implementation also has some internal limit to the number of texture objects it supports, so unless someone were to deliberately generate 2^32 texture names without assigning any memory to them, I don't see how 0xFFFFFFFFu can be a valid texture name.

As I mentioned in the description, even 0xDDDDDDDD wouldn't be problematic because that many texture objects would also never be created in any application that was not deliberately trying to be malicious.

(!!!) Does not catch the issue with move constructors encountered in #3062:

I adjusted the change to flag both m_texture and m_cacheId even when the texture is empty on destruction and now it also detects this error.

For anything who wants to verify this:

#include <SFML/Graphics.hpp>

struct S
{
    sf::Texture texture{sf::Texture::loadFromImage(sf::Image({100, 100}, sf::Color::Red)).value()};
    sf::Sprite  sprite{texture};
};

std::optional<S> makeS()
{
    return std::make_optional<S>();
}

int main()
{
    sf::RenderWindow window(sf::VideoMode({200, 200}), "Test");

    auto s = makeS().value();

    while (window.isOpen())
    {
        while (const auto event = window.pollEvent())
        {
            if (event.is<sf::Event::Closed>())
                return 0;
        }

        window.clear();
        window.draw(s.sprite);
        window.display();
    }
}

Not general, only applies to sf::Sprite and sf::Texture.

This change is only meant to target sf::Texture and never meant to be a generalized wheel. The same principle of operation can be applied to other places where it fits the problem.

Not testable.

Why is this all of a sudden an argument? We have plenty of other asserts that we also don't currently "test" due to Catch's lack of support for requiring that the program abnormally terminates akin to googletest's ASSERT_DEATH.

Does not provide an actionable/educational diagnostic to the user.

The same argument can be made about all our other asserts if you claim that the provided text string is not helpful enough.

As I said above... This change is not meant to be a shiny new wheel that is eager to be put to use all over the place. The problem definition is clear, the debugging machinery we expect users to be comfortable with already exists, this change just makes this one specific problem easier to debug than it would be without, with minimal code impact. I just don't see how anybody in good faith could be against the general idea of this kind of change regardless of the concrete implementation.

@coveralls
Copy link
Collaborator

coveralls commented Jun 23, 2024

Pull Request Test Coverage Report for Build 9634618962

Details

  • 2 of 3 (66.67%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.01%) to 56.21%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/SFML/Graphics/Texture.cpp 2 3 66.67%
Totals Coverage Status
Change from base Build 9631475793: -0.01%
Covered Lines: 11673
Relevant Lines: 19698

💛 - Coveralls

@vittorioromeo
Copy link
Member

The very obvious difference between this solution and #3097 is that this solution leverages debug machinery that already comes built into most compilers/debuggers.

#3097 is not doing anything particularly fancy. It's just an atomic counter with some sugar on top. Not using assert is also beneficial as users can turn on/off the lifetime tracking even in release builds if they so desire without having to turn on all assertions to catch possible lifetime issues that only arise with optimizations enabled.

If you claim that something is not worth doing if it doesn't have a 100% success rate of achieving its goal,

I am not claiming that. I am claiming that #3097 is a superior and more general solution to the problem you're trying to solve here. It also conforms to the C++ Standard.

How is this even a valid argument? By this logic, all asserts that read memory (of a possibly already destroyed object) to do their job are illegal and nobody should ever use them.

Yes, that is what I am claiming. I don't think it's common practice to rely on undefined behavior in assertions, but you seem to imply it is.

Once undefined behaviour happens somewhere, all assumptions fly out the window. Dereferencing texture to access texture->m_texture in the if is already UB,

Exactly, this is why #3097 avoid triggering any form of undefined behavior in the first place. The detection mechanism happens and reports a diagnostic prior to UB happening.

Also, don't forget that undefined behaviour is undefined in the context of the standard. [...]

Yes, you can carefully check that "UB" does what you expect on your platform, with your compiler, and so on. But you can never predict if a future compiler version, or some weird flag, or some niche architecture behaves as you expect.

The easiest solution is to just... not rely on UB in the first place.

This change is only meant to target sf::Texture and never meant to be a generalized wheel. The same principle of operation can be applied to other places where it fits the problem.

Or we could consider using #3097 which already solves the problem for all SFML types and does not rely on undefined behavior.

Why is this all of a sudden an argument? We have plenty of other asserts that we also don't currently "test" due to Catch's lack of support [...]

It's an argument because we already have a solution to the problem that is testable, which is a plus.

The same argument can be made about all our other asserts if you claim that the provided text string is not helpful enough.

Ditto, I'm not saying that your assertion is not valuable, I'm saying that we already have a solution with a more informative and useful message. If I have to weigh this PR against #3097, I have a really hard time seeing why I would pick this one over the #3097.

I just don't see how anybody in good faith could be against the general idea of this kind of change regardless of the concrete implementation.

And I don't see how anybody in good faith could be against an existing solution which does not invoke UB, is testable, has already been implemented for all SFML types, can be toggled at will by users, provides a more informative/actionable diagnostic, has minimal impact on the actual definition of SFML types, does not require us hardcoding magic pointer constants, etc...

I also don't see why you think I'm against the general idea, given that I implemented my own solution on that same idea.

@eXpl0it3r
Copy link
Member

eXpl0it3r commented Jun 23, 2024

@vittorioromeo It's not helpful to create red herring arguments that really just are advocating for your implementation and moving goal posts for the proposed change (i.e. all the "my PR does this, yours doesn't" when that was never the goal, nor a requirement). We're all smart people here and are capable of understanding the different proposals. Next time you can just state that you're in favor of your change, you can of course still list the reasons why.

Regarding UB, I don't exactly see the problem. We're already in UB-land before hitting the assert and without the assert, we'll end up in UB-land again.

image

The "white square" problem has been very consistent, one could even say "stable" 😄 , which if I understood correctly relied on OpenGL trying to use the already destroyed texture.
I don't see how the change will be more problematic than the status quo that we already have.

@vittorioromeo
Copy link
Member

vittorioromeo commented Jun 23, 2024

@eXpl0it3r: all the arguments I've made are factual and publicly verifiable. If there are two competing proposals trying to solve the same problem in different ways, I'm going to argue in favour of the one that I believe is superior, regardless of its author.

I value genericity, testability, and adherence to the C++ standard.

Frankly, accusing me of intentionally creating red herrings is a bit too much. Working on SFML is not giving me any joy nowadays, so I will step down.

Copy link
Member

@ChrisThrasher ChrisThrasher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding UB, I don't exactly see the problem. We're already in UB-land before hitting the assert and without the assert, we'll end up in UB-land again.

I'm still skeptical to add more UB to the pile but I suppose we're already reading texture->m_texture prior to this assertion so I'm not sure how the addition of the assertion could make the situation worse.

I'm a bit reluctant to approve this PR, but I suppose I'll give this a shot. 🤞🏻

@ChrisThrasher ChrisThrasher merged commit 742beea into master Jun 27, 2024
@ChrisThrasher ChrisThrasher deleted the feature/texture_bind_assert branch June 27, 2024 21:48
vittorioromeo added a commit to vittorioromeo/VRSFML that referenced this pull request Jun 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants