Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dangle pointer in map BarrierSolver::mResourceStatus leads to the false positive assert in BarrierSolver::debugCheckDivergingTransition #416

Closed
StoyanovVlad opened this issue Sep 11, 2023 · 4 comments

Comments

@StoyanovVlad
Copy link

System Information

  • Ogre Version: ❔ 2.2
  • Operating System / Platform: ❔ Android 13
  • RenderSystem: ❔ Vulkan
  • GPU: ❔

Detailled description

Before creating OgreRenderWindow I create and load texture and setup it in HlmsPbsDatablock, during loading my texture the func GenerateHwMipmaps::_executeSerial() is called where temporary texture with name "___tempMipmapTexture" is created and destroyed at the end of func GenerateHwMipmaps::_executeSerial(). During destroying temporary texture in dtor this texture is added to the BarrierSolver which become a dangle pointer right after exiting dtor. After that I create OgreRenderWindow, during this call texture with "RenderWindow" is created, and if memory for this texture was allocated at the place where dangle pointer in BarrierSolver points to the pointer automatically becomes valid, what is wrong, because no adding was done and later assert will be triggered.

1)Here is adding texture to the Barrier
Screenshot 2023-09-11 at 23 53 57

2)Here is the dangle pointer before creating texture "RenderWindow"
Screenshot 2023-09-11 at 23 55 47

3)Here is stop right after allocating and pointer become valid, but no adding was done to the Barrier
Screenshot 2023-09-11 at 23 56 13

4)Here is my code detects dangle pointer to stop
Screenshot 2023-09-11 at 23 56 32

5)Here is the triggered assert and texture which trigger the assert
Screenshot 2023-09-11 at 23 56 53
Screenshot 2023-09-11 at 23 57 55

Ogre.log

Callstack

@StoyanovVlad
Copy link
Author

StoyanovVlad commented Sep 11, 2023

Supposed fix - selected lines in screenshot
Screenshot 2023-09-12 at 00 22 44

@darksylinc
Copy link
Member

Hi!

First, I congratulate you for such high quality ticket. I wish there were more like yours.

After that I create OgreRenderWindow, during this call texture with "RenderWindow" is created, and if memory for this texture was allocated at the place where dangle pointer in BarrierSolver points to the pointer automatically becomes valid, what is wrong, because no adding was done and later assert will be triggered.

Ah, basically the memory address got reused for an unrelated allocation. Yikes!

Supposed fix - selected lines in screenshot

Funny enough TextureGpuManager::destroyTextureImmediate already calls barrierSolver.textureDeleted but unfortunately it gets added again when it shouldn't have.

Does this alternate solution fix the problem?

diff --git a/RenderSystems/Vulkan/src/OgreVulkanQueue.cpp b/RenderSystems/Vulkan/src/OgreVulkanQueue.cpp
index 97b805e3f7..f6d4cfa874 100644
--- a/RenderSystems/Vulkan/src/OgreVulkanQueue.cpp
+++ b/RenderSystems/Vulkan/src/OgreVulkanQueue.cpp
@@ -1064,10 +1064,14 @@ namespace Ogre
         if( mEncoderState == EncoderCopyOpen )
         {
             bool needsToFlush = false;
+            bool mustRemoveFromBarrier = false;
             TextureGpuDownloadMap::const_iterator itor = mCopyDownloadTextures.find( texture );
 
             if( itor != mCopyDownloadTextures.end() )
+            {
                 needsToFlush = true;
+                mustRemoveFromBarrier = true;
+            }
             else
             {
                 FastArray<TextureGpu *>::const_iterator it2 =
@@ -1085,6 +1089,14 @@ namespace Ogre
                 OGRE_ASSERT_LOW( texture->mCurrLayout == VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL ||
                                  texture->mCurrLayout == VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL );
                 endCopyEncoder();
+
+                if( mustRemoveFromBarrier )
+                {
+                    // endCopyEncoder() just called solver.assumeTransition() on this texture
+                    // but we're destroying the texture. Remove the dangling pointer.
+                    BarrierSolver &solver = mRenderSystem->getBarrierSolver();
+                    solver.textureDeleted( texture );
+                }
             }
         }
     }

After that I create OgreRenderWindow

Unrelated to this bug: OgreNext historically was written when D3D7 and (pre 2.0) OpenGL were around; and it has the unfortunate assumption that the at least one RenderWindow must be the first thing created before anything else (and the last thing to be destroyed).

Because Vulkan & Metal no longer have this restriction; it may just work (i.e. creating GPU resources before creating the Window).

Personally I recommend that "just in case", if possible, you create the RenderWindow first to ensure everything is properly initialized.

@StoyanovVlad
Copy link
Author

Hi! Thank you for your quick reply. Yes, your alternate solution fixes the problem as well.

@darksylinc
Copy link
Member

Done! Thanks for the report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants