Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using a multicontext version of glad to prepare for multi GPU support. #10118

Merged

Conversation

jhmueller-huawei
Copy link
Contributor

Refactoring all Vulkan calls to use the device context.

Signed-off-by: Joerg H. Mueller joerg.mueller@huawei.com

@jhmueller-huawei jhmueller-huawei requested review from a team as code owners June 14, 2022 13:55
@gadams3 gadams3 requested a review from a team June 14, 2022 14:34
@jhmueller-huawei
Copy link
Contributor Author

Let me add some information to the changes in this pull request:

For multi-GPU support we need to load Vulkan functions for each device in use which doesn't allow us to use a global context that is usually created by GLAD anymore.

Instead, multiple GladVulkanContext structs can be created. This patch does so once for the Vulkan Instance and once for each Device.

Since calling functions requires using the GladVulkanContext directly now, it didn't make sense to keep an interface class (FunctionLoader) and the implementation (GladFunctionLoader) separate anymore since there is no alternative interface and no viable option to implement one, so I moved everything into the FunctionLoader class (creating a FunctionLoader.cpp for it) and removed GladFunctionLoader.

I also removed the dynamic module handling code in FunctionLoader since on closer inspection that is exactly what GLAD is doing.

The biggest part of the diff is the auto-generated vulkan.h file from https://gen.glad.sh/ where now the mx flag was enabled additionally to allow multiple contexts.

@lmbr-pip lmbr-pip added the sig/graphics-audio Categorizes an issue or PR as relevant to SIG graphics-audio. label Jun 23, 2022
@jhmueller-huawei jhmueller-huawei force-pushed the glad_multi_context_refactor branch 2 times, most recently from 90765a6 to d32c516 Compare June 28, 2022 08:22
Copy link
Contributor

@moudgils moudgils left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part these changes look good to me. Are you able to test AtomSampleViewer and run the full test suite to ensure none of the existing functionality is broken?

@martinwinter-huawei
Copy link
Contributor

Having updated both o3de and o3de-atom-sampleviewer, we get the same result both on the unmodified development branch as well as with this commit included, running scripts/_fulltestsuite_bv.luac, which runs through until we get a crash.
It crashes in MSAA_RPI_ExampleComponent trying to de-reference an iterator in Scene::RemoveRenderPipeline(AZ::Name const &).

We are testing on Linux and with Clang 13.0.1.

@martinwinter-huawei
Copy link
Contributor

I added a bug report at the o3de-atom-sampleviewer repository to highlight this issue.

@jhmueller-huawei
Copy link
Contributor Author

Any progress on this @moudgils? I just updated to the latest development branch. @kh-huawei just tried to run the test suite of AtomSampleViewer with it on Windows and everything worked.

@tjmgd tjmgd self-requested a review July 22, 2022 11:52
@moudgils
Copy link
Contributor

moudgils commented Jul 22, 2022

Any progress on this @moudgils? I just updated to the latest development branch. @kh-huawei just tried to run the test suite of AtomSampleViewer with it on Windows and everything worked.

Ahh. As long as the full test suite runs correctly for Vk we should be good to check this in. I will approve it and start AR for it.

@moudgils
Copy link
Contributor

There are a few issues with this PR as flagged by the AR run. Please go ahead and address them so that the AR run passes - https://jenkins.build.o3de.org/job/O3DE/view/change-requests/job/PR-10118/

Refactoring all Vulkan calls to use the device context.

Signed-off-by: Joerg H. Mueller <joerg.mueller@huawei.com>
@jhmueller-huawei
Copy link
Contributor Author

There are a few issues with this PR as flagged by the AR run. Please go ahead and address them so that the AR run passes - https://jenkins.build.o3de.org/job/O3DE/view/change-requests/job/PR-10118/

The update should fix those issues. I assume you need to trigger another run, or does it run automatically?

@thefranke thefranke merged commit 3cd9628 into o3de:development Jul 28, 2022
@moudgils
Copy link
Contributor

moudgils commented Aug 3, 2022

@jhmueller-huawei This PR is causing runtime crash for Android. Specifically the call to SetDebugUtilsObjectNameEXT is seg-faulting. Are you able to look into this? An engineer (internally) is looking into it but we may decide to revert this PR until the cause of the issue is identified/resolved.

@jhmueller-huawei
Copy link
Contributor Author

Unfortunately, not really. I also don't have any idea why this call would crash, since it's just one of the many function calls that now go through the context instead of a global that is set by glad. It is of course a function provided by an extension, so maybe the issue is with the extension management in glad on Android.

moraaar added a commit that referenced this pull request Aug 9, 2022
…glad context (#11151)

Signed-off-by: moraaar <moraaar@amazon.com>

## What does this PR do?

Fixes `VK_EXTENSION_SUPPORTED` macro to check at runtime if a vulkan extension is supported or not.

Glad vulkan checks each extension availability at runtime (when loaded) and saves it in global int variables `GLAD_VK_EXT_extensionname`. PR #10118 changed this so it checks if a macro `VK_EXT_...` is 1, which it always is, but it's not checking at runtime if the device/instance has support for the extension. So what's happening is that on android it now thinks extension `EXT_debug_utils` is available, which is not, and then it crashes when calling `CreateDebugUtilsMessengerEXT`.

With the new glad vulkan header introduced in PR #10118 instead of checking `GLAD_VK_EXT_extensionname` it should be checking `context.EXT_extensionname` int instead.

## How was this PR tested?

Run default level using vulkan rhi on pc and android.
moraaar added a commit to o3de/o3de-extras that referenced this pull request Aug 10, 2022
…#9)

Removed duplicated function glad loader code and using Atom_RHI_Vulkan.Glad.Static instead.

Fixed compilation issues after glad vulkan header was updated to use multi-context in Vulkan gem (o3de/o3de#10118)

Built with latest O3DE using AtomSampleViewer project (openxr branch). RHI VR Sample works with Quest 2 via link cable.

Fixes #7 

Signed-off-by: moraaar <moraaar@amazon.com>
michalpelka pushed a commit to RobotecAI/o3de-extras that referenced this pull request Jan 4, 2023
…#9)

Removed duplicated function glad loader code and using Atom_RHI_Vulkan.Glad.Static instead.

Fixed compilation issues after glad vulkan header was updated to use multi-context in Vulkan gem (o3de/o3de#10118)

Built with latest O3DE using AtomSampleViewer project (openxr branch). RHI VR Sample works with Quest 2 via link cable.

Fixes o3de#7 

Signed-off-by: moraaar <moraaar@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/graphics-audio Categorizes an issue or PR as relevant to SIG graphics-audio.
Development

Successfully merging this pull request may close these issues.

None yet

6 participants