Refactor Synchronicity, Provide Async API #77

MichealReed · 2025-02-21T16:24:46Z

Adds async versions for createContext, createContextByGpuIdx, toCPU, createKernel, and dispatchKernel.
Adds waitForContextFuture to pump the event loop until context is ready.
Converts wait to a template so that the object can be returned from the future. Kernel op = wait(ctx, kernelFuture);
Improves the control flow for the synchronous API to reduce reentrancy and race condition errors.
Defines callbacks inline to prevent function reentrancy.
Breaking change functions that previously were used with promise parameter can migrate to using the async API and returned future.
Adds helper function to convert a WGPUStringView to a standard string.
Adds helper functions to USE_DAWN_API: getAdapters(ctx), formatAdapters(adapters), listAdapters(ctx)
Sets CMake native builds to USE_DAWN_API
Adds test/test_gpu.cpp integration tests.
Adds offset to our toCPU functions for reading slices. Offset + size decide the size of the readback buffer and should match the output buffer.

…f on native

… test/test_gpu.cpp

MichealReed · 2025-02-22T04:23:36Z

Last bit on this PR. I added an optional param with this refactor to set a offset for the toCPU buffer read. This is tested in the new test/test_gpu.cpp. I think the toCPU with copydata is now redundant. Will leave it alone if you prefer to keep it.

austinvhuang · 2025-03-01T19:03:01Z

gpu.hpp

-                // kernel invocations
-  std::promise<void> *promise;
+  // kernel invocations
+  std::shared_ptr<std::promise<void>> promise;


The intent with the use of raw pointers is to signal non-owning-ness. Would prefer that as a default unless there's a specific rationale for shared ownership.

I was running into an issue with premature deletion in the async callback chain and this ensured it remained valid through the multiple callbacks.

I see ... with that behavior, I think there's an underlying lifetime issue that sharing ownership with CallbackData is masking. Specifically, the application should fully control the lifetime of the promise.

This is by design so that we don't get in a situation where our library is opinionated in a way that gets in the way of the application. We want control over lifetimes to be explicit and controlled by the user, rather than implicit through the lifetime of CallbackData.

If the application has its own mechanism for automatic lifetime management through shared pointers, that's okay (eg there could be some application-level struct that has a shared_ptrs of the promise and the CallbackData struct), but lifetime management of the promise (or any resource) shouldn't be baked into this library through shared ownership unless there's absolutely no other way to make it work.

In a similar spirit, the reason there's *Pool types in Context is because in the cases where the library is responsible for resource allocation and lifetimes, we try to make the management explicit through resource pools rather than implicit through object scope. For promises, it feels to me like the right place to manage those lifetimes is in the application rather than in the library.

This pointer is passed into Dawn. I do not see how the user can be expected to manage the entire lifecycle given this? I tried to change it back to a raw pointer but the pointer becomes non-existent sometime during the processEvents cycle from Dawn. With a shared pointer all of the tests for test_gpu.cpp pass, with a raw pointer a memory exception is hit for wait on line 846, trying to get the future.

test_gpu.exe!std::future<void>::get() Line 911 C++ > test_gpu.exe!gpu::wait<void>(gpu::Context & ctx, std::future<void> & f) Line 846 C++ test_gpu.exe!stressTestToCPU() Line 240 C++

gpu.hpp

austinvhuang · 2025-03-01T19:11:34Z

gpu.hpp

+ * std::vector<dawn::native::Adapter> adapters = getAdapters(ctx);
+ * @endcode
+ */
+inline std::vector<dawn::native::Adapter> getAdapters(Context &ctx) {


Would rather not have too many 1-line wrapper functions as it adds indirection to what is actually happening. Any reason not to unpack the underlying dawn API calls from the callsites for getAdapters and listAdapters? Do these need to be exposed externally?

At some point these should be exposed to non-dawn builds as well when a way to enumerate adapters exists in the webgpu standard. Before this, there was not a way to see what adapters were actually available. If a user is going to change the adapter used they need to be able to get the full list and from a frontend perspective, output what is available.

MichealReed added 4 commits February 19, 2025 16:30

refactors async

9ac780b

use async context waitForContext()

14e7ab5

adds sync wrappers

9a08f8a

refactors the byIdx context function and sets USE_DAWN_API compile de…

95e587d

…f on native

austinvhuang self-assigned this Feb 21, 2025

MichealReed added 4 commits February 21, 2025 22:14

tests toCPU, adds offset, adds gpuflow doc, default cmakelists builds…

70d9802

… test/test_gpu.cpp

remove path

16feb9e

format

e61e809

doc formatting

ad8698d

MichealReed added 8 commits February 21, 2025 22:29

doc nits

025af2a

set project root on root cmakelists

3776dcd

fix linux issue with callback info

d58e191

should not release readback buffer

498ba74

clean up callback syntax

2db9be1

add stress test

752a53a

linux has a segfault if wait for events after.

5f82ff4

EOF newline

28dabf2

austinvhuang reviewed Mar 1, 2025

View reviewed changes

gpu.hpp Outdated Show resolved Hide resolved

austinvhuang reviewed Mar 1, 2025

View reviewed changes

added sleeptime optional arg

39c816c

austinvhuang merged commit cd37551 into AnswerDotAI:dev Mar 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor Synchronicity, Provide Async API #77

Refactor Synchronicity, Provide Async API #77

Uh oh!

MichealReed commented Feb 21, 2025 •

edited

Loading

Uh oh!

MichealReed commented Feb 22, 2025

Uh oh!

austinvhuang Mar 1, 2025

Uh oh!

MichealReed Mar 1, 2025

Uh oh!

austinvhuang Mar 3, 2025

Uh oh!

MichealReed Mar 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

austinvhuang Mar 1, 2025

Uh oh!

MichealReed Mar 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refactor Synchronicity, Provide Async API #77

Refactor Synchronicity, Provide Async API #77

Uh oh!

Conversation

MichealReed commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MichealReed commented Feb 22, 2025

Uh oh!

austinvhuang Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

MichealReed Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

austinvhuang Mar 3, 2025

Choose a reason for hiding this comment

Uh oh!

MichealReed Mar 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

austinvhuang Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

MichealReed Mar 1, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MichealReed commented Feb 21, 2025 •

edited

Loading

MichealReed Mar 3, 2025 •

edited

Loading