Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support async context creation and use sync postfix #274

Merged

Conversation

huningxin
Copy link
Contributor

@huningxin huningxin commented Jun 20, 2022

Fix #272

According to the feedback of WebML WG Teleconference – 16 June 2022, this PR supports async context creation and sync postfix propsoal that is summarized in the following table:

method async sync
create a context ML.createContext ML.createContextSync
build a graph MLGraphBuilder.build MLGraphBuilder.buildSync
compute a graph MLContext.compute MLContext.computeSync

@wchao1115 @anssiko @dontcallmedom @RafaelCintron @yuhonglin @wacky6 , PTAL. Thanks!


Preview | Diff

Copy link
Member

@anssiko anssiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @huningxin, only a few minor non-blocking suggestions.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
@wacky6
Copy link

wacky6 commented Jun 21, 2022

LGTM. Thanks for the great work!

@yuhonglin
Copy link

LGTM too. Nice job!

@anssiko anssiko self-requested a review June 21, 2022 08:02
Copy link
Collaborator

@wchao1115 wchao1115 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only comment I have for this change is the choice of appending the suffix Sync on the name of a synchronous method while leaving it out for asynchronous ones. I find this choice a bit odd. Typically, an Async suffix for an asynchronous method is more common as the asynchrony nature of the call is not an obvious behavior. By using a Sync suffix, one can argue that every other method in the namespace whose behavior is a blocking call by default should also have this suffix for consistency. I recommend that we revert to the practice of using an Async suffix only on non-blocking calls.

@huningxin
Copy link
Contributor Author

@wchao1115 , thanks for your comments. You are right. There are operation build methods of MLGraphBuilder, e.g., conv2d, and those methods return MLOperand synchronously. They should not have the Sync suffix. I suppose this would align with WebGPU convention, e.g., GPUBuffer.mapAsync. I am fine to revert the changes of build and compute methods.

The remaining open is about createContext, since it is shared by both WebNN and Model Loader. The createContext of Model Loader is defined as an async version while createContext of WebNN is defined as a sync version. I suppose we are aligned that both sync and async versions should be supported. We should decide which (sync or async) should be the default one.

@wchao1115
Copy link
Collaborator

The createContext of Model Loader is defined as an async version while createContext of WebNN is defined as a sync version. I suppose we are aligned that both sync and async versions should be supported.

The Model Loader API and WebNN are not the API of the same layer. So in theory, one could still build an async Model Loader API over a sync WebNN API. I'm just curious about the use case coz when you have two versions of every key API (sync and async) it just makes implementing the standard more tedious and more complex to test.

@huningxin
Copy link
Contributor Author

I'm just curious about the use case coz when you have two versions of every key API (sync and async) it just makes implementing the standard more tedious and more complex to test.

The async APIs would allow JS frameworks easily call them in main thread. Otherwise the frameworks have to post the task to a worker thread, call sync APIs in the worker thread and post the results back to the main thread. I agree supporting both sync and async would increase the complexity of implementation and test.

@huningxin
Copy link
Contributor Author

Although WebGPU uses Async postfix, @dontcallmedom and @RafaelCintron shared a JS developer perspective in WebML WG Teleconference – 16 June 2022 that:

given sync version is a facility for "non-JS developers", used for bringing non-JS codebases via worker to the web
… this suggests making async default and use sync postfix

+1

instances of sync postfix: FileReaderSync(), CSSStyleSheet.replaceSync(text)
RafaelCintron: I agree we should make this method async

@wchao1115 , as you didn't attend that meeting, I'd like to re-bring this point for your reference.

And according to Web Platform Design Principles, an API should generally be synchronous if the following rules of thumb apply:

The API is not expected to ever be gated behind a permission prompt, or another dialog such as a device selector.
The API implementation will not be blocked by a lock, filesystem or network access, for example, inter-process communication.
The execution time is short and deterministic.

I feel the operation build methods of MLGraphBuilder, e.g., MLGraphBuilder.conv2d, fit into this category and should not have Sync postfix.

@wacky6
Copy link

wacky6 commented Aug 16, 2022

Model loader will follow WebNN's decision on of createContext (async default) and createContextSync (or vice versa)

I think Sync suffix is better (which is probably only be used by WASM in workers?) for createContext, build, compute. This also matches conventions used in Node.js.

I think MLGraphBuilder methods like conv2d, dense don't need the Sync suffix, becuase they are declarative and isn't actually computing.

@anssiko
Copy link
Member

anssiko commented Aug 16, 2022

Thanks @huningxin @wchao1115 and @wacky6 for your thoughful comments.

@wchao1115 I fully understand your perspective on naming and consistency. However, there's no established naming convention for sync vs async Web APIs due to historical reasons: convenient asynchronicity primitives such as promises were retrofitted to Web APIs and in part due to this only a few Sync suffixed Web APIs exists (FileReaderSync and WebGLSync), and no Async suffixed at all.

This is different from many other platforms (that don't have the same historical baggage) where APIs use these suffixes consistently.

In the light of this, my recommendation would be to special-case the few sync methods that may block with a Sync suffix (even if this suffix is not that popular) and keep the async versions without the suffix.

I believe this PR as is represents our best attempt at that compromise, and I hope we can get this landed soon with everyone's support.

@wacky6
Copy link

wacky6 commented Aug 16, 2022

+1 with @anssiko's suggestion: #274 (comment)

@wchao1115
Copy link
Collaborator

I audited all the references mentioned here and I'm still not entirely sure I understand the full context. The meeting note when this naming discussion took place is also partially occluded.

But it sounds like we're saying that there are 3 classes of methods, the clearly sync, clearly async, and others, and that only the first group will need a Sync suffix (with no suffix for the last two)? If that's so, how will developers (or implementers) know how to differentiate between the last two?

I always prefer clarity, and API naming is a tool for clarity in behavior. Naming consistency is important and if we can do that, it's always the best policy, but in reality you will find that a fully consistent naming convention across namespaces hardly exists. In this case, I would opt for developer experience i.e. what do we feel is the best naming that would not confuse our developers?

@anssiko
Copy link
Member

anssiko commented Aug 16, 2022

Thanks @wchao1115. Meeting notes is my err. Also, I hear you on clarity.

Web API naming is inconsistent for historical reasons as can be seen by auditing the existing Web APIs. It pains me to say sometimes aligning with that imperfection is the least bad approach.

My understanding is the PR as is represents that least bad approach.

It is said there are two hard problems in CS, one of them is naming things… I agree.

@anssiko
Copy link
Member

anssiko commented Aug 26, 2022

@RafaelCintron pointed out on our 2022-08-25 call that the latest WebGPU API does have three methods with an Async suffix:

  • createRenderPipelineAsync
  • createComputePipelineAsync
  • mapAsync

Conversely, there are 4 methods with a Sync suffix defined in the WebGL 2.0 spec:

  • deleteSync
  • fenceSync
  • isSync
  • clientWaitSync

Here's an index of all IDL member names: https://dontcallmedom.github.io/webidlpedia/members.html

The index I referenced earlier (https://dontcallmedom.github.io/webidlpedia/names/) listed only referenceable IDL names and omitted method names. That was my oversight, thanks @dontcallmedom for correcting me.

WIth this new information it looks like we still (sadly) cannot argue that there would be a naming convention for sync/async methods on the web platform we could follow.

@wchao1115, we didn't want to make any decision on this PR on our call while you were away, so please feel free to chime in here when you have a moment.

@wchao1115
Copy link
Collaborator

@anssiko Finding a consistent precedence is never easy. I'm still leaning more towards following WebGPU convention and use Async suffix for async calls. We have previously dropped support for WebGL.

@huningxin
Copy link
Contributor Author

Rebase the PR and solve the merge conflict.

@anssiko
Copy link
Member

anssiko commented Aug 30, 2022

(Design discussion to continue in the issue #272.)

@huningxin
Copy link
Contributor Author

Per the Resolution #01 of WebML WG Teleconference – 3 November 2022 that is

RESOLUTION: Per TAG recommendation adopt x() and xSync() naming pattern for createContext(), build() and compute() methods.

I am going to merge this PR.

Thanks all for your review and inputs.

@huningxin huningxin merged commit 29c0323 into webmachinelearning:main Nov 4, 2022
github-actions bot added a commit that referenced this pull request Nov 4, 2022
SHA: 29c0323
Reason: push, by huningxin

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@huningxin huningxin mentioned this pull request Dec 10, 2022
aarongable pushed a commit to chromium/chromium that referenced this pull request Feb 8, 2023
This CL implements the WebNN spec change [1] that renames the asynchronous MLGraphBuilder.buildAsync() method to MLGraphBuilder.build() and synchronous  MLGraphBuilder.build() method to MLGraphBuilder.buildSync(). The MLGraphBuilder.builderSync() method is only exposed to the dedicated worker.

This CL allows some WPT WebNN test cases to pass the MLGraph build step on XNNPACK backend. To ensure the 64 and 32 bits builds have the same WPT baseline, this CL also builds XNNPACK for Windows and Linux on x86. Other platforms are expected to fail at this stage. All the WPT baselines are updated by using blink_tool.py rebaseline-cl [2].

[1]: webmachinelearning/webnn#274
[2]: https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_test_expectations.md#rebaselining-using-try-jobs


Bug: 1273291
Change-Id: Iee6b27fee46305e7196b5015f324bba3abadb09d
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4005570
Commit-Queue: ningxin hu <ningxin.hu@intel.com>
Code-Coverage: Findit <findit-for-me@appspot.gserviceaccount.com>
Reviewed-by: Jiewei Qian <qjw@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1102523}
aarongable pushed a commit to chromium/chromium that referenced this pull request Feb 8, 2023
…ld methods"

This reverts commit 3a319e2.

Reason for revert: Causes failures in external/wpt/webnn/

Original change's description:
> WebNN: Implement the WebNN spec change for MLGraphBuilder build methods
>
> This CL implements the WebNN spec change [1] that renames the asynchronous MLGraphBuilder.buildAsync() method to MLGraphBuilder.build() and synchronous  MLGraphBuilder.build() method to MLGraphBuilder.buildSync(). The MLGraphBuilder.builderSync() method is only exposed to the dedicated worker.
>
> This CL allows some WPT WebNN test cases to pass the MLGraph build step on XNNPACK backend. To ensure the 64 and 32 bits builds have the same WPT baseline, this CL also builds XNNPACK for Windows and Linux on x86. Other platforms are expected to fail at this stage. All the WPT baselines are updated by using blink_tool.py rebaseline-cl [2].
>
> [1]: webmachinelearning/webnn#274
> [2]: https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_test_expectations.md#rebaselining-using-try-jobs
>
>
> Bug: 1273291
> Change-Id: Iee6b27fee46305e7196b5015f324bba3abadb09d
> Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4005570
> Commit-Queue: ningxin hu <ningxin.hu@intel.com>
> Code-Coverage: Findit <findit-for-me@appspot.gserviceaccount.com>
> Reviewed-by: Jiewei Qian <qjw@chromium.org>
> Cr-Commit-Position: refs/heads/main@{#1102523}

Bug: 1273291
Change-Id: I1065849f6351d5b43992deb285f09227c4f3846b
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4232577
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Owners-Override: Anders Hartvoll Ruud <andruud@google.com>
Commit-Queue: Anders Hartvoll Ruud <andruud@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1102672}
aarongable pushed a commit to chromium/chromium that referenced this pull request Feb 13, 2023
…ld methods"

This is a reland of commit 3a319e2

This CL fixes the WebNN WPT failure on Linux debug bot by disabling the
WebNN XNNPACK build for Linux. The root cause is XNNPACK (cpuinfo)
initialization would fail on Linux component build.

Original change's description:
> WebNN: Implement the WebNN spec change for MLGraphBuilder build methods
>
> This CL implements the WebNN spec change [1] that renames the asynchronous MLGraphBuilder.buildAsync() method to MLGraphBuilder.build() and synchronous  MLGraphBuilder.build() method to MLGraphBuilder.buildSync(). The MLGraphBuilder.builderSync() method is only exposed to the dedicated worker.
>
> This CL allows some WPT WebNN test cases to pass the MLGraph build step on XNNPACK backend. To ensure the 64 and 32 bits builds have the same WPT baseline, this CL also builds XNNPACK for Windows and Linux on x86. Other platforms are expected to fail at this stage. All the WPT baselines are updated by using blink_tool.py rebaseline-cl [2].
>
> [1]: webmachinelearning/webnn#274
> [2]: https://chromium.googlesource.com/chromium/src/+/main/docs/testing/web_test_expectations.md#rebaselining-using-try-jobs
>
>
> Bug: 1273291
> Change-Id: Iee6b27fee46305e7196b5015f324bba3abadb09d
> Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4005570
> Commit-Queue: ningxin hu <ningxin.hu@intel.com>
> Code-Coverage: Findit <findit-for-me@appspot.gserviceaccount.com>
> Reviewed-by: Jiewei Qian <qjw@chromium.org>
> Cr-Commit-Position: refs/heads/main@{#1102523}

Bug: 1273291
Change-Id: Id67b503fb39e5d7bba86e2c1f5e2bc4c375c5b63
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4231593
Reviewed-by: Jiewei Qian <qjw@chromium.org>
Commit-Queue: ningxin hu <ningxin.hu@intel.com>
Cr-Commit-Position: refs/heads/main@{#1104363}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support asynchronous context creation
5 participants