Define graph execution methods used in different threading models. #255

wchao1115 · 2022-02-26T18:32:10Z

Based on the discussion in issue #230, we define three methods of execution for the compiled graph as follow:

Immediate Execution through the MLExecution interface:

The MLExecution interface represents a method of execution that synchronously carries out the computational workload of a compiled graph MLGraph on the calling thread, which must be a worker thread, to produce results as defined by the operations in the graph. This method of execution requires an MLContext created with MLContextOptions with the MLDevicePreference option set to either "cpu" or "default" resolved to a CPU context.

Async Execution through the MLAwaitedExecution interface:

The MLAwaitedExecution interface represents a method of execution that asynchronously carries out the computational workload of a compiled graph MLGraph on a worker thread to avoid blocking the calling thread while producing results as defined by the operations in the graph. This method of execution requires an MLContext created with MLContextOptions or WebGLRenderingContext.

Queued Execution through the MLCommandEncoder interface:

The MLCommandEncoder interface represents a method of execution that synchronously records the computational workload of a compiled graph MLGraph to a GPU command buffer GPUCommandBuffer on the calling thread. Since the workload is not immediately executed, just recorded, this method allows more flexibility for the caller to determine how and when the recorded commands will be submitted for execution on the GPU relative to other GPU workload on the same queue. This method of execution requires an MLContext created with GPUDevice.

Preview | Diff

dontcallmedom · 2022-02-28T08:25:49Z

I may be confused, but how would a developer know that their MLContext defaults have resulted in a CPU vs GPU, and in a WebGL vs WebGPU outcome? That's key to them picking the right executor.

Generally speaking, branching on the type of the processing unit and support for WebGPU significantly raises the complexity of coding against the API; maybe that's acceptable given our targets (framework developers), but it's probably worth confirming it.

Bikeshedding - to align with the existing pattern in the file API of FileReader (async) and FileReaderSync (sync, available in worker only), I think we should use MLExecutionSync and MLExecution (or more explicitly MLExecutionAsync?); and maybe Executor rather than Execution?

wchao1115 · 2022-02-28T20:46:40Z

@dontcallmedom The assumption here is that the caller choosing the graph execution method is also responsible for how the MLContext is created. The discussion #230 exposes a new requirement that WebNN must support multiple threading models for its graph execution depending on the circumstances around how the graph should be processed in relation to the rest of the activities in the framework deploying it.

For instance, if a context is created specifically for the CPU, the graph can be executed either synchronously (via MLExecution) or asynchronously (via MLAwaitedExecution) but the caller must choose which one to go with, while if the context is created by a WebGPU device, the graph can't be executed directly either synchronously or asynchronously, but rather its workload must be recorded in a command buffer and yield to the app. Only this way the ML workload can then be truly managed alongside the rest of the graphics payload going to the GPU.

index.bs

huningxin · 2022-03-01T06:45:29Z

index.bs

+The {{MLAwaitedExecution}} interface represents a method of execution that asynchronously carries out the computational workload of a compiled graph {{MLGraph}} on a worker thread to avoid blocking the calling thread while producing results as defined by the operations in the graph. This method of execution requires an {{MLContext}} created with {{MLContextOptions}} or {{WebGLRenderingContext}}.
+
+<script type=idl>
+typedef (MLBufferView or WebGLTexture or GPUTexture) MLResource;


I suppose MLResource should not support WebGPU resources e.g.,GPUBuffer (of MLBufferView) and GPUTexture.

Should MLResource be limited to just ArrayBufferView? It could be just an async version of MLExecution.

Are you suggesting that async execution of a GPU context would always automatically upload/download the result to/from the CPU memory? I can see that. How will async execution be used by TF.js in the readback scenario?

Are you suggesting that async execution of a GPU context would always automatically upload/download the result to/from the CPU memory?

Yes.

How will async execution be used by TF.js in the readback scenario?

The TF.js has Tensor::data() method which "asynchronously downloads the values".

huningxin · 2022-03-01T07:10:33Z

index.bs

+The {{MLExecution}} interface represents a way the execution of the graph is carried out immediately on the
+calling thread, which must also be a worker thread. The execution is carried out by the {{MLExecution/compute()}},
+method which produces the results of the computation from all the inputs bound to graph on the bound outputs.
+This type of execution is limited to only when the computational device bound to the ML context is a CPU.


I suppose it should support GPU as well. It would allow the caller who uses sync API could also access GPU.

That was my original thought, but @RafaelCintron believes that we should not allow sync execution on a GPU context, because it means you need to block on the calling thread until the GPU executes and produces the results in the output tensors, which could create jarring user experience.

if this is restricted to DedicatedWorker, I'm not sure I understand where the risk with blocking on a GPU execution would be. Having it applicable to GPU contexts would help with the concern I raised

which could create jarring user experience.

Because it is limited in the worker, I suppose the UI thread (main thread) won't be blocked in this way.

huningxin · 2022-03-01T07:52:16Z

index.bs

+interface MLCommandEncoder {
+  constructor(MLContext context);
+
+  undefined initializeGraph(MLGraph graph, MLNamedGPUInputs inputs);


Should this method be named as initialize? MLGraph is the first argument.

Should MLExecution and MLAwaitedExecution expose initialize method as well? With that, we can remove the MLGraphBuilder::constant and move all constant buffers binding to specialized execution interfaces.

Normally, an initialize method on an interface implies that the operation initializes the host interface object, which makes the host object stateful. I explicitly named this method initializeGraph because the operation would initialize the graph input param to the method, and not the host object itself i.e. it implies that the graph is stateful, and not the encoder.

Should MLExecution and MLAwaitedExecution expose initialize

MLExecution is CPU only, and so there isn't a need to initialize and pre-process the weights i.e. graph constants and inputs are treated the same way.

MLAwaitedExecution when used for a GPU context, will need to initialize and pre-process the weights before the first inference call. But that can be done without defining the initializeGraph method because weights are simply constant operands bound to the graph at construction time. The implementation can already distinguish the weight constants from the rest of the named inputs, so initializing them could technically be done before the first inference call (i.e. compute).

The real reason that only MLCommandEncoder will need a separate initializeGraph method is that unlike MLExecution and MLAwaitedExecution, the command encoder does not actually carry out the execution of the compute workload. It merely records the workload onto the command buffer on behalf of the caller. This means that it is the caller's responsibility to track the weight constant tensors, and make sure that constant initialization workload is recorded separately by the initializeGraph call.

Another way to think about it is that the command encoder is a lower level API than the execution API. And, while the execution API can "cache" the constant's initialization result before making the first inference call to compute the output result, the encoder API cannot, it can only record the act of initialization, as a workload separated from the inference workload captured by the subsequent dispatch call.

A CPU backend may also need to pre-process the weights, e.g., conversion to a SIMD instruction optimized block memory layout. And as the discussion above, I would suggest MLExecution supports both CPU and GPU.

The graph initialization, e.g., constants pre-processing, may be time consuming. Doing it in sync MLGraphBuilder::build may block the main thread. Hosting a sync version in MLExecution and an async version in MLAwaitedExecution may be a solution.

Regarding to the resource type, the initialize of MLExecution and MLAwaitedExecution could only accept ArrayBufferView, while initializeGraph of MLCommandEncorder only accepts GPUBuffer. This would align with the execution methods respectively (with my suggestion in comment) and simplify the usage.

In this way, the graph building code could be shared for different execution methods. Otherwise, the callers need to call MLGraphBuilder::constant to bind constant to a graph for MLExecution and MLAwaitedExecution, but call MLGraphBuilder::input and bind the constant in MLCommandEncoder::initializeGraph. That leads to different graph building code.

index.bs

huningxin · 2022-03-01T07:55:41Z

index.bs

+interface MLAwaitedExecution {
+  constructor(MLContext context);
+
+  Promise<MLNamedOutputs> compute(MLGraph graph, MLNamedInputs inputs, MLNamedOutputs outputs);


Should it just return Promise<void>?

My understanding is that Promise<Return-Type> is a semantic used for a deferred async result on the return value. In this case, the result will be placed on the outputs param, which should also be the return value on the return type. Let me know if I get it wrong.

Promise<Return-Type> means that a developer would write code à la

const outputs = await executor.compute(graph, inputs, namedoutputs);

I agree with @huningxin that this feels wrong, since the expectation is instead that namedoutputs will be modified in place by the compute method - having the same object also returns via the promise is at best confusing.

dontcallmedom · 2022-03-02T09:12:05Z

@wchao1115 I understood the model you propose that way. My question is - imagine you create a default MLContext with:

// Default context created
const context = navigator.ml.createContext();

That means the developer leaves entire control of where the model will be computed to the user agent; the developer has no way to know whether the resulting context implies the sync approach (MLExecution), the async one (MLAwaitedExecution), or the WebGPU one (MLCommandEncoder), so there is no way to write code that triggers the right execution.

Presumably, this could be surfaced somehow on the MLContext object, but isn't at this stage.

Even if it gets surfaced, it still means that WebNN users would have to maintain 3 code pathways to deal with the general case of executing a graph (i.e. one where the developer wants to leave the choice of the context to the user agent); 3 pathways means that many opportunities for additional bugs, that much additional need for testing, etc - that may very well be OK, but I think we should check whether this is an acceptable trade-off.

wchao1115 · 2022-03-03T04:54:05Z

That means the developer leaves entire control of where the model will be computed to the user agent; the developer has no way to know whether the resulting context implies the sync approach (MLExecution), the async one (MLAwaitedExecution), or the WebGPU one (MLCommandEncoder)

In the default case, the caller can use either MLExecution or MLAwaitedExecution. This is already spelled out in the definition of each of these two execution methods. You can't use MLCommandEncoder because in its definition, it says:

This method of execution requires an MLContext created with GPUDevice.

dontcallmedom · 2022-03-03T07:44:39Z

In the default case, the caller can use either MLExecution or MLAwaitedExecution. This is already spelled out in the definition of each of these two execution methods. You can't use MLCommandEncoder because in its definition, it says:

This method of execution requires an MLContext created with GPUDevice.

OK, I'm starting to understand more clearly the intent here:

createContext(GPUDevice) ⇒ MLCommandEncoder
createContext(!GPUDevice) && window ⇒ MLAwaitedExecution
createContext(!GPUDevice) && worker && cpu ⇒ (MLAwaitedExecution | MLExecution)
createContext(!GPUDevice) && worker && !cpu ⇒ MLAwaitedExecution

If that's indeed the model, I think it needs to be surfaced in createContext, not just under the various execution interfaces; the execution interfaces also need to trigger errors when they're called with the wrong context.

And I still don't understand with the default context type how a developer would know whether they can or cannot use MLExecution in a worker, since I don't know how they can determine whether « "default" [was] resolved to a CPU context ».

Generally speaking, I am worried that the generic interface MLContext can be used as parameter to this or that constructor depending on a state that isn't directly exposed to the developer. Imagine a scenario where the developer is delegating creation of the MLContext to a 3rd party library, the information about how the MLContext has been created will also need to be passed back, which feels awkward.

dontcallmedom · 2022-03-03T07:47:03Z

index.bs

+This type of execution is limited to only when the computational device bound to the ML context is a CPU.
+
+The {{MLAwaitedExecution}} interface represents a way the execution of the graph is performed asynchronously
+on a separate worker thread. The call to the {{MLAwaitedExecution/compute()}} method returns instantly without 


Suggested change

on a separate worker thread. The call to the {{MLAwaitedExecution/compute()}} method returns instantly without

in the background. The call to the {{MLAwaitedExecution/compute()}} method returns instantly without

or maybe just say "separate thread" (but "worker thread" makes it feel that it will be run in a developer-exposed worker)

dontcallmedom · 2022-03-03T07:48:02Z

index.bs

+on a separate worker thread. The call to the {{MLAwaitedExecution/compute()}} method returns instantly without 
+blocking the calling thread. This execution method is appropriate when the responsiveness of the calling thread
+is critical to good user experience. The computation results will be placed at the bound outputs at the time
+the operation is completed on a worker thread at which time the calling thread is signaled. This type of 


Suggested change

the operation is completed on a worker thread at which time the calling thread is signaled. This type of

the operation is completed in the background at which time the calling thread is signaled. This type of

wchao1115 · 2022-03-03T22:13:39Z

If that's indeed the model, I think it needs to be surfaced in createContext, not just under the various execution interfaces;

Thanks @dontcallmedom. There is an alternative design discussed in the issue #230, which I think is what you alluded to here. Let me try to implement that in a PR, and see which one is more palatable. The entire matrix of required use cases is complex and so there may not be a "perfect" design in this case. But maybe we can try to pick the least bad one.

I must point out though that this scenario:

Imagine a scenario where the developer is delegating creation of the MLContext to a 3rd party libra

Is not something we aim to support i.e. callers that use WebNN to create and execute the compiled ML graph also has the responsibility for context creation. The act of executing or recording the execution of the models is highly dependent on the type of context and the underlying device used to create and execute them.

wchao1115 · 2022-03-10T07:59:46Z

There is an alternative design discussed in the issue #230,

@dontcallmedom Please take a look at an alternative design at PR #257

bbernhar · 2022-03-17T19:15:43Z

Is the intent of MLCommandEncoder to be also interopable with WebGPU? If so, how will MLCommandEncoder work on graphics APIs (like Vulkan) that do not natively support ML?

WebNN does not necessarily need to introduce WebGPU's device execution model (deferred vs immediate) in order to execute ML work on the GPU asynchronously. Before "command encoders", the context itself would encode and schedule GPU work by generating a (internal) single command-buffer, each on a separate thread/worker, to effectively allow simultaneous encoding. The control/flexibility of explicit ML command encoders (deferred) does not seem useful if the WebNN developer can't broadly share it with WebGPU.

wchao1115 · 2022-03-17T22:49:04Z

@bbernhar Maybe I misunderstood your previous comment. If the WebNN context were to record its ML workloads onto an internal command buffer, who is responsible for executing that buffer? If that's also the responsibility of the context, then is the execution a blocking operation? The purpose of the MLCommandEncoder is to produce the command buffer with the ML workloads in it. Its caller is free to execute the command buffer whenever is appropriate using the WebGPU API, in however way it wants.

bbernhar · 2022-03-18T17:38:23Z

@bbernhar Maybe I misunderstood your previous comment. If the WebNN context were to record its ML workloads onto an internal command buffer, who is responsible for executing that buffer?

Either MLContext enqueues a (explicit or implicit) encoded ML command-buffer into WebGPU's shared queue and GPUQueue.Submit executes it indirectly OR MLContext directly executes the encoded ML command-buffer using WebGPU's shared queue. Both submit work to the GPU asynchronously, as it rarely makes sense to do otherwise (ie. CPU stalls GPU).

Notice the use of an implicit ML command-buffer allows for a simple interop model of "queue sharing" (vs sharing command-lists), avoiding the need for WebGPU to broadly support non-native concepts like MLCommandEncoder.

Hope this helps.

wchao1115 · 2022-03-22T14:32:06Z

@bbernhar @RafaelCintron Following our discussion, should we go ahead and close this discussion around MLCommandEncoder topic in this PR (and the relevant #257)?

huningxin · 2022-03-24T09:57:11Z

@bbernhar

Either MLContext enqueues a (explicit or implicit) encoded ML command-buffer into WebGPU's shared queue and GPUQueue.Submit executes it indirectly OR MLContext directly executes the encoded ML command-buffer using WebGPU's shared queue. Both submit work to the GPU asynchronously, as it rarely makes sense to do otherwise (ie. CPU stalls GPU).

I'd like to understand how resource sharing and execution ordering happen in this design.

As the sample of WebGPU/WebNN background blur for the full-GPU-only video processing use case, it involves dispatching compute shaders and ML graph and sharing GPUBuffers between them. The processing steps include:

The first compute shader converts the input texture into input tensor GPUBuffer.
The ML graph consumes the input tensor GPUBuffer, executes the semantic segmentation and produces the results into the output tensor GPUBuffer (segmap).
The second compute shader takes the input texture, the blurred texture (produced by another compute shader) and the segmap GPUBuffer as inputs, produces the background blurred image into the output texture.

The pseudo code looks like:

const commandEncoder = device.createCommandEncoder();

// Convert inputTexture to inputTensor
let computePass = commandEncoder.beginComputePass();
computePass.setPipeline(preprocessingPipeline);
computePass.setBindGroup();
computePass.dispatch();
computePass.end();

// ML graph segments background from inputTensor to outputTensor
// Should it record the command into commandEncoder? like
mlGraph.record(commandEncoder, inputBuffer, outputBuffer); 

// Blend inputTexture to outputTexture based on outputTensor
let computePass = commandEncoder.beginComputePass();
computePass.setPipeline(postprocessingPipeline);
computePass.setBindGroup();
computePass.dispatch();
computePass.end();

// Render the outputTexture onto canvas
const passEncoder = commandEncoder.beginRenderPass();
passEncoder.setPipeline(fullscreenQuadPipeline);
passEncoder.setBindGroup();
passEncoder.draw(6, 1, 0, 0);
passEncoder.end();

device.queue.submit([commandEncoder.finish()]);

bbernhar · 2022-03-24T17:37:16Z

@huningxin

If we do the "execute WebGPU indirect" method:

const commandEncoder = device.createCommandEncoder();

let computePass1 = commandEncoder1.beginComputePass();
...
computePass1.dispatch();
computePass1.end();

mlcontext = navigator.ml.createContext(device)
let graph = /* Build graph */
mlcommandBuffer= mlcontext.createCommandBuffer(graph, inputBuffer, outputBuffer)

let computePass2 = commandEncoder2.beginComputePass();
...
computePass2.dispatch();
computePass2.end();

const renderPass1 = commandEncoder2.beginRenderPass();
...
renderPass1.draw(...);
renderPass1.end();

mlcontext.Submit(queue, [commandEncoder1.Finish(), mlcommandBuffer, commandEncoder2.Finish()])

MLContext would generate it's own (internal) GPU command buffer, encoded with ML work, then effectively calls queue.Submit() implicitly (or internally) with the (optional) WebGPU command buffer(s) AND the generated ML command buffer (this is the execution order). Resource sharing would work just like WebGPU interop (ie. external buffers/textures) would be imported/exported.

WDYT?

huningxin · 2022-03-25T06:39:24Z

@bbernhar

mlcommandBuffer= mlcontext.getCommandBuffer(graph, inputBuffer, outputBuffer)

Get (create) a GPUCommandBuffer from a MLGraph is a brilliant idea. I like it.

My question is whether we can leverage GPUQueue to submit it, like

device.queue.submit([commandEncoder1.Finish(), mlcommandBuffer, commandEncoder2.Finish()])

bbernhar · 2022-03-25T16:33:10Z

@huningxin

My question is whether we can leverage GPUQueue to submit it, like

Thanks! Yes, it's possible but there are no plans to support WebGPU interop through shared command-buffers and it may not be allowed unless it's explicitly defined in the WebGPU specification (ex. Dawn forbids this). Of course, going the other way is permitted.

huningxin · 2022-03-26T07:26:31Z

GPUQueue::submit is defined to accept a sequence of GPUCommandBuffer:

interface GPUQueue {
    undefined submit(sequence<GPUCommandBuffer> commandBuffers);
};

So if we define MLContext::getCommandBuffer returns a GPUCommandBuffer, it seems to me WebGPU spec would not be required to change.

Did I miss anything?

wchao1115 · 2022-03-27T17:35:38Z

So if we define MLContext::getCommandBuffer returns a GPUCommandBuffer, it seems to me WebGPU spec would not be required to change.

@bbernhar, @RafaelCintron and myself did discuss this exact design at great length and agreed not to pursue them mainly because it creates a strong requirement that a WebNN implementation must produce a WebGPU primitive (i.e. GPUCommandBuffer) that any implementation of WebGPU must understand how to handle.

Although I think it is conceptually sound, and am also a proponent of it, I agreed that it could make implementing WebNN much harder, especially when it's implemented using an OS API that has no design provision for interoping with the GPU stack used to implement WebGPU on the same environment e.g. it isn't at all clear how it's going to be implemented by Khronos API, Apple CoreML, or even hardware-vendor APIs like OpenVINO.

The new consensus now is to leave the ML workloads generation and the command buffer submission as an implementation detail behind a public interface that simply accepts the WebGPU device's default queue. Please take a look at my latest commit on #257.

huningxin · 2022-03-28T07:44:13Z

because it creates a strong requirement that a WebNN implementation must produce a WebGPU primitive (i.e. GPUCommandBuffer) that any implementation of WebGPU must understand how to handle.

I suppose this might not be an issue. Because as you mentioned, only when WebNN is implemented by using an OS API that supports interop with the GPU stack used to implement WebGPU on the same environment, the creation of MLContext from a GPUDevice would be allowed. Otherwise, the creation of MLContext from a GPUDevice would throw an error. So there is not such a case that a WebNN implementation produces a WebGPU primitive that is not supported by a WebGPU implementation on the same environment / browser.

bbernhar · 2022-03-28T17:17:02Z

Did I miss anything?

GPUCommandBuffer is also expected to contain only "valid commands". This is a hard requirement for WebGPU implementations. Either WebGPU needs to know what these commands are (and where to find them) or be told how to validate/handle, otherwise Queue.submit will always fail. =(

wchao1115 · 2022-03-29T00:18:00Z

only when WebNN is implemented by using an OS API that supports interop with the GPU stack used to implement WebGPU on the same environment, the creation of MLContext from a GPUDevice would be allowed.

The WebNN implementation also needs to be able to produce an implementation of GPUCommandBuffer (from your proposed MLContext.getCommandBuffer method) that the WebGPU implementation would accept. This is harder to do in practice than simply submitting the ML workload directly into the internal device's queue (from a GPUDevice) as it all happens behind the scene without a public commitment to producing WebGPU-compatible command buffer.

Additionally, if the MLContext is created with GPU device type, but without a GPUDevice given, the WebNN implementation is free to use its own internal device and reuses the exact same code as it would on a given GPUDevice to submit to the internal device's queue. It's a less publicly demanding contract that will still allow efficient interop with the WebGPU device without tying our hands in the future.

huningxin · 2022-03-29T01:07:30Z

@bbernhar

GPUCommandBuffer is also expected to contain only "valid commands".

It makes sense. If the MLGPUCommandBuffer extends GPUCommandBuffer, its implementation should implement the validate / handle methods of the interface.

@wchao1115

This is harder to do in practice than simply submitting the ML workload directly into the internal device's queue (from a GPUDevice) as it all happens behind the scene without a public commitment to producing WebGPU-compatible command buffer.

It's true. However, with the proposal of "execute WebGPU indirect" as

mlcontext.Submit(queue, [commandEncoder1.Finish(), mlcommandBuffer, commandEncoder2.Finish()])

it requires WebNN implementation to validate / handlie of other types of GPUCommandBuffer, say rendering and compute, and submit them into the device queue. This part of work is out of scope of WebNN and duplicated with WebGPU implementation.

My another concern is from developer experience, with the MLContext::sbumit, now developers have two code paths to submit GPU command buffers. Ideally, developers should be able to just drop the WebNN graph compute as a command buffer into the existing submit path.

wchao1115 · 2022-03-29T04:29:14Z

My another concern is from developer experience, with the MLContext::sbumit

I'm sorry I'm not following. My latest commit on PR #257 doesn't call for a new submit method, instead it just requires that the implementation of computeAsync internally submits the workload onto the GPUDevice device queue. Please take a look at the latest commit and let me know if you have any concern.

huningxin · 2022-03-29T13:12:26Z

I was talking about @bbernhar 's proposal. Sorry for the confusion. I'll move to #257 and comment there.

wchao1115 · 2022-03-29T14:18:23Z

I was talking about @bbernhar 's proposal. Sorry for the confusion. I'll move to #257 and comment there.

Yeah, we already moved past that proposal.

wchao1115 · 2022-03-29T16:02:31Z

Hi all, I would like to pause the discussion in this PR for now to avoid confusion since all the recent revisions of the proposal on the topic have been made on #257. Please put your additional comment on that #257 for simplicity.

bbernhar · 2022-03-29T16:34:19Z

it requires WebNN implementation to validate / handlie of other types of GPUCommandBuffer, say rendering and compute, and submit them into the device queue. This part of work is out of scope of WebNN and duplicated with WebGPU implementation.

Nothing is duplicated. WebNN would have it's own (internal) queue implementation that re-uses the WebGPU native API so it may co-execute GPU and ML commands in a single call. Going the other way requires WebGPU implementers to do the same.

wchao1115 · 2022-04-18T16:05:39Z

Closing this one. See #257 the evolution of this PR continues there.

Define graph execution methods used in different threading models.

222697f

wchao1115 requested review from pyu10055, anssiko, huningxin and RafaelCintron February 26, 2022 18:32

huningxin reviewed Mar 1, 2022

View reviewed changes

Fix over-aggressive find/replace edits.

078ff38

dontcallmedom reviewed Mar 3, 2022

View reviewed changes

wchao1115 mentioned this pull request Mar 10, 2022

Context-based graph execution methods for different threading models. #257

Merged

wchao1115 closed this Apr 18, 2022

This was referenced Nov 29, 2022

API review, questions, brainstorming #298

Closed

API simplification: context types, context options, createContext() #302

Open

	on a separate worker thread. The call to the {{MLAwaitedExecution/compute()}} method returns instantly without
	in the background. The call to the {{MLAwaitedExecution/compute()}} method returns instantly without

	the operation is completed on a worker thread at which time the calling thread is signaled. This type of
	the operation is completed in the background at which time the calling thread is signaled. This type of

Define graph execution methods used in different threading models. #255

Define graph execution methods used in different threading models. #255

Conversation

wchao1115 commented Feb 26, 2022 • edited by pr-preview bot

dontcallmedom commented Feb 28, 2022

wchao1115 commented Feb 28, 2022

Choose a reason for hiding this comment

huningxin Mar 1, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wchao1115 Mar 3, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dontcallmedom commented Mar 2, 2022 • edited

wchao1115 commented Mar 3, 2022

dontcallmedom commented Mar 3, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wchao1115 commented Mar 3, 2022 • edited

wchao1115 commented Mar 10, 2022

bbernhar commented Mar 17, 2022

wchao1115 commented Mar 17, 2022

bbernhar commented Mar 18, 2022

wchao1115 commented Mar 22, 2022

huningxin commented Mar 24, 2022 • edited

bbernhar commented Mar 24, 2022 • edited

huningxin commented Mar 25, 2022

bbernhar commented Mar 25, 2022 • edited

huningxin commented Mar 26, 2022

wchao1115 commented Mar 27, 2022 • edited

huningxin commented Mar 28, 2022 • edited

bbernhar commented Mar 28, 2022

wchao1115 commented Mar 29, 2022 • edited

huningxin commented Mar 29, 2022

wchao1115 commented Mar 29, 2022

huningxin commented Mar 29, 2022

wchao1115 commented Mar 29, 2022

wchao1115 commented Mar 29, 2022

bbernhar commented Mar 29, 2022 • edited

wchao1115 commented Apr 18, 2022

wchao1115 commented Feb 26, 2022 •

edited by pr-preview bot

huningxin Mar 1, 2022 •

edited

wchao1115 Mar 3, 2022 •

edited

dontcallmedom commented Mar 2, 2022 •

edited

wchao1115 commented Mar 3, 2022 •

edited

huningxin commented Mar 24, 2022 •

edited

bbernhar commented Mar 24, 2022 •

edited

bbernhar commented Mar 25, 2022 •

edited

wchao1115 commented Mar 27, 2022 •

edited

huningxin commented Mar 28, 2022 •

edited

wchao1115 commented Mar 29, 2022 •

edited

bbernhar commented Mar 29, 2022 •

edited