op metadata that helps avoid implementation mistakes #243

quidity · 2021-12-22T10:07:26Z

Some way to help implementors validate that data is being passed correctly between nodes in the computation graph, has the expected sizes, etc.

Perhaps a little language which describes the required sizes/shapes and can be used to create checks that can be enforced as a graph is constructed, and as a graph is executed.

e.g. fields for ops which indicate:

Does it change input/output tensor contents?
Does it change input/output tensor sizes?
Can the input tensor be the output tensor?
Shape of the input, shape of the output?
Constraints on other input parameters that can be turned into automatic checks.

anssiko · 2022-01-31T16:15:13Z

The group had this issue on its agenda, but we did not record anything in the minutes.

All - please share your feedback for the proposal on this issue.

anssiko · 2022-02-17T11:36:06Z

We discussed this issue again on our call. It was noted it may not be possible to extract very useful metadata for all ops given the complexity of operations.

@quidity, we'd like to hear which of these options would work for you:

Would you prefer us to add a new "Metadata" section for each op to accompany "Arguments" and "Returns"? See the relu example below.
Or could we perhaps provide all this information combined in an appendix? It could allow us to present this information in a more concise form, for example as a table. We could then link to that from applicable op definitions. See the example table below.
Something else?

Option 1: add metadata into each op definition (using relu as an example):

7.7.21. relu

Compute the rectified linear function of the input tensor.

partial interface MLGraphBuilder {
  MLOperand relu(MLOperand x);
  MLOperator relu();
};

Arguments:
x: an MLOperand. The input tensor.

Returns:

an MLOperand. The output tensor of the same shape as x.
an MLOperator. The operator representing the relu operation.

Metadata:
Does it change input/output tensor contents?
Does it change input/output tensor sizes?
Can the input tensor be the output tensor?
Shape of the input, shape of the output?
Constraints on other input parameters that can be turned into automatic checks.

Option 2: add a single table combining metadata across all ops:

	Does it change input/output tensor contents?	Does it change input/output tensor sizes?	Can the input tensor be the output tensor?	Shape of the input, shape of the output?	Constraints on other input parameters that can be turned into automatic checks.
batchNormalization
clamp
concat

quidity · 2022-02-17T18:09:27Z

Having something machine-readable (an appendix, or directly as metadata on the ops) will be more useful when constructing fuzzers or running a hardened debug implementation designed to detect implementation errors. A table is useful to people, but less so to the machines.

wchao1115 · 2022-02-23T20:03:15Z

I would like to point out that the MLGraphBuilder API isn't a data execution API, but rather a data definition API. Every single defined operation that can act as a "node" in this computational graph has a clearly defined mapping rule between its input and output shape of the data it would consume and produce. This means that at any random node one can always infer the node's input shape by reversing the graph all the way back to the root node where the shapes of the input nodes are fully defined. This process is generally known as shape inferencing, and it makes any metadata about input shape to be added to the operation definition redundant. Note that shape inferencing only concerns the definition of the data (e.g. its shape) and not the actual content of the data.

Once the graph is fully constructed and compiled, the input shapes into each of the operations in the graph are inferred and finalized. It is part of what the implementation of the MLGraph interface would store. And finally, when the graph is actually executed against the actual data that are bound to it at execution time, bound checking would then occur. Keep in mind that no actual data has been bound to the compiled graph until at this stage, right before the graph is executed against it. This is happening at the compute method, which could fail. It is the implementation's responsibility to make sure proper bound checking occurs against the shapes of the data already inferred by that time.

huningxin · 2022-02-25T06:37:46Z

Can the input tensor be the output tensor?

As @wchao1115 mentioned, MLGraphBuilder is a graph definition API, the output MLOperand is returned by the function call of an operation, like y = builder.relu(x). The input can not be used as the output.

The MLGraph::compute method takes both input buffers and output buffers. Do you mean whether a buffer can be used for both input and output in the same call? I think this is not defined in the current spec.

anssiko · 2022-03-24T17:16:05Z

In #251 we landed an update to https://www.w3.org/TR/webnn/#security that describes how the chosen API design (graph definition API) minimizes the attack surface for the compiled computational graph. We believe this update makes the benefits of this design choice clearer and addresses this issue.

quidity mentioned this issue Dec 22, 2021

General Security Questions #241

Closed

anssiko added the security-tracker Group bringing to attention of security, or tracked by the security Group but not needing response. label Dec 22, 2021

w3cbot mentioned this issue Dec 22, 2021

op metadata that helps avoid implementation mistakes w3c/security-review#134

Open

huningxin mentioned this issue Mar 11, 2022

Update Security Considerations per review feedback #251

Merged

anssiko closed this as completed Mar 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

op metadata that helps avoid implementation mistakes #243

op metadata that helps avoid implementation mistakes #243

quidity commented Dec 22, 2021

anssiko commented Jan 31, 2022

anssiko commented Feb 17, 2022

quidity commented Feb 17, 2022

wchao1115 commented Feb 23, 2022 •

edited

huningxin commented Feb 25, 2022

anssiko commented Mar 24, 2022

op metadata that helps avoid implementation mistakes #243

op metadata that helps avoid implementation mistakes #243

Comments

quidity commented Dec 22, 2021

anssiko commented Jan 31, 2022

anssiko commented Feb 17, 2022

Option 1: add metadata into each op definition (using relu as an example):

7.7.21. relu

Option 2: add a single table combining metadata across all ops:

quidity commented Feb 17, 2022

wchao1115 commented Feb 23, 2022 • edited

huningxin commented Feb 25, 2022

anssiko commented Mar 24, 2022

wchao1115 commented Feb 23, 2022 •

edited