Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: WebAssembly/wit-abi-up-to-date@v3
- uses: WebAssembly/wit-abi-up-to-date@v4
with:
wit-abi-tag: wit-abi-0.1.0
wit-abi-tag: wit-abi-0.2.0
72 changes: 18 additions & 54 deletions wasi-nn.abi.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,15 @@

## <a href="#tensor_dimensions" name="tensor_dimensions"></a> `tensor-dimensions`: list<`u32`>

The dimensions of a tensor.

The array length matches the tensor rank and each element in the array
describes the size of each dimension.

Size: 8, Alignment: 4

## <a href="#tensor_type" name="tensor_type"></a> `tensor-type`: variant
## <a href="#tensor_type" name="tensor_type"></a> `tensor-type`: enum

The type of the elements in a tensor.

Size: 1, Alignment: 1

### Variant Cases
### Enum Cases

- <a href="tensor_type.fp16" name="tensor_type.fp16"></a> [`fp16`](#tensor_type.fp16)

Expand All @@ -31,43 +26,27 @@ Size: 1, Alignment: 1

## <a href="#tensor_data" name="tensor_data"></a> `tensor-data`: list<`u8`>

The tensor data.

Initially coneived as a sparse representation, each empty cell would be filled with zeros and
the array length must match the product of all of the dimensions and the number of bytes in the
type (e.g., a 2x2 tensor with 4-byte f32 elements would have a data array of length 16).
Naturally, this representation requires some knowledge of how to lay out data in memory--e.g.,
using row-major ordering--and could perhaps be improved.

Size: 8, Alignment: 4

## <a href="#tensor" name="tensor"></a> `tensor`: record

A tensor.

Size: 20, Alignment: 4

### Record Fields

- <a href="tensor.dimensions" name="tensor.dimensions"></a> [`dimensions`](#tensor.dimensions): [`tensor-dimensions`](#tensor_dimensions)

Describe the size of the tensor (e.g., 2x2x2x2 -> [2, 2, 2, 2]). To represent a tensor
containing a single value, use `[1]` for the tensor dimensions.

- <a href="tensor.tensor_type" name="tensor.tensor_type"></a> [`tensor-type`](#tensor.tensor_type): [`tensor-type`](#tensor_type)

Describe the type of element in the tensor (e.g., f32).

- <a href="tensor.data" name="tensor.data"></a> [`data`](#tensor.data): [`tensor-data`](#tensor_data)

Contains the tensor data.

## <a href="#graph_builder" name="graph_builder"></a> `graph-builder`: list<`u8`>

The graph initialization data.

This consists of an array of buffers because implementing backends may encode their graph IR in
parts (e.g., OpenVINO stores its IR and weights separately).

Size: 8, Alignment: 4

Expand All @@ -76,14 +55,12 @@ Size: 8, Alignment: 4

Size: 8, Alignment: 4

## <a href="#graph_encoding" name="graph_encoding"></a> `graph-encoding`: variant
## <a href="#graph_encoding" name="graph_encoding"></a> `graph-encoding`: enum

Describes the encoding of the graph. This allows the API to be implemented by various backends
that encode (i.e., serialize) their graph IR with different formats.

Size: 1, Alignment: 1

### Variant Cases
### Enum Cases

- <a href="graph_encoding.openvino" name="graph_encoding.openvino"></a> [`openvino`](#graph_encoding.openvino)

Expand All @@ -94,13 +71,12 @@ Size: 1, Alignment: 1
- <a href="graph_encoding.tensorflow" name="graph_encoding.tensorflow"></a> [`tensorflow`](#graph_encoding.tensorflow)


## <a href="#execution_target" name="execution_target"></a> `execution-target`: variant
## <a href="#execution_target" name="execution_target"></a> `execution-target`: enum

Define where the graph should be executed.

Size: 1, Alignment: 1

### Variant Cases
### Enum Cases

- <a href="execution_target.cpu" name="execution_target.cpu"></a> [`cpu`](#execution_target.cpu)

Expand All @@ -111,102 +87,90 @@ Size: 1, Alignment: 1
- <a href="execution_target.tpu" name="execution_target.tpu"></a> [`tpu`](#execution_target.tpu)


## <a href="#error" name="error"></a> `error`: variant
## <a href="#error" name="error"></a> `error`: enum

Error codes returned by functions in this API.

Size: 1, Alignment: 1

### Variant Cases
### Enum Cases

- <a href="error.success" name="error.success"></a> [`success`](#error.success)

No error occurred.

- <a href="error.invalid_argument" name="error.invalid_argument"></a> [`invalid-argument`](#error.invalid_argument)

Caller module passed an invalid argument.

- <a href="error.invalid_encoding" name="error.invalid_encoding"></a> [`invalid-encoding`](#error.invalid_encoding)

Invalid encocing.

- <a href="error.missing_memory" name="error.missing_memory"></a> [`missing-memory`](#error.missing_memory)

Caller module is missing a memory export.

- <a href="error.busy" name="error.busy"></a> [`busy`](#error.busy)

Device or resource busy.

- <a href="error.runtime_error" name="error.runtime_error"></a> [`runtime-error`](#error.runtime_error)

Runtime Error.

# Functions

----

#### <a href="#load" name="load"></a> `load`

Load an opaque sequence of bytes to use for inference.
##### Params

- <a href="#load.builder" name="load.builder"></a> `builder`: [`graph-builder-array`](#graph_builder_array)
- <a href="#load.encoding" name="load.encoding"></a> `encoding`: [`graph-encoding`](#graph_encoding)
- <a href="#load.target" name="load.target"></a> `target`: [`execution-target`](#execution_target)
##### Results
##### Result

- <a href="#load." name="load."></a> ``: expected<handle<graph>, [`error`](#error)>
- expected<handle<graph>, [`error`](#error)>

----

#### <a href="#init_execution_context" name="init_execution_context"></a> `init-execution-context`

Create an execution instance of a loaded graph.
##### Params

- <a href="#init_execution_context.graph" name="init_execution_context.graph"></a> `graph`: handle<graph>
##### Results
##### Result

- <a href="#init_execution_context." name="init_execution_context."></a> ``: expected<handle<graph-execution-context>, [`error`](#error)>
- expected<handle<graph-execution-context>, [`error`](#error)>

----

#### <a href="#set_input" name="set_input"></a> `set-input`

Define the inputs to use for inference.
##### Params

- <a href="#set_input.ctx" name="set_input.ctx"></a> `ctx`: handle<graph-execution-context>
- <a href="#set_input.index" name="set_input.index"></a> `index`: `u32`
- <a href="#set_input.tensor" name="set_input.tensor"></a> `tensor`: [`tensor`](#tensor)
##### Results
##### Result

- <a href="#set_input." name="set_input."></a> ``: expected<_, [`error`](#error)>
- expected<`unit`, [`error`](#error)>

----

#### <a href="#compute" name="compute"></a> `compute`

Compute the inference on the given inputs.
##### Params

- <a href="#compute.ctx" name="compute.ctx"></a> `ctx`: handle<graph-execution-context>
##### Results
##### Result

- <a href="#compute." name="compute."></a> ``: expected<_, [`error`](#error)>
- expected<`unit`, [`error`](#error)>

----

#### <a href="#get_output" name="get_output"></a> `get-output`

Extract the outputs after inference.
##### Params

- <a href="#get_output.ctx" name="get_output.ctx"></a> `ctx`: handle<graph-execution-context>
- <a href="#get_output.index" name="get_output.index"></a> `index`: `u32`
##### Results
##### Result

- <a href="#get_output." name="get_output."></a> ``: expected<[`tensor`](#tensor), [`error`](#error)>
- expected<[`tensor`](#tensor), [`error`](#error)>

4 changes: 2 additions & 2 deletions wasi-nn.wit.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,10 +107,10 @@ index:

```wit
// Define the inputs to use for inference.
set-input: function(ctx: graph-execution-context, index: u32, tensor: tensor) -> expected<_, error>
set-input: function(ctx: graph-execution-context, index: u32, tensor: tensor) -> expected<unit, error>

// Compute the inference on the given inputs.
compute: function(ctx: graph-execution-context) -> expected<_, error>
compute: function(ctx: graph-execution-context) -> expected<unit, error>

// Extract the outputs after inference.
get-output: function(ctx: graph-execution-context, index: u32) -> expected<tensor, error>
Expand Down