Skip to content

Commit

Permalink
Merge 31e0092 into 48e5828
Browse files Browse the repository at this point in the history
  • Loading branch information
walrusmcd committed May 23, 2018
2 parents 48e5828 + 31e0092 commit d707710
Show file tree
Hide file tree
Showing 9 changed files with 188 additions and 0 deletions.
4 changes: 4 additions & 0 deletions docs/DimensionDenotation.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,7 @@ for i, j in enumerate(perm):
## Denotation Verification

Denotation Verification happens when an operation expects its input to arrive in a particular format. An example operation where denotation verification happens is AveragePool operation where the input, if annotated with dimension denotation, in the 2D case should have the denotation [`DATA_BATCH`, `DATA_CHANNEL`, `DATA_FEATURE`, `DATA_FEATURE`]. If there is a mismatch between the expected dimension denotation and the actual dimension denotation, an error should be reported.

## Type Denotation

See the [type denotation documentation](TypeDenotation.md) for more details on how to describe images and other types.
2 changes: 2 additions & 0 deletions docs/IR.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,8 @@ The type system used for attributes is a superset of that used for of inputs and

The ONNX specification is comprised of this document, which defines the semantics of the IR and the standard data types, and the following documents defining standard operator semantics and the IR syntax. The latter is specified as Protobuf v2 and v3 schema files.

See the [metadata category documentation](MetadataProps.md) for more details.

### Operators

[Neural Network Operators](Operators.md)
Expand Down
33 changes: 33 additions & 0 deletions docs/MetadataProps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Metadata
# Metadata

In addition to the core metadata recommendations listed in the [extensibility documentation](IR.md#metadata) there is additional experimental metadata to help provide information for model inputs and outputs.

This metadata applies to all input and output tensors of a given category. The first such category we define is: `Image`.

## Motivation

The motivation of such a mechanism is to allow model authors to convey to model consumers enough information for them to consume the model.

In the case of images there are many option for providing valid image data. However a model which consumes images was trained with a particular set of these options which must
be used during inferencing.

The goal is this proposal is to provide enough metadata that the model consumer can perform their own featurization prior to running the model and provide a compatible input or retrive an output and know what its format is.

## Image Category Definition

For every tensor in this model that uses [Type Denotation](TypeDenotation.md) to declare itself an `IMAGE`, you SHOULD provide metadata to assist the model consumer. Note that any metadata provided using this mechanism is global to ALL types
with the accompanying denotation.

Keys and values are case insenstive.

Specifically, we define here the following set image metadata:

|Key|Value|Description|
|-----|----|-----------|
|`Image.BitmapPixelFormat`|__string__|Specifies the format of pixel data. Each enumeration value defines a channel ordering and bit depth. Possible values: <ul><li>`Gray8`: 1 channel image, the pixel data is 8 bpp grayscale.</li><li>`Rgb8`: 3 channel image, channel order is RGB, pixel data is 8bpp (No alpha)</li><li>`Bgr8`: 3 channel image, channel order is BGR, pixel data is 8bpp (No alpha)</li><li>`Rgba8`: 4 channel image, channel order is RGBA, pixel data is 8bpp (Straight alpha)</li><li>`Bgra8`: 4 channel image, channel order is BGRA, pixel data is 8bpp (Straight alpha)</li></ul>|
|`Image.ColorSpaceGamma`|__string__|Specifies the gamma color space used. Possible values:<ul><li>`Linear`: Linear color space, gamma == 1.0</li><li>`SRGB`: sRGB color space, gamma == 2.2</li></ul>|
|`Image.NominalPixelRange`|__string__|Specifies the range that pixel values are stored. Possible values: <ul><li>`NominalRange_0_255`: [0...255] for 8bpp samples</li><li>`Normalized_0_1`: [0...1] pixel data is stored normalized</li><li>`Normalized_1_1`: [-1...1] pixel data is stored normalized</li><li>`NominalRange_16_235`: [16...235] for 8bpp samples</li></ul>|


54 changes: 54 additions & 0 deletions docs/TypeDenotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Type Denotation

Type Denotation is used to describe semantic information around what the inputs and outputs are. It is stored on the TypeProto message.

## Motivation

The motivation of such a mechanism can be illustrated via a simple example. In the the neural network SqueezeNet, it takes in an NCHW image input float[1,3,244,244] and produces a output float[1,1000,1,1]:

```
input_in_NCHW -> data_0 -> SqueezeNet() -> output_softmaxout_1
```

In order to run this model the user needs a lot of information. In this case the user needs to know:
* the input is an image
* the image is in the format of NCHW
* the color channels are in the order of bgr
* the pixel data is 8 bit
* the pixel data is normalized as values 0-255

This proposal consists of three key components to provide all of this information:
* Type Denotation,
* [Dimension Denotation](DimensionDenotation.md),
* [Model Metadata](MetadataProps.md).

## Type Denotation Definition

To begin with, we define a set of semantic types that define what models generally consume as inputs and produce as outputs.

Specifically, in our first proposal we define the following set of standard denotations:

0. `TENSOR` describes that a type holds an generic tensor using the standard TypeProto message.
1. `IMAGE` describes that a type holds an image. You can use dimension denotation to learn more about the layout of the image, and also the optional model metadata_props.
2. `AUDIO` describes that a type holds an audio clip.
3. `TEXT` describes that a type holds a block of text.

Model authors SHOULD add type denotation to inputs and outputs for the model as appropriate.

## An Example with input IMAGE

Let's use the same SqueezeNet example from above and show everything to properly annotate the model:

* First set the TypeProto.denotation =`IMAGE` for the ValueInfoProto `data_0`
* Because it's an image, the model consumer now knows to go look for image metadata on the model
* Then include 3 metadata strings on ModelProto.metadata_props
* `Image.BitmapPixelFormat` = `Bgr8`
* `Image.ColorSpaceGamma` = `SRGB`
* `Image.NominalPixelRange` = `NominalRange_0_255`
* For that same ValueInfoProto, make sure to also use Dimension Denotations to denote NCHW
* TensorShapeProto.Dimension[0].denotation = `DATA_BATCH`
* TensorShapeProto.Dimension[1].denotation = `DATA_CHANNEL`
* TensorShapeProto.Dimension[2].denotation = `DATA_FEATURE`
* TensorShapeProto.Dimension[3].denotation = `DATA_FEATURE`

Now there is enough information in the model to know everything about how to pass a correct image into the model.
19 changes: 19 additions & 0 deletions onnx/onnx-ml.proto
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,25 @@ message TypeProto {
Map map_type = 5;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
optional string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// A generic tensor
optional string TENSOR = 0 [default = "TENSOR"];
// An image is stored inside this tensor
optional string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this tensor
optional string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this tensor
optional string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
19 changes: 19 additions & 0 deletions onnx/onnx-ml.proto3
Original file line number Diff line number Diff line change
Expand Up @@ -476,6 +476,25 @@ message TypeProto {
Map map_type = 5;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// A generic tensor
string TENSOR = 0 [default = "TENSOR"];
// An image is stored inside this tensor
string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this tensor
string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this tensor
string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
19 changes: 19 additions & 0 deletions onnx/onnx.in.proto
Original file line number Diff line number Diff line change
Expand Up @@ -477,6 +477,25 @@ message TypeProto {

// #endif
}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
optional string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// A generic tensor
optional string TENSOR = 0 [default = "TENSOR"];
// An image is stored inside this type
optional string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this type
optional string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this type
optional string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
19 changes: 19 additions & 0 deletions onnx/onnx.proto
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,25 @@ message TypeProto {
Tensor tensor_type = 1;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
optional string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// A generic tensor
optional string TENSOR = 0 [default = "TENSOR"];
// An image is stored inside this type
optional string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this type
optional string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this type
optional string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
19 changes: 19 additions & 0 deletions onnx/onnx.proto3
Original file line number Diff line number Diff line change
Expand Up @@ -445,6 +445,25 @@ message TypeProto {
Tensor tensor_type = 1;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// A generic tensor
string TENSOR = 0 [default = "TENSOR"];
// An image is stored inside this type
string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this type
string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this type
string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down

0 comments on commit d707710

Please sign in to comment.