Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added TensorDenotation and metadata_props for images #879

Merged
merged 25 commits into from
May 23, 2018
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
8de10bd
Added TensorDenotation, markdown docs for metadata_props for images, …
walrusmcd May 2, 2018
b87d257
Merge branch 'master' into fork2onnx
walrusmcd May 3, 2018
9ee54fe
CR feedback in the docs.
walrusmcd May 9, 2018
33c69fd
Merge branch 'fork2onnx' of https://github.com/walrusmcd/onnx into fo…
walrusmcd May 9, 2018
39e43e4
moved to type
walrusmcd May 12, 2018
de1fcb3
Updated the docs, kept TypeDenotation and DimesnionDenotation sperate…
walrusmcd May 14, 2018
cca4cf5
Merge branch 'master' into fork2onnx
walrusmcd May 14, 2018
60baa38
fixed the CI by running gen_proto.py
walrusmcd May 14, 2018
3482ce8
Merge branch 'master' into fork2onnx
linkerzhang May 14, 2018
e97b359
PR feedback, trimmed the type denotation doc and added a sample
walrusmcd May 16, 2018
e396e64
Merge branch 'fork2onnx' of https://github.com/walrusmcd/onnx into fo…
walrusmcd May 16, 2018
9b610b8
markdown cleanup
walrusmcd May 16, 2018
59f2af8
markdown cleanup
walrusmcd May 16, 2018
a49680d
markdown cleanup
walrusmcd May 16, 2018
0ef4bdc
Merge branch 'master' into fork2onnx
linkerzhang May 19, 2018
960048a
Merge branch 'master' into fork2onnx
linkerzhang May 22, 2018
8798305
PR feedback.
walrusmcd May 23, 2018
31e0092
Merge branch 'fork2onnx' of https://github.com/walrusmcd/onnx into fo…
walrusmcd May 23, 2018
e5d2401
Merge branch 'master' into fork2onnx
linkerzhang May 23, 2018
0da5253
typo (a vs. an)
walrusmcd May 23, 2018
fb31948
typo - a vs an
walrusmcd May 23, 2018
6784565
Merge branch 'fork2onnx' of https://github.com/walrusmcd/onnx into fo…
walrusmcd May 23, 2018
3a966f8
Merge branch 'master' into fork2onnx
prasanthpul May 23, 2018
167c335
Merge branch 'master' into fork2onnx
prasanthpul May 23, 2018
8500466
Merge branch 'master' into fork2onnx
linkerzhang May 23, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/DimensionDenotation.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,7 @@ for i, j in enumerate(perm):
## Denotation Verification

Denotation Verification happens when an operation expects its input to arrive in a particular format. An example operation where denotation verification happens is AveragePool operation where the input, if annotated with dimension denotation, in the 2D case should have the denotation [`DATA_BATCH`, `DATA_CHANNEL`, `DATA_FEATURE`, `DATA_FEATURE`]. If there is a mismatch between the expected dimension denotation and the actual dimension denotation, an error should be reported.

## Type Denotation

See the [type denotation documentation](TypeDenotation.md) for more details on how to describe images and other types.
2 changes: 2 additions & 0 deletions docs/IR.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,8 @@ The type system used for attributes is a superset of that used for of inputs and

The ONNX specification is comprised of this document, which defines the semantics of the IR and the standard data types, and the following documents defining standard operator semantics and the IR syntax. The latter is specified as Protobuf v2 and v3 schema files.

See the [metadata category documentation](MetadataProps.md) for more details.

### Operators

[Neural Network Operators](Operators.md)
Expand Down
34 changes: 34 additions & 0 deletions docs/MetadataProps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Metadata
# Metadata

In addition to the core metadata recommendations listed in the [extensibility documentation](IR.md#metadata) there is additional experimental metadata to help provide information for model inputs and outputs.
Copy link
Member

@houseroad houseroad May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc is called metadata prop, so we should not introduce the additional experimental metadata here.
Let's move the introduction of the experimental metadata to metadata section in IR doc.

We can extract the useful pieces in this doc, and merge it into metadata section in IR.doc.

At least, we should have a better name for this section...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example should help to show that we have 3 separate pieces here, and how to tie them all together.


This metadata applies to all input and output tensors of a given category. The first such category we define is: `Image`.

## Motivation

The motivation of such a mechanism is to allow model authors to convey to model consumers enough information for them to consume the model.

In the case of images there are many option for providing valid image data. However a model which consumes images was trained with a particular set of these options which must
be used during inferencing.

The goal is this proposal is to provide enough metadata that the model consumer can perform their own featurization prior to running the model and provide a compatible input
or retrive an output and know what its format is.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: no need for a new line here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed !


## Image Category Definition

For every tensor in this model that uses [Type Denotation](TypeDenotation.md) to declare itself an `IMAGE`, you SHOULD provide metadata to assist the model consumer. Note that any metadata provided using this mechanism is global to ALL types
with the accompanying denotation.

Keys and values are case insenstive.

Specifically, we define here the following set image metadata:

|Key|Value|Description|
|-----|----|-----------|
|`Image.BitmapPixelFormat`|__string__|Specifies the format of pixel data. Each enumeration value defines a channel ordering and bit depth. Possible values: <ul><li>`Gray8`: 1 channel image, the pixel data is 8 bpp grayscale.</li><li>`Rgb8`: 3 channel image, channel order is RGB, pixel data is 8bpp (No alpha)</li><li>`Bgr8`: 3 channel image, channel order is BGR, pixel data is 8bpp (No alpha)</li><li>`Rgba8`: 4 channel image, channel order is RGBA, pixel data is 8bpp (Straight alpha)</li><li>`Bgra8`: 4 channel image, channel order is BGRA, pixel data is 8bpp (Straight alpha)</li></ul>|
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you call out that keys and values are case insensitive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice ! added to the next iteration.

|`Image.ColorSpaceGamma`|__string__|Specifies the gamma color space used. Possible values:<ul><li>`Linear`: Linear color space, gamma == 1.0</li><li>`SRGB`: sRGB color space, gamma == 2.2</li></ul>|
|`Image.NominalPixelRange`|__string__|Specifies the range that pixel values are stored. Possible values: <ul><li>`NominalRange_0_255`: [0...255] for 8bpp samples</li><li>`Normalized_0_1`: [0...1] pixel data is stored normalized</li><li>`Normalized_1_1`: [-1...1] pixel data is stored normalized</li><li>`NominalRange_16_235`: [16...235] for 8bpp samples</li></ul>|
Copy link
Member

@houseroad houseroad May 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases, means for each channel is also needed to preprocess the input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think that would look ? Are you OK if we add this as a follow up PR later ? I think this follows this model of adding more and more metadata as we find it to be useful. love it !




55 changes: 55 additions & 0 deletions docs/TypeDenotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Using Denotation For Semantic Description
------------------------------------------

Denotation is an experiment to give semantic description to models. This enables model authors to describe the parts of their model application developers need to know in order to consume them.

There are 2 types of denotation : [Type Denotation](TypeDenotation.md) and [Dimension Denotation](DimensionDenotation.md).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we mention Dimension Denotation here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am adding an example to this MD, that shows why you actually need to use all three features (type denotation, dimenstion denotation, and metadata) . all three are needed to make it work end to end.


### Type Denotation

Type Denotation is used to describe semantic information around what the inputs and outputs are. It is stored on the TypeProto message.

#### Motivation

The motivation of such a mechanism can be illustrated via a simple example. In the the neural network SqueezeNet, it takes in an NCHW image input float[1,2,244,244] and produces a output float[1,1000,1,1]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A typical SqueezeNet takes [1, 3, 224, 224] as input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I meant 3. fixed.


```
input_in_NCHW -> data_0 -> SqueezeNet() -> output_softmaxout_1
```

In order to run this model the user needs a lot of information. In this case the user needs to know:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of using type denotation, saving the following information in the metadata is more straightforward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We experimented with that approach first, but it's more semantically correct to first denote that the type is an IMAGE. only then do you know to go look at the metdata to see how the model requires it's images. If you had multiple inputs into the model, you need a type denotation to know which of those types is the image.

Copy link
Contributor

@lupesko lupesko May 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the need to support multiple inputs is a good call

* the input is an image
* the image is in the format of NCHW
* the color channels are in the order of bgr
* the pixel data is 8 bit
* the pixel data is normalized as values 0-255

This proposal consists of three key components to provide all of this information: Type Denotation, [Dimension Denotation](DimensionDenotation.md), and [model metadata](MetadataProps.md). Each of which will be discussed in detail.

#### Type Denotation Definition

To begin with, we define a set of semantic types that define what models generally consume as inputs and produce as outputs.

Specifically, in our first proposal we define the following set of standard denotations:

1. `IMAGE` describes that a type holds an image. You can use dimension denotation to learn more about the layout of the image, and also the optional model metadata_props.
2. `AUDIO` describes that a type holds an audio clip.
3. `TEXT` describes that a type holds a block of text.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about good old numerical tensors, for other tasks such as recommendations, forecasting, anomaly detection, etc?
Should we add another type TENSOR for those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense ! I added the 3 we had been using the most in our models (image/audio/text) and added explicit metadata for image. I assumed that there would be follow up proposals and PR's as we add more "types" here. What you thinking ? Also, we do have the normal case where there is no denotation . In that case the tensor is also a good old numerical tensor. since denotation is optional, it still has all the fun stuff, like shape and type.

Model authors SHOULD add type denotation to inputs and outputs for the model.

#### Denotation Propagation

Type Denotation propagation does not automatically occur. It is used to describe the initial input or final outputs, but as data flows through the graph no inference is made to propogate if the data still holds an image (for example). A model builder or conversion tool MAY apply propagation manually in the model if they knows that subsequent types share the same semantic denotation.
Copy link
Member

@houseroad houseroad May 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work? Any example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch !! I think an example would be perfect here. I'm adding one.


#### Denotation Verification

Denotation Verification is not enforced. It is simply a method for model authors to indicate to model consumers what they should be passing in and what they should be expecting out. No error is reported if you do not actually pass in an image (for example).

#### Combined With Dimension Denotation

Type denotation can be combined with the new experimental feature of [Dimension Denotation](DimensionDenotation.md). For example if the Type Denotation is `IMAGE`, then that type SHOULD also have [Dimension Denotation](DimensionDenotation.md) stating the channel layout.

#### Model metadata_props

A model author then uses model metadata to describe information about ALL of the inputs and outputs for the model. For example, `Image.BitmapPixelFormat`. See the [model metadata documentation](MetadataProps.md) for details.
17 changes: 17 additions & 0 deletions onnx/onnx-ml.proto
Original file line number Diff line number Diff line change
Expand Up @@ -477,6 +477,23 @@ message TypeProto {
Map map_type = 5;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
optional string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comment above about a generic tensor

Copy link
Contributor Author

@walrusmcd walrusmcd May 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we make denotation optional, and "generic" would be that there is no denotation at all. Are you saying we should add a TENSOR to the constproto as the default/0 case so that you could provide a tensor denotation and say it is just a "tensor" ? Assuming that , what do you think? I'll add a GENERIC to this PR for completeness, but leave the denotation as optional. Let me know what you think and if you are proposing we make the denotation required. thanks !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

// An image is stored inside this tensor
optional string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this tensor
optional string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this tensor
optional string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
17 changes: 17 additions & 0 deletions onnx/onnx-ml.proto3
Original file line number Diff line number Diff line change
Expand Up @@ -477,6 +477,23 @@ message TypeProto {
Map map_type = 5;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// An image is stored inside this tensor
string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this tensor
string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this tensor
string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
17 changes: 17 additions & 0 deletions onnx/onnx.in.proto
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,23 @@ message TypeProto {

// #endif
}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
optional string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// An image is stored inside this type
optional string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this type
optional string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this type
optional string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
17 changes: 17 additions & 0 deletions onnx/onnx.proto
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,23 @@ message TypeProto {
Tensor tensor_type = 1;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
optional string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// An image is stored inside this type
optional string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this type
optional string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this type
optional string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down
17 changes: 17 additions & 0 deletions onnx/onnx.proto3
Original file line number Diff line number Diff line change
Expand Up @@ -446,6 +446,23 @@ message TypeProto {
Tensor tensor_type = 1;

}

// An optional denotation can be used to denote the whole
// type with a standard semantic description as to what is
// stored inside
string denotation = 6;
}

// A set of pre-defined constants to be used as values for
// the standard denotation field in Tensor for semantic
// description of the tensor.
message TypeDenotationConstProto {
// An image is stored inside this type
string IMAGE = 1 [default = "IMAGE"];
// Audio is stored inside this type
string AUDIO = 2 [default = "AUDIO"];
// Text is stored inside this type
string TEXT = 3 [default = "TEXT"];
}

// Operator Sets
Expand Down