Implement CELU node as a Function #2575

jeremycochoy · 2020-01-30T17:49:47Z

I had a look at your guidelines and tutorial for adding missing op according to #1121 (comment) .

Description:

The CELU operator has been required at this issue #1121 and is now part of the new operator request list #1646 .

First introduced in Continuously Differentiable Exponential Linear Units the CELU is similar to the ELU operation.

Given the attribute α, CELU is a pointwise application of the following formula:

CELU(x)=max(0,x)+min(0,α*(exp(x/α)−1))

and allow leakage of the gradient in the negative values, while having a differential remaining continuous for any value of alpha (which is not the case of ELU).

It is implemented in Pytorch based on the Pytorch-ELU operation:

Tensor celu(const Tensor & self, Scalar alpha) {
  double inv_alpha = 1. / alpha.to<double>();
  return at::elu(self, alpha, Scalar(1.0), Scalar(inv_alpha));
}

A similar approach for ONNX-ELU is alpha * ELU(x / alpha, alpha=1).

An alternative implementation in numpy, not requiring the Pytorch-ELU operator is given in the tests:

import numpy as np

input_data = np.random.randn(1, 2, 3)
alpha = 2

positive_input = np.maximum(0, input_data)
negative_input = np.minimum(0, alpha * (np.exp(input_data / alpha) - 1))
output_data = positive_input + negative_input

A first implementation was intended in #1676 but never merged.

Graph

The CELU function is implemented using the expression alpha * ELU(x / alpha, alpha=1). This make the graph smaller (and easier to read) than using each individual functions (sum, dub, div, exp, mult) present in the expression of the operator. It also leverage good supports of `Elu in most onnx backend implementations.

Tests

A unit test and a shape inference test are available following the tests of MeanVarianceNormalization function.

jeremycochoy · 2020-01-30T18:17:20Z

onnx/defs/nn/defs.cc

+           {// nodes: {outputs, op, inputs, attributes}
+            {{"X_alpha"},
+             "Div",
+             {"X", "alpha"}


I can't figure out how to reference the attribute of the Celu instruction as an argument of the node Div.

I had a look at the helpers FunctionBodyHelper::BuildNodes, FunctionBodyHelper::Const and MakeRefAttribute without success.

Could you show me some documentation / example of this usage?

There is a comment about this on FunctionBodyHelper::BuildNodes here, and it's used in the MeanVarianceNormalization operator function here. Shout if you have trouble :)

Thanks you for your answer. :)

Unfortunately I did read this line and the usage in MeanVarianceNormalization but I am still confused. I tried different syntax that compile, but since the shape inference test fail it is probably not right graph.

As I understand, I can create a graph equivalent to Div(X, alpha=alpha) using

{{"X_alpha"}, "Div", {"X"}, {MakeRefAttribute("alpha", AttributeProto::FLOAT)} },

But it doesn't seams to be what I am lloking for, probably because Div have two input and 0 attribute.

How can write the equivalent of Div(X, alpha) (i.e. use this reference as the second argument of Div)? I would like to write

{{"X_alpha"}, "Div", {"X", MakeRefAttribute("alpha", AttributeProto::FLOAT)} }

but obviously this is not possible since std::String != AttributeProtoWrapper 😅

In the MeanVarianceNormalization it seams all the usage of axis simply forward the attribute to the underlying Ops, right ?

Ah okay, I see, you are quite right. I think you should be able to move from an attribute to a value by adding a Constant node. I'm not sure if it will be okay with providing a scalar for a tensor though 😬.

Also, the current helper doesn't allow you to use different names for the attr (value) and the ref_attr (alpha), so you'd need to add that.

FunctionBodyHelper::BuildNodes( {// nodes: {outputs, op, inputs, attributes} {{"alpha"}, "Constant", {}, {MakeRefAttribute("value", AttributeProto::FLOAT, "alpha")}}, {{"X_alpha"}, "Div", {"X", "alpha"}}, {{"Y"}, "Elu", {"X_alpha"}}})

TMVector · 2020-01-30T19:28:13Z

onnx/defs/operator_sets.h

@@ -635,6 +635,7 @@ class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 11, Pad);
 class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 11, Gemm);
 class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 11, If);
 class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 11, NonMaxSuppression);
+class ONNX_OPERATOR_SET_SCHEMA_CLASS_NAME(Onnx, 11, Celu);


This should be in version 12, as the latest released version is 11 🙂

onnx/defs/nn/defs.cc

onnx/test/shape_inference_test.py

linkerzhang · 2020-01-31T01:58:32Z

onnx/defs/nn/defs.cc

+             {MakeRefAttribute("alpha", AttributeProto::FLOAT)}
+	    },
+            {{"Y"}, "Elu", {"X_alpha"}}})));
+


This function body is NOT a correct "sub-graph" representing the formula you described. Function body is actually a graph to represent the math formula you mentioned with other ops, in this case, it should be "Constant", "Div", "Elu".

Yes, I need some information on how to convert the Scalar Attribute into a Constant Tensor. Do you know how to accomplish this?

jeremycochoy · 2020-01-31T11:28:11Z

Thanks @TMVector and @linkerzhang for your feedback.

Regarding the Tensor/Scalar issue raised by TMVector, I can say that the following code pass the shape inference test. But I don't know if it is enough to say if everything is fine if the second argument is a scalar and not a tensor (I don't know if this 1.f isn't converted to a tensor implicitly).

          {// nodes: {outputs, op, inputs, attributes}                                                           
            FunctionBodyHelper::NodeDef{{"alpha"}, "Constant", {}, {{"value", 1.f}}},                                        
            {{"X_alpha"},
             "Div",
             {"X", "alpha"}
            },
            {{"Y"}, "Elu", {"X_alpha"}}})));

Regarding the second problem (using the actual attribute):

I made some attempt to create a constant node that recover the alpha attribute from the Celu operator using AtributeProto. Although the code compile, the shape inference test is a huge failure. In order to understand what is happening, I simplified the body of the function.

If I run the shape inference test with the following body, I get a nice "Y" of empty shape.

FunctionBodyHelper::Const<float>("Y", 1.0f)

E             name: "Y"
E             type {
E               tensor_type {
E                 elem_type: 1
E                 shape {
E                 }
E               }
E             }

but if I try to run the shape inference test with the attribute (see code below), then no "Y" is inferred at all.

FunctionBodyHelper::NodeDef{{"Y"}, "Constant", {}, {MakeRefAttribute("value", AttributeProto::FLOAT, "alpha")}}

E       AssertionError: ({'X', 'Y'}, {'X'})
E       assert {'X', 'Y'} == {'X'}
E         Extra items in the left set:
E         'Y'
E         Use -v to get the full diff

On my side I am stuck. Looking at Const and ToVector implementation didn't gave me any new idea to test. Do you have any idea of what is happening? Is it related to this Tensor/Scalar problem? 🙃

onnx/defs/nn/defs.cc

wschin · 2020-02-01T07:11:20Z

onnx/defs/nn/defs.cc

+             "Div",
+             {"X", "alpha"}
+            },
+            {{"U"}, "Elu", {"X_alpha"}}})));


From Pytorch, CELU equation is

CELU(x)=max(0,x)+min(0,α∗(exp(x/α)−1))

while ELU uses

ELU(x)=max(0,x)+min(0,α∗(exp(x)−1))

In addition, here the function body is doing

ONNX_CELU(x)=max(0,x/α)+min(0,(exp(x/α)−1))

which doesn't exactly match Pytorch CELU. Is this expected? Or I miss something?

Do we have a numpy reference implementation for generating tests? We should also check if that implementation matches Pytorch CELU.

[Update] I saw your numpy reference implementation. Nice! Can you please provide a short comparison to show it performs the same as Pytorch CELU?

😱

You are completely right, the ELU implementation of Pytorch is different from ONNX Elu, and it is not possible to express CELU from ONNX's ELU. Thanks you for noticing it, I am working on a fix.

Here is a code testing the differences between Pytorch CELU and the implementation (with corrected parenthesis) I provided.

import numpy as np import torch def onnx_celu(input_data, alpha=1.0): positive_input = np.maximum(0, input_data) negative_input = np.minimum(0, alpha * (np.exp(input_data / alpha) - 1)) output_data = positive_input + negative_input return output_data def torch_celu(input_data, alpha=1.0): return torch.nn.CELU(alpha=alpha)(torch.Tensor(input_data)).numpy() input_data = np.random.randn(1, 2, 3).astype('float32') alpha = 2 assert (onnx_celu(input_data, alpha) == torch_celu(input_data, alpha)).all()

onnx/defs/operator_sets.h

wschin · 2020-02-01T07:15:10Z

onnx/defs/nn/defs.cc

+           "Constrain input and output types to floating-point tensors.")
+       .FunctionBody(FunctionBodyHelper::BuildNodes(
+           {// nodes: {outputs, op, inputs, attributes}
+            //FunctionBodyHelper::NodeDef{{"alpha"}, "Constant", {}, {{"value", 1.f}}},


Are the comments here left intendedly?

wschin · 2020-02-01T07:44:43Z

Thanks @TMVector and @linkerzhang for your feedback.

Regarding the Tensor/Scalar issue raised by TMVector, I can say that the following code pass the shape inference test. But I don't know if it is enough to say if everything is fine if the second argument is a scalar and not a tensor (I don't know if this 1.f isn't converted to a tensor implicitly).
          {// nodes: {outputs, op, inputs, attributes}                                                           
            FunctionBodyHelper::NodeDef{{"alpha"}, "Constant", {}, {{"value", 1.f}}},                                        
            {{"X_alpha"},
             "Div",
             {"X", "alpha"}
            },
            {{"Y"}, "Elu", {"X_alpha"}}})));
Regarding the second problem (using the actual attribute):

I made some attempt to create a constant node that recover the alpha attribute from the Celu operator using AtributeProto. Although the code compile, the shape inference test is a huge failure. In order to understand what is happening, I simplified the body of the function.

If I run the shape inference test with the following body, I get a nice "Y" of empty shape.
FunctionBodyHelper::Const<float>("Y", 1.0f)
E             name: "Y"
E             type {
E               tensor_type {
E                 elem_type: 1
E                 shape {
E                 }
E               }
E             }
but if I try to run the shape inference test with the attribute (see code below), then no "Y" is inferred at all.
FunctionBodyHelper::NodeDef{{"Y"}, "Constant", {}, {MakeRefAttribute("value", AttributeProto::FLOAT, "alpha")}}
E       AssertionError: ({'X', 'Y'}, {'X'})
E       assert {'X', 'Y'} == {'X'}
E         Extra items in the left set:
E         'Y'
E         Use -v to get the full diff
On my side I am stuck. Looking at Const and ToVector implementation didn't gave me any new idea to test. Do you have any idea of what is happening? Is it related to this Tensor/Scalar problem? 🙃

As described here, the value attribute should be an tensor, not a float.

jeremycochoy · 2020-02-01T12:41:03Z

Thanks @TMVector and @linkerzhang for your feedback.

Regarding the Tensor/Scalar issue raised by TMVector, I can say that the following code pass the shape inference test. But I don't know if it is enough to say if everything is fine if the second argument is a scalar and not a tensor (I don't know if this 1.f isn't converted to a tensor implicitly).
          {// nodes: {outputs, op, inputs, attributes}                                                           
            FunctionBodyHelper::NodeDef{{"alpha"}, "Constant", {}, {{"value", 1.f}}},                                        
            {{"X_alpha"},
             "Div",
             {"X", "alpha"}
            },
            {{"Y"}, "Elu", {"X_alpha"}}})));
Regarding the second problem (using the actual attribute):

I made some attempt to create a constant node that recover the alpha attribute from the Celu operator using AtributeProto. Although the code compile, the shape inference test is a huge failure. In order to understand what is happening, I simplified the body of the function.
If I run the shape inference test with the following body, I get a nice "Y" of empty shape.
FunctionBodyHelper::Const<float>("Y", 1.0f)
E             name: "Y"
E             type {
E               tensor_type {
E                 elem_type: 1
E                 shape {
E                 }
E               }
E             }
but if I try to run the shape inference test with the attribute (see code below), then no "Y" is inferred at all.
FunctionBodyHelper::NodeDef{{"Y"}, "Constant", {}, {MakeRefAttribute("value", AttributeProto::FLOAT, "alpha")}}
E       AssertionError: ({'X', 'Y'}, {'X'})
E       assert {'X', 'Y'} == {'X'}
E         Extra items in the left set:
E         'Y'
E         Use -v to get the full diff
On my side I am stuck. Looking at Const and ToVector implementation didn't gave me any new idea to test. Do you have any idea of what is happening? Is it related to this Tensor/Scalar problem? 🙃
As described here, the value attribute should be an tensor, not a float.

Unfortunately, after hours digging documentation and code, I can't figure a way to convert a scalar (from MakeRefAttribute("alpha", Attribute\ Proto::FLOAT)) to a tensor constant.
I left a comment pointing the the problematic line in the body of the function.

PS: If this is not possible, then maybe there is still a way to cheat with the Gemm instruction (it is the only instruction I found which take a scalar attribute and do his product with a tensor). But I would need some help to create the 1x1 tensor input matrices.

wschin · 2020-02-01T18:09:21Z

Thanks @TMVector and @linkerzhang for your feedback.

Regarding the Tensor/Scalar issue raised by TMVector, I can say that the following code pass the shape inference test. But I don't know if it is enough to say if everything is fine if the second argument is a scalar and not a tensor (I don't know if this 1.f isn't converted to a tensor implicitly).
          {// nodes: {outputs, op, inputs, attributes}                                                           
            FunctionBodyHelper::NodeDef{{"alpha"}, "Constant", {}, {{"value", 1.f}}},                                        
            {{"X_alpha"},
             "Div",
             {"X", "alpha"}
            },
            {{"Y"}, "Elu", {"X_alpha"}}})));
Regarding the second problem (using the actual attribute):

I made some attempt to create a constant node that recover the alpha attribute from the Celu operator using AtributeProto. Although the code compile, the shape inference test is a huge failure. In order to understand what is happening, I simplified the body of the function.
If I run the shape inference test with the following body, I get a nice "Y" of empty shape.
FunctionBodyHelper::Const<float>("Y", 1.0f)
E             name: "Y"
E             type {
E               tensor_type {
E                 elem_type: 1
E                 shape {
E                 }
E               }
E             }
but if I try to run the shape inference test with the attribute (see code below), then no "Y" is inferred at all.
FunctionBodyHelper::NodeDef{{"Y"}, "Constant", {}, {MakeRefAttribute("value", AttributeProto::FLOAT, "alpha")}}
E       AssertionError: ({'X', 'Y'}, {'X'})
E       assert {'X', 'Y'} == {'X'}
E         Extra items in the left set:
E         'Y'
E         Use -v to get the full diff
On my side I am stuck. Looking at Const and ToVector implementation didn't gave me any new idea to test. Do you have any idea of what is happening? Is it related to this Tensor/Scalar problem? 🙃
As described here, the value attribute should be an tensor, not a float.
Unfortunately, after hours digging documentation and code, I can't figure a way to convert a scalar (from MakeRefAttribute("alpha", Attribute\ Proto::FLOAT)) to a tensor constant.
I left a comment pointing the the problematic line in the body of the function.

PS: If this is not possible, then maybe there is still a way to cheat with the Gemm instruction (it is the only instruction I found which take a scalar attribute and do his product with a tensor). But I would need some help to create the 1x1 tensor input matrices.

I will try something on my side. In the meanwhile, what do you think if we make alpha an input?

linkerzhang · 2020-02-02T05:27:35Z

I think "I can't figure a way to convert a scalar (from MakeRefAttribute("alpha", Attribute\ Proto::FLOAT)) to a tensor constant" needs to be fixed. Logically, the body graph is referring an attribute outside (which should be an AttributeProto) and the "Constant" OP will use the attribute and output a Tensor.

jeremycochoy · 2020-02-02T08:31:37Z

@wschin Personally, I really don't mind moving alpha as an input. But it may be very confusing for both developper of user if CELU and ELU have completely different interface. If this approach get merged, it also means supporting it for a long time. 😅

@linkerzhang Would be awesome. I think anyone who implement a new function op that do not directly forward its arguments will end up having the exact same problem, and a clean way to move the scalars into the graph would solve this. Do you have any idea on how this could be archived?

linkerzhang · 2020-02-03T02:31:58Z

@jeremycochoy The AttributeProto itself was designed to support this kind of reference already, though the utility function is missing, I guess.

PR #2583 should resolve it.

MakeRefAttribute("value", "alpha", AttributeProto::FLOAT) for the Constant node in the function body.

TMVector · 2020-02-03T09:55:58Z

@linkerzhang I think that will work if the alpha attribute is a tensor, but ideally it would be a naked float. Maybe Constant should be changed to promote non-tensor values to scalar tensors?

jeremycochoy · 2020-02-03T12:13:23Z

@jeremycochoy The AttributeProto itself was designed to support this kind of reference already, though the utility function is missing, I guess.

PR #2583 should resolve it.

MakeRefAttribute("value", "alpha", AttributeProto::FLOAT) for the Constant node in the function body.

@linkerzhang
Isn't essentially the same thing of https://github.com/onnx/onnx/pull/2575/files#diff-8073bde925403bcdfa7d23c68d914d97 present in the current PR? (although your ordering of arguments feel more natural to me)

wschin · 2020-02-03T17:07:30Z

@linkerzhang I think that will work if the alpha attribute is a tensor, but ideally it would be a naked float. Maybe Constant should be changed to promote non-tensor values to scalar tensors?

We might need to support floats in addition to float. The fundamental cause here is that Attribute and Graph use different numerical type systems. Attribute has float, floats, and tensor. Graph only has tensor. I think changing Constant will be a nice and small change to bridge these two systems -- because Attribute is always a constant in graphs.

@TMVector, @jeremycochoy, any comments?

jeremycochoy · 2020-02-03T20:04:50Z

@linkerzhang I think that will work if the alpha attribute is a tensor, but ideally it would be a naked float. Maybe Constant should be changed to promote non-tensor values to scalar tensors?

We might need to support floats in addition to float. The fundamental cause here is that Attribute and Graph use different numerical type systems. Attribute has float, floats, and tensor. Graph only has tensor. I think changing Constant will be a nice and small change to bridge these two systems -- because Attribute is always a constant in graphs.

@TMVector, @jeremycochoy, any comments?

To me it sounds the best solution, and it definitively makes sense to convert both float and floats to their corresponding tensors.

linkerzhang · 2020-02-04T01:44:34Z

@jeremycochoy yep, it's the same as the one in current PR (I missed the part).

One more solution is removing AttributeProto and having TensorProto be used to store Attribute, to unify the two type systems (one tensor type system and one attribute type system).

AttributeProto and its type system were designed at the very beginning, to introduce a simpler way of having "scalar" attribute data. However, it introduces many troubles when specifying operator spec. For example, it's really hard to specify when attribute needs to share the same type as input or output (now most cases are using "Tensor" attribute type).

This PR reminds me again that the benefit of AttributeProto and its type system is not that much, while many troubles introduced.

I'd suggest to remove them and have only one type system in ONNX.

@gramalingam @wschin @jeremycochoy @TMVector What do you think please?

linkerzhang · 2020-02-04T02:05:12Z

onnx/defs/attr_proto_util.cc

@@ -58,4 +58,15 @@ AttributeProto MakeRefAttribute(
  return a;


change this function to call the overridden one added below please.

Oh, I think you can just merge your PR and I can rebase this branch on top of it. I remember you introduced documentation, your ordering of arguments sounds more natural to me, and I have nothing against splitting PRs in smaller piece.

Edit: I rebased on top of your commit. 🙂

Aha, I abandoned my PR this morning (realized it's duplicate with changes in your PR). Let me get it back and merge it in this way :)

TMVector · 2020-02-12T10:21:22Z

onnx/defs/nn/defs.cc

+            {{"Elu_Result"}, "Elu", {"X_alpha"}, {{"alpha", 1.f}}},
+            {{"Y"}, "Mul", {"alpha", "Elu_Result"}}})));


Wouldn't it be equivalent to pass Celu.alpha to Elu.alpha instead of setting Elu.alpha=1 and then multiplying by Celu.alpha?

It is not. 😅 (Because the alpha is applied only on the second member of the + operator in the Elu equation).

Explanations from wschin: #2575 (comment)

Ah, quite right.

Btw should Celu be defined in onnx/defs/math/defs.cc? -- that's where Relu, Elu, etc. are.

Right, it would make more sense to place it net to relu / elu / leaky relu. I will change it this evening.

Done :)

Co-Authored-By: Jonny Shipton <tmvector@gmail.com>

fdwr · 2020-05-26T20:24:27Z

@jeremycochoy : This is an excellent description for a new operator (explaining why it's being added, where it came from, the actual equation used, and even an alternate Python implementation), and I'll point to it as a good example in the future. 👍

PallHaraldsson · 2020-06-14T13:28:08Z

Thanks, I had no idea about the useful CELU. Is there a reason to use min? I mean max only (for ReLU) is natural, implying one test, and while having both implies two, and I wouldn't trust a compiler to know only one test and branch needed, and if you don't exclude it, then you always have to calculate the slow exp (I'm sure it could be faster, i.e. both implementations could be optimized more), and that it can be 583 times slower:

julia> x = 1.1; α = 1.0

julia> ONNX_CELU(x, α)=max(0,x/α)+min(0,(exp(x/α)-1))

julia> @btime ONNX_CELU($x, $α);  # only add $ for @btime (not @time) that requires: using BenchmarkTools
  13.999 ns (0 allocations: 0 bytes)

julia> ONNX_CELU(x, α)=if x >= zero(x) x/α else exp(x/α)-1 end

julia> @btime ONNX_CELU($x, $α);
  0.024 ns (0 allocations: 0 bytes)

julia> x = -1.1

julia> @btime ONNX_CELU($x, $α);
  11.627 ns (0 allocations: 0 bytes)

Yes, only 20% faster on the other side, and maybe with values all over the place (ca. 50-50% split?) it's not as useful an optimization as I would think?

jeremycochoy · 2020-06-15T08:08:10Z

Hi @PallHaraldsson

To be honest, I am efraid that if you want to have something really optimized you'd need the backend to provide a specific implementation for celu that replace this graph function (this is 100% possible and any backend can chose th implementation that fits it's specific targeted hardware).

PallHaraldsson · 2020-06-16T20:56:51Z

Yes, and FYI, I found an even better activation function (in part since it's also continuously differentiable, why better than ReLU):

Mish: A Self Regularized Non-Monotonic Neural Activation Function
https://arxiv.org/pdf/1908.08681v1.pdf

In Tensorflow[11], the function definition of Mish can be written as x * tf.math.tanh(tf.softplus(x)) while in Torch[12] it is x * torch.tanh(F.softplus(x)). For improved results over ReLU, it is advised to use a slightly lower learning rate for Mish.

It's also compared to ELU and some variant (while not CELU).

Also interesting, and supposed advantages contrary to those supposed above (such as bounded at low):

PLU: The Piecewise Linear Unit Activation Function
https://arxiv.org/abs/1809.09534

I implemented it like this for fewer assembly instructions (and only one branch):

julia> function PLU(x)
         stripped=abs(x)
         s=sign(x)
         if stripped <= 1.0
           return x
         else
           return 0.1*(x-s)+s
         end
       end

julia> @code_native PLU(1.0)  # to see assembly. I always get 0.024 ns by timing with @btime

And if you're interested, a very simple idea here (using two "opposite", but similar, I wander if you could do similar for two dissimilar, e.g. those above?):

https://arxiv.org/pdf/1709.04054.pdf

We propose a simple extension to the ReLU-family of activation functions that allows them to shift the mean activation across a layer towards zero. Combined with proper weight initialization, this alleviates the need for normalization layers. We explore the training of deep vanilla recurrent neural networks (RNNs) with up to 144 layers, and show that bipolar activation functions help learning in this setting. On the Penn Treebank and Text8 language modeling tasks we obtain competitive results, improving on the best reported results for non-gated networks.

* Implement CELU node as a Function * Add shape inference test * Update onnx/defs/nn/defs.cc Co-Authored-By: Jonny Shipton <tmvector@gmail.com> * Update onnx/test/shape_inference_test.py Co-Authored-By: Jonny Shipton <tmvector@gmail.com> * Set operator version to 12 * ? * WIP. But the constant node can't be shape infered. * Rewrite correct implementation based on equation instead of Elu * Fix parentesis in formula * wschin suggestions from onnx#2583 PR * Fix a bug in inferene code and simplify graph * Fix typo in Celu test * Udapte docs * Move Celu operator next to Elu (math/defs.cc) Co-authored-by: Jonny Shipton <tmvector@gmail.com> Co-authored-by: Ke Zhang <kezhan@microsoft.com>

jeremycochoy force-pushed the feature/add-celu-function branch from 1c596cb to 77d5293 Compare January 30, 2020 18:11

jeremycochoy commented Jan 30, 2020

View reviewed changes

jeremycochoy marked this pull request as ready for review January 30, 2020 18:23

jeremycochoy requested review from a team as code owners January 30, 2020 18:23

TMVector reviewed Jan 30, 2020

View reviewed changes

onnx/test/shape_inference_test.py Outdated Show resolved Hide resolved

linkerzhang reviewed Jan 31, 2020

View reviewed changes

prasanthpul added the operator Issues related to ONNX operators label Feb 1, 2020

wschin reviewed Feb 1, 2020

View reviewed changes

onnx/defs/nn/defs.cc Outdated Show resolved Hide resolved

wschin reviewed Feb 1, 2020

View reviewed changes

onnx/defs/operator_sets.h Show resolved Hide resolved

wschin reviewed Feb 1, 2020

View reviewed changes

jeremycochoy force-pushed the feature/add-celu-function branch 3 times, most recently from cb02fd5 to 8068de6 Compare February 1, 2020 13:14

linkerzhang mentioned this pull request Feb 4, 2020

add utility function to make reference attribute #2583

Merged

linkerzhang reviewed Feb 4, 2020

View reviewed changes

TMVector reviewed Feb 12, 2020

View reviewed changes

jeremycochoy and others added 13 commits February 12, 2020 22:27

Implement CELU node as a Function

548709d

Add shape inference test

5219543

Update onnx/defs/nn/defs.cc

ce691e0

Co-Authored-By: Jonny Shipton <tmvector@gmail.com>

Update onnx/test/shape_inference_test.py

050d657

Co-Authored-By: Jonny Shipton <tmvector@gmail.com>

Set operator version to 12

03282aa

?

67e64e1

WIP. But the constant node can't be shape infered.

f5ad51f

Rewrite correct implementation based on equation instead of Elu

568b5ea

Fix parentesis in formula

3fafe81

wschin suggestions from onnx#2583 PR

13fc720

Fix a bug in inferene code and simplify graph

a5627f1

Fix typo in Celu test

d7f82be

Udapte docs

57e7a49

jeremycochoy force-pushed the feature/add-celu-function branch from 686fb71 to 096799e Compare February 12, 2020 20:28

Move Celu operator next to Elu (math/defs.cc)

db6cc95

jeremycochoy force-pushed the feature/add-celu-function branch from 096799e to db6cc95 Compare February 12, 2020 21:25

linkerzhang added 2 commits February 14, 2020 14:43

Merge branch 'master' into feature/add-celu-function

137c777

Merge branch 'master' into feature/add-celu-function

511b07e

linkerzhang merged commit c978d10 into onnx:master Feb 16, 2020

chinhuang007 added this to the 1.7 milestone Feb 19, 2020

codemzs mentioned this pull request Apr 26, 2020

ONNX release 1.7 #2614

Closed

TMVector mentioned this pull request May 29, 2020

[feature request] Triangular operations #2554

Closed

PallHaraldsson mentioned this pull request Jun 17, 2020

New and better implementation of activations sylvaticus/BetaML.jl#6

Merged

fdwr mentioned this pull request Aug 26, 2020

ORT DirectML EP for Iron release, ONNX 1.5 microsoft/onnxruntime#4925

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement CELU node as a Function #2575

Implement CELU node as a Function #2575

jeremycochoy commented Jan 30, 2020 •

edited

jeremycochoy Jan 30, 2020

TMVector Jan 30, 2020

jeremycochoy Jan 30, 2020

TMVector Jan 30, 2020 •

edited

TMVector Jan 30, 2020

linkerzhang Jan 31, 2020

jeremycochoy Feb 1, 2020

jeremycochoy commented Jan 31, 2020

wschin Feb 1, 2020 •

edited

jeremycochoy Feb 1, 2020

jeremycochoy Feb 1, 2020

wschin Feb 1, 2020

wschin commented Feb 1, 2020

jeremycochoy commented Feb 1, 2020 •

edited

wschin commented Feb 1, 2020

linkerzhang commented Feb 2, 2020

jeremycochoy commented Feb 2, 2020

linkerzhang commented Feb 3, 2020

TMVector commented Feb 3, 2020 •

edited

jeremycochoy commented Feb 3, 2020 •

edited

wschin commented Feb 3, 2020 •

edited

jeremycochoy commented Feb 3, 2020

linkerzhang commented Feb 4, 2020 •

edited

linkerzhang Feb 4, 2020

jeremycochoy Feb 4, 2020 •

edited

linkerzhang Feb 4, 2020

TMVector Feb 12, 2020

jeremycochoy Feb 12, 2020 •

edited

TMVector Feb 12, 2020

jeremycochoy Feb 12, 2020 •

edited

fdwr commented May 26, 2020

PallHaraldsson commented Jun 14, 2020 •

edited

jeremycochoy commented Jun 15, 2020

PallHaraldsson commented Jun 16, 2020 •

edited

		@@ -58,4 +58,15 @@ AttributeProto MakeRefAttribute(
		return a;

		{{"Elu_Result"}, "Elu", {"X_alpha"}, {{"alpha", 1.f}}},
		{{"Y"}, "Mul", {"alpha", "Elu_Result"}}})));

Implement CELU node as a Function #2575

Implement CELU node as a Function #2575

Conversation

jeremycochoy commented Jan 30, 2020 • edited

Description:

Graph

Tests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TMVector Jan 30, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremycochoy commented Jan 31, 2020

wschin Feb 1, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wschin commented Feb 1, 2020

jeremycochoy commented Feb 1, 2020 • edited

wschin commented Feb 1, 2020

linkerzhang commented Feb 2, 2020

jeremycochoy commented Feb 2, 2020

linkerzhang commented Feb 3, 2020

TMVector commented Feb 3, 2020 • edited

jeremycochoy commented Feb 3, 2020 • edited

wschin commented Feb 3, 2020 • edited

jeremycochoy commented Feb 3, 2020

linkerzhang commented Feb 4, 2020 • edited

Choose a reason for hiding this comment

jeremycochoy Feb 4, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremycochoy Feb 12, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeremycochoy Feb 12, 2020 • edited

Choose a reason for hiding this comment

fdwr commented May 26, 2020

PallHaraldsson commented Jun 14, 2020 • edited

jeremycochoy commented Jun 15, 2020

PallHaraldsson commented Jun 16, 2020 • edited

jeremycochoy commented Jan 30, 2020 •

edited

TMVector Jan 30, 2020 •

edited

wschin Feb 1, 2020 •

edited

jeremycochoy commented Feb 1, 2020 •

edited

TMVector commented Feb 3, 2020 •

edited

jeremycochoy commented Feb 3, 2020 •

edited

wschin commented Feb 3, 2020 •

edited

linkerzhang commented Feb 4, 2020 •

edited

jeremycochoy Feb 4, 2020 •

edited

jeremycochoy Feb 12, 2020 •

edited

jeremycochoy Feb 12, 2020 •

edited

PallHaraldsson commented Jun 14, 2020 •

edited

PallHaraldsson commented Jun 16, 2020 •

edited