Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MXNET-346] Hard Sigmoid Operator #10633

Merged
merged 1 commit into from
Apr 25, 2018
Merged

Conversation

haojin2
Copy link
Contributor

@haojin2 haojin2 commented Apr 20, 2018

Description

Implementation of Hard Sigmoid function

Checklist

Essentials

  • The PR title starts with [MXNET-346]
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Hard Sigmoid on CPU and GPU
  • Unit test for Hard Sigmoid

Comments

For bridging the operator gaps between MXNet and ONNX

DMLC_REGISTER_PARAMETER(HardSigmoidParam);
MXNET_OPERATOR_REGISTER_UNARY(hard_sigmoid)
.describe(R"code(Computes hard sigmoid of x element-wise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could add an example here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unnecessary for this kind of function, please refer to "sigmoid" function above.

@haojin2
Copy link
Contributor Author

haojin2 commented Apr 21, 2018

@rahul003 @reminisce @anirudh2290 @eric-haibin-lin Could you please give a review for this?

@chinakook
Copy link
Contributor

It’s widely used in LSTM in Keras. Could add this op into LSTM or GRU, etc..

@haojin2
Copy link
Contributor Author

haojin2 commented Apr 21, 2018

@chinakook Thanks for your advice! I'll definitely talk to related persons about this.

@haojin2 haojin2 force-pushed the hard_sigmoid branch 2 times, most recently from 218fda1 to 5f875ec Compare April 22, 2018 20:17


template<typename xpu>
void HardSigmoidForward(const nnvm::NodeAttrs& attrs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need to write forward and backward? You can use unary compute like https://github.com/apache/incubator-mxnet/blob/master/src/operator/tensor/elemwise_unary_op_basic.cc#L87

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I'll move the kernels to mshadow_op.h, does that sound good?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there's a problem with that, as hard sigmoid takes 2 parameters, which is different from other unary operators.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's okay to leave this in this way?

};

template<int req>
struct hard_sigmoid_forward {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@haojin2 haojin2 Apr 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Eric, there's a major difference between the given op and hard sigmoid here. set_to_int have a version of Map that takes 0 arguments so it could make use of the existing IMPLEMENT_BLANK_WORKLOAD_FWD macro, but hard sigmoid is a tertiary function, so I may need to add IMPLEMENT_TERTIARY_WORKLOAD_FWD and IMPLEMENT_TERTIARY_WORKLOAD_BWD in operator_tune.cc together with a new TertiaryOpTune class in operator_tune-inl.h. I'm not sure if there's any more functions that could make use of this, if so then I guess it's necessary to implement the things mentioned above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cjolivier01 would know more about this. He is drafting a doc for autotune. I'll merge this for now. Please revisit after you go through that doc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I actually talked with him yesterday after seeing your comments and those were what he told me, I think if we see more similar functions with this one then we can definitely get that done.

@piiswrong piiswrong merged commit efe5314 into apache:master Apr 25, 2018
rahul003 pushed a commit to rahul003/mxnet that referenced this pull request Jun 4, 2018
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants