Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SINGA -475 add Sign operator to singa #488

Merged
merged 6 commits into from Aug 14, 2019
Merged

SINGA -475 add Sign operator to singa #488

merged 6 commits into from Aug 14, 2019

Conversation

pinpom
Copy link
Contributor

@pinpom pinpom commented Jul 29, 2019

No description provided.

@pinpom pinpom changed the title add Sign operator to singa SINGA SINGA-475 - add Sign operator to singa Jul 29, 2019
@pinpom pinpom changed the title SINGA SINGA-475 - add Sign operator to singa SINGA -475 add Sign operator to singa Jul 29, 2019
@joddiy
Copy link
Member

joddiy commented Aug 5, 2019

ready for merge

Copy link
Contributor

@chrishkchris chrishkchris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run the test in docker container using cloud K80. It passes the tests.
I have left four comments above which are minor issue. It will be ready for merge after these minor changes.


result = autograd.sign(x)
dx = result.creator.backward(dy.data)
#dx = [x/|x|]'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the comment line at 625:
#dx = [x/|x|]'


result = autograd.sign(x)
dx = result.creator.backward(dy.data)
#dx = [x/|x|]'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the comment line at 644:
#dx = [x/|x|]'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix the code



class Sign(Operation):
def forward(self, a):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the default initializer:

def __init__(self):
    super(Sign, self).__init__() 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix the code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the modification. The below is the reason why this is needed.

This function passes the name "Sign" to the superclass (Operation) initializer.
The superclass Operation initializer is as follows:

class Operation(object):
    op_count = 0
    def __init__(self, name=None):
        if name is None:
            self.name = "{}#{}".format(
                self.__class__.__name__, Operation.op_count
            )
            Operation.op_count += 1
        else:
            self.name = name

If there is no initialzer in the subclass, the superclass initializer will be used instead with the "name=None"

@chrishkchris
Copy link
Contributor

Thanks! Ready for merge

return singa.Sign(a)

def backward(self, dy):
dx = singa.MultFloat(dy, 0.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dx is always 0?

@chrishkchris
Copy link
Contributor

chrishkchris commented Aug 9, 2019

In my opinion:
For the function y=sign(x) , y = 1 when x > 0, y = -1 when x < 0. Therefore, the derivative is 0 except x = 0.
The function y=sign(x) is discontinuous at x=0 where the derivative is undefined in the mathematical logic. However, an undefined derivative leads to a nan number if we use it for backpropagation, so instead we used a derivative of zero also at x=0.

As a result, dx is always 0 while the output shape (size of array) is the same as input shape.

@chrishkchris
Copy link
Contributor

chrishkchris commented Aug 9, 2019

This is what tensorflow uses: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/gradients/math_grad.cc#L268

Status SignGrad(const Scope& scope, const Operation& op,
                const std::vector<Output>& grad_inputs,
                std::vector<Output>* grad_outputs) {
  auto shape = Shape(scope, op.input(0));
  auto zero = Cast(scope, Const(scope, 0.0), op.input(0).type());
  auto dx = Fill(scope, shape, zero);
  grad_outputs->push_back(dx);
  return scope.status();
}
REGISTER_GRADIENT_OP("Sign", SignGrad);

seems to fill the dx with all zero using the dy shape

@chrishkchris
Copy link
Contributor

chrishkchris commented Aug 9, 2019

there are conflicts and there are two solutions (same as what happened in log operator):

  1. merge this branch with the latest master
  2. checkout a new branch from the latest master. Then copy the added code into the autograd.py and test_operations.py; send a new PR.

(I guess this maybe because different PRs put the new functions at the same line, so git detected it as different ways of changing code and hence conflicts)

@nudles nudles merged commit 4d81e98 into apache:master Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants