Python implementation of softmax loss layer #4023

Closed
biprajiman opened this Issue Apr 20, 2016 · 1 comment

Comments

Projects
None yet
2 participants
@biprajiman

biprajiman commented Apr 20, 2016

Hi,

I am trying to implement the softmaxloss layer in python to be used with pycaffe. I followed the example of euclidean loss and created a simple code as a starting point:

----------------------------------------------------------------------------------------------------------------------------

class SoftmaxLossLayer(caffe.Layer):

def setup(self, bottom, top):
    # check input pair
    if len(bottom) != 2:
        raise Exception("Need two inputs to compute distance.")

def reshape(self, bottom, top):
    # check input dimensions match
    if bottom[0].num != bottom[1].num:
        raise Exception("Inputs must have the same dimension.")
    #raise Exception("Inputs must have the same dimension.")
    # difference is shape of inputs
    self.diff = np.zeros_like(bottom[0].data, dtype=np.float32)
    # loss output is scalar
    top[0].reshape(1)

def forward(self, bottom, top):
    scores = bottom[0].data
    exp_scores = np.exp(scores)
    probs = exp_scores / np.sum(exp_scores, axis=1, keepdims=True) 
    correct_logprobs = -np.log(probs[range(bottom[0].num),np.array(bottom[1].data,dtype=np.uint16)])
    data_loss = np.sum(correct_logprobs)/bottom[0].num

    self.diff[...] = probs
    top[0].data[...] = data_loss

def backward(self, top, propagate_down, bottom):
    delta = self.diff

    for i in range(2):
        if not propagate_down[i]:
            continue
        if i == 0:
            delta[range(bottom[0].num), np.array(bottom[1].data,dtype=np.uint16)] -= 1

        bottom[i].diff[...] = delta/bottom[0].num

------------------------------------------------------------------------------------------------------------------------------

The code is working for simple LeNet and loss seems to be decreasing. I would be willing to modify this code and make it upto the standard and share. I need guidance on what am I missing (I read the c++ code and this one is far from what the c++ is doing) and modify the code to match the c++ code so that it is more generic.

You may ask why to go through this trouble, well modifying python code to create new loss is easier for me than to go through the c++ code which might take long time.

Thank you in advance for any help.

@seanbell

This comment has been minimized.

Show comment
Hide comment
@seanbell

seanbell Apr 20, 2016

Contributor

While a python layer is nice for academic/learning purposes, there's no need for it in caffe since the C++ one is faster and uses the GPU.

Also note that your forward expression is numerically unstable; you should look into lectures explaining Softmax (e.g. http://cs231n.github.io/linear-classify/#softmax) to see how to fix it.

I'm closing this since it's a modeling/usage question. Please continue the discussion on the mailing list.
From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

Please do not post usage, installation, or modeling questions, or other requests for help to Issues.
Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.

Contributor

seanbell commented Apr 20, 2016

While a python layer is nice for academic/learning purposes, there's no need for it in caffe since the C++ one is faster and uses the GPU.

Also note that your forward expression is numerically unstable; you should look into lectures explaining Softmax (e.g. http://cs231n.github.io/linear-classify/#softmax) to see how to fix it.

I'm closing this since it's a modeling/usage question. Please continue the discussion on the mailing list.
From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

Please do not post usage, installation, or modeling questions, or other requests for help to Issues.
Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.

@seanbell seanbell closed this Apr 20, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment