Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is MultinomialLogisticLossLayer for, since it doesn't support Backprop #209

Open
davidparks21 opened this issue Aug 12, 2016 · 4 comments

Comments

@davidparks21
Copy link

davidparks21 commented Aug 12, 2016

I'm confused about MultinomialLogisticLossLayer, it seems like it's documented as a normal loss layer, and does what I want, but it doesn't have backprop implemented. Is it intended to be used with another layer? Something not referenced in the documentation perhaps?

I'm trying to take an image as input and output a (downsampled) heatmap of that image with higher values in locations where the model matches a desired object. My labels are of shape [100 x N_SAMPLES] where 100 represents a 10x10 heatmap output.

Thanks,
David

@CorticalComputer
Copy link

Hello David,

Were you ever able to resolve your issue? I'm currently trying to apply Mocha to a problem where I too need to output an X by Y map (which is of the same dimension as the input, since i'm classifying every pixel in the input). Any suggestions on how to do this?

@davidparks21
Copy link
Author

Unfortunately not, I tried using Softmax loss but it performed really poorly in comparison to logistic loss which I use now in Tensorflow. I couldn't get it working in Mocha because of this issue. I don't suppose it'll be hard to fix the issue since the derivative of the logistic loss is just (y-sigmoid(z))x, it should just be a few lines of code, one if you're a neat freak. I was initially under the assumption I didn't understand how it was being used, since it was so unexpected that a loss function would be committed that didn't actually work. So if it does in fact work, I still don't understand how it's implemented, and if it's just incomplete code then that's its own obvious issue.

I really liked the structure and potential ease of extending mocha, but to be a viable framework with a decent community it's going to need some basics like a forum where we can all discuss stuff like this. Posting git issues isn't very effective.

As for me, I'm working on Tensorflow for the moment. I get the impression that for Julia deep learning frameworks MXNet and the Tensorflow wrapper will be the actively maintained projects.

@CorticalComputer
Copy link

CorticalComputer commented Oct 8, 2016

I'm having problems using Mocha for anything other than standard classification. Unfortunately I've dumped a lot of time into Mocha, and it looks like now I will have to move everything to TenserFlow...

@pluskid
Copy link
Owner

pluskid commented Oct 14, 2016

The softmaxloss layer in Mocha is essentially a combination of softmax layer with MultinomialLogisticLoss layer. So if you are to use MultinomialLogisticLossLayer, then SoftmaxLoss layer can be used directly, with proper backward function implemented. It could handle multi-dimension outputs. See the dim parameter in the doc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants