New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Loss Functions Folder, KL Divergence Loss #1365

Merged
merged 14 commits into from Apr 30, 2018

Conversation

Projects
None yet
4 participants
@sourabhvarshney111
Contributor

sourabhvarshney111 commented Apr 13, 2018

This is towards completion of #1294. I have moved some of the loss functions in the loss functions folder. I have tried to refactor the code as much as possible.

dakshitagrawal97 and others added some commits Mar 4, 2018

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 13, 2018

@zoq Async learning test has used mean squared error as its layer. I have a little idea about reinforcement learning. Can you help me deal with this?

@zoq

This comment has been minimized.

Member

zoq commented Apr 16, 2018

It's still fine, to move the layer/loss function, we just have to adjust the include path.

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 16, 2018

I think I got it. I just have to make a loss_function.hpp and loss_function_types.hpp in loss_functions folder. Right? And make suitable changes.

@zoq

This comment has been minimized.

Member

zoq commented Apr 17, 2018

I'm not sure I get the issue with the MeanSquaredError method.

You are referring to:

  FFN<MeanSquaredError<>, GaussianInitialization> model(MeanSquaredError<>(),
      GaussianInitialization(0, 0.001));

right? If that is the case, I don't see the issue if we move the MeanSquaredError code into another folder.

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 17, 2018

@zoq I was thinking whether we should make a loss_functions.hpp like layer.hpp so that we can include it directly without needing to include the files separately. Am I right?

@zoq

This comment has been minimized.

Member

zoq commented Apr 17, 2018

I see, I guess in this case it's fine to include a specific layer, since in most cases only one loss function is used. A consolidated header file might inccrease the build time.

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 17, 2018

@zoq I got your point. I have made the changes and waiting for the tests to complete.

sourabhvarshney111 added some commits Apr 17, 2018

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 18, 2018

@zoq I think the PR is ready for review. Review whenever possible for you.

* @file KLDivergence.hpp
* @author Dakshit Agrawal
*
* Definition of the KL Divergence error function.

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

Might be a good idea to call this Kullback–Leibler divergence.

This comment has been minimized.

@sourabhvarshney111

sourabhvarshney111 Apr 18, 2018

Contributor

I have changed the file name. Should I change class name too?

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

I think the class and filename is just fine, it's a pretty common alias, but we should mention at least once Kullback–Leibler divergence in the comments.

namespace ann /** Artificial Neural Network. */ {
/**
* The KL divergence error function measures the network's

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

I think this one could be improved, e.g. The Kullback–Leibler divergence is often used for continuous distributions (direct regression). Perhaps adding a reference (paper) is useful as well.

This comment has been minimized.

@sourabhvarshney111

sourabhvarshney111 Apr 18, 2018

Contributor

I was not finding an arxiv paper for this. But I found https://projecteuclid.org/download/pdf_1/euclid.aoms/1177729694 . Can I use this as the reference?

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

Yes, I would go for the same article.

This comment has been minimized.

@sourabhvarshney111

sourabhvarshney111 Apr 18, 2018

Contributor

Should this reference be like:
code
article{
title={On Information and Sufficiency},
author={S. Kullback, R.A. Leibler},
url = {https://projecteuclid.org/euclid.aoms/1177729694},
journal = {The Annals of Mathematical Statistics},
year={1951}
}
endcode

*
* @param takeMean Boolean variable to specify whether to take mean or not.
*/
KLDivergence(const bool takeMean = false);

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

What do you think, should we just call it mean?

This comment has been minimized.

@sourabhvarshney111

sourabhvarshney111 Apr 18, 2018

Contributor

I think it would not specify it's real purpose. It is meant for specifying whether we have to take the mean or use the sum of individual loss.

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

I see the point, thanks for the clarification.

const InputType&& input, const TargetType&& target)
{
// Input and Target should be of same size.
double loss;

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

No need to define the parameter here, we can return right away.

This comment has been minimized.

@sourabhvarshney111
// Input and Target should be of same size.
double loss;
if (takeMean) {

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

Can you put the { on a new line.

const TargetType&& target,
OutputType&& output)
{
if (takeMean) {

This comment has been minimized.

@zoq

zoq Apr 18, 2018

Member

See the comment above.

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 21, 2018

@zoq I have made the requested changes.

}
else
{
output = arma::accu(arma::log(input) - arma::log(target) + 1);

This comment has been minimized.

@zoq

zoq Apr 22, 2018

Member

I'm not sure this is correct, I thought it should be log(input/target) maybe I missed something?

This comment has been minimized.

@sourabhvarshney111

sourabhvarshney111 Apr 22, 2018

Contributor

But isn't log(a/b) = log a - log b?

This comment has been minimized.

@zoq

zoq Apr 22, 2018

Member

Ah, sorry I was talking about mean(log(input/target)), but I realized, the NLL, does also use accu, so I think we should stick to that metric.

This comment has been minimized.

@sourabhvarshney111

sourabhvarshney111 Apr 22, 2018

Contributor

So, is this ready for merge?

@zoq

zoq approved these changes Apr 23, 2018

No more comments from my side, thanks for looking into this one; I'll go ahead and merge this in 3 days to leave time for any other comments.

@rcurtin

Looks good to me. Thanks for the contribution!

set(SOURCES
cross_entropy_error.hpp
cross_entropy_error_impl.hpp
KLDivergence.hpp

This comment has been minimized.

@rcurtin

rcurtin Apr 26, 2018

Member

This is a minor comment, but could we rename this kl_divergence.hpp during the merge (or now) to match the other filenames in mlpack?

sourabhvarshney111 added some commits Apr 27, 2018

@zoq zoq merged commit 1e69262 into mlpack:master Apr 30, 2018

5 checks passed

Memory Checks
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@zoq

This comment has been minimized.

Member

zoq commented Apr 30, 2018

Thanks again for another great contribution, made some really minor style fixes in: a28143b.

@sourabhvarshney111 sourabhvarshney111 deleted the sourabhvarshney111:loss_functions branch Apr 30, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment