Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix derivative of LogSoftMax (Resolves #3468) #3469

Merged
merged 4 commits into from May 2, 2023

Conversation

mrdaybird
Copy link
Contributor

@mrdaybird mrdaybird commented Apr 10, 2023

Update the Backward method of LogSoftMax and fix test corresponding to it. Fixes #3468 (See issue for more details)

  • Fix derivative
  • Fix tests

See this colab file for my explanation and with references for the new derivative : LogSoftMax.
Also to be noted, the new derivative was tested against JacobianTest(corrected version).

@shubham1206agra
Copy link
Contributor

Can you cite the paper? So I can double check this.

@mrdaybird
Copy link
Contributor Author

mrdaybird commented Apr 11, 2023

See this colab file for my explanation with references to paper: LogSoftMax.
Also to be noted, the new derivative was tested against JacobianTest(corrected version).

@mrdaybird
Copy link
Contributor Author

I have updated the colab file(LogSoftMax) to include a simpler and a more complete calculation of the derivative. Also, an important reference in understanding the derivative of LogSoftMax is understanding how the derivative of Softmax is calculated. For that, you can see the mlpack implementation(great job to whoever wrote) and also this blog is a good reference to understand how the softmax derivative is calculated as well as how it is multiplied with the back-propagated derivatives. The derivative in the mlpack is a further simplified form.

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want to add a note to HISTORY.md about this?

I found that the tests still pass with both implementations. Maybe we should add a Jacobian test to log_softmax.cpp too? I tried that, but it fails---I haven't dug into exactly why yet:

/**
 * Jacobian log-softmax module test.
 */
TEST_CASE("JacobianLogSoftMaxLayerTest", "[ANNLayerTest]")
{
  for (size_t i = 0; i < 5; ++i)
  {
    const size_t inputElements = RandInt(2, 1000);

    arma::mat input;
    input.set_size(inputElements, 1);
    input.randu();

    LogSoftMax module;
    module.InputDimensions() = std::vector<size_t>({ inputElements });
    module.ComputeOutputDimensions();

    double error = JacobianTest(module, input);
    REQUIRE(error <= 1e-5);
  }
}

Sorry it took so long to get to this. I've been so heads-down on bandicoot that a lot of mlpack stuff has gotten put on the back burner.

@mrdaybird
Copy link
Contributor Author

I will add the Jacobian test after the PR correcting it is merged.

@mrdaybird
Copy link
Contributor Author

@rcurtin I have added the JacobianTest. It is mostly your implementation from your previous comment, because after looking at it, it was probably better than anything I could think of 😅. Hope you don't have a problem with that. One slight change is that, JacobianTest initializes the input matrix, so there's no need to initialize the matrix before passing.

@rcurtin
Copy link
Member

rcurtin commented May 1, 2023

Sounds good @mrdaybird; thanks for adding it. I'm going to debug the static code analysis checks job on this PR, so, don't mind the noise...

@rcurtin
Copy link
Member

rcurtin commented May 1, 2023

@mlpack-jenkins test this please

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet, got the static code analysis job fixed on the first try. :)

We should add a note to HISTORY.md; if you want to do that, feel free, otherwise I'll handle it during merge. 👍

Thanks for finding and fixing this important issue!

Copy link

@mlpack-bot mlpack-bot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second approval provided automatically after 24 hours. 👍

@rcurtin rcurtin merged commit 59a3b84 into mlpack:master May 2, 2023
19 checks passed
rcurtin added a commit to rcurtin/mlpack that referenced this pull request May 2, 2023
@rcurtin rcurtin mentioned this pull request May 2, 2023
@rcurtin
Copy link
Member

rcurtin commented May 2, 2023

Forgot to update the history during merge... oops... so I opened #3479.

conradsnicta pushed a commit that referenced this pull request May 7, 2023
This was referenced Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect Derivative of LogSoftMax layer
4 participants