fix softmax nan problem by XJDKC · Pull Request #598 · apache/singa

XJDKC · 2020-02-12T11:45:21Z

Here I change the algorithm of the cudnnSoftmaxForward function.

The original algorithm was CUDNN_SOFTMAX_FAST which may lead to overflow when the input numbers are too large. Change it to CUDNN_SOFTMAX_ACCURATE will solve this problem.

For example: If cudnn softmax is used in mlp.py, it will lead to this problem.

chrishkchris · 2020-02-12T11:54:30Z

I strongly support this change for two reasons:

The current setting fails some network structure/dataset, which make the functions not universal for all applications.
SoftMax forward spent only very short time within the CNN training, e.g. resnet50 has many conv layers but only one SoftMax layer, so the time spent in SoftMax is not the bottleneck.

chrishkchris · 2020-02-12T11:56:44Z

sorry, could you also change CUDNN_SOFTMAX_FAST to CUDNN_SOFTMAX_ACCURATE in SoftMaxbackward in tensor_math_cuda.cc ? I am not sure if both needed to be matched

chrishkchris · 2020-02-12T12:09:01Z

thanks. I think it is ready for merge

XJDKC · 2020-02-12T12:10:46Z

Thank you for your reminding.

change SoftMax algorithm to prevent overflow

b34cdaf

change the SoftmaxBackward to the same algorithm

47dbcb6

nudles merged commit bc5df6e into apache:dev Feb 13, 2020

XJDKC deleted the SoftMax_Algorithm branch February 13, 2020 17:16

XJDKC restored the SoftMax_Algorithm branch February 13, 2020 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix softmax nan problem#598

fix softmax nan problem#598
nudles merged 2 commits intoapache:devfrom
XJDKC:SoftMax_Algorithm

XJDKC commented Feb 12, 2020

Uh oh!

chrishkchris commented Feb 12, 2020

Uh oh!

chrishkchris commented Feb 12, 2020

Uh oh!

chrishkchris commented Feb 12, 2020

Uh oh!

XJDKC commented Feb 12, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

XJDKC commented Feb 12, 2020

Uh oh!

chrishkchris commented Feb 12, 2020

Uh oh!

chrishkchris commented Feb 12, 2020

Uh oh!

chrishkchris commented Feb 12, 2020

Uh oh!

XJDKC commented Feb 12, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants