splits cross entropy can be further optimized #62

ReactiveCJ · 2018-07-13T06:49:08Z

Line 137 in 32fcb42

for idx in range(self.nsplits):

As the word in tail, the probability of this word is p(C) * p(x=target|C), then the entropy is target * log(p(C) * p(x=target|C) = target * log(P(C)) + target + log(p(x=target|C)。

We can just add the cross entropy on the head include tombstones, then compute cross entropy on each tail, so it is no need pass head_entropy below.

ReactiveCJ · 2018-07-13T09:10:09Z

At the line 166 of splitcross.py "entropy = -(head_entropy + tail_entropy)", we need not add head_entropy cause it have been added in logprob "results.append(head_entropy.view(-1, 1) + tail_entropy)"

octavian-ganea · 2018-10-05T17:44:56Z

I replaced this complicated SplitCrossEntropyLoss by the pytorch cross entropy loss which produces the same results and seems to be only slightly slower.

lorelupo mentioned this issue Apr 5, 2019

Why there are two times log_softmax in splitcross.py #97

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

splits cross entropy can be further optimized #62

splits cross entropy can be further optimized #62

ReactiveCJ commented Jul 13, 2018

ReactiveCJ commented Jul 13, 2018

octavian-ganea commented Oct 5, 2018

splits cross entropy can be further optimized #62

splits cross entropy can be further optimized #62

Comments

ReactiveCJ commented Jul 13, 2018

ReactiveCJ commented Jul 13, 2018

octavian-ganea commented Oct 5, 2018