You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.
As the word in tail, the probability of this word is p(C) * p(x=target|C), then the entropy is target * log(p(C) * p(x=target|C) = target * log(P(C)) + target + log(p(x=target|C)。
We can just add the cross entropy on the head include tombstones, then compute cross entropy on each tail, so it is no need pass head_entropy below.
The text was updated successfully, but these errors were encountered:
At the line 166 of splitcross.py "entropy = -(head_entropy + tail_entropy)", we need not add head_entropy cause it have been added in logprob "results.append(head_entropy.view(-1, 1) + tail_entropy)"
I replaced this complicated SplitCrossEntropyLoss by the pytorch cross entropy loss which produces the same results and seems to be only slightly slower.
awd-lstm-lm/splitcross.py
Line 137 in 32fcb42
As the word in tail, the probability of this word is p(C) * p(x=target|C), then the entropy is target * log(p(C) * p(x=target|C) = target * log(P(C)) + target + log(p(x=target|C)。
We can just add the cross entropy on the head include tombstones, then compute cross entropy on each tail, so it is no need pass head_entropy below.
The text was updated successfully, but these errors were encountered: