Skip to content
This repository has been archived by the owner on Jun 8, 2022. It is now read-only.

I think there is a bug in the implementation of bpc #2

Closed
OleNet opened this issue Apr 5, 2021 · 2 comments
Closed

I think there is a bug in the implementation of bpc #2

OleNet opened this issue Apr 5, 2021 · 2 comments

Comments

@OleNet
Copy link

OleNet commented Apr 5, 2021

According to the material I have find from here and here, bpc=log2(NLL).
But in the implementation in your code, I found that bpc = NLL / log2.
Is there something wrong for the calculation of bpc, or I have missed anything?

@yzh119
Copy link
Owner

yzh119 commented Apr 5, 2021

My implementation follows the definition of BPC in https://arxiv.org/pdf/1308.0850.pdf (page 8) and aligns with the implementation with Transformer-XL, and adaptive-span transformer and the StackOverflow thread you posted.

For the paper you mentioned, I think that's a typo in the paper. NLL has already applied a log to the input: https://pytorch.org/docs/stable/generated/torch.nn.NLLLoss.html#torch.nn.NLLLoss.

@OleNet
Copy link
Author

OleNet commented Apr 6, 2021

OK, I got it , thanks

@OleNet OleNet closed this as completed Apr 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants