You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to train a language model with different k-mers with stride=1. As you already discussed, we should expect the loss for this LM should be around ln(4^1)~1.3.
However, when I train 6-mer with stride 1, loss is around 8.3 ~ ln(4^6), and if I train 4-mer with stride 1, loss is around 5.2 ~ ln(4^4). Do you have an idea why this happens?
Thanks!
The text was updated successfully, but these errors were encountered:
Hello, thanks a lot for the great repo!
I'm trying to train a language model with different k-mers with stride=1. As you already discussed, we should expect the loss for this LM should be around ln(4^1)~1.3.
However, when I train 6-mer with stride 1, loss is around 8.3 ~ ln(4^6), and if I train 4-mer with stride 1, loss is around 5.2 ~ ln(4^4). Do you have an idea why this happens?
Thanks!
The text was updated successfully, but these errors were encountered: