You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear authors:
Thanks for your implementation with BERT on the DocRED. I have a question that the value of 'not NA acc' is quite large when training, and when the model converges, it even approaches 1. But the test F1 is more normal with a number about 0.54. Beyond that, I find that the value of original implementation (ACL-19) with LSTM seems in line with the final test F1. Thus I want to know why the 'not NA acc' and 'test F1' are so different in training.
Looking for your reply!
The text was updated successfully, but these errors were encountered:
Thanks for pointing that out! We think this is caused by overfitting. ‘Not NA acc’ is computed on the training data, while ‘test F1’ is computed on the development data. It looks like the BERT model is overfitting the training data and get nearly 100% accuracy.
Dear authors:
Thanks for your implementation with BERT on the DocRED. I have a question that the value of 'not NA acc' is quite large when training, and when the model converges, it even approaches 1. But the test F1 is more normal with a number about 0.54. Beyond that, I find that the value of original implementation (ACL-19) with LSTM seems in line with the final test F1. Thus I want to know why the 'not NA acc' and 'test F1' are so different in training.
Looking for your reply!
The text was updated successfully, but these errors were encountered: