Question about the value of 'not NA acc'. #1

zzysay · 2019-12-20T01:19:32Z

Dear authors:
Thanks for your implementation with BERT on the DocRED. I have a question that the value of 'not NA acc' is quite large when training, and when the model converges, it even approaches 1. But the test F1 is more normal with a number about 0.54. Beyond that, I find that the value of original implementation (ACL-19) with LSTM seems in line with the final test F1. Thus I want to know why the 'not NA acc' and 'test F1' are so different in training.
Looking for your reply!

hongwang600 · 2019-12-26T17:34:23Z

Thanks for pointing that out! We think this is caused by overfitting. ‘Not NA acc’ is computed on the training data, while ‘test F1’ is computed on the development data. It looks like the BERT model is overfitting the training data and get nearly 100% accuracy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the value of 'not NA acc'. #1

Question about the value of 'not NA acc'. #1

zzysay commented Dec 20, 2019 •

edited

Loading

hongwang600 commented Dec 26, 2019

Question about the value of 'not NA acc'. #1

Question about the value of 'not NA acc'. #1

Comments

zzysay commented Dec 20, 2019 • edited Loading

hongwang600 commented Dec 26, 2019

zzysay commented Dec 20, 2019 •

edited

Loading