Hi team.
I have stuck on this problem for a whole week and still cannot figure out why.
Env: python 3.8, transformer -- 4.28
I am using the XLMRobertA Base for finetuning the model for a multi-class classification.
However,
when in the training step, I run trainer.evaluate() it shows the accuracy is 68% while in the evaluate standalone, which it reads the base model and then make the prediction and evaluate it, the accuracy drops to 30%. Is there any reason why it happens, or it's a bug?