You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 25, 2025. It is now read-only.
I fine tune bert in some classification task and get better result, but I have some confuse.
I found the fine tuning is unstable. The f1 has 2% diff when fine-tuning with the same hyperparameters. (I tried to tf.set_random_seed(1024) but the result is still different.)
A classification task with about 80million train data. In the first epoch, the train loss always keep descending, the valid loss descend at first and increase after about 1 million train data is feed. It means model overfit in the first epoch?
Any guide about how to set a better warmup step? I found the fine-tuning performance is highly depend on the epoch num and warmup.