Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training loss decreases, but model doesn't learn #19

Closed
sabetAI opened this issue Jul 11, 2020 · 1 comment
Closed

Training loss decreases, but model doesn't learn #19

sabetAI opened this issue Jul 11, 2020 · 1 comment

Comments

@sabetAI
Copy link

sabetAI commented Jul 11, 2020

I'm training the bert gector model (using train.py) on an edit dataset similar to those found in the gector paper (ie nucle3.3 or conll14), but the model's predictions degenerate to predicting $KEEP for every token. This minimizes the loss, since most of the labels are $KEEP, but doesn't induce any learning in the model. Usually, this is solved by re-weighting the class losses to correct for class imbalance, but that wasn't done in your implementation.

How did you originally resolve this?

@komelianchuk
Copy link
Collaborator

Hi, @sabetAI.
Sorry for the slow reply.

We had not faced such a problem in our experiments. (after some number of updates model start to produce other tags as well)
I think that the following could be helpful.

  1. Exclude true negatives from the data (tn_prob=0) during the pretraining stage.
  2. Use bigger batch_size (at least 128; better 256).
  3. Use more data if possible.
  4. Freeze encoder weights during the first couple of epochs (cold_step_count in [2,4])

Additionally, you could modify the mask in order to make weights for KEEP operation lower.
Something like this:

keep_bias = -0.5
weights = (labels == self.keep_index).long() * (keep_bias) + mask
loss_labels = sequence_cross_entropy_with_logits(logits_labels, labels, weights, label_smoothing=self.label_smoothing)

I hope that this will be useful to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants