Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF Camembert not improving over epochs #3361

Closed
2 of 4 tasks
bourrel opened this issue Mar 20, 2020 · 3 comments
Closed
2 of 4 tasks

TF Camembert not improving over epochs #3361

bourrel opened this issue Mar 20, 2020 · 3 comments
Labels

Comments

@bourrel
Copy link

bourrel commented Mar 20, 2020

馃悰 Bug

Information

Model I am using (Bert, XLNet ...):
jplu/tf-camembert-base

Language I am using the model on (English, Chinese ...):
French

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

  1. Get a custom multi-class dataset with imbalanced data
  2. Train TFCamembertForSequenceClassification on this dataset
  3. Try with and without class_weight or under-sample biggest classes (accuracy and loss change but still don't improve over epochs)
import tensorflow as tf
from transformers import TFCamembertForSequenceClassification, CamembertTokenizer

model = TFCamembertForSequenceClassification.from_pretrained("jplu/tf-camembert-base", num_labels=len(labels))
tokenizer = CamembertTokenizer.from_pretrained("jplu/tf-camembert-base")

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0),
    loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"],
)

model.fit(
    custom_generator(), # generator yield encoded sample (by tokenizer) and encoded label (by OneHotEncoder)
    epochs=10,
    max_queue_size=2,
    steps_per_epoch=25,
    #class_weight=class_weights,
    validation_data=custom_generator(),
    validation_steps=4
)

Expected behavior

The classifier should improve over each epochs. In this case it stay at the same accuracy and loss, it just vary with more or less 5% accuracy.
To compare, I tried to run the same code but with TFFlaubertForSequenceClassification.from_pretrained("jplu/tf-flaubert-base-cased") and it worked as expected.

Environment info

  • transformers version: 2.5.1
  • Platform: Linux-4.9.0-12-amd64-x86_64-with-debian-9.12 (Google AI Platform)
  • Python version: 3.7.6
  • PyTorch version (GPU?): 1.4.0 (True)
  • Tensorflow version (GPU?): 2.1.0 (True)
  • Using GPU in script?: Yes
  • Using distributed or parallel set-up in script?: No

For information, I already posted this problem on Stack Overflow which lead me here.

@stale
Copy link

stale bot commented May 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label May 19, 2020
@stale stale bot closed this as completed May 26, 2020
@LeMoussel
Copy link

@bourrel
Why do you use jplu/tf-flaubert-base-cased? (https://huggingface.co/jplu/tf-flaubert-base-cased)
Any particular reason not to use flaubert/flaubert_base_cased? (https://huggingface.co/flaubert/flaubert_base_cased)

@bourrel
Copy link
Author

bourrel commented Aug 9, 2022

It was 2 years ago, I don't remember sorry 馃槄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants