TF Camembert not improving over epochs #3361

bourrel · 2020-03-20T15:40:19Z

🐛 Bug

Information

Model I am using (Bert, XLNet ...):
jplu/tf-camembert-base

Language I am using the model on (English, Chinese ...):
French

The problem arises when using:

the official example scripts: (give details below)
my own modified scripts: (give details below)

The tasks I am working on is:

an official GLUE/SQUaD task: (give the name)
my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

Get a custom multi-class dataset with imbalanced data
Train TFCamembertForSequenceClassification on this dataset
Try with and without class_weight or under-sample biggest classes (accuracy and loss change but still don't improve over epochs)

import tensorflow as tf
from transformers import TFCamembertForSequenceClassification, CamembertTokenizer

model = TFCamembertForSequenceClassification.from_pretrained("jplu/tf-camembert-base", num_labels=len(labels))
tokenizer = CamembertTokenizer.from_pretrained("jplu/tf-camembert-base")

model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=3e-5, epsilon=1e-08, clipnorm=1.0),
    loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"],
)

model.fit(
    custom_generator(), # generator yield encoded sample (by tokenizer) and encoded label (by OneHotEncoder)
    epochs=10,
    max_queue_size=2,
    steps_per_epoch=25,
    #class_weight=class_weights,
    validation_data=custom_generator(),
    validation_steps=4
)

Expected behavior

The classifier should improve over each epochs. In this case it stay at the same accuracy and loss, it just vary with more or less 5% accuracy.
To compare, I tried to run the same code but with TFFlaubertForSequenceClassification.from_pretrained("jplu/tf-flaubert-base-cased") and it worked as expected.

Environment info

transformers version: 2.5.1
Platform: Linux-4.9.0-12-amd64-x86_64-with-debian-9.12 (Google AI Platform)
Python version: 3.7.6
PyTorch version (GPU?): 1.4.0 (True)
Tensorflow version (GPU?): 2.1.0 (True)
Using GPU in script?: Yes
Using distributed or parallel set-up in script?: No

For information, I already posted this problem on Stack Overflow which lead me here.

The text was updated successfully, but these errors were encountered:

stale · 2020-05-19T16:54:00Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

LeMoussel · 2022-08-09T09:04:15Z

@bourrel
Why do you use jplu/tf-flaubert-base-cased? (https://huggingface.co/jplu/tf-flaubert-base-cased)
Any particular reason not to use flaubert/flaubert_base_cased? (https://huggingface.co/flaubert/flaubert_base_cased)

bourrel · 2022-08-09T09:11:16Z

It was 2 years ago, I don't remember sorry 😅

stale bot added the wontfix label May 19, 2020

stale bot closed this as completed May 26, 2020

LeMoussel mentioned this issue Aug 9, 2022

Have you compared the results with FaluBERT? TheophileBlard/french-sentiment-analysis-with-bert#25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TF Camembert not improving over epochs #3361

TF Camembert not improving over epochs #3361

bourrel commented Mar 20, 2020

stale bot commented May 19, 2020

LeMoussel commented Aug 9, 2022

bourrel commented Aug 9, 2022

TF Camembert not improving over epochs #3361

TF Camembert not improving over epochs #3361

Comments

bourrel commented Mar 20, 2020

🐛 Bug

Information

To reproduce

Expected behavior

Environment info

stale bot commented May 19, 2020

LeMoussel commented Aug 9, 2022

bourrel commented Aug 9, 2022