`RichProgressCallback` would break model evaluation and prediction #1495

eggry · 2024-03-31T07:27:04Z

Hi! It's awesome to have a CLI for trl.
However, there seems to be a problem with the newly introduced ProgressCallback. This issue affects both the evaluation and prediction stages.

To reproduce the issue, simply run

trl sft --model_name_or_path facebook/opt-125m --dataset_name imdb --output_dir opt-sft-imdb --evaluation_strategy steps --eval_steps 1

this would leads to the following error message:

Traceback (most recent call last):
  File "**************/lib/python3.11/site-packages/trl/commands/scripts/sft.py", line 148, in <module>
    trainer.train()
  File "**************/lib/python3.11/site-packages/trl/trainer/sft_trainer.py", line 360, in train
    output = super().train(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 1780, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 2193, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 2577, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 3365, in evaluate
    output = eval_loop(
             ^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 3586, in evaluation_loop
    self.control = self.callback_handler.on_prediction_step(args, self.state, self.control)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer_callback.py", line 410, in on_prediction_step
    return self.call_event("on_prediction_step", args, state, control)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer_callback.py", line 414, in call_event
    result = getattr(callback, event)(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/trl/trainer/utils.py", line 783, in on_prediction_step
    self.prediction_bar.update(self.prediction_task_id, advance=1, update=True)
  File "**************/lib/python3.11/site-packages/rich/progress.py", line 1425, in update
    task = self._tasks[task_id]
           ~~~~~~~~~~~^^^^^^^^^
KeyError: None

The text was updated successfully, but these errors were encountered:

eggry · 2024-03-31T07:31:27Z

I’m working on a PR to address this issue.

For those with urgent needs, a simple workaround is to comment out this line.

trl/trl/commands/cli.py

Line 45 in 0ee349d

os.environ["TRL_USE_RICH"] = "1"

eggry mentioned this issue Mar 31, 2024

Fix RichProgressCallback #1496

Merged

younesbelkada closed this as completed in #1496 Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`RichProgressCallback` would break model evaluation and prediction #1495

`RichProgressCallback` would break model evaluation and prediction #1495

eggry commented Mar 31, 2024

eggry commented Mar 31, 2024

RichProgressCallback would break model evaluation and prediction #1495

RichProgressCallback would break model evaluation and prediction #1495

Comments

eggry commented Mar 31, 2024

eggry commented Mar 31, 2024

`RichProgressCallback` would break model evaluation and prediction #1495

`RichProgressCallback` would break model evaluation and prediction #1495