Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RichProgressCallback would break model evaluation and prediction #1495

Closed
eggry opened this issue Mar 31, 2024 · 1 comment · Fixed by #1496
Closed

RichProgressCallback would break model evaluation and prediction #1495

eggry opened this issue Mar 31, 2024 · 1 comment · Fixed by #1496

Comments

@eggry
Copy link
Contributor

eggry commented Mar 31, 2024

Hi! It's awesome to have a CLI for trl.
However, there seems to be a problem with the newly introduced ProgressCallback. This issue affects both the evaluation and prediction stages.

To reproduce the issue, simply run

trl sft --model_name_or_path facebook/opt-125m --dataset_name imdb --output_dir opt-sft-imdb --evaluation_strategy steps --eval_steps 1

this would leads to the following error message:

Traceback (most recent call last):
  File "**************/lib/python3.11/site-packages/trl/commands/scripts/sft.py", line 148, in <module>
    trainer.train()
  File "**************/lib/python3.11/site-packages/trl/trainer/sft_trainer.py", line 360, in train
    output = super().train(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 1780, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 2193, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 2577, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 3365, in evaluate
    output = eval_loop(
             ^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer.py", line 3586, in evaluation_loop
    self.control = self.callback_handler.on_prediction_step(args, self.state, self.control)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer_callback.py", line 410, in on_prediction_step
    return self.call_event("on_prediction_step", args, state, control)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/transformers/trainer_callback.py", line 414, in call_event
    result = getattr(callback, event)(
             ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "**************/lib/python3.11/site-packages/trl/trainer/utils.py", line 783, in on_prediction_step
    self.prediction_bar.update(self.prediction_task_id, advance=1, update=True)
  File "**************/lib/python3.11/site-packages/rich/progress.py", line 1425, in update
    task = self._tasks[task_id]
           ~~~~~~~~~~~^^^^^^^^^
KeyError: None
@eggry
Copy link
Contributor Author

eggry commented Mar 31, 2024

I’m working on a PR to address this issue.

For those with urgent needs, a simple workaround is to comment out this line.

os.environ["TRL_USE_RICH"] = "1"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant