-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
🐛 Describe the bug
INFO colossalai - colossalai - INFO: Tokenizing finish.
Collect steps: 100%|███████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.42s/it]
Train epoch [1]: 0%| | 0/2 [00:00<?, ?it/s]
Update steps: 0%| | 0/1 [00:01<?, ?it/s]
Episodes: 0%| | 0/1 [00:04<?, ?it/s]
Traceback (most recent call last):
File "/ColossalAI-main/applications/Chat/examples/train_prompts.py", line 222, in <module>
main(args)
File "/ColossalAI-main/applications/Chat/examples/train_prompts.py", line 179, in main
trainer.fit(prompt_dataloader=prompt_dataloader,
File "/opt/conda/lib/python3.10/site-packages/coati/trainer/base.py", line 186, in fit
self._update_phase(update_step)
File "/opt/conda/lib/python3.10/site-packages/coati/trainer/base.py", line 151, in _update_phase
self._learn(update_step)
File "/opt/conda/lib/python3.10/site-packages/coati/trainer/ppo.py", line 189, in _learn
metrics = self._training_step(experience)
File "/opt/conda/lib/python3.10/site-packages/coati/trainer/ppo.py", line 143, in _training_step
attention_mask=batch['attention_mask'])['logits']
KeyError: 'attention_mask'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 192) of binary: /opt/conda/bin/python3.10Thank You!
Environment
cuda 11.7
python 3.10.12
pytorch 1.13.1+cu117
GPU A100-40G
Colossalai V0.3.1
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working