Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] macOS mps not support although configured with accelerate config #566

Closed
2 tasks done
petergreis opened this issue Apr 1, 2024 · 3 comments
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@petergreis
Copy link

Prerequisites

  • I have read the documentation.
  • I have checked other issues for similar problems.

Backend

Local

Interface Used

CLI

CLI Command

autotrain llm --train --project-name masters-work --model ./chatmusician_model_tokenizer --data-path . --text_column output --peft --lr 2e-4 --batch-size 2 --epochs 3 --trainer sft --model_max_length 2048 --block_size 2048 --save_strategy epoch --log wandb

UI Screenshots & Parameters

No response

Error Logs

`warnings.warn(
❌ ERROR | 2024-04-01 11:00:27 | autotrain.trainers.common:wrapper:91 - train has failed due to an exception: Traceback (most recent call last):
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/autotrain/trainers/common.py", line 88, in wrapper
return func(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/autotrain/trainers/clm/main.py", line 519, in train
trainer.train()
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/trl/trainer/sft_trainer.py", line 331, in train
output = super().train(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/trainer.py", line 1624, in train
return inner_training_loop(
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/trainer.py", line 2902, in training_step
loss = self.compute_loss(model, inputs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/trainer.py", line 2925, in compute_loss
outputs = model(**inputs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/peft/peft_model.py", line 1091, in forward
return self.base_model(
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 160, in forward
return self.model.forward(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1176, in forward
outputs = self.model(
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1008, in forward
layer_outputs = self._gradient_checkpointing_func(
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/_compile.py", line 24, in inner
return torch._dynamo.disable(fn, recursive)(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 489, in _fn
return fn(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/_dynamo/external_utils.py", line 17, in inner
return fn(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 482, in checkpoint
return CheckpointFunction.apply(function, preserve, *args)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/autograd/function.py", line 553, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 261, in forward
outputs = run_function(*args)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 740, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 647, in forward
cos, sin = self.rotary_emb(value_states, position_ids)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 141, in forward
with torch.autocast(device_type=device_type, enabled=False):
File "/Users/petergreis/anaconda3/envs/autotrain/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 241, in init
raise RuntimeError(
RuntimeError: User specified an unsupported autocast device_type 'mps'

❌ ERROR | 2024-04-01 11:00:27 | autotrain.trainers.common:wrapper:92 - User specified an unsupported autocast device_type 'mps'`

Additional Information

(autotrain) petergreis@MacBook-Pro-M1-Max-2021 Project % accelerate config ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------In which compute environment are you running? This machine ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Which type of machine are you using? No distributed training Do you want to run your training on CPU only (even if a GPU / Apple Silicon / Ascend NPU device is available)? [yes/NO]:no Do you want to use Intel PyTorch Extension (IPEX) to speed up training on CPU? [yes/NO]:no Do you wish to optimize your script with torch dynamo?[yes/NO]:no Do you want to use DeepSpeed? [yes/NO]: no ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Do you wish to use FP16 or BF16 (mixed precision)? fp16

@petergreis petergreis added the bug Something isn't working label Apr 1, 2024
@magusCoder-official
Copy link

I think Mac OS does'nt support fp16 nor bf 16. Can you try using docker probably that might help.

@abhishekkrthakur
Copy link
Member

seems like a transformers issue: huggingface/transformers#29431

@abhishekkrthakur
Copy link
Member

abhishekkrthakur commented Apr 1, 2024

autotrain llm \
--train \
--model gpt2 \
--data-path timdettmers/openassistant-guanaco \
--lr 2e-4 \
--batch-size 2 \
--epochs 1 \
--trainer sft \
--peft \
--project-name ms-re-1

seems to be working fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants