Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! #176

Open
gody7334 opened this issue Apr 26, 2024 · 3 comments

Comments

@gody7334
Copy link

gody7334 commented Apr 26, 2024

Notebook to reproduce
Please use GPU runtime and setup accelerate config

When I use a quantize models, it has this error:

Traceback (most recent call last):
File "/content/./lighteval/run_evals_accelerate.py", line 82, in
main(args)
File "/usr/local/lib/python3.10/dist-packages/lighteval/logging/hierarchical_logger.py", line 166, in wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/lighteval/main_accelerate.py", line 111, in main
evaluation_tracker = evaluate(
File "/usr/local/lib/python3.10/dist-packages/lighteval/evaluator.py", line 86, in evaluate
full_resps = lm.greedy_until(requests, override_bs=override_bs)
File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 594, in greedy_until
cur_reponses = self._generate(
File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 617, in _generate
outputs = self.model.generate(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1576, in generate
result = self._greedy_search(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2494, in _greedy_search
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/mistral/modeling_mistral.py", line 1158, in forward
outputs = self.model(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/mistral/modeling_mistral.py", line 987, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 163, in forward
return F.embedding(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2237, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 46, in main
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1075, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 681, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', './lighteval/run_evals_accelerate.py', '--model_args', 'pretrained=TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', '--tasks', 'leaderboard|gsm8k|0|0', '--override_batch_size', '1', '--output_dir=./evals/']' returned non-zero exit status 1.

Anything I did wrong?
Thanks for your help

@gody7334
Copy link
Author

gody7334 commented Apr 30, 2024

I can resolve the issue for GPTQ model by adding .cuda() in

main_accelerate.py (77) after load_model()

`
 with htrack_block("Model loading"): 
        with accelerator.main_process_first() if accelerator is not None else nullcontext():
            model, model_info = load_model(config=model_config, env_config=env_config)
            # for i in model.model.named_parameters():
                # print(f"{i[0]} -> {i[1].device}")
            # import ipdb; from pprint import pprint as pp; ipdb.set_trace();
            model.model.cuda()
            evaluation_tracker.general_config_logger.log_model_info(model_info)
`

But I don't think its a proper way to resolve this issue,
If anyone can have a look why quantized model is stayed on CPU, not moved to GPU during load_model procedure
That will be very helpful

The models passed using above fix:
TechxGenus/Meta-Llama-3-8B-GPTQ
TheBloke/Mistral-7B-Instruct-v0.2-GPTQ

The models doesn't pass:
01-ai/Yi-6B-Chat-4bits

@NathanHB
Copy link
Member

hi ! thanks for your interest in the lighteval and sorry for the delayed answer ! just to be sure, the model that does not pass usually does but does not work anymore when uadding your fix ?
For the reason, i would guess that GPTQ models are loaded differenlty and there is a bug where they are simply not loaded to GPU.
also, how many GPUs do you have available in your collab notebook ?

@gody7334
Copy link
Author

Hi NathanHB
Thanks for the reply,

  • NO, I don't have any problem with normal models, as the added code simply push model into GPU
  • I only have single GPU, and collab is single GPU(T4) setting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants