You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Traceback (most recent call last):
File "/content/./lighteval/run_evals_accelerate.py", line 82, in
main(args)
File "/usr/local/lib/python3.10/dist-packages/lighteval/logging/hierarchical_logger.py", line 166, in wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/lighteval/main_accelerate.py", line 111, in main
evaluation_tracker = evaluate(
File "/usr/local/lib/python3.10/dist-packages/lighteval/evaluator.py", line 86, in evaluate
full_resps = lm.greedy_until(requests, override_bs=override_bs)
File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 594, in greedy_until
cur_reponses = self._generate(
File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 617, in _generate
outputs = self.model.generate(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1576, in generate
result = self._greedy_search(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2494, in _greedy_search
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/mistral/modeling_mistral.py", line 1158, in forward
outputs = self.model(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/mistral/modeling_mistral.py", line 987, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 163, in forward
return F.embedding(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2237, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 46, in main
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1075, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 681, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', './lighteval/run_evals_accelerate.py', '--model_args', 'pretrained=TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', '--tasks', 'leaderboard|gsm8k|0|0', '--override_batch_size', '1', '--output_dir=./evals/']' returned non-zero exit status 1.
Anything I did wrong?
Thanks for your help
The text was updated successfully, but these errors were encountered:
I can resolve the issue for GPTQ model by adding .cuda() in
main_accelerate.py (77) after load_model()
`
with htrack_block("Model loading"):
with accelerator.main_process_first() if accelerator is not None else nullcontext():
model, model_info = load_model(config=model_config, env_config=env_config)
# for i in model.model.named_parameters():
# print(f"{i[0]} -> {i[1].device}")
# import ipdb; from pprint import pprint as pp; ipdb.set_trace();
model.model.cuda()
evaluation_tracker.general_config_logger.log_model_info(model_info)
`
But I don't think its a proper way to resolve this issue,
If anyone can have a look why quantized model is stayed on CPU, not moved to GPU during load_model procedure
That will be very helpful
The models passed using above fix:
TechxGenus/Meta-Llama-3-8B-GPTQ
TheBloke/Mistral-7B-Instruct-v0.2-GPTQ
hi ! thanks for your interest in the lighteval and sorry for the delayed answer ! just to be sure, the model that does not pass usually does but does not work anymore when uadding your fix ?
For the reason, i would guess that GPTQ models are loaded differenlty and there is a bug where they are simply not loaded to GPU.
also, how many GPUs do you have available in your collab notebook ?
Notebook to reproduce
Please use GPU runtime and setup accelerate config
When I use a quantize models, it has this error:
Traceback (most recent call last):
File "/content/./lighteval/run_evals_accelerate.py", line 82, in
main(args)
File "/usr/local/lib/python3.10/dist-packages/lighteval/logging/hierarchical_logger.py", line 166, in wrapper
return fn(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/lighteval/main_accelerate.py", line 111, in main
evaluation_tracker = evaluate(
File "/usr/local/lib/python3.10/dist-packages/lighteval/evaluator.py", line 86, in evaluate
full_resps = lm.greedy_until(requests, override_bs=override_bs)
File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 594, in greedy_until
cur_reponses = self._generate(
File "/usr/local/lib/python3.10/dist-packages/lighteval/models/base_model.py", line 617, in _generate
outputs = self.model.generate(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1576, in generate
result = self._greedy_search(
File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2494, in _greedy_search
outputs = self(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/mistral/modeling_mistral.py", line 1158, in forward
outputs = self.model(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/mistral/modeling_mistral.py", line 987, in forward
inputs_embeds = self.embed_tokens(input_ids)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 163, in forward
return F.embedding(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2237, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA__index_select)
Traceback (most recent call last):
File "/usr/local/bin/accelerate", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/accelerate_cli.py", line 46, in main
args.func(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 1075, in launch_command
simple_launcher(args)
File "/usr/local/lib/python3.10/dist-packages/accelerate/commands/launch.py", line 681, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3', './lighteval/run_evals_accelerate.py', '--model_args', 'pretrained=TheBloke/Mistral-7B-Instruct-v0.2-GPTQ', '--tasks', 'leaderboard|gsm8k|0|0', '--override_batch_size', '1', '--output_dir=./evals/']' returned non-zero exit status 1.
Anything I did wrong?
Thanks for your help
The text was updated successfully, but these errors were encountered: