You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,I downloaded the code, when I running mp=1 size=7B,my command is torchrun --nproc_per_node 1 example.py --ckpt_dir D:/llama/7B --tokenizer_path D:/llama/tokenizer.model
it works well
But when I change to mp=2 size=13B, with command torchrun --nproc_per_node 2 example.py --ckpt_dir D:/llama/13B --tokenizer_path D:/llama/tokenizer.model
the model loaded correctly into 2 GPUs, but when generating, there is an error: Traceback (most recent call last): File "example.py", line 165, in <module> fire.Fire(main) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fire\core.py", line 480, in _Fire target=component.__name__) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "example.py", line 160, in main [prompt], max_gen_len=max_gen_len, temperature=temperature, top_p=top_p, top_k=top_k, repetition_penalty=repetition_penalty, token_callback=callback, File "D:\LLaMA\llama\llama\generation.py", line 46, in generate logits = self.model.forward(tokens[:, prev_pos:cur_pos], prev_pos) File "C:\Users\sunbi\Anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "D:\LLaMA\llama\llama\model.py", line 225, in forward h = self.tok_embeddings(tokens) File "C:\Users\sunbi\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\layers.py", line 214, in forward output = gather_from_model_parallel_region(output_parallel) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\mappings.py", line 156, in gather_from_model_parallel_region return _GatherFromModelParallelRegion.apply(input_) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\mappings.py", line 131, in forward return _gather(input_) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\mappings.py", line 82, in _gather torch.distributed.all_gather(tensor_list, input_, group=group) File "C:\Users\sunbi\Anaconda3\lib\site-packages\torch\distributed\distributed_c10d.py", line 2282, in all_gather work.wait() RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.
I changed logits = self.model.forward(tokens[:, prev_pos:cur_pos], prev_pos)
into temp = tokens[:, prev_pos:cur_pos].clone() logits = self.model.forward(temp, prev_pos)
But the problem remains
environment:
intel i7 8700k
32gb of ram
2 tesla P40 GPUs(24GB video memory each)
win11 22H2
I tried to clone tokens in h = self.tok_embeddings(tokens)
It doesn't resolve the problem, so I have to remove inference_mode, it's slower, but acceptable.
Hello,I downloaded the code, when I running mp=1 size=7B,my command is
torchrun --nproc_per_node 1 example.py --ckpt_dir D:/llama/7B --tokenizer_path D:/llama/tokenizer.model
it works well
But when I change to mp=2 size=13B, with command
torchrun --nproc_per_node 2 example.py --ckpt_dir D:/llama/13B --tokenizer_path D:/llama/tokenizer.model
the model loaded correctly into 2 GPUs, but when generating, there is an error:
Traceback (most recent call last): File "example.py", line 165, in <module> fire.Fire(main) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fire\core.py", line 480, in _Fire target=component.__name__) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "example.py", line 160, in main [prompt], max_gen_len=max_gen_len, temperature=temperature, top_p=top_p, top_k=top_k, repetition_penalty=repetition_penalty, token_callback=callback, File "D:\LLaMA\llama\llama\generation.py", line 46, in generate logits = self.model.forward(tokens[:, prev_pos:cur_pos], prev_pos) File "C:\Users\sunbi\Anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "D:\LLaMA\llama\llama\model.py", line 225, in forward h = self.tok_embeddings(tokens) File "C:\Users\sunbi\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\layers.py", line 214, in forward output = gather_from_model_parallel_region(output_parallel) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\mappings.py", line 156, in gather_from_model_parallel_region return _GatherFromModelParallelRegion.apply(input_) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\mappings.py", line 131, in forward return _gather(input_) File "C:\Users\sunbi\Anaconda3\lib\site-packages\fairscale\nn\model_parallel\mappings.py", line 82, in _gather torch.distributed.all_gather(tensor_list, input_, group=group) File "C:\Users\sunbi\Anaconda3\lib\site-packages\torch\distributed\distributed_c10d.py", line 2282, in all_gather work.wait() RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.
I found the same error in stackoverflow:
https://stackoverflow.com/questions/71223747/pytorch-error-when-modifying-unpacked-tensor
I changed
logits = self.model.forward(tokens[:, prev_pos:cur_pos], prev_pos)
into
temp = tokens[:, prev_pos:cur_pos].clone()
logits = self.model.forward(temp, prev_pos)
But the problem remains
environment:
intel i7 8700k
32gb of ram
2 tesla P40 GPUs(24GB video memory each)
win11 22H2
conda version: 4.5.11
python version: 3.7
torch version: 1.13.1+cu117
cuda version: 11.7
The text was updated successfully, but these errors were encountered: