Skip to content

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:0! #103

@haoran1062

Description

@haoran1062

hello, I'm trying to modify some knowledge to Qwen-14b, I have 8 x A100 to use, in yaml file

  • when model_parallel : False, OOM occured
  • when I set model_parallel : True use multi-GPU to run ROME/MEMIT RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:0! error occured
  • qwen-14b.yaml
alg_name: "ROME"
model_name: "/data/weights/trained/sft/qwen-14b_hf/"
stats_dir: "./data/stats"
device: 0
layers: [10]
fact_token: "subject_last"
v_num_grad_steps: 20
v_lr: 5e-1
v_loss_layer: 39
v_weight_decay: 0.5
clamp_norm_factor: 4
kl_factor: 0.0625
mom2_adjustment: false
context_template_length_params: [[5, 10], [10, 10]]
rewrite_module_tmp: "transformer.h.{}.mlp.c_proj"
layer_module_tmp: "transformer.h.{}"
mlp_module_tmp: "transformer.h.{}.mlp"
attn_module_tmp: "transformer.h.{}.attn"
ln_f_module: "transformer.ln_f"
lm_head_module: "lm_head"
mom2_dataset: "wikipedia"
mom2_n_samples: 100000
mom2_dtype: "float32"
model_parallel: true
  • output
Rewrite layer is 10                                                                                                                                                                                                                                                                                                                                                        
Tying optimization objective to 39                                                                                                                                                                                                                                                                                                                                         
Recording initial value of v*                                                                                                                                                                                                                                                                                                                                              
Traceback (most recent call last):                                                                                                                                                                                                                                                                                                                                         
  File "/data/projects/EasyEdit/play_edit.py", line 88, in <module>                                                                                                                                                                                                                                                                                                        
    test_ROME_Qwen(cfg_path, model_path)                                                                                                                                                                                                                                                                                                                                   
  File "/data/projects/EasyEdit/play_edit.py", line 32, in test_ROME_Qwen                                                                                                                                                                                                                                                                                                  
    metrics, edited_model, _ = editor.edit(                                                                                                                                                                                                                                                                                                                                
  File "/data/projects/EasyEdit/easyeditor/editors/editor.py", line 247, in edit                                                                                                                                                                                                                                                                                           
    edited_model, weights_copy = self.apply_algo(                                                                                                                                                                                                                                                                                                                          
  File "/data/projects/EasyEdit/easyeditor/models/rome/rome_main.py", line 41, in apply_rome_to_model                                                                                                                                                                                                                                                                      
    deltas = execute_rome(model, tok, request, hparams)                                                                                                                                                                                                                                                                                                                    
  File "/data/projects/EasyEdit/easyeditor/models/rome/rome_main.py", line 113, in execute_rome                                                                                                                                                                                                                                                                            
    right_vector: torch.Tensor = compute_v(                                                                                                                                                                                                                                                                                                                                
  File "/data/projects/EasyEdit/easyeditor/models/rome/compute_v.py", line 121, in compute_v                                                                                                                                                                                                                                                                               
    logits = model(**input_tok).logits                                                                                                                                                                                                                                                                                                                                     
  File "/opt/conda/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl                                                                                                                                                                                                                                                         
    return forward_call(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                   
  File "/opt/conda/envs/EasyEdit/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward                                                                                                                                                                                                                                                                
    output = module._old_forward(*args, **kwargs)                                                                                                                                                                                                                                                                                                                          
  File "/root/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 1104, in forward                                                                                                                                                                                                                                                                     
    transformer_outputs = self.transformer(                                                                                                                                                                                                                                                                                                                                
  File "/opt/conda/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl                                                                                                                                                                                                                                                         
    return forward_call(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                   
  File "/root/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 934, in forward                                                                                                                                                                                                                                                                      
    outputs = block(                                                                                                                                                                                                                                                                                                                                                       
  File "/opt/conda/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl                                                                                                                                                                                                                                                         
    return forward_call(*args, **kwargs)                                                                                                                                                                                                                                                                                                                                   
  File "/opt/conda/envs/EasyEdit/lib/python3.9/site-packages/accelerate/hooks.py", line 164, in new_forward                                                                                                                                                                                                                                                                
    output = module._old_forward(*args, **kwargs)                                                                                                                                                                                                                                                                                                                          
  File "/root/.cache/huggingface/modules/transformers_modules/modeling_qwen.py", line 655, in forward                                                                                                                                                                                                                                                                      
    mlp_output = self.mlp(layernorm_output)                                                                                                                                                                                                                                                                                                                                
  File "/opt/conda/envs/EasyEdit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1547, in _call_impl                                                                                                                                                                                                                                                         
    hook_result = hook(self, args, result)                                                                                                                                                                                                                                                                                                                                 
  File "/data/projects/EasyEdit/easyeditor/util/nethook.py", line 80, in retain_hook                                                                                                                                                                                                                                                                                       
    output = invoke_with_optional_args(                                                                                                                                                                                                                                                                                                                                    
  File "/data/projects/EasyEdit/easyeditor/util/nethook.py", line 451, in invoke_with_optional_args                                                                                                                                                                                                                                                                        
    return fn(*pass_args, **pass_kw)                                                                                                                                                                                                                                                                                                                                       
  File "/data/projects/EasyEdit/easyeditor/models/rome/compute_v.py", line 98, in edit_output_fn                                                                                                                                                                                                                                                                           
    cur_out[i, idx, :] += delta                                                                                                                                                                                                                                                                                                                                            
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:0! 

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions