[BUG] Can't load OPT-30B and OPT-66B through checkpoints.json #2616

anselmwang · 2022-12-15T23:31:00Z

Describe the bug

I can't load OPT-30B and OPT-66B through checkpoints.json. If I load them with Huggingface from_pretrained, everything works fine. This bug is troublesome because my production nodes have far less memory than my dev node, so they don't have enough CPU memory to load OPT-30B and OPT-66B.

To Reproduce
python 3.7.7

git clone https://github.com/anselmwang/transformers-bloom-inference/
cd transformers-bloom-inference
git checkout explore_ds

pip install --upgrade pip
pip install transformers>=4.21.3 accelerate>=0.12.0
pip install deepspeed>=0.7.3

Without checkpoints_json, this command works date; deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name facebook/opt-30b; date

Below is the stack trace when using checkpoints.json date; deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name facebook/opt-30b --use_checkpoints_json; date

Traceback (most recent call last):                                                                                                                                                        
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/bloom-inference-scripts/bloom-ds-inference.py", line 192, in <module>                                               
    model = deepspeed.init_inference(                                                                                                                                                     
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/__init__.py", line 311, in init_inference                                
    engine = InferenceEngine(model, config=ds_inference_config)                                                                                                                           
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 127, in __init__
    self.module.to(device)                                                                                                                                                                
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/transformers/modeling_utils.py", line 1682, in to  
    return super().to(*args, **kwargs)                                                       
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 987, in to
    return self._apply(convert)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 639, in _apply
    module._apply(fn)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 639, in _apply
    module._apply(fn)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 639, in _apply
    module._apply(fn)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 662, in _apply
    param_applied = fn(param)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 985, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!

For OPT-66B, this command works date; deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name facebook/opt-66b; date

But when turning on checkpoints.json, date; deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name facebook/opt-66b --use_checkpoints_json; date, below is the stack trace

Traceback (most recent call last):                                                                                                                                                [9/1869]
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/bloom-inference-scripts/bloom-ds-inference.py", line 190, in <module>                                               
    model = deepspeed.init_inference(                                                                                                                                                     
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/__init__.py", line 311, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)                            
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 124, in __init__
    self._apply_injection_policy(config)                                                     
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 349, in _apply_injection_policy                   replace_transformer_layer(client_module,                                                                                                                                              
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/replace_module.py", line 926, in replace_transformer_layer
    load_model_with_checkpoint(                                                              
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 349, in load_model_with_checkpoin
t
    load_module_recursive(r_module)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 343, in load_module_recursive
    load_module_recursive(
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 343, in load_module_recursive
    load_module_recursive(
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 343, in load_module_recursive
    load_module_recursive(
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 341, in load_module_recursive
    layer_policies[child.__class__](child, prefix + name + '.')
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 258, in load_transformer_layer
    maybe_copy_qkv(module.attention,
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 203, in maybe_copy_qkv
    k = sd[0][src_names[1]]
KeyError: 'model.decoder.layers.28.self_attn.k_proj.weight'

Expected behavior

ds_report output
Please run ds_report to give us details about your setup.

--------------------------------------------------
DeepSpeed C++/CUDA extension op report                                                                                              
--------------------------------------------------                                                                                 
NOTE: Ops not installed will be just-in-time (JIT) compiled at                    
      runtime if needed. Op compatibility means that your system                                                                                          
      meet the required dependencies to JIT install the op.                                                         
--------------------------------------------------
JIT compiled ops requires ninja                                                                                                              
ninja .................. [OKAY]                                                            
--------------------------------------------------
op name ................ installed .. compatible                                                                        
--------------------------------------------------        
cpu_adam ............... [NO] ....... [OKAY]            
cpu_adagrad ............ [NO] ....... [OKAY]                                                             
fused_adam ............. [NO] ....... [OKAY]                                                                      
fused_lamb ............. [NO] ....... [OKAY]                                                                       
 [WARNING]  please install triton==1.0.0 if you want to use sparse attention
sparse_attn ............ [NO] ....... [NO]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
 [WARNING]  async_io requires the dev libaio .so object and headers but these were not found.
 [WARNING]  async_io: please install the libaio-dev package with apt
 [WARNING]  If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
utils .................. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
spatial_inference ...... [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['/tmp/code/transformers-bloom-inference/venv/lib/python3.7/site-packages/torch']
torch version .................... 1.13.0+cu117
torch cuda version ............... 11.7
torch hip version ................ None
nvcc version ..................... 11.8
deepspeed install path ........... ['/tmp/code/transformers-bloom-inference/venv/lib/python3.7/site-packages/deepspeed']
deepspeed info ................... 0.7.6, unknown, unknown
deepspeed wheel compiled w. ...... torch 1.13, cuda 11.7

Screenshots
If applicable, add screenshots to help explain your problem.

System info (please complete the following information):

OS: Ubuntu 18.04
GPU count and types: 1 node with 4 A6000, 46GB memory per GPU
(if applicable) what DeepSpeed-MII version are you using
(if applicable) Hugging Face Transformers/Accelerate/etc. versions

transformers             4.25.1
deepspeed                0.7.7
torch                    1.13.0

Python version: 3.7.7
Any other relevant info about your setup

Docker context
Are you using a specific docker image that you can share?
Not use docker
Additional context
Add any other context about the problem here.

The text was updated successfully, but these errors were encountered:

mrwyattii · 2022-12-20T23:09:32Z

I can confirm that I'm able to replicate this. Interestingly, I'm finding that smaller OPT models work loading with meta tensor. It appears that models that are split in the HuggingFace checkpoints are causing this error (e.g., they have multiple pytorch_model-*-of-*.bin).

@RezaYazdaniAminabadi any idea the cause? I'm guessing we don't catch this in our unit tests because we use small versions of these larger models to save time.

mrwyattii · 2022-12-20T23:11:59Z

@anselmwang I see you mentioned you are only trying to load the models with meta tensor on your production node. One possible solution (until we determine the cause of this error) would be to create a pre-sharded version of each model on your dev node and copy that over to the production node. I'm able to properly load these models from DeepSpeed-sharded checkpoints. See my comment here on how to generate those sharded checkpoints: #2379 (comment)

felifri · 2022-12-21T09:04:13Z

I'm experiencing the same issue with the BLOOM models

asafkar · 2022-12-26T07:41:03Z

Regarding Bloom models, downgrading deepspeed to 0.7.6 works for me.
Using 0.7.7 / 0.8.0 gets this error (using this script - https://github.com/huggingface/transformers-bloom-inference/blob/main/bloom-inference-scripts/bloom-ds-inference.py)

njhill · 2022-12-27T03:42:34Z

Also encountered this when upgrading from 0.7.6 to 0.7.7, with BLOOM 176B.

RezaYazdaniAminabadi · 2023-01-04T06:52:13Z

Hi,

I have fixed some bugs regarding the checkpoint loading for these model architectures. Could you please retry using this PR? You can also try our updated test-suite here.
Thanks,
Reza

RezaYazdaniAminabadi · 2023-01-19T18:26:10Z

Hi @niumanar, @asafkar and @anselmwang,

I just wanted to see if you get a chance to use this PR and see if it fixed the issue?

Thanks,
Reza

njhill · 2023-01-19T22:50:45Z

@RezaYazdaniAminabadi I can confirm that version 0.8.0 fixed the issue for me.

anselmwang · 2023-01-22T20:35:56Z

@RezaYazdaniAminabadi , @njhill said version 0.8.0 fixed the issue, unfortunately this version doesn't fix for me.

For PR #2662 , it fixes OPT-30B, which is command date; deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name facebook/opt-30b --use_checkpoints_json; date

OPT-66B date; deepspeed --num_gpus 4 bloom-inference-scripts/bloom-ds-inference.py --name facebook/opt-66b --use_checkpoints_json; date meets another error.

Traceback (most recent call last):
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/bloom-inference-scripts/bloom-ds-inference.py", line 194, in <module>
    model = deepspeed.init_inference(
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/__init__.py", line 311, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 126, in __init__
    self._apply_injection_policy(config)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 339, in _apply_injection_policy
Traceback (most recent call last):
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/bloom-inference-scripts/bloom-ds-inference.py", line 194, in <module>
    replace_transformer_layer(client_module,
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/replace_module.py", line 820, in replace_transformer_layer
Traceback (most recent call last):
    model = deepspeed.init_inference(  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/bloom-inference-scripts/bloom-ds-inference.py", line 194, in <module>

  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/__init__.py", line 311, in init_inference
    load_model_with_checkpoint(replaced_module,
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 252, in load_model_with_checkpoint
Traceback (most recent call last):
    engine = InferenceEngine(model, config=ds_inference_config)
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 126, in __init__
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/bloom-inference-scripts/bloom-ds-inference.py", line 194, in <module>
        self._apply_injection_policy(config)load_module_recursive(r_module)

  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 339, in _apply_injection_policy
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 246, in load_module_recursive
    model = deepspeed.init_inference(
    load_module_recursive(
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/__init__.py", line 311, in init_inference
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 246, in load_module_recursive
    replace_transformer_layer(client_module,
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/replace_module.py", line 820, in replace_transformer_layer
    load_module_recursive(
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 244, in load_module_recursive
    model = deepspeed.init_inference(
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/__init__.py", line 311, in init_inference
    layer_policies[child.__class__](child, prefix + name + '.')
  File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 30, in load
    load_model_with_checkpoint(replaced_module,
      File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/module_inject/load_checkpoint.py", line 252, in load_model_with_checkpoint
    module.weight = mp_replace.copy(module.weight.data, sd[0][prefix + 'weight'])engine = InferenceEngine(model, config=ds_inference_config)

KeyError:   File "/home/yuwan/GitRoot/opt_pipeline/transformers-bloom-inference/venv_deepspeed_dev/lib/python3.9/site-packages/deepspeed/inference/engine.py", line 126, in __init__
'decoder.embed_tokens.weight'

njhill · 2023-01-23T04:34:08Z

@RezaYazdaniAminabadi apologies I spoke too soon... it's now working for BLOOM 175B with the pre-sharded fp16 weights, but not the original .bin checkpoint shards (which do work with 0.7.6). We those we are getting the NotImplementedError: Cannot copy out of meta tensor; no data! error.

dhar174 · 2023-01-23T22:16:36Z

Me too:
Traceback (most recent call last): File "/home/darf3/buddy/test.py", line 13, in <module> main() File "/home/darf3/buddy/test.py", line 5, in main BlenderBot1b.init() File "/home/darf3/buddy/BlenderBot1b.py", line 325, in init model = GPTJForCausalLM.from_pretrained( File "/home/darf3/.local/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2113, in from_pretrained model = cls(config, *model_args, **model_kwargs) File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 353, in wrapper f(module, *args, **kwargs) File "/home/darf3/.local/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 727, in __init__ self.transformer = GPTJModel(config) File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 353, in wrapper f(module, *args, **kwargs) File "/home/darf3/.local/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 487, in __init__ self.wte = nn.Embedding(config.vocab_size, self.embed_dim) File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 361, in wrapper self._post_init_method(module) File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 757, in _post_init_method param.partition() File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 889, in partition self._partition(param_list, has_been_updated=has_been_updated) File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 1033, in _partition self._partition_param(param, has_been_updated=has_been_updated) File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/utils/nvtx.py", line 11, in wrapped_fn return func(*args, **kwargs) File "/home/darf3/.local/lib/python3.10/site-packages/deepspeed/runtime/zero/partition_parameters.py", line 1122, in _partition_param param.ds_tensor.copy_(src_tensor) NotImplementedError: Cannot copy out of meta tensor; no data!

Any idea when a fix might be available?

dhar174 · 2023-01-23T22:19:08Z

Also, I seem to get the same "NotImplementedError: Cannot copy out of meta tensor; no data!" error even when I roll back to 0.7.6. Is that expected? How can I get this working?

P.S.: I am attempting to load a model with checkpoints that are split into two .bin files.

asafkar · 2023-02-01T18:41:42Z

@RezaYazdaniAminabadi apologies I spoke too soon... it's now working for BLOOM 175B with the pre-sharded fp16 weights, but not the original .bin checkpoint shards (which do work with 0.7.6). We those we are getting the NotImplementedError: Cannot copy out of meta tensor; no data! error.

same on my end.

felifri · 2023-02-02T08:26:04Z

Same for me. Everything works on 0.7.6 now, and before it didn't. However, 0.8.0. does not resolve the issue and gives similar behavior as the others showed.

dhar174 · 2023-02-02T18:21:46Z

@asafkar @felifri Have you tried with low_cpu_mem_usage=False argument in from_pretrained?

But ultimately what I did that I think got it loading correctly (on 0.8.0) was to use CPU (and thus RAM) to load the model once, and to re-save the checkpoints to a local folder in sharded form using save_pretrained, like model.save_pretrained("checkpoint", max_shard_size="200MB") once, and then from there on out loading from that local checkpoint.

I am using Huggingface Accelerate for handling config and initialization, so I am not using deepspeed.initialize() or deepspeed.init_inference at all, instead I'm simply passing my deepspeed config to the huggingface deepspeed config object (something like dschf = HfDeepSpeedConfig(ds_config)). I don't know if using the Accelerate library makes a difference to this problem or not. It wasn't working either way before I used the strategy I mentioned above. I've been able to load a 6B model onto a single 8GB graphics card by offloading unused params to CPU using zero3, haven't tried anything larger yet.

molohov · 2023-05-17T20:50:38Z

I can confirm that I'm able to replicate this. Interestingly, I'm finding that smaller OPT models work loading with meta tensor. It appears that models that are split in the HuggingFace checkpoints are causing this error (e.g., they have multiple pytorch_model-*-of-*.bin).

@RezaYazdaniAminabadi any idea the cause? I'm guessing we don't catch this in our unit tests because we use small versions of these larger models to save time.

On DS 0.9.2, I tried with opt-350m, which only has one .bin file, and it doesn't work (it throws the NotImplementedError: Cannot copy out of meta tensor; no data! error)

dhar174 · 2023-05-17T22:18:27Z

I can confirm that I'm able to replicate this. Interestingly, I'm finding that smaller OPT models work loading with meta tensor. It appears that models that are split in the HuggingFace checkpoints are causing this error (e.g., they have multiple pytorch_model-*-of-*.bin).
@RezaYazdaniAminabadi any idea the cause? I'm guessing we don't catch this in our unit tests because we use small versions of these larger models to save time.

On DS 0.9.2, I tried with opt-350m, which only has one .bin file, and it doesn't work (it throws the NotImplementedError: Cannot copy out of meta tensor; no data! error)

What is low_cpu_mem_usage set to?

molohov · 2023-05-17T22:45:18Z

I can confirm that I'm able to replicate this. Interestingly, I'm finding that smaller OPT models work loading with meta tensor. It appears that models that are split in the HuggingFace checkpoints are causing this error (e.g., they have multiple pytorch_model-*-of-*.bin).
@RezaYazdaniAminabadi any idea the cause? I'm guessing we don't catch this in our unit tests because we use small versions of these larger models to save time.

On DS 0.9.2, I tried with opt-350m, which only has one .bin file, and it doesn't work (it throws the NotImplementedError: Cannot copy out of meta tensor; no data! error)

What is low_cpu_mem_usage set to?

If I set low_cpu_mem_usage = False, the error still occurs.

puyuanOT · 2023-11-29T06:37:19Z

I got the same error with NousResearch/Nous-Capybara-34B,

  File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
    return model_class.from_pretrained(
  File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3480, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3870, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/modeling_utils.py", line 751, in _load_state_dict_into_meta_model
    set_module_quantized_tensor_to_device(
  File "/home/ec2-user/SageMaker/anaconda3/envs/ot-gpt-package/lib/python3.10/site-packages/transformers/integrations/bitsandbytes.py", line 108, in set_module_quantized_tensor_to_device
    new_value = value.to(device)
NotImplementedError: Cannot copy out of meta tensor; no data!

anselmwang added bug Something isn't working inference labels Dec 15, 2022

RezaYazdaniAminabadi mentioned this issue Jan 4, 2023

Fix INT8-quantization for BLOOM, OPT, and Neo-X #2662

Closed

larekrow mentioned this issue Mar 29, 2023

[BUG] deepspeed.init_inference() erroneously attempts to copy out of meta tensor #3012

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Can't load OPT-30B and OPT-66B through checkpoints.json #2616

[BUG] Can't load OPT-30B and OPT-66B through checkpoints.json #2616

anselmwang commented Dec 15, 2022 •

edited

mrwyattii commented Dec 20, 2022

mrwyattii commented Dec 20, 2022 •

edited

felifri commented Dec 21, 2022

asafkar commented Dec 26, 2022

njhill commented Dec 27, 2022

RezaYazdaniAminabadi commented Jan 4, 2023

RezaYazdaniAminabadi commented Jan 19, 2023

njhill commented Jan 19, 2023

anselmwang commented Jan 22, 2023

njhill commented Jan 23, 2023

dhar174 commented Jan 23, 2023

dhar174 commented Jan 23, 2023 •

edited

asafkar commented Feb 1, 2023

felifri commented Feb 2, 2023

dhar174 commented Feb 2, 2023 •

edited

molohov commented May 17, 2023

dhar174 commented May 17, 2023

molohov commented May 17, 2023

puyuanOT commented Nov 29, 2023

[BUG] Can't load OPT-30B and OPT-66B through checkpoints.json #2616

[BUG] Can't load OPT-30B and OPT-66B through checkpoints.json #2616

Comments

anselmwang commented Dec 15, 2022 • edited

mrwyattii commented Dec 20, 2022

mrwyattii commented Dec 20, 2022 • edited

felifri commented Dec 21, 2022

asafkar commented Dec 26, 2022

njhill commented Dec 27, 2022

RezaYazdaniAminabadi commented Jan 4, 2023

RezaYazdaniAminabadi commented Jan 19, 2023

njhill commented Jan 19, 2023

anselmwang commented Jan 22, 2023

njhill commented Jan 23, 2023

dhar174 commented Jan 23, 2023

dhar174 commented Jan 23, 2023 • edited

asafkar commented Feb 1, 2023

felifri commented Feb 2, 2023

dhar174 commented Feb 2, 2023 • edited

molohov commented May 17, 2023

dhar174 commented May 17, 2023

molohov commented May 17, 2023

puyuanOT commented Nov 29, 2023

anselmwang commented Dec 15, 2022 •

edited

mrwyattii commented Dec 20, 2022 •

edited

dhar174 commented Jan 23, 2023 •

edited

dhar174 commented Feb 2, 2023 •

edited