Could not load model deepset/minilm-uncased-squad2 #16849

IavTavares · 2022-04-20T10:10:38Z

System Info

I'm trying to load the model "deepset/minilm-uncased-squad2".
On my laptop (Ubuntu 20.4 LTS), there's no problem.
This happens when I run the exact same code on a server running Linux(see version below).

Here's the output of transformer-cli env command:
- `transformers` version: 4.18.0
- Platform: Linux-5.13.0-1021-aws-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.5.1
- PyTorch version (GPU?): not installed (NA)
- Tensorflow version (GPU?): 2.8.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No (I'm manually filling this in)
- Using distributed or parallel set-up in script?: No (I'm manually filling this in)


Here's the error message:
ValueError: Could not load model deepset/minilm-uncased-squad2 with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForQuestionAnswering'>, <class 'transformers.models.bert.modeling_tf_bert.TFBertForQuestionAnswering'>

On this [github issue](https://github.com/huggingface/transformers/issues/353) they point to a memory failure.


However, to solve this, I had to download the pytorch version for CPU!

Who can help?

@Rocketknight1, @LysandreJik,@Narsil

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

model_checkpoint = "deepset/minilm-uncased-squad2"
device = -1

model_checkpoint = pipeline('question-answering', model=model_checkpoint,
tokenizer=model_checkpoint,
device=device)

Expected behavior

No error output, and correct loading of the model.

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2022-04-20T12:05:26Z

I suspect the cause of this is that the deepset/roberta-base-squad2 model only exists as a PyTorch model. When you call pipeline(), it will select the framework (TF or PyTorch) based on what is installed on your machine. If your laptop has both TF and PyTorch installed, then it will probably select PyTorch and load the model correctly, but if the server only has TensorFlow then it will fail to load the model. To resolve this, you can either load the model in TF with from_pt=True and save as personal copy as a TF model with save_pretrained and push_to_hub, or you can switch to using PyTorch for the pipeline.

Narsil · 2022-04-20T14:11:44Z

Exactly that.

And looking at the error

with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForQuestionAnswering'>, <class 'transformers.models.bert.modeling_tf_bert.TFBertForQuestionAnswering'>

I can tell you that for some reason your environmnent could not see AutoModelForQuestionAnswering(the PyTorch version of the model). So it's probably not linked to GPU vs CPU but just that the GPU install was not functional somehow.

github-actions · 2022-05-20T15:02:06Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

dmartines · 2023-07-16T19:04:13Z

I am still having this issue with model tiiuae/falcon-40b-instruct

I copied the sample code from the example.

ERROR:
ValueError: Could not load model tiiuae/falcon-40b-instruct with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>,).

Narsil · 2023-07-17T16:27:24Z

What hardware do you have ? Loading tiiuae/falcon-40b-instruct will not work on most GPUs.

You need to do some sharding, either using accelerate pipeline(...., device_map="auto") which should work very easily.
Or doing something a bit more fancy like TP sharding to get performance out of it.

dmartines · 2023-07-17T16:30:14Z

I am on MacBook Air, Apple M2, 16GB RAM, 500GB+ disk available.

Do you have the code samples for accelerate or TP sharding?

Narsil · 2023-07-17T16:34:14Z

pipeline(...., device_map="auto")

This should be enough for accelerate.
On M2 I think it's a bit tight for falcon-40b, you will most likely get a lot of offloading so quite slow inference (and TP cannot help with that)

dmartines · 2023-07-17T17:24:09Z

Thanks @Narsil. Still getting an error

Downloading (…)l-00001-of-00002.bin:  65%|████████████████████████████████████████████████████▉                             | 6.43G/9.95G [33:12<18:12, 3.23MB/s]
Traceback (most recent call last):
  File "/Users/martinesdaniel/dev/yt-transcript/falcon.py", line 10, in <module>
    pipeline = transformers.pipeline(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/transformers/pipelines/__init__.py", line 788, in pipeline
    framework, model = infer_framework_load_model(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/transformers/pipelines/base.py", line 278, in infer_framework_load_model
    raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
ValueError: Could not load model tiiuae/falcon-7b-instruct with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>,).

Here is my code:

from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
tokenizer.save_pretrained("./model/")

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float32,
    trust_remote_code=True,
    device_map="auto",
)
sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Could this be internet bandwidth?

Narsil · 2023-07-17T20:28:05Z

ValueError: Could not load model tiiuae/falcon-7b-instruct with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>,).

That's the issue, but I'm not sure what's happening

Can you try :

model = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
tokenizer.save_pretrained("./model/")

model = AutoModelForCausalLM.from_pretrained(model=model, trust_remote_code=True, device_map="auto",torch_dtype=torch.float32,)

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

Try removing the torch_dtype=torch.float32 too, these models are meant to be used in half precision.

IavTavares added the bug label Apr 20, 2022

github-actions bot closed this as completed May 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could not load model deepset/minilm-uncased-squad2 #16849

Could not load model deepset/minilm-uncased-squad2 #16849

IavTavares commented Apr 20, 2022 •

edited

Rocketknight1 commented Apr 20, 2022

Narsil commented Apr 20, 2022

github-actions bot commented May 20, 2022

dmartines commented Jul 16, 2023

Narsil commented Jul 17, 2023

dmartines commented Jul 17, 2023

Narsil commented Jul 17, 2023

dmartines commented Jul 17, 2023

Narsil commented Jul 17, 2023

Could not load model deepset/minilm-uncased-squad2 #16849

Could not load model deepset/minilm-uncased-squad2 #16849

Comments

IavTavares commented Apr 20, 2022 • edited

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Rocketknight1 commented Apr 20, 2022

Narsil commented Apr 20, 2022

github-actions bot commented May 20, 2022

dmartines commented Jul 16, 2023

Narsil commented Jul 17, 2023

dmartines commented Jul 17, 2023

Narsil commented Jul 17, 2023

dmartines commented Jul 17, 2023

Narsil commented Jul 17, 2023

IavTavares commented Apr 20, 2022 •

edited