Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not load model deepset/minilm-uncased-squad2 #16849

Closed
2 of 4 tasks
IavTavares opened this issue Apr 20, 2022 · 9 comments
Closed
2 of 4 tasks

Could not load model deepset/minilm-uncased-squad2 #16849

IavTavares opened this issue Apr 20, 2022 · 9 comments
Labels

Comments

@IavTavares
Copy link

IavTavares commented Apr 20, 2022

System Info

I'm trying to load the model "deepset/minilm-uncased-squad2".
On my laptop (Ubuntu 20.4 LTS), there's no problem.
This happens when I run the exact same code on a server running Linux(see version below).

Here's the output of transformer-cli env command:
- `transformers` version: 4.18.0
- Platform: Linux-5.13.0-1021-aws-x86_64-with-glibc2.29
- Python version: 3.8.10
- Huggingface_hub version: 0.5.1
- PyTorch version (GPU?): not installed (NA)
- Tensorflow version (GPU?): 2.8.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No (I'm manually filling this in)
- Using distributed or parallel set-up in script?: No (I'm manually filling this in)


Here's the error message:
ValueError: Could not load model deepset/minilm-uncased-squad2 with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForQuestionAnswering'>, <class 'transformers.models.bert.modeling_tf_bert.TFBertForQuestionAnswering'>

On this [github issue](https://github.com/huggingface/transformers/issues/353) they point to a memory failure.


However, to solve this, I had to download the pytorch version for CPU!

Who can help?

@Rocketknight1, @LysandreJik,@Narsil

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

model_checkpoint = "deepset/minilm-uncased-squad2"
device = -1

model_checkpoint = pipeline('question-answering', model=model_checkpoint,
tokenizer=model_checkpoint,
device=device)

Expected behavior

No error output, and correct loading of the model.
@IavTavares IavTavares added the bug label Apr 20, 2022
@Rocketknight1
Copy link
Member

I suspect the cause of this is that the deepset/roberta-base-squad2 model only exists as a PyTorch model. When you call pipeline(), it will select the framework (TF or PyTorch) based on what is installed on your machine. If your laptop has both TF and PyTorch installed, then it will probably select PyTorch and load the model correctly, but if the server only has TensorFlow then it will fail to load the model. To resolve this, you can either load the model in TF with from_pt=True and save as personal copy as a TF model with save_pretrained and push_to_hub, or you can switch to using PyTorch for the pipeline.

@Narsil
Copy link
Contributor

Narsil commented Apr 20, 2022

Exactly that.

And looking at the error

with any of the following classes: (<class 'transformers.models.auto.modeling_tf_auto.TFAutoModelForQuestionAnswering'>, <class 'transformers.models.bert.modeling_tf_bert.TFBertForQuestionAnswering'>

I can tell you that for some reason your environmnent could not see AutoModelForQuestionAnswering(the PyTorch version of the model). So it's probably not linked to GPU vs CPU but just that the GPU install was not functional somehow.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@dmartines
Copy link

I am still having this issue with model tiiuae/falcon-40b-instruct

I copied the sample code from the example.

ERROR:
ValueError: Could not load model tiiuae/falcon-40b-instruct with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>,).

@Narsil
Copy link
Contributor

Narsil commented Jul 17, 2023

What hardware do you have ? Loading tiiuae/falcon-40b-instruct will not work on most GPUs.

You need to do some sharding, either using accelerate pipeline(...., device_map="auto") which should work very easily.
Or doing something a bit more fancy like TP sharding to get performance out of it.

@dmartines
Copy link

I am on MacBook Air, Apple M2, 16GB RAM, 500GB+ disk available.

Do you have the code samples for accelerate or TP sharding?

@Narsil
Copy link
Contributor

Narsil commented Jul 17, 2023

pipeline(...., device_map="auto")

This should be enough for accelerate.
On M2 I think it's a bit tight for falcon-40b, you will most likely get a lot of offloading so quite slow inference (and TP cannot help with that)

@dmartines
Copy link

Thanks @Narsil. Still getting an error

Downloading (…)l-00001-of-00002.bin:  65%|████████████████████████████████████████████████████▉                             | 6.43G/9.95G [33:12<18:12, 3.23MB/s]
Traceback (most recent call last):
  File "/Users/martinesdaniel/dev/yt-transcript/falcon.py", line 10, in <module>
    pipeline = transformers.pipeline(
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/transformers/pipelines/__init__.py", line 788, in pipeline
    framework, model = infer_framework_load_model(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/lib/python3.11/site-packages/transformers/pipelines/base.py", line 278, in infer_framework_load_model
    raise ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")
ValueError: Could not load model tiiuae/falcon-7b-instruct with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>,).

Here is my code:

from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch

model = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
tokenizer.save_pretrained("./model/")

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    torch_dtype=torch.float32,
    trust_remote_code=True,
    device_map="auto",
)
sequences = pipeline(
   "Girafatron is obsessed with giraffes, the most glorious animal on the face of this Earth. Giraftron believes all other animals are irrelevant when compared to the glorious majesty of the giraffe.\nDaniel: Hello, Girafatron!\nGirafatron:",
    max_length=200,
    do_sample=True,
    top_k=10,
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
)
for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Could this be internet bandwidth?

@Narsil
Copy link
Contributor

Narsil commented Jul 17, 2023

ValueError: Could not load model tiiuae/falcon-7b-instruct with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>,).

That's the issue, but I'm not sure what's happening

Can you try :

model = "tiiuae/falcon-7b-instruct"

tokenizer = AutoTokenizer.from_pretrained(model)
tokenizer.save_pretrained("./model/")

model = AutoModelForCausalLM.from_pretrained(model=model, trust_remote_code=True, device_map="auto",torch_dtype=torch.float32,)

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
)

Try removing the torch_dtype=torch.float32 too, these models are meant to be used in half precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants