Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when loading embedding pre-trained model #143

Open
pdhung3012 opened this issue Aug 25, 2023 · 5 comments
Open

Error when loading embedding pre-trained model #143

pdhung3012 opened this issue Aug 25, 2023 · 5 comments

Comments

@pdhung3012
Copy link

Hello. I tried this simple code snippet for getting the embedding for a pre-trained model using CodeT5Plus:

`checkpoint="/home/hungphd/media/git/codet5p-110m-embedding"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint, trust_remote_code=True)
model = AutoModel.from_pretrained(checkpoint, trust_remote_code=True).to(device)

inputs = tokenizer.encode("def print_hello_world():\tprint('Hello World!')", return_tensors="pt").to(device)
embedding = model(inputs)[0]
print(f'Dimension of the embedding: {embedding.size()[0]}, with norm={embedding.norm().item()}')
`

However, I got this error:
Traceback (most recent call last): File "/home/hungphd/media/git/CodeT5/CodeT5+/code_retrieval/examplePretrainedModel.py", line 8, in <module> model = AutoModel.from_pretrained(checkpoint, trust_remote_code=True).to(device) File "/home/hungphd/anaconda3/envs/py38v2/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 396, in from_pretrained config, kwargs = AutoConfig.from_pretrained( File "/home/hungphd/anaconda3/envs/py38v2/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 529, in from_pretrained config_class = CONFIG_MAPPING[config_dict["model_type"]] File "/home/hungphd/anaconda3/envs/py38v2/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 278, in __getitem__ raise KeyError(key) KeyError: 'codet5p_embedding'
I downloaded the pre-trained model from hugging faces. The folder of pre-trained model looks like this:

https://drive.google.com/file/d/1CdLv5GyNFeIPPufLcUS-5TT53W_4fes4/view?usp=drive_link

Did I do it correctly?

@pdhung3012
Copy link
Author

Hello. After checking the error, I investigate that the reason is due to the codet5p_embedding hasn't been a key defined in this file:
https://github.com/huggingface/transformers/blob/main/src/transformers/models/auto/configuration_auto.py

I check in the latest version on Github they didn't define that key. Is that key specific to your machine (or did you update the configuration_auto.py file specifically compared to the general version)?

@yuewang-cuhk
Copy link
Contributor

Hello, are you able to run the example script here? I've double check it and can run it successfully.

I guess the reason for the above error you faced is due to that you load the model from a local folder instead of from the remote HuggingFace hub. To load the local model checkpoint, you'll have to download the class and config files. Below is the example code to load the local model:

from modeling_codet5p_embedding import CodeT5pEmbeddingModel
model = CodeT5pEmbeddingModel.from_pretrained(checkpoint)

@pdhung3012
Copy link
Author

Thank you for your help. Yes, at first I ran the script on my local ubuntu 22.04 server (which the code called to the remoted HuggingFace hub "Salesforce/codet5p-110m-embedding") but it returned the same error. Thus, I changed to download the model to my local machine to call but the error still remained.
Let me try again.

@pdhung3012
Copy link
Author

I tried to load it remotely from huggingface remote side. It shows the error like this:

`
OSError: Can't load 'Salesforce/codet5p-110m-embedding'. Make sure that:

  • 'Salesforce/codet5p-110m-embedding' is a correct model identifier listed on 'https://huggingface.co/models'

  • or 'Salesforce/codet5p-110m-embedding' is the correct path to a directory containing a 'config.json' file`

@pdhung3012
Copy link
Author

I checked the version of my transformer. I used the old version of transformers as 4.11.3, while the current version is 4.32.1. reinstall my transformer and it worked. Thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants