Value Error :"Using pad_token, but it is not set yet." While using GPT-Neo model from Hugging Face #1418

gousemd73 · 2022-02-11T13:37:09Z

Hi,
I am trying to use the GPT-Neo model from Hugging Face library to generate the sentence embedding using the Sentence Transformer Library.

from sentence-transformer import SentenceTransformer
gpt = SentenceTransformer('EleutherAI/gpt-neo-1.3B')
embeddings = gpt.encode(['This is example of using GPT'])

For the above code for generating the sentence embeddings, it is giving the following error.
ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as pad_token (tokenizer.pad_token = tokenizer.eos_token e.g.) or add a new pad token via `tokenizer.add_special_tokens({'pad_token': '[PAD]'})

As far as I can, I didnot find any way to add tokenizer with special token as required using the sentence-transformer library. Can anyone please help me with this error.

Environment :
python - 3.8
sentence-transformer - 2.0.0
transformers - 4.11.1

The text was updated successfully, but these errors were encountered:

nreimers · 2022-02-11T13:43:57Z

This model is not support. I also don't think it will work well. Using encoder models like bert/roberta/mpnet-base work better

gousemd73 · 2022-02-11T13:46:29Z

Is there a chance of including this model in sentence-transformer package? So that we can easily generate sentence embeddings as like encoder models

nreimers · 2022-02-11T13:55:39Z

Hmm, not sure if if will be easy. As it misses a padding token, it hard to use it in a batched fashion.

gousemd73 · 2022-02-15T15:11:06Z

The issue has been solved initiating the tokenizer before using for embedding generation.
modified code :

from sentence_transformers import SentenceTransformer
gpt = SentenceTransformer('EleutherAI/gpt-neo-1.3B')
gpt.tokenizer.pad_token = gpt.tokenizer.eos_token
embeddings = gpt.encode(['This is example of using GPT']

Thanks @nreimers ,For your quick reply and help

gousemd73 closed this as completed Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Value Error :"Using pad_token, but it is not set yet." While using GPT-Neo model from Hugging Face #1418

Value Error :"Using pad_token, but it is not set yet." While using GPT-Neo model from Hugging Face #1418

gousemd73 commented Feb 11, 2022

nreimers commented Feb 11, 2022

gousemd73 commented Feb 11, 2022

nreimers commented Feb 11, 2022

gousemd73 commented Feb 15, 2022

Value Error :"Using pad_token, but it is not set yet." While using GPT-Neo model from Hugging Face #1418

Value Error :"Using pad_token, but it is not set yet." While using GPT-Neo model from Hugging Face #1418

Comments

gousemd73 commented Feb 11, 2022

nreimers commented Feb 11, 2022

gousemd73 commented Feb 11, 2022

nreimers commented Feb 11, 2022

gousemd73 commented Feb 15, 2022