Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for TaylorAI/gte-tiny #680

Closed
Shifter2600 opened this issue Mar 23, 2024 · 1 comment
Closed

Support for TaylorAI/gte-tiny #680

Shifter2600 opened this issue Mar 23, 2024 · 1 comment

Comments

@Shifter2600
Copy link

Receiving an error when loading model

Downloading: 100%|██████████████████████████████████████████████████████████████████████| 1.50k/1.50k [00:00<00:00, 980kB/s]
Downloading: 100%|███████████████████████████████████████████████████████████████████████| 226k/226k [00:00<00:00, 1.97MB/s]
Downloading: 100%|███████████████████████████████████████████████████████████████████████| 82.0/82.0 [00:00<00:00, 92.6kB/s]
Downloading: 100%|██████████████████████████████████████████████████████████████████████████| 228/228 [00:00<00:00, 111kB/s]
Traceback (most recent call last):
  File "/usr/local/bin/eland_import_hub_model", line 197, in <module>
    tm = TransformerModel(args.hub_model_id, args.task_type, args.quantize)
  File "/usr/local/lib/python3.9/dist-packages/eland/ml/pytorch/transformers.py", line 567, in __init__
    self._tokenizer = transformers.AutoTokenizer.from_pretrained(
  File "/usr/local/lib/python3.9/dist-packages/transformers/models/auto/tokenization_auto.py", line 579, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "/usr/local/lib/python3.9/dist-packages/transformers/tokenization_utils_base.py", line 1783, in from_pretrained
    return cls._from_pretrained(
  File "/usr/local/lib/python3.9/dist-packages/transformers/tokenization_utils_base.py", line 1984, in _from_pretrained
    raise ValueError(
ValueError: Non-consecutive added token '[PAD]' found. Should have index 30522 but has index 0 in saved vocabulary.
@davidkyle
Copy link
Member

I was able to install this model using the 8.14 docker image

docker pull docker.elastic.co/eland/eland:8.14.0

And installed with:

docker run -it --rm docker.elastic.co/eland/eland \
    eland_import_hub_model \
      --cloud-id $CLOUD_ID \
      -u elastic -p $CLOUD_PWD \
      --hub-model-id 'TaylorAI/gte-tiny' \
      --task-type text_embedding 

Closing issue as the error comes from the Transformers library and appears to be fixed now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants