Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why openai/clip-vit-base-patch32 model not support ! #662

Closed
RobinYang11 opened this issue Feb 10, 2024 · 2 comments
Closed

why openai/clip-vit-base-patch32 model not support ! #662

RobinYang11 opened this issue Feb 10, 2024 · 2 comments

Comments

@RobinYang11
Copy link

RobinYang11 commented Feb 10, 2024

eland_import_hub_model --url  https://elastic:q=MYL8TsnSVUhmlwOIWa@localhost:9200 \
 --hub-model-id  openai/clip-vit-base-patch32   \
 --task-type text_embedding  \
 --ca-certs /Users/robinyang/http_ca.crt  \
  --start

2024-02-10 16:04:17,677 INFO : Establishing connection to Elasticsearch
2024-02-10 16:04:17,728 INFO : Connected to cluster named 'docker-cluster' (version: 8.12.0)
2024-02-10 16:04:17,729 INFO : Loading HuggingFace transformer tokenizer and model 'openai/clip-vit-base-patch32'
Traceback (most recent call last):
File "/Users/robinyang/Library/Python/3.9/bin/eland_import_hub_model", line 8, in
sys.exit(main())
File "/Users/robinyang/Library/Python/3.9/lib/python/site-packages/eland/cli/eland_import_hub_model.py", line 254, in main
tm = TransformerModel(
File "/Users/robinyang/Library/Python/3.9/lib/python/site-packages/eland/ml/pytorch/transformers.py", line 649, in init
raise TypeError(
T**ypeError: Tokenizer type CLIPTokenizer**(name_or_path='openai/clip-vit-base-patch32', vocab_size=49408, model_max_length=77, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'bos_token': AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'eos_token': AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'unk_token': AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True), 'pad_token': '<|endoftext|>'}, clean_up_tokenization_spaces=True) not supported, must be one of: <class 'transformers.models.bart.tokenization_bart.BartTokenizer'>, <class 'transformers.models.bert.tokenization_bert.BertTokenizer'>, <class 'transformers.models.bert_japanese.tokenization_bert_japanese.BertJapaneseTokenizer'>, <class 'transformers.models.deprecated.retribert.tokenization_retribert.RetriBertTokenizer'>, <class 'transformers.models.distilbert.tokenization_distilbert.DistilBertTokenizer'>, <class 'transformers.models.dpr.tokenization_dpr.DPRContextEncoderTokenizer'>, <class 'transformers.models.dpr.tokenization_dpr.DPRQuestionEncoderTokenizer'>, <class 'transformers.models.electra.tokenization_electra.ElectraTokenizer'>, <class 'transformers.models.mobilebert.tokenization_mobilebert.MobileBertTokenizer'>, <class 'transformers.models.mpnet.tokenization_mpnet.MPNetTokenizer'>, <class 'transformers.models.roberta.tokenization_roberta.RobertaTokenizer'>, <class 'transformers.models.squeezebert.tokenization_squeezebert.SqueezeBertTokenizer'>, <class 'transformers.models.xlm_roberta.tokenization_xlm_roberta.XLMRobertaTokenizer'>

@pquentin
Copy link
Member

Hello! This was already reported in #546. Closing this issue as a duplicate. Thank you.

@pquentin pquentin closed this as not planned Won't fix, can't repro, duplicate, stale Feb 12, 2024
@davidkyle
Copy link
Member

Cross posting from #546 (comment) for visibility

There are 2 models in Clip, the image processing model and a text embedding model. Elastic does not support image processing models but if you want to use the text embedding model you can install the Sentence Transformers implementation: https://huggingface.co/sentence-transformers/clip-ViT-B-32-multilingual-v1

eland_import_hub_model --url  https://elastic:q=MYL8TsnSVUhmlwOIWa@localhost:9200 \
 --hub-model-id  sentence-transformers/clip-ViT-B-32-multilingual-v1   \
 --task-type text_embedding  \
 --ca-certs /Users/robinyang/http_ca.crt  \
  --start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants