-
Notifications
You must be signed in to change notification settings - Fork 894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add get_logits
method and NLLB tokenizer
#756
Conversation
@visheratin can we do this without blacking it and changing more lines than being added? |
also, should probably be rebased against latest |
@rwightman Left only my changes and rebased. |
image_logits += self.logit_bias | ||
text_logits = image_logits.T | ||
return image_logits, text_logits | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be
def get_logits(self, image, text):
image_features = self.encode_image(image, normalize=True)
text_features = self.encode_text(text, normalize=True)
image_logits = self.logit_scale.exp() * image_features @ text_features.T
if self.logit_bias is not None:
image_logits += self.logit_bias
text_logits = image_logits.T
return image_logits, text_logits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By bad. Fixed.
src/open_clip/factory.py
Outdated
@@ -111,11 +110,18 @@ def get_tokenizer( | |||
context_length = text_config.get('context_length', DEFAULT_CONTEXT_LENGTH) | |||
|
|||
if 'hf_tokenizer_name' in text_config: | |||
tokenizer = HFTokenizer( | |||
if model_name.startswith("nllb"): | |||
tokenizer = NLLBTokenizer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
really not a fan of having a model name based hack
So, for the tokenizer, I'm not convinced it warrants a new tokenizer and associated maintenance. Isn't it pretty standard for multi-lingual to manually insert the language token per text? Are there any popular impl which do it this way? |
On the get_logits, it's useful to have, but will point out this won't work with torchcompile or FSDP which only wrap forward() methods. |
Regarding the tokenizer, with When making an additional tokenizer, I tried to look at the problem from the end-user perspective. With the current |
Regarding
An example of such usage can be found in roboflow/supervision library. |
@visheratin rather than make a whole new tokenizer for this with a different tokenize interface, couldn't we pass through the set_src_lang methods and/or src lang init kwargs? assert/report error if it's called on a underlying HF tokenizer that doesn't have it? |
Passing through the Maybe multilingual input texts is too edge case. If you think so, I can remove |
Isn't this how the HF tokenizers work though? I think they have some sort of specific tokenization method with src/target lang as args, but usually it's either set on construction of the tokenizer or via the set method no? |
In the case of I removed the |
* Get logits method and set_language for tokenizer.
Hi!
I want to make OpenCLIP more usable for downstream applications, like zero-shot classification. Right now, to get the logits, the user has to call
encode_image
andencode_text
, then matmul them, multiply the result bylogit_scale
, and optionally addlogit_bias
. I think it makes sense to have one method to get logits, as in OpenAI and HuggingFace. So I added theget_logits
method to bothCLIP
andCustomTextCLIP
classes.I also added the
NLLBTokenizer
class that has an additionallangs
parameter in the__call__
method. This is needed because the tokenizer for NLLB models adds a language token to the beginning of the sequence. The token that is added is controlled via theset_src_lang_special_tokens
method. If the language is not set via this method, the tokenizer will add an English token to all sequences.The PR also contains some formatting changes performed by Ruff. For me personally, the formatted code looks nicer, but if you don't like it, I can roll it back.
@gabrielilharco @rwightman