Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in M-Bert-Base-ViT-B clip head linear layer size #9

Open
shreyajain4 opened this issue Jan 12, 2022 · 2 comments
Open

Issue in M-Bert-Base-ViT-B clip head linear layer size #9

shreyajain4 opened this issue Jan 12, 2022 · 2 comments

Comments

@shreyajain4
Copy link

shreyajain4 commented Jan 12, 2022

I tried the following piece of code present in the repo at location https://github.com/FreddeFrallan/Multilingual-CLIP/blob/main/src/multilingual_clip.py

The only changes I made is that I added print statements in between.


`
import pickle

import torch
import transformers

AVAILABLE_MODELS = {
'M-BERT-Distil-40': {
'model_name': 'M-CLIP/M-BERT-Distil-40',
'tokenizer_name': 'M-CLIP/M-BERT-Distil-40',
'head_name': 'M-BERT Distil 40 Linear Weights.pkl'
},

'M-BERT-Base-69': {
    'model_name': 'M-CLIP/M-BERT-Base-69',
    'tokenizer_name': 'M-CLIP/M-BERT-Base-69',
    'head_name': 'M-BERT-Base-69 Linear Weights.pkl'
},

'Swe-CLIP-500k': {
    'model_name': 'M-CLIP/Swedish-500k',
    'tokenizer_name': 'M-CLIP/Swedish-500k',
    'head_name': 'Swedish-500k Linear Weights.pkl'
},

'Swe-CLIP-2M': {
    'model_name': 'M-CLIP/Swedish-2M',
    'tokenizer_name': 'M-CLIP/Swedish-2M',
    'head_name': 'Swedish-2M Linear Weights.pkl'
},

'M-BERT-Base-ViT-B': {
    'model_name': 'M-CLIP/M-BERT-Base-ViT-B',
    'tokenizer_name': 'M-CLIP/M-BERT-Base-ViT-B',
    'head_name': 'M-BERT-Base-69-ViT Linear Weights.pkl'
},

}

class MultilingualClip2(torch.nn.Module):
def init(self, model_name, tokenizer_name, head_name, weights_dir='data/weights/'):
super().init()
self.model_name = model_name
self.tokenizer_name = tokenizer_name
self.head_path = weights_dir + head_name

    self.tokenizer = transformers.AutoTokenizer.from_pretrained(tokenizer_name)
    self.transformer = transformers.AutoModel.from_pretrained(model_name)
    self.clip_head = torch.nn.Linear(in_features=768, out_features=640)
    self._load_head()

def forward(self, txt):
    txt_tok = self.tokenizer(txt, padding=True, return_tensors='pt').to(device)
    embs = self.transformer(**txt_tok)[0]
    print('embs_text')
    print(embs.size())
    att = txt_tok['attention_mask']
    print('att_text')
    print(att.size())
    embs = (embs * att.unsqueeze(2)).sum(dim=1) / att.sum(dim=1)[:, None]
    print('embs_text')
    print(embs.size())
    p =  self.clip_head(embs)
    print('clip head obj')
    print(self.clip_head)
    print('cliphed_text')
    print(p.size())
    return p

def _load_head(self):
    with open(self.head_path, 'rb') as f:
        lin_weights = pickle.loads(f.read())
    self.clip_head.weight = torch.nn.Parameter(torch.tensor(lin_weights[0]).float().t())
    self.clip_head.bias = torch.nn.Parameter(torch.tensor(lin_weights[1]).float())
    print('ok')
    print(self.clip_head.weight.size())
    print(self.clip_head.bias.size())

def load_model2(name):
config = AVAILABLE_MODELS[name]
return MultilingualClip2(**config)

mod = load_model2('M-BERT-Base-ViT-B')
z = mod(Query[0])
`

Output for this code :
ok
torch.Size([512, 768])
torch.Size([512])
embs_text
torch.Size([1, 6, 768])
att_text
torch.Size([1, 6])
embs_text
torch.Size([1, 768])
clip head obj
Linear(in_features=768, out_features=640, bias=True)
cliphed_text
torch.Size([1, 512])


This output suggest that the file 'M-BERT-Base-69-ViT Linear Weights.pkl' doesn't have the size of 640 X 768 but 512 X 768

Is there any issue with the config then ?

@shreyajain4
Copy link
Author

@FreddeFrallan please have a look

@ozmig77
Copy link

ozmig77 commented Jan 21, 2022

Hello, I looked into this issue too.

I think the issue is related to CLIP embedding size. Where 512 in ViT, 640 in ResNet.
Since M-BERT-Base-69-ViT use CLIP ViT, the 512 seems right.

However I think out_features should be included in configuration for prevent misunderstanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants