Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BERT output is a tuple (in LAMA) #24

Closed
olenmg opened this issue Jul 7, 2021 · 3 comments
Closed

BERT output is a tuple (in LAMA) #24

olenmg opened this issue Jul 7, 2021 · 3 comments

Comments

@olenmg
Copy link

olenmg commented Jul 7, 2021

Hi, thanks for great codebase!

in bert_out method in PTuneForLAMA class,

# LAMA/p_tuning/modeling.py (124 line~)
def bert_out():
    label_mask = (queries == self.tokenizer.mask_token_id).nonzero().reshape(bz, -1)[:, 1].unsqueeze(
        1).to(self.device)  # bz * 1
    labels = torch.empty_like(queries).fill_(-100).long().to(self.device)  # bz * seq_len
    labels = labels.scatter_(1, label_mask, label_ids)
    output = self.model(inputs_embeds=inputs_embeds.to(self.device),
                        attention_mask=attention_mask.to(self.device).bool(),
                        labels=labels.to(self.device))
    loss, logits = output.loss, output.logits

output object has no attributes(loss, logits) since it is tuple

I think it should be changed like below

def bert_out():
    label_mask = (queries == self.tokenizer.mask_token_id).nonzero().reshape(bz, -1)[:, 1].unsqueeze(
        1).to(self.device)  # bz * 1
    labels = torch.empty_like(queries).fill_(-100).long().to(self.device)  # bz * seq_len
    labels = labels.scatter_(1, label_mask, label_ids)
    loss, logits = self.model(inputs_embeds=inputs_embeds.to(self.device),
                        attention_mask=attention_mask.to(self.device).bool(),
                        labels=labels.to(self.device))

I checked this code works fine on my machine.
Thank you again.


07.08 add
gpt_out() also has a same issue

loss, logits, _ = self.model(inputs_embeds=inputs_embeds.to(self.device).half(),
                    attention_mask=attention_mask.to(self.device).half(),
                    labels=labels.to(self.device))

If the huggingface Transformers version is higher, it can be solved by giving the return_dict option True

@Xiao9905
Copy link
Member

Thanks for your response. I think it should be a package version problem from huggingface transformers. This codes should work fine with version after 4.3.0.

@Xiao9905
Copy link
Member

Thanks again for your careful checking! I think this may help other users to run the codes.

@olenmg
Copy link
Author

olenmg commented Jul 11, 2021

I raised this issue because the Transformers version is 3.0.2 in requirements.txt.
Thanks for answering.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants