Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix download model #490

Merged
merged 6 commits into from Nov 13, 2023
Merged

Fix download model #490

merged 6 commits into from Nov 13, 2023

Conversation

maxjeblick
Copy link
Contributor

Make Download Model functionality robust against various model backbones/tokenizers.

  • Fetch tokenizer files from tokenizer.save_pretrained
  • Monitor experiment folder and add files that where modified after the backbone/tokenizer have been saved

Fixes #489

Copy link
Collaborator

@pascal-pfeiffer pascal-pfeiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

classification_head.pth seems to be missing in the download, too, doesn't it?

@maxjeblick
Copy link
Contributor Author

Yes, you are completely right, I fixed it

Copy link
Collaborator

@pascal-pfeiffer pascal-pfeiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

I think the model_card.md doesn't take into account the option to have multiple classes and will always assume a (1, 1) linear layer, but this is unrelated to this PR (so also wrong about input shape actually)

# settings can be arbitrary here as we overwrite with saved weights
head = torch.nn.Linear(1, 1, bias=False).to("cuda")
head.weight.data = head_weights

inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False).to("cuda")

out = model(**inputs).logits

logits = head(out[:,-1])

@maxjeblick
Copy link
Contributor Author

I think the model_card.md doesn't take into account the option to have multiple classes and will always assume a (1, 1) linear layer, but this is unrelated to this PR (so also wrong about input shape actually)

Yeah, I was also confused, but it is actually ok, head.weight.data will overwrite weights. See also the comment in the md file.

@maxjeblick maxjeblick merged commit 751ba79 into main Nov 13, 2023
5 checks passed
@maxjeblick maxjeblick deleted the max/fix_download_model branch November 13, 2023 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Downloaded model may miss files
2 participants