Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using microsoft/BioGPT for sentence similarity tasks #1824

Closed
nleroy917 opened this issue Feb 7, 2023 · 2 comments
Closed

Using microsoft/BioGPT for sentence similarity tasks #1824

nleroy917 opened this issue Feb 7, 2023 · 2 comments

Comments

@nleroy917
Copy link

Am I able to pull in Microsofts new BioGPT model from huggingface for sentence similarity tasks?

I've run the following:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer("microsoft/biogpt")
model.encode("K562 cells")

And it does work. But I haven't done any robust testing to see if its producing meaningful results... Before I go down this path, is this valid? Can I pull in any of these models on huggingface? I do get this warning when instantiating the model:

No sentence-transformers model found with name ~/.cache/torch/sentence_transformers/microsoft_biogpt. Creating a new one with MEAN pooling.
Some weights of the model checkpoint at ~/.cache/torch/sentence_transformers/microsoft_biogpt were not used when initializing BioGptModel: ['output_projection.weight']
- This IS expected if you are initializing BioGptModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BioGptModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).

But it seems like I fall in the is expected camp since I am instantiating a SentenceTransformer from a BertForPreTraining (I think). Is this correct?

@sadaqabdo
Copy link

Yes you can pull a HF model for sentence similarity task.
and also yes, it is expected, since you are not initialising a sentence-transformers model but rather a BioGptModel, which has no config for sbert, as for now, thus SentenceTransformer creates a Mean Pooling model on top of the transformer model.

logger.warning("No sentence-transformers model found with name {}. Creating a new one with MEAN pooling.".format(model_name_or_path))

However, you can avoid the warning by simply create the mean pooling model yourself as follows:

import torch
from sentence_transformers import SentenceTransformer, models

word_embedding_model = models.Transformer("microsoft/biogpt", max_seq_length=max_seq_length)
pooling_model = models.Pooling(word_embedding_model.get_word_embedding_dimension(), pooling_mode='mean')
model = SentenceTransformer(modules=[word_embedding_model, pooling_model])

sentence1 = torch.Tensor(model.encode("the patient has no fever")
sentence2 = torch.Tensor(model.encode("They report symptoms of hypertension"))
torch.nn.functional.cosine_similarity(sentence1, sentence2, dim=0)

@nleroy917
Copy link
Author

This is great info! Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants