Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embeddings with StableLM? #20

Open
enjalot opened this issue Apr 19, 2023 · 5 comments
Open

Embeddings with StableLM? #20

enjalot opened this issue Apr 19, 2023 · 5 comments

Comments

@enjalot
Copy link

enjalot commented Apr 19, 2023

Is it possible to get embeddings from the model for my input text?

I.e. could I replace GTP3 calls from OpenAI with some python code and this model?

@sirwalt
Copy link

sirwalt commented Apr 19, 2023

I would recommend taking a look at https://www.sbert.net/ . To my best knowledge the OpenAI models are not outstanding at all for embeddings (https://huggingface.co/spaces/mteb/leaderboard) but it is convenience to use the API of them - at least for us.

@lingster
Copy link

If it helps, I have successfully used: sentence-transformers/all-mpnet-base-v2 as an alternative to the OpenAI text-embedding-ada-002

@twmmason twmmason reopened this Apr 25, 2023
@sandyflute
Copy link

sandyflute commented Apr 28, 2023

Hello, i am able to extract the embeddings from the model.
from transformers import AutoModelForCausalLM, AutoConfig,AutoTokenizer

checkpoint = "path/to/the/model"

config = AutoConfig.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_config(config)
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

inputs = tokenizer.encode('Stability AI democratised AI by open sourcing large models', return_tensors="pt")
outputs = model(inputs)
hidden_states = output[1]

now hidden states has output of all the layers. You can use the output of last layer.

Since i am a newbie to huggingface, there might be better ways to do this. Please share if you find something better.

@wajihullahbaig
Copy link

These models are multi-lingual?

@juliuslipp
Copy link

If you are looking for another convenient API might consider embaas. They offer a similiar structure to openai and you can use the MTEB leaderboard top members. They have some mutlilingual models as well and integrate wiht langchain or have an easy to use python client

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants