## Local Embeddings with [HuggingFace](https://docs.llamaindex.ai/en/stable/api_reference/embeddings/huggingface/#llama_index.embeddings.huggingface.HuggingFaceEmbedding)

Massive Text Embeddings Benchmark (MTEB) [Leaderboard](https://huggingface.co/spaces/mteb/leaderboard)

### HuggingFaceEmbedding

The base HuggingFaceEmbedding class is a generic wrapper around any HuggingFace model for embeddings. All [embedding models](https://huggingface.co/models?library=sentence-transformers) on Hugging Face should work. You can refer to the [embeddings leaderboard](https://huggingface.co/spaces/mteb/leaderboard) for more recommendations.


In [2]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings


embed_model_bge = HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5")
Settings.embed_model = embed_model_bge

text_embedding = embed_model_bge.get_text_embedding("Hello, world!")
print(text_embedding)
print(len(text_embedding))

ImportError: cannot import name 'Settings' from 'llama_index.core' (/usr/local/lib/python3.11/site-packages/llama_index/core/__init__.py)

### [InstructorEmbedding](https://docs.llamaindex.ai/en/stable/examples/embeddings/huggingface/#instructorembedding)

Instructor Embeddings are a class of embeddings specifically trained to augment their embeddings according to an instruction. By default, queries are given query_instruction="Represent the question for retrieving supporting documents: " and text is given text_instruction="Represent the document for retrieval: ".

They rely on the Instructor and SentenceTransformers (version 2.2.2) pip package, which you can install with pip install InstructorEmbedding and pip install -U sentence-transformers==2.2.2.

https://huggingface.co/hkunlp/instructor-large

For Model List: https://pypi.org/project/InstructorEmbedding/

In [1]:
from llama_index.embeddings.instructor import InstructorEmbedding

embed_model = InstructorEmbedding(model_name="hkunlp/instructor-base")

text_embedding = embed_model.get_text_embedding("Hello, world!")
print(len(text_embedding))
print(text_embedding)


  from tqdm.autonotebook import trange


modules.json:   0%|          | 0.00/461 [00:00<?, ?B/s]

TypeError: INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'token'