# OpenVINO
[OpenVINO™](https://github.com/openvinotoolkit/openvino) is an open-source toolkit for optimizing and deploying AI inference. The OpenVINO™ Runtime supports various hardware [devices](https://github.com/openvinotoolkit/openvino?tab=readme-ov-file#supported-hardware-matrix) including x86 and ARM CPUs, and Intel GPUs. It can help to boost deep learning performance in Computer Vision, Automatic Speech Recognition, Natural Language Processing and other common tasks.

Hugging Face embedding model can be supported by OpenVINO through ``OpenVINOEmbeddings`` class. If you have an Intel GPU, you can specify `model_kwargs={"device": "GPU"}` to run inference on it.

In [3]:
%pip install --upgrade-strategy eager "optimum[openvino,nncf]" --quiet

Note: you may need to restart the kernel to use updated packages.


In [1]:
from langchain_community.embeddings import OpenVINOEmbeddings

In [2]:
model_name = "sentence-transformers/all-mpnet-base-v2"
model_kwargs = {"device": "CPU"}
encode_kwargs = {"mean_pooling": True, "normalize_embeddings": True}

ov_embeddings = OpenVINOEmbeddings(
    model_name_or_path=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)

  from .autonotebook import tqdm as notebook_tqdm


INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, onnx, openvino


Framework not specified. Using pt to export the model.
Using the export variant default. Available variants are:
    - default: The default ONNX variant.
Using framework PyTorch: 2.2.1+cu121
Compiling the model to CPU ...


In [3]:
text = "This is a test document."

In [4]:
query_result = ov_embeddings.embed_query(text)

In [5]:
query_result[:3]

[-0.048951778560876846, -0.03986183926463127, -0.02156277745962143]

In [7]:
doc_result = ov_embeddings.embed_documents([text])

## Export IR model
It is possible to export your embedding model to the OpenVINO IR format with ``OVModelForFeatureExtraction``, and load the model from local folder.

In [None]:
from pathlib import Path

ov_model_dir = "all-mpnet-base-v2-ov"
if not Path(ov_model_dir).exists():
    from optimum.intel.openvino import OVModelForFeatureExtraction
    from transformers import AutoTokenizer

    ov_model = OVModelForFeatureExtraction.from_pretrained(
        model_name, compile=False, export=True
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    ov_model.half()
    ov_model.save_pretrained(ov_model_dir)
    tokenizer.save_pretrained(ov_model_dir)

In [None]:
ov_embeddings = OpenVINOEmbeddings(
    model_name_or_path=ov_model_dir,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)

## BGE with OpenVINO
We can also access BGE embedding models via the ``OpenVINOBgeEmbeddings`` class with OpenVINO. 

In [1]:
from langchain_community.embeddings import OpenVINOBgeEmbeddings

model_name = "BAAI/bge-small-en"
model_kwargs = {"device": "CPU"}
encode_kwargs = {"normalize_embeddings": True}
ov_embeddings = OpenVINOBgeEmbeddings(
    model_name_or_path=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs,
)

  from .autonotebook import tqdm as notebook_tqdm


INFO:nncf:NNCF initialized successfully. Supported frameworks detected: torch, onnx, openvino


Framework not specified. Using pt to export the model.
Using the export variant default. Available variants are:
    - default: The default ONNX variant.
Using framework PyTorch: 2.2.1+cu121
Overriding 1 configuration item(s)
	- use_cache -> False
Compiling the model to CPU ...


In [2]:
embedding = ov_embeddings.embed_query("hi this is harrison")
len(embedding)

384

For more information refer to:

* [OpenVINO LLM guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).

* [OpenVINO Documentation](https://docs.openvino.ai/2024/home.html).

* [OpenVINO Get Started Guide](https://www.intel.com/content/www/us/en/content-details/819067/openvino-get-started-guide.html).

* [RAG Notebook with LangChain](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/rag-chatbot.ipynb).