# Using OpenAI's Latest Embedding Model with Milvus

Milvus is the world's first open source vector database. With scalability and advanced features such as metadata filtering, Milvus has became a crucial component that empowers semantic search with deep learning technique - embedding models.

On January 25, OpenAI released 2 latest embedding models, `text-embedding-3-small` and `text-embedding-3-large`. Both embedding models has better performance over `text-embedding-ada-002`. The `text-embedding-3-small` is a highly efficient model. With 5X cost reduction, it achieves slight higher [MTEB](https://huggingface.co/spaces/mteb/leaderboard) score of 62.3% compared to 61%. `text-embedding-3-large` is OpenAI's best performing model, with 64.6% MTEB score.

![](../../pics/openai_embedding_scores.png)

More impressively, both models support trading-off performance and cost with a technique called "Matryoshka Representation Learning". Users can get shorten embeddings for vast reduction of the vector storage cost, without sacrificing the retrieval quality much. For example, reducing the vector dimension from 3072 to 256 only reduces the MTEB score from 64.6% to 62%. However, it achieves 12X cost reduction!

![](../../pics/openai_embedding_vector_size.png)

This tutorial shows how to use OpenAI's newest embedding models with Milvus for semantic similarity search.

## Preparations

We will demonstrate with `text-embedding-3-small` model and Milvus in Standalone mode. The text for searching comes from the [blog post](https://openai.com/blog/new-embedding-models-and-api-updates) that annoucements the new OpenAI model APIs. For each sentence in the blog, we use `text-embedding-3-small` model to convert the text string into 1536 dimension vector embedding, and store each embedding in Milvus.

We then search a query by converting the query text into a vector embedding, and perform vector Approximate Nearest Neighbor search to find the text strings with cloest semantic.

To run this demo you'll need to obtain an API key from [OpenAI website](https://openai.com/product). Be sure you have already [started up a Milvus instance](https://milvus.io/docs/install_standalone-docker.md) and installed python client library with `pip install pymilvus openai`.

Import packages.

In [1]:
import os
from openai import OpenAI
from pymilvus import (
    connections,
    utility,
    FieldSchema,
    CollectionSchema,
    DataType,
    Collection,
)

Set up the options for Milvus, specify OpenAI model name as `text-embedding-3-small`, and enter your OpenAI API key upon prompt.

In [2]:
MILVUS_HOST = "localhost"
MILVUS_PORT = "19530"
COLLECTION_NAME = "openai_doc_collection"  # Milvus collection name
EMBEDDING_MODEL = "text-embedding-3-small"  # OpenAI embedding model name, you can change it into `text-embedding-3-large` or `text-embedding-ada-002`

client = OpenAI()  # Initialize an Open AI client
client.api_key = os.getenv('OPENAI_API_KEY')  # Use your own Open AI API Key or set it in the environment variables.

Let’s try the OpenAI Embedding service with a text string, print the result vector embedding and get the dimensions of the model.

In [3]:
response = client.embeddings.create(
    input="Your text string goes here",
    model=EMBEDDING_MODEL
)
res_embedding = response.data[0].embedding
print(f'{res_embedding[:20]} ...')
dimension = len(res_embedding)
print(f'\nDimensions of `{EMBEDDING_MODEL}` embedding model is: {dimension}')

[0.00514861848205328, 0.017234396189451218, -0.018690429627895355, -0.01859242655336857, -0.04732108861207962, -0.030296696349978447, 0.027692636474967003, 0.003640083596110344, 0.011249258182942867, 0.006401647347956896, -0.0016966640250757337, 0.0157923623919487, -0.0013186553260311484, -0.007833180017769337, 0.059921376407146454, 0.050261154770851135, -0.027538632974028587, 0.009940228424966335, -0.04040492698550224, 0.05000915005803108] ...

Dimensions of `text-embedding-3-small` embedding model is: 1536


## Load vectors to Milvus

We set up a collection in Milvus and build index so that we can efficiently search vectors. For more information on how to use Milvus, look [here](https://milvus.io/docs/example_code.md).


In [4]:
# Connect to Milvus
connections.connect(host=MILVUS_HOST, port=MILVUS_PORT)

# Remove collection if it already exists
if utility.has_collection(COLLECTION_NAME):
    utility.drop_collection(COLLECTION_NAME)

# Set scheme with 3 fields: id (int), text (string), and embedding (float array).
fields = [
    FieldSchema(name="pk", dtype=DataType.INT64, is_primary=True, auto_id=False),
    FieldSchema(name="text", dtype=DataType.VARCHAR, max_length=65_535),
    FieldSchema(name="embeddings", dtype=DataType.FLOAT_VECTOR, dim=dimension)
]
schema = CollectionSchema(fields, "Here is description of this collection.")
# Create a collection with above schema.
doc_collection = Collection(COLLECTION_NAME, schema)

# Create an index for the collection.
index = {
    "index_type": "IVF_FLAT",
    "metric_type": "L2",
    "params": {"nlist": 128},
}
doc_collection.create_index("embeddings", index)

Status(code=0, message=)

Here we have prepared a data source, which is crawled from the latest [blog](https://openai.com/blog/new-embedding-models-and-api-updates#fn-A) of Open AI, and its name is `openai_blog.txt`. It stores each sentence as a line, and we convert each line in the document into a vector with `text-embedding-3-small` and then insert these embeddings into Milvus collection.

In [5]:
with open('./docs/openai_blog.txt', 'r') as f:
    lines = f.readlines()

embeddings = []
for line in lines:
    response = client.embeddings.create(
        input=line,
        model=EMBEDDING_MODEL
    )
    embeddings.append(response.data[0].embedding)

entities = [
    list(range(len(lines))),  # field id (primary key) 
    lines,  # field text
    embeddings,  #field embeddings
]
insert_result = doc_collection.insert(entities)

# After final entity is inserted, it is best to call flush to have no growing segments left in memory
doc_collection.flush()

## Query

Here we will build a `semantic_search` function, which is used to retrieve the topK most semantically similar document from a Milvus collection.


In [6]:
# Load the collection into memory for searching
doc_collection.load()


def semantic_search(query, top_k=3):
    response = client.embeddings.create(
        input=query,
        model=EMBEDDING_MODEL
    )
    vectors_to_search = [response.data[0].embedding]
    search_params = {
        "metric_type": "L2",
        "params": {"nprobe": 10},
    }
    result = doc_collection.search(vectors_to_search, "embeddings", search_params, limit=top_k, output_fields=["text"])
    return result[0]

Here we ask questions about the price of the latest embedding models.

In [7]:
question = 'What is the price of the `text-embedding-3-small` model?'

match_results = semantic_search(question, top_k=3)
for match in match_results:
    print(f"distance = {match.distance:.2f}\n{match.text}")

distance = 0.50
Pricing for `text-embedding-3-small` has therefore been reduced by 5X compared to `text-embedding-ada-002`, from a price per 1k tokens of $0.0001 to $0.00002.

distance = 0.56
`text-embedding-3-small` is our new highly efficient embedding model and provides a significant upgrade over its predecessor, the `text-embedding- ada-002` model released in December 2022.

distance = 0.56
**`text-embedding-3-large` is our new best performing model.



The smaller the distance, the closer the vector is, that is, semantically more similar. We can see that the top 1 results returned can answer this question.

Let's try another question, it's a question about the new GPT-4.

In [8]:
question = 'What is the context window size of GPT-4??'

match_results = semantic_search(question, top_k=3)
for match in match_results:
    print(f"distance = {match.distance:.2f}\n{match.text}")

distance = 0.97
Over 70% of requests from GPT-4 API customers have transitioned to GPT-4 Turbo since its release, as developers take advantage of its updated knowledge cutoff, larger 128k context windows, and lower prices.

distance = 1.02
Today, we are releasing an updated GPT-4 Turbo preview model, `gpt-4-0125-preview`.

distance = 1.02
* Overview * Index * GPT-4 * DALL·E 3



Our semantic retrieval is able to identify the meaning of our queries and return the most semantically similar documents from Milvus collection.

We can delete this collection to save resources.

In [9]:
# Drops the collection
utility.drop_collection(COLLECTION_NAME)

This is how to use OpenAI embedding model and Milvus to perform semantic search. Milvus has also integrated with other model providers such as Cohere and HuggingFace, you can learn more at https://milvus.io/docs.