# Text Embedding

An embedding is a vector, or a list of floats, that represents a piece of text. Thinking about text as a vector is useful because it allows us to use the same tools we use for other types of data on text. For example, we can use [cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) to find the similarity between two pieces of text. This notebook shows how to use [OpenAI's embedding model](https://platform.openai.com/docs/guides/embeddings) to transform a string into a vector.

In [1]:
import os
from langchain.embeddings import OpenAIEmbeddings

Unlike the chat model `ChatOpenAI`, the embedding model `OpenAIEmbeddings` requires explicit API key authentication. You can get an API key by signing up at [OpenAI](https://openai.com/) and then going to the [API keys page](https://platform.openai.com/account/api-keys).  

In [2]:
embeddings_model = OpenAIEmbeddings(openai_api_key=os.getenv('OPENAI_API_KEY'))

You can embed a single piece of text using the `embed_query` method of an embeddings model.

In [3]:
embedded_query = embeddings_model.embed_query("What is vector search?")
len(embedded_query), embedded_query[:3]

(1536, [-0.032973344502955255, 0.002587942105724324, -0.007404194018956249])

You can also embed a list of strings using the `embed_documents` method.

In [4]:
embeddings = embeddings_model.embed_documents(
    [
        "Hi there!",
        "Oh, hello!",
        "What's your name?",
        "My friends call me World",
        "Hello World!"
    ]
)
len(embeddings), len(embeddings[0])

(5, 1536)