# OpenAI

This will help you get started with OpenAIEmbeddings [embedding models](/docs/concepts/embedding_models) using LangChain. For detailed documentation on `OpenAIEmbeddings` features and configuration options, please refer to the [API reference](https://api.js.langchain.com/classes/langchain_openai.OpenAIEmbeddings.html).

## Overview
### Integration details

| Class | Package | Local | [Py support](https://python.langchain.com/docs/integrations/text_embedding/openai/) | Package downloads | Package latest |
| :--- | :--- | :---: | :---: |  :---: | :---: |
| [OpenAIEmbeddings](https://api.js.langchain.com/classes/langchain_openai.OpenAIEmbeddings.html) | [@langchain/openai](https://api.js.langchain.com/modules/langchain_openai.html) | ❌ | ✅ | ![NPM - Downloads](https://img.shields.io/npm/dm/@langchain/openai?style=flat-square&label=%20&) | ![NPM - Version](https://img.shields.io/npm/v/@langchain/openai?style=flat-square&label=%20&) |

## Setup

To access OpenAIEmbeddings embedding models you'll need to create an OpenAI account, get an API key, and install the `@langchain/openai` integration package.

### Credentials

Head to [platform.openai.com](https://platform.openai.com) to sign up to OpenAI and generate an API key. Once you've done this set the `OPENAI_API_KEY` environment variable:

```bash
export OPENAI_API_KEY="your-api-key"
```

If you want to get automated tracing of your model calls you can also set your [LangSmith](https://docs.smith.langchain.com/) API key by uncommenting below:

```bash
# export LANGSMITH_TRACING="true"
# export LANGSMITH_API_KEY="your-api-key"
```

### Installation

The LangChain OpenAIEmbeddings integration lives in the `@langchain/openai` package:

```{=mdx}
import IntegrationInstallTooltip from "@mdx_components/integration_install_tooltip.mdx";
import Npm2Yarn from "@theme/Npm2Yarn";

<IntegrationInstallTooltip></IntegrationInstallTooltip>

<Npm2Yarn>
  @langchain/openai @langchain/core
</Npm2Yarn>
```

## Instantiation

Now we can instantiate our model object and generate chat completions:

In [1]:
import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings = new OpenAIEmbeddings({
  apiKey: "YOUR-API-KEY", // In Node.js defaults to process.env.OPENAI_API_KEY
  batchSize: 512, // Default value if omitted is 512. Max is 2048
  model: "text-embedding-3-large",
});

If you're part of an organization, you can set `process.env.OPENAI_ORGANIZATION` to your OpenAI organization id, or pass it in as `organization` when
initializing the model.

## Indexing and Retrieval

Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our RAG tutorials under the [working with external knowledge tutorials](/docs/tutorials/#working-with-external-knowledge).

Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document using the demo [`MemoryVectorStore`](/docs/integrations/vectorstores/memory).

In [3]:
// Create a vector store with a sample text
import { MemoryVectorStore } from "langchain/vectorstores/memory";

const text = "LangChain is the framework for building context-aware reasoning applications";

const vectorstore = await MemoryVectorStore.fromDocuments(
  [{ pageContent: text, metadata: {} }],
  embeddings,
);

// Use the vector store as a retriever that returns a single document
const retriever = vectorstore.asRetriever(1);

// Retrieve the most similar text
const retrievedDocuments = await retriever.invoke("What is LangChain?");

retrievedDocuments[0].pageContent;

LangChain is the framework for building context-aware reasoning applications


## Direct Usage

Under the hood, the vectorstore and retriever implementations are calling `embeddings.embedDocument(...)` and `embeddings.embedQuery(...)` to create embeddings for the text(s) used in `fromDocuments` and the retriever's `invoke` operations, respectively.

You can directly call these methods to get embeddings for your own use cases.

### Embed single texts

You can embed queries for search with `embedQuery`. This generates a vector representation specific to the query:

In [4]:
const singleVector = await embeddings.embedQuery(text);

console.log(singleVector.slice(0, 100));

[
    -0.01927683,  0.0037708976,  -0.032942563,  0.0037671267,  0.008175306,
   -0.012511838,  -0.009713832,   0.021403614,  -0.015377721, 0.0018684798,
    0.020574018,   0.022399133,   -0.02322873,   -0.01524951,  -0.00504169,
   -0.007375876,   -0.03448109, 0.00015130726,   0.021388533, -0.012564631,
   -0.020031009,   0.027406884,  -0.039217334,    0.03036327,  0.030393435,
   -0.021750538,   0.032610722,  -0.021162277,  -0.025898525,  0.018869571,
    0.034179416,  -0.013371604,  0.0037652412,   -0.02146395, 0.0012641934,
   -0.055688616,    0.05104287,  0.0024982197,  -0.019095825, 0.0037369595,
  0.00088757504,   0.025189597,  -0.018779071,   0.024978427,  0.016833287,
  -0.0025868358,  -0.011727491, -0.0021154736,  -0.017738303, 0.0013839195,
  -0.0131151825,   -0.05405959,   0.029729757,  -0.003393808,  0.019774588,
    0.028885076,   0.004355387,   0.026094612,    0.06479911,  0.038040817,
    -0.03478276,  -0.012594799,  -0.024767255, -0.0031430433,  0.017874055,
   -0.0152

### Embed multiple texts

You can embed multiple texts for indexing with `embedDocuments`. The internals used for this method may (but do not have to) differ from embedding queries:

In [5]:
const text2 = "LangGraph is a library for building stateful, multi-actor applications with LLMs";

const vectors = await embeddings.embedDocuments([text, text2]);

console.log(vectors[0].slice(0, 100));
console.log(vectors[1].slice(0, 100));

[
    -0.01927683,  0.0037708976,  -0.032942563,  0.0037671267,  0.008175306,
   -0.012511838,  -0.009713832,   0.021403614,  -0.015377721, 0.0018684798,
    0.020574018,   0.022399133,   -0.02322873,   -0.01524951,  -0.00504169,
   -0.007375876,   -0.03448109, 0.00015130726,   0.021388533, -0.012564631,
   -0.020031009,   0.027406884,  -0.039217334,    0.03036327,  0.030393435,
   -0.021750538,   0.032610722,  -0.021162277,  -0.025898525,  0.018869571,
    0.034179416,  -0.013371604,  0.0037652412,   -0.02146395, 0.0012641934,
   -0.055688616,    0.05104287,  0.0024982197,  -0.019095825, 0.0037369595,
  0.00088757504,   0.025189597,  -0.018779071,   0.024978427,  0.016833287,
  -0.0025868358,  -0.011727491, -0.0021154736,  -0.017738303, 0.0013839195,
  -0.0131151825,   -0.05405959,   0.029729757,  -0.003393808,  0.019774588,
    0.028885076,   0.004355387,   0.026094612,    0.06479911,  0.038040817,
    -0.03478276,  -0.012594799,  -0.024767255, -0.0031430433,  0.017874055,
   -0.0152

## Specifying dimensions

With the `text-embedding-3` class of models, you can specify the size of the embeddings you want returned. For example by default `text-embedding-3-large` returns embeddings of dimension 3072:


In [2]:
import { OpenAIEmbeddings } from "@langchain/openai";

const embeddingsDefaultDimensions = new OpenAIEmbeddings({
  model: "text-embedding-3-large",
});

const vectorsDefaultDimensions = await embeddingsDefaultDimensions.embedDocuments(["some text"]);
console.log(vectorsDefaultDimensions[0].length);

3072


But by passing in `dimensions: 1024` we can reduce the size of our embeddings to 1024:

In [3]:
import { OpenAIEmbeddings } from "@langchain/openai";

const embeddings1024 = new OpenAIEmbeddings({
  model: "text-embedding-3-large",
  dimensions: 1024,
});

const vectors1024 = await embeddings1024.embedDocuments(["some text"]);
console.log(vectors1024[0].length);

1024


## Custom URLs

You can customize the base URL the SDK sends requests to by passing a `configuration` parameter like this:

In [None]:
import { OpenAIEmbeddings } from "@langchain/openai";

const model = new OpenAIEmbeddings({
  configuration: {
    baseURL: "https://your_custom_url.com",
  },
});

You can also pass other `ClientOptions` parameters accepted by the official SDK.

If you are hosting on Azure OpenAI, see the [dedicated page instead](/docs/integrations/text_embedding/azure_openai).

## API reference

For detailed documentation of all OpenAIEmbeddings features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_openai.OpenAIEmbeddings.html