Skip to content

Commit

Permalink
Update OpenAI embeddings stripNewLines to be default false (#3612)
Browse files Browse the repository at this point in the history
* Updating the default value for `stripNewLines` to `false` in the `openai` embeddings. As removing newlines was beneficial for `V1` models (or `-001`), but should not be mandatory for `V2` models (or `-002`). This is explained in openai/openai-python#418 (comment)

Therefor updating this field to be in line with the default set model `text-embedding-ada-002`.

Also the langchain python library only enables this for `-001` models: https://github.com/langchain-ai/langchain/blob/c0f4b95aa9961724ab4569049b4c3bc12ebbacfc/libs/langchain/langchain/embeddings/openai.py#L466

* Reverting the default value, so it's `false` by default again.
Marked with a comment to indicate this should be changed in a future minor release.
Referenced the PR, as it contains the necessary information as to why this should be updated.

* Resolving conflicts, adding changes again to new location.
  • Loading branch information
Knordy committed Dec 14, 2023
1 parent ad2da87 commit 1c486db
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion libs/langchain-openai/src/embeddings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ export interface OpenAIEmbeddingsParams extends EmbeddingsParams {

/**
* Whether to strip new lines from the input text. This is recommended by
* OpenAI, but may not be suitable for all use cases.
* OpenAI for older models, but may not be suitable for all use cases.
* See: https://github.com/openai/openai-python/issues/418#issuecomment-1525939500
*/
stripNewLines?: boolean;
}
Expand Down Expand Up @@ -59,6 +60,7 @@ export class OpenAIEmbeddings

batchSize = 512;

// TODO: Update to `false` on next minor release (see: https://github.com/langchain-ai/langchainjs/pull/3612)
stripNewLines = true;

timeout?: number;
Expand Down

2 comments on commit 1c486db

@vercel
Copy link

@vercel vercel bot commented on 1c486db Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vercel
Copy link

@vercel vercel bot commented on 1c486db Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Successfully deployed to the following URLs:

langchainjs-docs – ./docs/core_docs/

langchainjs-docs-ruddy.vercel.app
langchainjs-docs-langchain.vercel.app
js.langchain.com
langchainjs-docs-git-main-langchain.vercel.app

Please sign in to comment.