Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle length safe embedding only if needed #3723

Merged

Conversation

ravwojdyla
Copy link
Contributor

Re: #3722

Copy pasting context from the issue:

https://github.com/hwchase17/langchain/blob/1bf1c37c0cccb7c8c73d87ace27cf742f814dbe5/langchain/embeddings/openai.py#L210-L211

Means that the length safe embedding method is "always" used, initial implementation #991 has the embedding_ctx_length set to -1 (meaning you had to opt-in for the length safe method), #2330 changed that to max length of OpenAI embeddings v2, meaning the length safe method is used at all times.

How about changing that if branch to use length safe method only when needed, meaning when the text is longer than the max context length?

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, thanks!

@hwchase17 hwchase17 merged commit 37ed6f2 into langchain-ai:master Apr 29, 2023
9 checks passed
samching pushed a commit to samching/langchain that referenced this pull request May 1, 2023
Re: langchain-ai#3722

Copy pasting context from the issue:


https://github.com/hwchase17/langchain/blob/1bf1c37c0cccb7c8c73d87ace27cf742f814dbe5/langchain/embeddings/openai.py#L210-L211

Means that the length safe embedding method is "always" used, initial
implementation langchain-ai#991 has the
`embedding_ctx_length` set to -1 (meaning you had to opt-in for the
length safe method), langchain-ai#2330
changed that to max length of OpenAI embeddings v2, meaning the length
safe method is used at all times.

How about changing that if branch to use length safe method only when
needed, meaning when the text is longer than the max context length?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants