Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

langchain[minor]: Make EmbeddingsFilters async #22737

Merged
merged 18 commits into from
Jun 12, 2024

Conversation

pprados
Copy link
Contributor

@pprados pprados commented Jun 10, 2024

Thank you for contributing to LangChain!

  • PR title: community: EmbeddingsFilters is not compatible with async

  • PR message:

    • Description: EmbeddingsFilters is not compatible with async
    • Issue: Current implementation call sync methods
    • Twitter handle: pprados
  • Add tests and docs: Test were updated

  • Lint and test: Run make format, make lint and make test from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/

@eyurtsev, another review ?

Copy link

vercel bot commented Jun 10, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Jun 12, 2024 10:50am

@pprados pprados force-pushed the pprados/fix_embeddings_filter branch from 7bd7f32 to d9f4eff Compare June 10, 2024 14:14
@pprados
Copy link
Contributor Author

pprados commented Jun 10, 2024

@eyurtsev
Once again, an async bug

@pprados pprados marked this pull request as ready for review June 10, 2024 14:20
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. Ɑ: embeddings Related to text embedding models module 🤖:improvement Medium size change to existing code to handle new use-cases labels Jun 10, 2024
Copy link
Collaborator

@eyurtsev eyurtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- could you undo changes in pyproject.toml and poetry.lock -- most PRs should leave those unomdified

@eyurtsev eyurtsev self-assigned this Jun 10, 2024
@pprados pprados changed the title Make EmbeddingsFilters async langchain[minor]: Make EmbeddingsFilters async Jun 11, 2024
@pprados pprados marked this pull request as draft June 11, 2024 07:16
@pprados
Copy link
Contributor Author

pprados commented Jun 11, 2024

Hello @eyurtsev

In test_embeddings_filter.py, I change
from langchain_community.embeddings import OpenAIEmbeddings
to
from langchain_openai.embeddings import OpenAIEmbeddings

But, for that, I need to add a dependency in pyproject.tml

[tool.poetry.group.test_integration.dependencies]
...
langchain-openai = ">=0.1.8"

But, the poetry lock --no-update return:

Because langchain-openai (0.1.8) depends on tiktoken (>=0.7,<1)
 and no versions of langchain-openai match >0.1.8, langchain-openai (>=0.1.8) requires tiktoken (>=0.7,<1).
So, because langchain-community depends on both tiktoken (>=0.3.2,<0.6.0) and langchain-openai (>=0.1.8), version solving failed.

Do you think I should keep the original import?

@pprados pprados marked this pull request as ready for review June 11, 2024 07:37
@eyurtsev
Copy link
Collaborator

Hello @eyurtsev

In test_embeddings_filter.py, I change from langchain_community.embeddings import OpenAIEmbeddings to from langchain_openai.embeddings import OpenAIEmbeddings

But, for that, I need to add a dependency in pyproject.tml

[tool.poetry.group.test_integration.dependencies]
...
langchain-openai = ">=0.1.8"

But, the poetry lock --no-update return:

Because langchain-openai (0.1.8) depends on tiktoken (>=0.7,<1)
 and no versions of langchain-openai match >0.1.8, langchain-openai (>=0.1.8) requires tiktoken (>=0.7,<1).
So, because langchain-community depends on both tiktoken (>=0.3.2,<0.6.0) and langchain-openai (>=0.1.8), version solving failed.

Do you think I should keep the original import?

Yes, please keep the original import.

If you want to spend more effort, you could fix the test logic itself (not required for this PR) -- the test should not depend on the embedding provider, and instead can leverage fake embeddings in langchain_core. If that's done, then the test can be moved into unit tests.

@pprados
Copy link
Contributor Author

pprados commented Jun 12, 2024

@eyurtsev
"The Integration docs lint" checks is blocked in all the worlflow.

@@ -67,6 +67,7 @@ tiktoken = ">=0.3.2,<0.6.0"
anthropic = "^0.3.11"
langchain-core = { path = "../core", develop = true }
langchain = { path = "../langchain", develop = true }
#langchain-openai = ">=0.1.8"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you revert changes in pyproject toml and poetry lock please? Most PRs should not be modifying these files

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jun 12, 2024
@eyurtsev eyurtsev merged commit 23c22fc into langchain-ai:master Jun 12, 2024
61 checks passed
@pprados pprados deleted the pprados/fix_embeddings_filter branch June 18, 2024 06:33
hinthornw pushed a commit that referenced this pull request Jun 20, 2024
Add native async implementation for EmbeddingsFilter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: embeddings Related to text embedding models module 🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants