Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community[minor]: Add SQL storage implementation #22207

Merged
merged 16 commits into from
Jun 7, 2024

Conversation

pprados
Copy link
Contributor

@pprados pprados commented May 27, 2024

Hello @eyurtsev

  • package: langchain-comminity
  • Description: Add SQL implementation for docstore. A new implementation, in line with my other PR (async PGVector, SQLChatMessageMemory)
  • Twitter handler: pprados

Copy link

vercel bot commented May 27, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 7, 2024 9:17pm

@pprados pprados changed the title Pprados/new sql docstore Add SQL docstore May 27, 2024
@pprados pprados force-pushed the pprados/new_sql_docstore branch 4 times, most recently from cbc83bb to 2671786 Compare May 27, 2024 13:16
@pprados pprados marked this pull request as ready for review May 27, 2024 13:32
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 🤖:improvement Medium size change to existing code to handle new use-cases labels May 27, 2024
@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jun 5, 2024
@eyurtsev eyurtsev changed the title Add SQL docstore community[minor]: Add SQL storage implementation Jun 5, 2024
libs/community/langchain_community/storage/sql.py Outdated Show resolved Hide resolved
libs/community/langchain_community/storage/sql.py Outdated Show resolved Hide resolved
libs/community/langchain_community/storage/sql.py Outdated Show resolved Hide resolved
libs/community/langchain_community/storage/sql.py Outdated Show resolved Hide resolved
libs/community/langchain_community/storage/sql.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@eyurtsev eyurtsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically ready to merge, we just need to resolve a few issues with the sql alchemy schema

@dosubot dosubot bot removed the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jun 5, 2024
pprados and others added 8 commits June 7, 2024 12:31
…ins/agents/docs (langchain-ai#16168)

Revamp SQL use cases docs. In the process update SQL chains and agents.
…corator (langchain-ai#16295)

Adjusted `deprecate` decorator to make sure decorated async functions
are still recognized as "coroutinefunction" by `inspect`.

Before change, functions such as `LLMChain.acall` which are decorated as
deprecated are not recognized as coroutine functions. After the change,
they are recognized:

```python
import inspect
from langchain import LLMChain

# Is false before change but true after.
inspect.iscoroutinefunction(LLMChain.acall)
```
- **Description:** add milvus multitenancy doc, it is an example for
this [pr](langchain-ai#15740) .
  - **Issue:** No,
  - **Dependencies:** No,
  - **Twitter handle:** No

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jun 7, 2024
@pprados pprados marked this pull request as draft June 7, 2024 10:49
@pprados pprados marked this pull request as ready for review June 7, 2024 11:38
@pprados
Copy link
Contributor Author

pprados commented Jun 7, 2024

@eyurtsev
It's correct now, but Merging is blocked because "@eyurtsev requested changes".

Please, can you check this other Add async mode for pgvector ? it has been blocked for 3 weeks because "@eyurtsev requested changes", but I can't find the changes I want, so I can't validate them

@@ -31,6 +31,7 @@ optional = true
pytest = "^7.3.0"
pytest-cov = "^4.1.0"
pytest-dotenv = "^0.5.2"
aiosqlite = "^0.19.0"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @pprados aiosqlite should be in extended testing only not in this test group -- I'll move it to the appropriate place

this is reserved for testing infrastructure (pytest, pytest-cov, etc.)

There are a few things that don't belong here (e.g., duckdb-engine, lark), but we'll remove those as well at some point

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK it's already in extended testing, so we should be all set 🤞

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jun 7, 2024
@eyurtsev eyurtsev enabled auto-merge (squash) June 7, 2024 21:04
@@ -0,0 +1,186 @@
"""Implement integration tests for Redis storage."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend removing integration tests since they duplicate the content of unit tests.

unit tests will run on CI right now under the extended tests run.

At the moment, integration tests are not run as part of CI (since they typically require access to external resources that are not set up)

@eyurtsev eyurtsev merged commit 9aabb44 into langchain-ai:master Jun 7, 2024
44 checks passed
eyurtsev pushed a commit that referenced this pull request Jun 10, 2024
…elf query retriever (#22678)

The fact that we outsourced pgvector to another project has an
unintended effect. The mapping dictionary found by
`_get_builtin_translator()` cannot recognize the new version of pgvector
because it comes from another package.
`SelfQueryRetriever` no longer knows `PGVector`.

I propose to fix this by creating a global dictionary that can be
populated by various database implementations. Thus, importing
`langchain_postgres` will allow the registration of the `PGvector`
mapping.

But for the moment I'm just adding a lazy import

Furthermore, the implementation of _get_builtin_translator()
reconstructs the BUILTIN_TRANSLATORS variable with each invocation,
which is not very efficient. A global map would be an optimization.

- **Twitter handle:** pprados

@eyurtsev, can you review this PR? And unlock the PR [Add async mode for
pgvector](langchain-ai/langchain-postgres#32)
and PR [community[minor]: Add SQL storage
implementation](#22207)?

Are you in favour of a global dictionary-based implementation of
Translator?
pprados added a commit to pprados/langchain that referenced this pull request Jun 11, 2024
…elf query retriever (langchain-ai#22678)

The fact that we outsourced pgvector to another project has an
unintended effect. The mapping dictionary found by
`_get_builtin_translator()` cannot recognize the new version of pgvector
because it comes from another package.
`SelfQueryRetriever` no longer knows `PGVector`.

I propose to fix this by creating a global dictionary that can be
populated by various database implementations. Thus, importing
`langchain_postgres` will allow the registration of the `PGvector`
mapping.

But for the moment I'm just adding a lazy import

Furthermore, the implementation of _get_builtin_translator()
reconstructs the BUILTIN_TRANSLATORS variable with each invocation,
which is not very efficient. A global map would be an optimization.

- **Twitter handle:** pprados

@eyurtsev, can you review this PR? And unlock the PR [Add async mode for
pgvector](langchain-ai/langchain-postgres#32)
and PR [community[minor]: Add SQL storage
implementation](langchain-ai#22207)?

Are you in favour of a global dictionary-based implementation of
Translator?
@pprados pprados deleted the pprados/new_sql_docstore branch June 18, 2024 06:34
hinthornw pushed a commit that referenced this pull request Jun 20, 2024
Hello @eyurtsev

- package: langchain-comminity
- **Description**: Add SQL implementation for docstore. A new
implementation, in line with my other PR ([async
PGVector](langchain-ai/langchain-postgres#32),
[SQLChatMessageMemory](#22065))
- Twitter handler: pprados

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Piotr Mardziel <piotrm@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
hinthornw pushed a commit that referenced this pull request Jun 20, 2024
…elf query retriever (#22678)

The fact that we outsourced pgvector to another project has an
unintended effect. The mapping dictionary found by
`_get_builtin_translator()` cannot recognize the new version of pgvector
because it comes from another package.
`SelfQueryRetriever` no longer knows `PGVector`.

I propose to fix this by creating a global dictionary that can be
populated by various database implementations. Thus, importing
`langchain_postgres` will allow the registration of the `PGvector`
mapping.

But for the moment I'm just adding a lazy import

Furthermore, the implementation of _get_builtin_translator()
reconstructs the BUILTIN_TRANSLATORS variable with each invocation,
which is not very efficient. A global map would be an optimization.

- **Twitter handle:** pprados

@eyurtsev, can you review this PR? And unlock the PR [Add async mode for
pgvector](langchain-ai/langchain-postgres#32)
and PR [community[minor]: Add SQL storage
implementation](#22207)?

Are you in favour of a global dictionary-based implementation of
Translator?
efriis added a commit that referenced this pull request Jul 19, 2024
xfailing some sql tests that do not currently work on sqlalchemy v1

#22207 was very much not sqlalchemy v1 compatible. 

Moving forward, implementations should be compatible with both to pass
CI
olgamurraft pushed a commit to olgamurraft/langchain that referenced this pull request Aug 16, 2024
xfailing some sql tests that do not currently work on sqlalchemy v1

langchain-ai#22207 was very much not sqlalchemy v1 compatible. 

Moving forward, implementations should be compatible with both to pass
CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants