core[minor]: Adds an in-memory implementation of RecordManager #13200

pprados · 2023-11-10T14:36:01Z

Description:
langchain offers three technologies to save data:

If you want to combine these technologies in a sample persistence stategy you need a common implementation for each. DocStore propose InMemoryDocstore.

We propose the class MemoryRecordManager to complete the system.

This is the prelude to another full-request, which needs a consistent combination of persistence components.

Tag maintainer:
@baskaryan

Twitter handle:
@pprados

vercel · 2023-11-10T14:36:05Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		Jun 19, 2024 5:46am

- [Add a Wrapper vectorstore, compatible with SelfQueryRetriever](langchain-ai/langchain#13190) - [Adds an in-memory implementation of RecordStore](langchain-ai/langchain#13200) - [Add SQLDocStore](langchain-ai/langchain#13181)

hwchase17

this is good, but the async methods aren't actually async right?

pprados · 2023-12-06T08:30:57Z

No. All is async, because all method works in memories. Without IO.

pprados · 2023-12-15T12:59:05Z

@hwchase17
Some news ?

libs/community/langchain_community/indexes/memory_recordmanager.py

pprados · 2024-06-13T15:46:24Z

@eyurtsev
I can result this bug.

with

git checkout master
cd /libs/core
poetry lock --no-update
Resolving dependencies... (0.0s)

Because langchain-core depends on langchain-text-splitters (0.2.1) @ file:///home/pprados/workspace.bda/langchain/libs/text-splitters which depends on langchain-core (^0.2.0), langchain-core is required.
So, because langchain-core is 0.0.0a64, version solving failed.

…cordmanager

pprados · 2024-06-14T05:54:05Z

@eyurtsev
In the test CI /cd libs/core / make extended_tests, I receive an error:

INTERNALERROR>     raise Failed(msg=reason, pytrace=pytrace)
INTERNALERROR> Failed: Package `aiosqlite` is not installed but is required for extended tests. Please install the given package and try again.

It has nothing to do with my patch

eyurtsev · 2024-06-14T16:12:04Z

@pprados please review

I reused the existing implementation of the in memory record manager from unit tests.

data is scoped to the instance not to the class (we generally don't want to scope data to the class namespace)
has slightly more documentation

It's moved to base.py together with the abstract interface. We can move it into a separate module as well (doesn't really matter), we just need it to be a non private module for it to appear in the API Reference documentation.

The typical name for memory implementations is "InMemory"

pprados · 2024-06-17T06:02:48Z

@eyurtsev
It's perfect for me.

pprados

Fixed

libs/core/langchain_core/indexing/_memory_recordmanager.py

@baskaryan

…22065) # package community: Fix SQLChatMessageHistory ## Description Here is a rewrite of `SQLChatMessageHistory` to properly implement the asynchronous approach. The code circumvents [issue 22021](#22021) by accepting a synchronous call to `def add_messages()` in an asynchronous scenario. This bypasses the bug. For the same reasons as in [PR 22](langchain-ai/langchain-postgres#32) of `langchain-postgres`, we use a lazy strategy for table creation. Indeed, the promise of the constructor cannot be fulfilled without this. It is not possible to invoke a synchronous call in a constructor. We compensate for this by waiting for the next asynchronous method call to create the table. The goal of the `PostgresChatMessageHistory` class (in `langchain-postgres`) is, among other things, to be able to recycle database connections. The implementation of the class is problematic, as we have demonstrated in [issue 22021](#22021). Our new implementation of `SQLChatMessageHistory` achieves this by using a singleton of type (`Async`)`Engine` for the database connection. The connection pool is managed by this singleton, and the code is then reentrant. We also accept the type `str` (optionally complemented by `async_mode`. I know you don't like this much, but it's the only way to allow an asynchronous connection string). In order to unify the different classes handling database connections, we have renamed `connection_string` to `connection`, and `Session` to `session_maker`. Now, a single transaction is used to add a list of messages. Thus, a crash during this write operation will not leave the database in an unstable state with a partially added message list. This makes the code resilient. We believe that the `PostgresChatMessageHistory` class is no longer necessary and can be replaced by: ``` PostgresChatMessageHistory = SQLChatMessageHistory ``` This also fixes the bug. ## Issue - [issue 22021](#22021) - Bug in _exit_history() - Bugs in PostgresChatMessageHistory and sync usage - Bugs in PostgresChatMessageHistory and async usage - [issue 36](langchain-ai/langchain-postgres#36) ## Twitter handle: pprados ## Tests - libs/community/tests/unit_tests/chat_message_histories/test_sql.py (add async test) @baskaryan, @eyurtsev or @hwchase17 can you check this PR ? And, I've been waiting a long time for validation from other PRs. Can you take a look? - [PR 32](langchain-ai/langchain-postgres#32) - [PR 15575](#15575) - [PR 13200](#13200) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>

@baskaryan

**Description:** langchain offers three technologies to save data: - [vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/) - [docstore](https://js.langchain.com/docs/api/schema/classes/Docstore) - [record manager](https://python.langchain.com/docs/modules/data_connection/indexing) If you want to combine these technologies in a sample persistence stategy you need a common implementation for each. `DocStore` propose `InMemoryDocstore`. We propose the class `MemoryRecordManager` to complete the system. This is the prelude to another full-request, which needs a consistent combination of persistence components. **Tag maintainer:** @baskaryan **Twitter handle:** @pprados --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>

dosubot bot added Ɑ: memory Related to memory module 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Nov 10, 2023

pprados force-pushed the pprados/memory_recordmanager branch from e3814f2 to d6b1611 Compare November 10, 2023 14:49

pprados mentioned this pull request Nov 10, 2023

ParentDocumentRetriever: parent_splitter and ids are incompatible #11982

Closed

14 tasks

pprados mentioned this pull request Nov 10, 2023

ParentDocumentRetriever need splitter and not transformer #11968

Closed

vercel bot deployed to Preview November 20, 2023 10:28 View deployment

pprados force-pushed the pprados/memory_recordmanager branch from cf2d422 to a9e342a Compare November 23, 2023 10:54

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Nov 23, 2023

pprados force-pushed the pprados/memory_recordmanager branch from c041b27 to b069a09 Compare November 27, 2023 09:14

pprados changed the title ~~Adds an in-memory implementation of RecordStore~~ Adds an in-memory implementation of RecordManager Nov 27, 2023

pprados mentioned this pull request Nov 27, 2023

Pprados/rag vectorstore #13910

Closed

hwchase17 reviewed Dec 5, 2023

View reviewed changes

pprados added 4 commits January 23, 2024 13:15

Adds an in-memory implementation of RecordStore

249d8a1

Fix spell

0d5f09f

Update to last langchain version

3d9e062

Fix a race condition in SQLAlchemyMd5Cache

f667410

pprados force-pushed the pprados/memory_recordmanager branch from b069a09 to f667410 Compare January 23, 2024 13:39

Fix a race condition in SQLAlchemyMd5Cache

ec53014

dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jan 23, 2024

Add MemoryRecordManager

cb37574

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Jan 23, 2024

pprados commented Jan 23, 2024

View reviewed changes

libs/community/langchain_community/indexes/memory_recordmanager.py Outdated Show resolved Hide resolved

pprados commented Jan 23, 2024

View reviewed changes

libs/community/langchain_community/indexes/memory_recordmanager.py Outdated Show resolved Hide resolved

pprados commented Jan 23, 2024

View reviewed changes

libs/community/langchain_community/indexes/memory_recordmanager.py Outdated Show resolved Hide resolved

pprados marked this pull request as ready for review June 13, 2024 08:56

pprados added 2 commits June 13, 2024 13:49

Merge branch 'master' into pprados/memory_recordmanager

c12c666

Merge branch 'master' into pprados/memory_recordmanager

d97dcc1

Merge remote-tracking branch 'upstream/master' into pprados/memory_re…

bd24d2a

…cordmanager

pprados added 2 commits June 14, 2024 16:47

Merge branch 'master' into pprados/memory_recordmanager

1bf3821

Merge branch 'master' into pprados/memory_recordmanager

1ce6aed

pprados marked this pull request as draft June 14, 2024 15:47

pprados marked this pull request as ready for review June 14, 2024 15:47

qxqx

c62bb8c

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jun 14, 2024

x

ca79b22

pprados added a commit to pprados/langchain-rag that referenced this pull request Jun 17, 2024

Sync with langchain-ai/langchain#13200

0afcc00

Merge branch 'master' into pprados/memory_recordmanager

bfc83ea

pprados marked this pull request as draft June 19, 2024 05:46

Merge branch 'master' into pprados/memory_recordmanager

f841067

pprados marked this pull request as ready for review June 19, 2024 14:07

ccurme added the Ɑ: core Related to langchain-core label Jun 19, 2024

pprados commented Jun 20, 2024

View reviewed changes

eyurtsev approved these changes Jun 20, 2024

View reviewed changes

libs/core/langchain_core/indexing/_memory_recordmanager.py Outdated Show resolved Hide resolved

dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jun 20, 2024

eyurtsev merged commit 8711c61 into langchain-ai:master Jun 20, 2024
134 checks passed

pprados deleted the pprados/memory_recordmanager branch June 21, 2024 13:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core[minor]: Adds an in-memory implementation of RecordManager #13200

core[minor]: Adds an in-memory implementation of RecordManager #13200

pprados commented Nov 10, 2023

vercel bot commented Nov 10, 2023 •

edited

Loading

hwchase17 left a comment

pprados commented Dec 6, 2023 •

edited

Loading

pprados commented Dec 15, 2023

pprados commented Jun 13, 2024

pprados commented Jun 14, 2024

eyurtsev commented Jun 14, 2024

pprados commented Jun 17, 2024

pprados left a comment

core[minor]: Adds an in-memory implementation of RecordManager #13200

core[minor]: Adds an in-memory implementation of RecordManager #13200

Conversation

pprados commented Nov 10, 2023

vercel bot commented Nov 10, 2023 • edited Loading

hwchase17 left a comment

Choose a reason for hiding this comment

pprados commented Dec 6, 2023 • edited Loading

pprados commented Dec 15, 2023

pprados commented Jun 13, 2024

pprados commented Jun 14, 2024

eyurtsev commented Jun 14, 2024

pprados commented Jun 17, 2024

pprados left a comment

Choose a reason for hiding this comment

vercel bot commented Nov 10, 2023 •

edited

Loading

pprados commented Dec 6, 2023 •

edited

Loading