Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add a feature to serialize and deserialize memory types to and from JSON #14032

Closed
wants to merge 5 commits into from

Conversation

Aditya-k-23
Copy link

Description: Add a feature to serialize and deserialize the memory types into JSON format,
Issue: #11275,
Dependencies: No new dependencies,
Tag maintainer: @baskaryan, @eyurtsev, @hwchase17
Co-Authors: @D3nam, @avaove, @malharpandya

…from JSON

* feat: Add serilization for memory

* nit: added dummy file for testing

* added tests and bug with vectorstore

* added unit tests for ConversationEntityMemory and ConversationTokenBufferMemory

* fix test cases

* feat: added from_json

Co-authored-by: Aditya Kulkarni <Aditya-k-23@users.noreply.github.com>

* small fix to ConversationEntityMemory unit test

* fix: fix vectorStoreRetriever

* turned string assert to json assert

* Add Conversation Summary Buffer

* Serialize Conversation Summary

* small fixes to unit tests

* feat: fix from_json()

Co-authored-by: Aditya Kulkarni <Aditya-k-23@users.noreply.github.com>

* feat: Added from_json for memories with llms

Co-authored-by: Aditya Kulkarni <Aditya-k-23@users.noreply.github.com>

* feat: fix test_cases

* feat: fix test cases

* linting: fix linting

* docs: started work on docs

* Add documentation

- Remove `trial.py`
- Add serialization.ipynb

* Update Formatting

---------

Co-authored-by: D3nam <rahl302000@gmail.com>
Co-authored-by: Ava Oveisi <87161280+avaove@users.noreply.github.com>
Co-authored-by: Aditya Kulkarni <Aditya-k-23@users.noreply.github.com>
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 29, 2023
Copy link

vercel bot commented Nov 29, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jan 7, 2024 7:19pm

@dosubot dosubot bot added Ɑ: memory Related to memory module 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Nov 29, 2023
@Aditya-k-23
Copy link
Author

Our changes add the serialization and deserialization to and from the following memory types:

  • ConversationBufferWindowMemory
  • ConversationBufferMemory
  • ConversationSummaryBufferMemory
  • ConversationSummaryMemory
  • ConversationEntityMemory
  • ConversationTokenBufferMemory
    For the following types an additional llm argument needs to be passed for deserialization, this is because the current BaseLLM class is not lc_serializable:
  • ConversationSummaryBufferMemory
  • ConversationSummaryMemory
  • ConversationEntityMemory
  • ConversationTokenBufferMemory
    In the future, our implementation allows for extension to include Knowledge graphs and Vector Stores, if their lc_serializable methods are implemented to support both serialization and deserialization.

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for opening this up. i dont think we want to do this quite yet. a few reasons:

  1. the memory abstractions may not be the best/most stable, so we want to rethink
  2. we generally make things serializable in both python and js at the same time

Curious to hear - why the desire to have them serializable?

@Aditya-k-23
Copy link
Author

Aditya-k-23 commented Nov 30, 2023

We understand. We chose this issue as there were a lot of user requests for making storage of memory types available in a JSON format. We chose to couple it with deserialization because we thought it made sense to continue where users left off their work. We think this feature provides the capability to efficiently share, reuse, and manage memory objects across sessions and even with other users. To add a particular use case I would like to reference the issue #11275. Maybe @lameTookan can shed more light on what their use case was.

@rupertlssmith
Copy link

@hwchase17 Curious to hear - why the desire to have them serializable?

Currently building a system where each step in the conversation is produced by invoking a stateless API. So any conversation state must be passed in to the parameters used to invoke the API.

@Aditya-k-23 Why do these memory types require an llm to deserialize?

  • ConversationSummaryBufferMemory
  • ConversationSummaryMemory

For example, to my thinking the ConversationSummaryBufferMemory contains a list of Human and AI conversation steps, a current buffer size or token count, and maybe a summary of the conversation so far to capture the overflow from the buffer. A serialized representation could hold these 3 things, and the state could be completely recovered from them.

What we want to avoid is having to make extra calls to the llm to re-create the summary from the conversation steps when deserializing, as this is potentially costly/time consuming.

@hwchase17
Copy link
Contributor

i am going to close this. this isnt something we want to make serializable yet

@hwchase17 hwchase17 closed this Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: core Related to langchain-core 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features Ɑ: memory Related to memory module size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants