In [7]:
# !pip install langchain
# !pip install langchain_community
# !pip install langchain-groq


In [None]:
# What is persisting?
    # persisting means saving data so that it can be reused later.

# What Does "Persisting Memory" Mean?
    # When you persist memory, you're storing it in a permanent or semi-permanent storage like:
        # - A database (e.g., PostgreSQL, Redis)
        # - A file (e.g., JSON, CSV)
        # - A cloud service (e.g., Pinecone, Zep)

In [None]:
# The Memory module is responsible for persisting the state between calls of a chain or agent, which helps the language model remember 
# previous interactions and use that information to make better decisions.

# It provides a standard interface for persisting state between calls of a chain or agent, enabling the language model to have memory and context.

# Memory is critical for personal assistants, autonomous agents, and agent simulations, where the language model needs to remember previous 
# interactions to make informed decisions.

In [None]:
# What does it do?
    # The Memory module enables the language model to have memory and context, allowing the LLM to make informed decisions.
    
    # It allows the model to remember user inputs, system responses, and any other relevant information. 
    # The stored information can be accessed and utilized during subsequent interactions.

In [None]:
# Why do I need it?
    # The Memory module helps you build more interactive and personalized applications.

In [None]:
# ### Why Memory Is Needed in GenAI Applications

#  1.  `Maintaining Context`
        # -   Memory allows the AI to  `remember previous messages`  in a conversation.
        # -   This is crucial for  `multi-turn interactions`, like chatbots or virtual assistants..

#  2.  `Personalization`
        # -   Memory helps GenAI apps  `remember user preferences`, names, goals, or past behavior.

#  3.  `Long-Term Knowledge`
        # -   Some applications need to  `recall facts or decisions`  made in earlier sessions

#  4.  `Efficient Reasoning`
        # -   Memory can store  `intermediate steps or summaries`, helping the AI reason better over time.


#  5.  `Semantic Recall`
        # -   With vector-based memory, GenAI can  `recall similar past conversations`  even if phrased differently.

#  6.  `Persistence Across Sessions`
        # -   Memory enables  `continuity`  so users don’t have to repeat themselves every time they return.

In [None]:
# Memory Storage types
#  - Permanent Storage
#  - Temporary Storage

# Permanent Storage Types
    # 1. Databases
    #     - PostgreSQL, MySQL, MongoDB, Redis (Structured data, chat history, user profiles)
    # 2. File Systems	
    #     - JSON, CSV, TXT, XML files	(Logs, configs, lightweight memory)
    # 3. Vector Stores	
    #     - FAISS, Pinecone, Chroma	 (Embedding-based memory, semantic search)
    # 4. Cloud Storage	
    #     - AWS S3, Google Cloud Storage, Azure Blob	(Large files, backups, logs)
    # 5. Key-Value Stores	
    #     - Redis, DynamoDB	 (Fast access to session or user data)
    # 6. Memory Services	
    #    - Zep, LangChainHub	(Specialized memory for LLMs and agents)


# Temporary Storage Types
    # 1. In-Memory Variables	
    #     - Python dictionaries, lists, objects
    # 2. Session Memory	
    #     - ConversationBufferMemory, ConversationEntityMemory (without persistence)
    # 3. Cache	
    #     - RAM-based cache, browser cache	
    # 4. Temporary Files	
    #     - OS temp folders, tempfile in Python	
    # 5. Ephemeral Databases	
    #     - In-memory SQLite, Redis (without persistence)	

In [None]:
# ### Types of memory in LangChain
    # 1.  `ConversationBufferMemory`  – Stores the full conversation history in memory.
    # 2.  `ConversationBufferWindowMemory`  – Keeps only the last  _k_  messages in memory.
    # 3.  `ConversationSummaryMemory`  – Summarizes the conversation using an LLM to reduce memory size.
    # 4.  `ConversationTokenBufferMemory`  – Maintains memory based on token count rather than message count.
    # 5.  `ConversationEntityMemory` - Extracts named entities from the conversation 
    # 5.  `VectorStoreRetrieverMemory`  – Uses a vector store to retrieve relevant past interactions based on similarity.
    # 6.  `ZepMemory`  – Integrates with Zep for long-term memory storage and retrieval.
    # 7.  `PostgresChatMessageHistory`  /  `RedisChatMessageHistory`  – Stores memory in external databases like PostgreSQL or Redis.

In [None]:
## What is Buffer?
    # a buffer is a temporary storage area used to hold data. 
    # Why Use a Buffer?
        # To maintain context across multiple turns in a conversation.
        # To allow the model to generate more coherent and relevant responses by remembering past inputs and outputs.

In [None]:
# ### LangChain Memory Types & Usefulness

# -   `ConversationBufferMemory`  
#     Stores the full conversation history.  
#     ➤ Useful for short to medium conversations where full context is needed.
    
# -   `ConversationBufferWindowMemory`  
#     Keeps only the last  _k_  messages.  
#     ➤ Useful for long conversations where only recent context matters.
    
# -   `ConversationSummaryMemory`  
#     Summarizes the conversation using an LLM.  
#     ➤ Useful for long conversations to retain key points without exceeding token limits.
    
# -   `ConversationTokenBufferMemory`  
#     Stores messages based on token count.  
#     ➤ Useful when working with models that have strict token limits.
    
# -   `VectorStoreRetrieverMemory`  
#     Retrieves relevant past interactions using vector similarity.  
#     ➤ Useful for semantic memory and long-term knowledge retrieval.
    
# -   `ZepMemory`  
#     Integrates with Zep for persistent memory and search.  
#     ➤ Useful for scalable, long-term memory across sessions.
    
# -   `PostgresChatMessageHistory`  /  `RedisChatMessageHistory`  
#     Stores chat history in external databases.  
#     ➤ Useful for persistent memory across users or sessions.

# - `ConversationEntityMemory `
#  - memory that tracks entities (like people, places, or things) mentioned during a conversation and stores information about them.
#  - Extracts named entities from the conversation.
#  - ➤ Useful Great for personal assistants, customer support bots, or multi-turn agents where tracking people, products, or topics is important.

# - `VectorStoreRetrieverMemory`
#  - It Enables semantic memory using embeddings and vector similarity search.
#  - Instead of storing raw text, it stores vector representations of past interactions and retrieves the most relevant ones based on 
#    similarity to the current input.
#  - ➤ Useful for 
# 	- Chatbots that need to remember facts across sessions.
# 	- Assistants that need to recall similar past queries.
# 	- Applications where semantic similarity is more important than exact match.

In [None]:
# ### Temporary Memory Types

    # These exist  only during the session  and are lost when the app stops or restarts (unless explicitly persisted).
    
    # -   `ConversationBufferMemory`
    # -   `ConversationBufferWindowMemory`
    # -   `ConversationSummaryMemory`
    # -   `ConversationTokenBufferMemory`
    # -   `ConversationEntityMemory`
    # -   `VectorStoreRetrieverMemory`  (if using in-memory vector store like FAISS without saving)

# ### Permanent Memory Types

    # These are  persisted  to external storage and survive across sessions.
    
    # -   `PostgresChatMessageHistory`
    # -   `RedisChatMessageHistory`
    # -   `ZepMemory`
    # -   `VectorStoreRetrieverMemory`  (when using persistent vector stores like Pinecone, Chroma with disk storage, etc.)

In [None]:
# Which memory type should you use?
    #  - Short, exact recall: `ConversationBufferMemory`
    #  - Long conversations: `ConversationSummaryMemory / ConversationSummaryBufferMemory`
    #  - Short-term memory efficiency: `ConversationBufferWindowMemory`
    #  - Entity-focused structured recall: `ConversationEntityMemory`
    #  - Deep, semantic retrieval/connections: `VectorStoreRetrieverMemory`

In [None]:
# ### Example Use Cases

# -   **Chatbot for customer support**  →  `ConversationEntityMemory`  +  `VectorStoreRetrieverMemory`
# -   **Personal assistant**  →  `ConversationSummaryMemory`  +  `ConversationEntityMemory`
# -   **FAQ bot with long-term recall**  →  `VectorStoreRetrieverMemory`
# -   **Multi-user app with login sessions**  →  `RedisChatMessageHistory`  or  `PostgresChatMessageHistory`

In [None]:
# ### Choosing the Right Memory Type for Your Use Case

# Selecting the appropriate memory type depends on several factors:

# -   `Duration and Complexity`: Short sessions benefit from detailed context retention with ConversationBufferMemory, while 
#      long-term interactions may require summarization via ConversationSummaryMemory.
# -   `Detail vs. Overview`: Determine whether detailed interaction history or high-level summaries are more valuable for your application.
# -   `Performance`: Consider the trade-offs between the memory size and retrieval speed.

# `Use Cases:`

# -   `ConversationBufferMemory`: Ideal for quick customer support or FAQ-style interactions.
# -   `ConversationSummaryMemory`: Best suited for long-term engagements like project management or ongoing customer interactions.

In [93]:
# Common issues/problems 

# Memory Bloat: Happens when too much conversation history is stored.
    # ➤ Solution: Limit memory by number of messages (windowed) or token count.

# Context Drift: The model loses track of the topic or entities.
    # ➤ Solution: Use summarization or track entities explicitly.

# High Token Costs: Large memory increases API costs.
    # ➤ Solution: Keep memory compact and summarized.

# Privacy Concerns: Storing user data can raise privacy issues.
    # ➤ Solution: Use temporary (session-based) memory and define clear data policies.

In [None]:
# `Memory bloat`  refers to the problem of storing  `too much conversation history or context`, which can lead to:

# -   `High token usage`  (increased cost and slower responses)
# -   `Reduced model performance`  (due to exceeding token limits)
# -   `Unnecessary or irrelevant context`  being included in prompts

# ### Why It Happens:

# -   Using memory types that store  `entire conversation history`  (like  `ConversationBufferMemory`) without limits.
# -   Not summarizing or trimming old messages.
# -   Storing repeated or redundant information.
# - 
# ### How to Avoid It:

# -   Use  `windowed memory`  (`ConversationBufferWindowMemory`) to keep only the last  _k_  messages.
# -   Use  `token-limited memory`  (`ConversationTokenBufferMemory`) to control memory size by token count.
# -   Use  `summarization`  (`ConversationSummaryMemory`) to condense older parts of the conversation.


In [None]:
# `Context drift` happens when a generative AI model `loses track of the topic or entities` being discussed over the course of a conversation. 
#  This can lead to `irrelevant, confusing, or incorrect responses`.

# ### Why It Happens:

# -   The conversation becomes  `too long`, and earlier context is forgotten or pushed out.
# -   The model  `misinterprets references`  like “he,” “it,” or “that” due to missing background.
# -   The memory is  `not structured`  to track key entities or topics.

# ### Example of Context Drift:

# `User`: "Tell me about Alice. She’s a software engineer."  
# `Later`: "Where does she work?"  
# → If the model forgets who “she” is, it may give a wrong or generic answer.


# ###`Strategy`  How to Prevent It:
#  1. `Summarize past context`
#          - `ConversationSummaryMemory`
#  2. `Track entities explicitly`
#          - `ConversationEntityMemory`
#  3. `Limit memory to recent turns`
#          - `ConversationBufferWindowMemory`
#  4. `Use semantic memory`
#          - VectorStoreRetrieverMemory


In [None]:

# `High token cost` refers to the `increased computational and financial cost` of using large numbers of tokens (words, symbols, or characters) 
#  in prompts and responses when interacting with language models like GPT.

# ### Why It Happens:

# -   Storing and sending  `long conversation histories`  to the model.
# -   Using  `verbose prompts`  or  `unoptimized memory`.
# -   Including  `irrelevant or redundant context`  in each request.

# ### Why It Matters:

# -   Most LLM APIs (like OpenAI, Anthropic)  `charge per token`.
# -   More tokens =  `higher cost`  and  `slower response times`.
# -   Can hit  `token limits`, causing errors or truncated outputs.

# ### How to Reduce Token Costs:

# -   `Summarize memory`: Use  `ConversationSummaryMemory`  to condense older messages.
# -   `Limit memory size`: Use  `ConversationBufferWindowMemory`  or  `ConversationTokenBufferMemory`.
# -   `Trim unnecessary context`: Only include relevant parts of the conversation.
# -   `Use semantic memory`: Retrieve only the most relevant past messages using  `VectorStoreRetrieverMemory`.

In [None]:
# ### `Benefits of Optimizing Memory in LLMs`

# -   ` Reduced Token Costs`  
#     By trimming unnecessary context and summarizing conversations, you reduce the number of tokens sent to the model—saving money and API usage.
    
# -   `  Faster Response Times`  
#     Smaller prompts mean quicker processing, leading to faster replies and a smoother user experience.
    
# -   ` Improved Relevance and Accuracy`  
#     Optimized memory helps the model focus on the most important and recent context, reducing confusion and improving the quality of responses.
    
# -   ` Better Token Management`  
#     Helps stay within the model’s token limits, avoiding errors like truncated outputs or context loss.
    
# -   ` Enhanced Privacy Control`  
#     By storing only essential or summarized data, you reduce the risk of exposing sensitive user information.
    
# -   ` Scalability`  
#     Efficient memory handling allows your application to support more users and longer sessions without performance degradation.
    
# -   ` Easier Debugging and Maintenance`  
#     Cleaner, more structured memory makes it easier to understand and debug the model’s behavior.

In [None]:


# ### `Strategies to Optimize Memory in LLMs`

# #### 1.  `Use Windowed or Token-Limited Memory`

# -   Limit memory to the  `last few messages`  (`ConversationBufferWindowMemory`) or a  `maximum token count`  (`ConversationTokenBufferMemory`).
# -   Prevents memory bloat and keeps prompts within model limits.

# #### 2.  `Summarize Older Context`

# -   Use  `ConversationSummaryMemory`  to  `compress earlier parts`  of the conversation into a summary.
# -   Keeps the essence of the conversation while reducing token usage.

# #### 3.  `Track Entities Instead of Full History`

# -   Use  `ConversationEntityMemory`  to  `store facts about key entities`  (like people or places) instead of full messages.
# -   Helps maintain relevance without storing everything.

# #### 4.  `Use Semantic Memory for Relevance`

# -   Implement  `VectorStoreRetrieverMemory`  to  `retrieve only relevant past interactions`  using embeddings.
# -   Avoids sending irrelevant history to the model.

# #### 5.  `Clear or Reset Memory When Needed`

# -   For privacy or performance,  `reset memory`  at logical points (e.g., end of session, task completion).

# #### 6.  `Avoid Redundant Information`

# -   Don’t repeat static instructions or facts in every prompt.
# -   Use system prompts or separate context injection when needed.

# #### 7.  `Persist Only What Matters`

# -   Store only  `important facts or summaries`  in persistent memory (e.g., Redis, Postgres, Pinecone).
# -   Avoid saving full chat logs unless necessary.

In [None]:
from utils import read_json
from utils import (
            get_model_openAI,
            get_model_groq,
            get_openAI_embeddings
)

In [None]:
model = get_model_openAI(model="gpt-4")

In [9]:
model.invoke("Hi, what's up?").content

"Hello! I'm an AI and I'm here to assist you. How can I help you today?"

In [74]:
#Utilizing the Conversational Buffer Memory
memory= ConversationBufferMemory()
conversation=ConversationChain(memory=memory, 
                  llm=model,
                  verbose=True)

conversation.predict(input="Hi, my name is Santhosh Kumar")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Santhosh Kumar
AI:[0m

[1m> Finished chain.[0m


"Hello Santhosh Kumar! It's nice to meet you. How can I assist you today?"

In [76]:
conversation.predict(input="How many emotional states are there?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Santhosh Kumar
AI: Hello Santhosh Kumar! It's nice to meet you. How can I assist you today?
Human: How many emotional states are there?
AI: Emotions are complex and can be categorized in many ways. However, most psychologists agree on a set of six basic emotions that are universally experienced in all human cultures. These are: happiness, sadness, fear, anger, surprise, and disgust. Some researchers have expanded this list to include others like anticipation, trust, and love. However, it's important to remember that emotions can be mixed and nuanced, creating a vast array of emotional states.
Human: How many emotional states a

'As mentioned before, there are six basic emotions that are universally recognized: happiness, sadness, fear, anger, surprise, and disgust. However, this is a simplified model and some researchers argue for a more complex range of emotions. For instance, Robert Plutchik proposed an "emotion wheel" that includes 8 primary emotions and a variety of secondary and tertiary emotions, resulting in a total of 32 emotional states. In reality, the number can be even larger when you consider nuanced and mixed emotions. So, the exact number of emotional states can be quite vast and hard to quantify.'

In [77]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Santhosh Kumar
AI: Hello Santhosh Kumar! It's nice to meet you. How can I assist you today?
Human: How many emotional states are there?
AI: Emotions are complex and can be categorized in many ways. However, most psychologists agree on a set of six basic emotions that are universally experienced in all human cultures. These are: happiness, sadness, fear, anger, surprise, and disgust. Some researchers have expanded this list to include others like anticipation, trust, and love. However, it's important to remember that emotions can be mixed and nuanced, creating a vast array of emotional states.
Human: How many emotional states a

'Your name is Santhosh Kumar, as you introduced yourself earlier in the conversation.'

In [78]:
from langchain.memory import ConversationBufferWindowMemory

memory = ConversationBufferWindowMemory(k=1)

## Adding COnversational in Memory as an example
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})

memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})


conversation = ConversationChain(
    llm=model, 
    memory = memory,
    verbose=False
)


memory.load_memory_variables({})

{'history': 'Human: Not much, just hanging\nAI: Cool'}

In [79]:
conversation.predict(input="What is my name?")

"I'm sorry, but as an AI, I don't have access to personal data about individuals unless it has been shared with me in the course of our conversation. I am designed to respect user privacy and confidentiality."

In [80]:
from langchain.memory import ConversationSummaryMemory

# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."


memory= ConversationSummaryMemory(llm=model, max_token_limit=100)

memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

memory.load_memory_variables({})

{'history': 'The human greets the AI casually and the AI responds in kind. The conversation continues in a laid-back manner. When asked about the schedule for the day, the AI informs the human of a morning meeting with the product team, time set aside for work on the LangChain project, and a lunch meeting at an Italian restaurant with a customer interested in AI. The AI stresses the importance of being prepared with a PowerPoint presentation and a laptop for the LLM demo.'}

In [81]:
conversation=ConversationChain(memory=memory, 
                  llm=model,
                  verbose=True)

conversation.predict(input="What cool demo in laptop can he show")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
The human greets the AI casually and the AI responds in kind. The conversation continues in a laid-back manner. When asked about the schedule for the day, the AI informs the human of a morning meeting with the product team, time set aside for work on the LangChain project, and a lunch meeting at an Italian restaurant with a customer interested in AI. The AI stresses the importance of being prepared with a PowerPoint presentation and a laptop for the LLM demo.
Human: What cool demo in laptop can he show
AI:[0m

[1m> Finished chain.[0m


"The LLM (Language Learning Machine) demo that he can show is really impressive. It's a machine learning model that can translate languages in real time. It can also analyze and understand natural language, making it capable of tasks like summarizing articles, answering questions, and even generating human-like text. It's a great illustration of the capabilities of AI and machine learning."

In [83]:
# ConversationTokenBufferMemory

# nstead of “k” conversations being remembered in ConversationBufferWindowMemory, 
# in this case we want to remember last set of discussion based on “max token limit”.

from langchain.memory import ConversationTokenBufferMemory

memory=ConversationTokenBufferMemory(llm=model, max_token_limit=25)

memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

memory.load_memory_variables({})


{'history': 'AI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

In [84]:


memory=ConversationTokenBufferMemory(llm=llm, max_token_limit=50)

memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

memory.load_memory_variables({})



{'history': 'Human: AI is what?!\nAI: Amazing!\nHuman: Backpropagation is what?\nAI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

In [20]:

# from langchain.memory import ConversationSummaryMemory  # Deprecated
from langchain.memory import ConversationBufferMemory
from langchain.memory import ChatMessageHistory

In [24]:
memory = ConversationBufferMemory(llm=model, return_messages=True)

memory.save_context(
    {"input": "Santhosh and MayaManish are working on a hackathon project."},
    {"output": "That's awesome! What's the hackathon project about?"}
)

In [25]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='Santhosh and MayaManish are working on a hackathon project.', additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's awesome! What's the hackathon project about?", additional_kwargs={}, response_metadata={})]}

In [26]:

summary=memory.load_memory_variables({})

In [27]:

print(summary["history"][0].content)

Santhosh and MayaManish are working on a hackathon project.


In [28]:

memory.save_context(
    {"input": "It's a machine learning project focused on healthcare."},
    {"output": "Sounds exciting! Are they building a prediction model or something else"}
)
memory.save_context(
    {"input": "Yes, they’re working on a model to predict patient outcomes."},
    {"output": "Impressive! How far along are they with the project?"}
)
     

In [29]:

memory.load_memory_variables({})

{'history': [HumanMessage(content='Santhosh and MayaManish are working on a hackathon project.', additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's awesome! What's the hackathon project about?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="It's a machine learning project focused on healthcare.", additional_kwargs={}, response_metadata={}),
  AIMessage(content='Sounds exciting! Are they building a prediction model or something else', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Yes, they’re working on a model to predict patient outcomes.', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Impressive! How far along are they with the project?', additional_kwargs={}, response_metadata={})]}

In [32]:
summary=memory.load_memory_variables({})

print(summary["history"][0].content)

Santhosh and MayaManish are working on a hackathon project.


In [33]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='Santhosh and MayaManish are working on a hackathon project.', additional_kwargs={}, response_metadata={}),
  AIMessage(content="That's awesome! What's the hackathon project about?", additional_kwargs={}, response_metadata={}),
  HumanMessage(content="It's a machine learning project focused on healthcare.", additional_kwargs={}, response_metadata={}),
  AIMessage(content='Sounds exciting! Are they building a prediction model or something else', additional_kwargs={}, response_metadata={}),
  HumanMessage(content='Yes, they’re working on a model to predict patient outcomes.', additional_kwargs={}, response_metadata={}),
  AIMessage(content='Impressive! How far along are they with the project?', additional_kwargs={}, response_metadata={})]}

In [34]:
print('Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.\n\nEXAMPLE\nCurrent summary:\nThe human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.\n\nNew lines of conversation:\nHuman: Why do you think artificial intelligence is a force for good?\nAI: Because artificial intelligence will help humans reach their full potential.\n\nNew summary:\nThe human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.\nEND OF EXAMPLE\n\nCurrent summary:\n{summary}\n\nNew lines of conversation:\n{new_lines}\n\nNew summary:')
     

Progressively summarize the lines of conversation provided, adding onto the previous summary returning a new summary.

EXAMPLE
Current summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good.

New lines of conversation:
Human: Why do you think artificial intelligence is a force for good?
AI: Because artificial intelligence will help humans reach their full potential.

New summary:
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.
END OF EXAMPLE

Current summary:
{summary}

New lines of conversation:
{new_lines}

New summary:


In [35]:
memory.chat_memory


InMemoryChatMessageHistory(messages=[HumanMessage(content='Santhosh and MayaManish are working on a hackathon project.', additional_kwargs={}, response_metadata={}), AIMessage(content="That's awesome! What's the hackathon project about?", additional_kwargs={}, response_metadata={}), HumanMessage(content="It's a machine learning project focused on healthcare.", additional_kwargs={}, response_metadata={}), AIMessage(content='Sounds exciting! Are they building a prediction model or something else', additional_kwargs={}, response_metadata={}), HumanMessage(content='Yes, they’re working on a model to predict patient outcomes.', additional_kwargs={}, response_metadata={}), AIMessage(content='Impressive! How far along are they with the project?', additional_kwargs={}, response_metadata={})])

In [36]:
memory.chat_memory.messages

[HumanMessage(content='Santhosh and MayaManish are working on a hackathon project.', additional_kwargs={}, response_metadata={}),
 AIMessage(content="That's awesome! What's the hackathon project about?", additional_kwargs={}, response_metadata={}),
 HumanMessage(content="It's a machine learning project focused on healthcare.", additional_kwargs={}, response_metadata={}),
 AIMessage(content='Sounds exciting! Are they building a prediction model or something else', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Yes, they’re working on a model to predict patient outcomes.', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Impressive! How far along are they with the project?', additional_kwargs={}, response_metadata={})]

In [41]:

previous_summary=""

In [42]:
messages = memory.chat_memory.messages

messages

[HumanMessage(content='Santhosh and MayaManish are working on a hackathon project.', additional_kwargs={}, response_metadata={}),
 AIMessage(content="That's awesome! What's the hackathon project about?", additional_kwargs={}, response_metadata={}),
 HumanMessage(content="It's a machine learning project focused on healthcare.", additional_kwargs={}, response_metadata={}),
 AIMessage(content='Sounds exciting! Are they building a prediction model or something else', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Yes, they’re working on a model to predict patient outcomes.', additional_kwargs={}, response_metadata={}),
 AIMessage(content='Impressive! How far along are they with the project?', additional_kwargs={}, response_metadata={})]

In [43]:
memory.predict_new_summary(messages, previous_summary)

AttributeError: 'ConversationBufferMemory' object has no attribute 'predict_new_summary'

In [47]:

history = ChatMessageHistory()
history.add_user_message("Hi")

history.add_ai_message("Hello, how can I assist you today?")

ConversationSummaryMemory.from_messages(
    llm=model,
    chat_memory=history,
    memory_key="summary",
    human_prefix="User",
    ai_prefix="AI"
)

ConversationSummaryMemory(human_prefix='User', llm=ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x000001382CC53110>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x000001382CC5B4D0>, root_client=<openai.OpenAI object at 0x000001382B52AAD0>, root_async_client=<openai.AsyncOpenAI object at 0x000001382CC53250>, model_name='gpt-4', temperature=0.7, model_kwargs={}, openai_api_key=SecretStr('**********')), chat_memory=InMemoryChatMessageHistory(messages=[HumanMessage(content='Hi', additional_kwargs={}, response_metadata={}), AIMessage(content='Hello, how can I assist you today?', additional_kwargs={}, response_metadata={})]), buffer='The user greets the AI and the AI responds by asking how it can assist the user today.', memory_key='summary')

In [48]:
memory = ConversationSummaryMemory.from_messages(
    llm=model,
    chat_memory=history,
    memory_key="summary",
    human_prefix="User",
    ai_prefix="AI"
)


In [49]:

memory.buffer

'The user greets the AI and the AI responds by asking how it can assist the user today.'

In [55]:
## From here the chaining starting

In [56]:
from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

conversation_with_summary = ConversationChain(
    llm=model,
    memory=ConversationBufferMemory(llm=model),
    verbose=True
)

conversation_with_summary.predict(input="Hi, what's up?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, what's up?
AI:[0m

[1m> Finished chain.[0m


"Hello! I'm here, processing information and assisting users like you. How can I help you today?"

In [57]:
conversation_with_summary.predict(input="Sunny and Mayank are working on a mlops production ready project.")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, what's up?
AI: Hello! I'm here, processing information and assisting users like you. How can I help you today?
Human: Sunny and Mayank are working on a mlops production ready project.
AI:[0m

[1m> Finished chain.[0m


"That's interesting! MLOps, or DevOps for machine learning, is all about bridging the gap between the development (Dev) and operation (Ops) sides of machine learning. It's a practice for collaboration and communication between data scientists and operations professionals to help manage production ML lifecycle. \n\nSunny and Mayank must be involved in tasks like building, testing, and deploying machine learning models efficiently and reliably. They may also be working on automating various steps in ML lifecycle such as data collection, model training, model serving, monitoring, etc. \n\nWould you like to know more about MLOps or details related to their project?"

In [58]:
conversation_with_summary.predict(input="so can you describe mlops pipeline to me with in six point.")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, what's up?
AI: Hello! I'm here, processing information and assisting users like you. How can I help you today?
Human: Sunny and Mayank are working on a mlops production ready project.
AI: That's interesting! MLOps, or DevOps for machine learning, is all about bridging the gap between the development (Dev) and operation (Ops) sides of machine learning. It's a practice for collaboration and communication between data scientists and operations professionals to help manage production ML lifecycle. 

Sunny and Mayank must be involved in tasks like building, testing, and deploying machine learning models efficiently and reliably. They may also

"Absolutely, here is a simplified description of an MLOps pipeline broken down into six points:\n\n1. **Data Collection and Preparation**: This is the first step where raw data is collected from various sources, cleaned, and transformed into a format that can be used by machine learning models.\n\n2. **Model Development and Training**: In this stage, data scientists use the prepared data to develop and train machine learning models. They experiment with different algorithms and tune parameters to find the most effective model.\n\n3. **Model Validation**: Once a model is trained, it is validated using a different set of data to assess its performance and accuracy. This can help identify any issues or biases in the model.\n\n4. **Model Deployment**: After validation, the model is deployed into a production environment. This might involve integrating the model into existing systems and setting up APIs for other applications to use.\n\n5. **Model Monitoring and Management**: After deployme

In [59]:
conversation_with_summary.predict(input="How many total points are there?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, what's up?
AI: Hello! I'm here, processing information and assisting users like you. How can I help you today?
Human: Sunny and Mayank are working on a mlops production ready project.
AI: That's interesting! MLOps, or DevOps for machine learning, is all about bridging the gap between the development (Dev) and operation (Ops) sides of machine learning. It's a practice for collaboration and communication between data scientists and operations professionals to help manage production ML lifecycle. 

Sunny and Mayank must be involved in tasks like building, testing, and deploying machine learning models efficiently and reliably. They may also

'In the description I provided, there are six main points or stages in the MLOps pipeline. However, depending on the complexity of the project and the specific needs of a team, there might be more detailed steps or sub-steps within each of these main stages. In general, the six stages I mentioned cover the essential elements of an MLOps pipeline.'

In [60]:

conversation_with_summary.predict(input="can you give me 5th point with explaination")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, what's up?
AI: Hello! I'm here, processing information and assisting users like you. How can I help you today?
Human: Sunny and Mayank are working on a mlops production ready project.
AI: That's interesting! MLOps, or DevOps for machine learning, is all about bridging the gap between the development (Dev) and operation (Ops) sides of machine learning. It's a practice for collaboration and communication between data scientists and operations professionals to help manage production ML lifecycle. 

Sunny and Mayank must be involved in tasks like building, testing, and deploying machine learning models efficiently and reliably. They may also

'Sure, the 5th point in the MLOps pipeline is "Model Monitoring and Management". \n\nAfter a machine learning model is deployed into a production environment, it doesn\'t mean the work is done. It\'s crucial to continuously monitor the model\'s performance to ensure it is still providing accurate and reliable predictions. This is what we mean by "Model Monitoring".\n\nThe performance of a model might degrade over time due to various factors, including changes in the underlying data it was trained on, also known as concept drift. This is why ongoing monitoring is so important, as it allows data scientists to detect any significant changes in model performance or accuracy.\n\n"Management" refers to the actions taken based on the results of monitoring. If monitoring reveals that the model\'s performance is declining, it might need to be retrained with new data, refined, or replaced entirely. This is a critical part of maintaining the reliability and usefulness of machine learning applicat

In [61]:

conversation_with_summary.memory.buffer

'Human: Hi, what\'s up?\nAI: Hello! I\'m here, processing information and assisting users like you. How can I help you today?\nHuman: Sunny and Mayank are working on a mlops production ready project.\nAI: That\'s interesting! MLOps, or DevOps for machine learning, is all about bridging the gap between the development (Dev) and operation (Ops) sides of machine learning. It\'s a practice for collaboration and communication between data scientists and operations professionals to help manage production ML lifecycle. \n\nSunny and Mayank must be involved in tasks like building, testing, and deploying machine learning models efficiently and reliably. They may also be working on automating various steps in ML lifecycle such as data collection, model training, model serving, monitoring, etc. \n\nWould you like to know more about MLOps or details related to their project?\nHuman: so can you describe mlops pipeline to me with in six point.\nAI: Absolutely, here is a simplified description of an 