# Conversation Memory

onversation Memory refers to the mechanism that allows an LLM-powered app (like a chatbot or agent) to remember previous interactions in a session. It’s what lets a conversation feel coherent, contextual, and “alive.”

- Prepare a Chat Model

In [69]:
import warnings
warnings.filterwarnings("ignore")


In [70]:
# =================== Section to change according to your choice of APIs you have access to===============
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"
# llm
llm = HuggingFaceEndpoint(
    repo_id=repo_id,
    task="text-generation",  # Ensure the correct task type
    temperature=0.5,
    max_new_tokens=128  # Adjust max_new_tokens for generation
)

# Set up ChatHuggingFace
chatllm = ChatHuggingFace(llm=llm)

# Test with a prompt
try:
    response = chatllm.invoke("Explain the concept of quantum computing.")
    print(response.content)
except Exception as e:
    print(e)

Note: Environment variable`HF_TOKEN` is set and is the current active token independently from the token you've just configured.


Quantum computing is a type of computation that utilizes the principles of quantum mechanics to perform operations on data. Unlike classical computers that use bits (0s and 1s) to represent and manipulate data, quantum computers use quantum bits, or qubits.

A qubit can exist in multiple states simultaneously, thanks to a property known as superposition. This means that a qubit can be both 0 and 1 at the same time - a concept known as quantum superposition.

Another key principle in quantum computing is entanglement. This occurs when pairs or groups of qubits become interconnected and the state of one qubit can instantly affect the state of the others, no matter the physical distance between them. This phenomenon allows quantum computers to perform certain calculations much faster than classical computers.

Quantum computers also utilize a third state known as the quantum oscillator, or quantum phase, which can further enhance their computational power.

These unique properties of qubi

## Simple conversation memory
ConversationBufferMemory is a memory class in LangChain that stores and remembers the full chat history between the user and the LLM during a conversation [ConversationChain].

In [77]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory


memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=chatllm,
    memory=memory,
)

try:
    conversation.invoke("Hi, my name is Alice.")
    response = conversation.invoke("What's my name?")  # Should remember "Alice"
    print(response['response'])
except Exception as e:
    print(e)

Based on our previous conversation, your name is Alice. How can I assist you today?


- Memory with specific key names

In [78]:
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

- ConversationBufferWindowMemory (keeps only last k interactions)


In [79]:
from langchain.memory import ConversationBufferWindowMemory
window_memory = ConversationBufferWindowMemory(k=2)

- ConversationSummaryMemory (summarizes older context)
 
  Using LLM for summarization

In [80]:
from langchain.memory import ConversationSummaryMemory
summary_memory = ConversationSummaryMemory(llm=chatllm)

## Vector Store Memory

Vector Store Memory is a type of memory in LangChain that stores past conversation chunks (or context) in a vector store — such as FAISS, Chroma, or Weaviate — and retrieves relevant messages based on semantic similarity, rather than just time-based order.

In [81]:
from langchain.memory import VectorStoreRetrieverMemory
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

model_name = "sentence-transformers/all-MiniLM-L6-v2"
embeddings = HuggingFaceEmbeddings(model_name=model_name)

# Create vector store
vector_store = FAISS.from_texts(["User likes red and blue colors"], embeddings)
retriever = vector_store.as_retriever(search_kwargs=dict(k=1))

# Create memory
memory = VectorStoreRetrieverMemory(retriever=retriever)

# Save information to memory
memory.save_context(
    {"input": "My favorite colors are red and blue"},
    {"output": "I'll remember that you like red and blue."}
)

# Retrieve from memory
memory.load_memory_variables({"prompt": "What colors do I like?"})

{'history': 'User likes red and blue colors'}