In [0]:
# %pip install "typing_extensions==4.5.0" "accelerate>=0.16.0,<1" "transformers[torch]>=4.28.1,<5" "torch>=1.13.1,<2" "langchain"

Memory: 

LLMs are "stateless"
- Each transaction is independent. 

Chatbots appears to have memory by providing the full conversation as "context"

# Imports

In [0]:
import torch
from transformers import pipeline

# langchain
from langchain.llms import HuggingFacePipeline

# langchain memory
from langchain.chains import ConversationChain
from langchain.memory import (
    ConversationBufferMemory,
    ConversationBufferWindowMemory,
    ConversationTokenBufferMemory,
    ConversationSummaryBufferMemory,
)

# Models

In [0]:
architecture = "databricks/dolly-v2-7b"

In [0]:
dolly = pipeline(
    model=architecture,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto",
    return_full_text=True,
)

Downloading config.json:   0%|          | 0.00/819 [00:00<?, ?B/s]

Downloading instruct_pipeline.py:   0%|          | 0.00/8.94k [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/12.9G [00:00<?, ?B/s]

The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.


Downloading tokenizer_config.json:   0%|          | 0.00/450 [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/2.02M [00:00<?, ?B/s]

Downloading special_tokens_map.json:   0%|          | 0.00/228 [00:00<?, ?B/s]

In [0]:
hf_pipeline = HuggingFacePipeline(pipeline=dolly)

# ConversationBufferMemory

In [0]:
memory = ConversationBufferMemory()
conversation = ConversationChain(
    llm=hf_pipeline, 
    memory = memory,
    verbose=True
)

## Example Chat - 1

In [0]:
conversation.predict(input="Hi, my name is Sriharsha")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Sriharsha
AI:[0m

[1m> Finished chain.[0m
Out[7]: '\nHi, my name is SRI HARSHA. Nice to meet you. I am an AI conversing with you today.'

In [0]:
conversation.predict(input="What is 1+1?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Sriharsha
AI: 
Hi, my name is SRI HARSHA. Nice to meet you. I am an AI conversing with you today.
Human: What is 1+1?
AI:[0m

[1m> Finished chain.[0m
Out[8]: '\nThe answer to the question "What is 1+1?" is 2. SRI HARSHA knows the answer to this question as it is a conversational AI tool that SRI HARSHA created.'

In [0]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Sriharsha
AI: 
Hi, my name is SRI HARSHA. Nice to meet you. I am an AI conversing with you today.
Human: What is 1+1?
AI: 
The answer to the question "What is 1+1?" is 2. SRI HARSHA knows the answer to this question as it is a conversational AI tool that SRI HARSHA created.
Human: What is my name?
AI:[0m

[1m> Finished chain.[0m
Out[9]: '\nSRI HARSHA.'

In [0]:
print(memory.buffer)

Human: Hi, my name is Sriharsha
AI: 
Hi, my name is SRI HARSHA. Nice to meet you. I am an AI conversing with you today.
Human: What is 1+1?
AI: 
The answer to the question "What is 1+1?" is 2. SRI HARSHA knows the answer to this question as it is a conversational AI tool that SRI HARSHA created.
Human: What is my name?
AI: 
SRI HARSHA.


In [0]:
memory.load_memory_variables({})

Out[11]: {'history': 'Human: Hi, my name is Sriharsha\nAI: \nHi, my name is SRI HARSHA. Nice to meet you. I am an AI conversing with you today.\nHuman: What is 1+1?\nAI: \nThe answer to the question "What is 1+1?" is 2. SRI HARSHA knows the answer to this question as it is a conversational AI tool that SRI HARSHA created.\nHuman: What is my name?\nAI: \nSRI HARSHA.'}

## Example Chat - 2

In [0]:
memory = ConversationBufferMemory()

memory.save_context({"input": "Hi"}, {"output": "What's up"})
memory.load_memory_variables({})

memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"})
memory.load_memory_variables({})

Out[12]: {'history': "Human: Hi\nAI: What's up\nHuman: Not much, just hanging\nAI: Cool"}

# ConversationBufferWindowMemory

In [0]:
memory = ConversationBufferWindowMemory(k=1)

memory.save_context({"input": "Hi"}, {"output": "What's up"})
memory.load_memory_variables({})

memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"})
memory.load_memory_variables({})

Out[13]: {'history': 'Human: Not much, just hanging\nAI: Cool'}

In [0]:
memory = ConversationBufferWindowMemory(k=1)
window_conversation = ConversationChain(
    llm=hf_pipeline, 
    memory = memory,
    verbose=False
)

In [0]:
window_conversation.predict(input="Hi, my name is Sriharsha")

Out[15]: "\nHi, it's a pleasure to meet you. My name is Swaraj. How can I help you?"

In [0]:
window_conversation.predict(input="What is 1+1?")

Out[16]: '\nWell, I do know the answer to this question. The answer is 2.\n\nThe context from which I extracted this answer is that I was trained on general AI datasets and I do know how to answer questions from these datasets with high accuracy.\n\nOn the other hand, I am aware that humans will generally ask much more focused questions. For example, I do not know the weather in New York in March or the programming language JavaScript. I would appreciate it if you would describe the context in which you would like me to answer these questions.'

In [0]:
window_conversation.predict(input="What is my name?")

Out[17]: '\nYour name is Jack. I am pleased to provide you with answers about the weather in New York in March and the programming language JavaScript when the context is that you would like these answers.'

# ConversationTokenBufferMemory

In [0]:
memory = ConversationTokenBufferMemory(llm=hf_pipeline, max_token_limit=30)

memory.save_context({"input": "AI is what?!"}, {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"}, {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, {"output": "Charming!"})

memory.load_memory_variables({})

Downloading vocab.json:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

Downloading config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Out[18]: {'history': 'AI: Amazing!\nHuman: Backpropagation is what?\nAI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

# ConversationSummaryMemory

Instead of limiting the memory to fixed number of tokens based on the most recent utterences <b>(ConversationTokenBufferMemory)</b> or a fixed number of conversational exchanges <b>(ConversationBufferWindowMemory)</b>, Let's use a LLM to write a summary of conversations so far and let that be the memory.

In [0]:
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=hf_pipeline, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"}, {"output": "Cool"})
memory.save_context(
    {"input": "What is on the schedule today?"}, {"output": f"{schedule}"}
)

memory.load_memory_variables({})

Out[19]: {'history': 'System: \nThe human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.\n\nThe AI was not sure what the human meant by full potential, but it explored the meaning of the word through a series of human interactions. Humans use the word full potential to describe their best, most impressive, or most successful self. For example, full potential may refer to a sports star who has reached the highest level of performance or a scholar who has achieved the greatest number of high-impact publications. The AI believes that human potential is naturally limitless and sees artificial intelligence as a force for expanding human potential.\nAI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At N

In [0]:
summary_conversation = ConversationChain(
    llm=hf_pipeline, 
    memory = memory,
    verbose=True
)

In [0]:
summary_conversation.predict(input="What would be a good demo to show?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: 
The human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential.

The AI was not sure what the human meant by full potential, but it explored the meaning of the word through a series of human interactions. Humans use the word full potential to describe their best, most impressive, or most successful self. For example, full potential may refer to a sports star who has reached the highest level of performance or a scholar who has achieved the greatest number of high-impact publications. The AI believes that human potential is natura

In [0]:
memory.load_memory_variables({})

Out[22]: {'history': 'System: \nThe human asks what the AI thinks of artificial intelligence. The AI thinks artificial intelligence is a force for good because it will help humans reach their full potential. The AI believes human potential is naturally limitless and sees artificial intelligence as a force for expanding human potential. The AI thinks LangChain is a powerful tool, and it wants to go quickly to show the latest LLM demo at noon.\nHuman: What would be a good demo to show?\nAI: \nAt noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.'}

# Additional Memory Types

Vector Data Memory
- Stores text (from conversation or else where) in a vector database and retrieves the most relevant blocks of text.

Entity Memory
- Using an LLM, it remembers details about specific entities

We can also use multiple memories at one time.
- Eg; Conversation memory + Entity memory to recall individuals

We can also store the memory in a conventional database (such as key-value store)