In [1]:
!pip install langchain==0.2.0 --quiet
!pip install langchain-openai==0.1.7 --quiet
!pip install langchain-community==0.2.0 --quiet
!pip install langchain-chroma==0.1.1 --quiet

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/973.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m225.3/973.7 kB[0m [31m8.2 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m973.7/973.7 kB[0m [31m14.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m397.1/397.1 kB[0m [31m25.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m311.8/311.8 kB[0m [31m21.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.8/50.8 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m16.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m20.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━

In [2]:
import os
from google.colab import userdata

os.environ['OPENAI_API_KEY'] = userdata.get('OPEN_API_KEY')
os.environ['HUGGINGFACEHUB_API_TOKEN'] = userdata.get('HF_TOKEN')

In [3]:
from langchain_openai import ChatOpenAI

chatgpt = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0)

## Working with LangChain Chains

Using an LLM in isolation is fine for simple applications, but more complex applications require chaining LLMs - either with each other or with other components. Also running on multiple data points can be done easily with chains.

Chain's are the legacy interface for "chained" applications. We define a Chain very generically as a sequence of calls to components, which can include other chains.

Here we will be using LCEL chains exclusively

### The Problem with Simple LLM Chains

Simple LLM Chains cannot keep a track of past conversation history

In [4]:
from langchain_core.prompts import ChatPromptTemplate


prompt_txt = """{query}"""
prompt = ChatPromptTemplate.from_template(prompt_txt)

llm_chain = (prompt
             |
             chatgpt
             )

In [5]:
response = llm_chain.invoke({'query': 'What are the first four colors of a rainbow?'})
print(response.content)

The first four colors of a rainbow are red, orange, yellow, and green.


In [6]:
response = llm_chain.invoke({'query': 'and the other 3?'})
print(response.content)

I'm sorry, I'm not sure what you are referring to. Can you please provide more context or clarify your question?


### Conversation Chains with LCEL

LangChain Expression Language (LCEL) connects prompts, models, parsers and retrieval components using a `|` pipe operator.

A conversation chain basically consists of user prompts, historical conversation memory and the LLM. The LLM uses the history memory to give more contextual answers to every new prompt or user query.

Memory is very important for having a true conversation with LLMs. LangChain allows us to manage conversation memory using various constructs. The main ones we will cover include:

- ConversationBufferMemory
- ConversationBufferWindowMemory
- ConversationSummaryMemory
- VectorStoreRetrieverMemory
- ChatMessageHistory
- SQLChatMessageHistory

### Conversation Chains with ConversationBufferMemory

This is the simplest version of in-memory storage of historical conversation messages. It is basically a buffer for storing conversation memory.

Remember if you have a really long conversation, you might exceed the max token limit of the context window allowed for the LLM.

In [7]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferMemory
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda

SYS_PROMPT = """Act as a helpful assistant and give brief answers"""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", SYS_PROMPT),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{query}"),
    ]
)

memory = ConversationBufferMemory(return_messages=True)

In [8]:
# function to get historical conversation messages from the memory
memory.load_memory_variables({})

{'history': []}

In [9]:
# lets create a function now which returns the list of messages from memory
def get_memory_messages(query):
    return memory.load_memory_variables(query)['history']

get_memory_messages('What are the first four colors of a rainbow?')

[]

In [10]:
# testing out the function with a runnable lambda which will go into our chain
# this returns the history but we also need to send our current query to the prompt
RunnableLambda(get_memory_messages).invoke({'query': 'What are the first four colors of a rainbow?'})

[]

In [11]:
# we use a runnable passthrough to pass our current query untouched
# along with the history messages to the next step in the chain
RunnablePassthrough.assign(
        history=RunnableLambda(get_memory_messages)
    ).invoke({'query': 'What are the first four colors of a rainbow?'})

{'query': 'What are the first four colors of a rainbow?', 'history': []}

In [12]:
# creating our conversation chain now
def get_memory_messages(query):
    return memory.load_memory_variables(query)['history']

conversation_chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(get_memory_messages)
    ) # sends current query (input by user at runtime) and history messages to next step
      |
    prompt # creates prompt using the previous two variables
      |
    chatgpt # generates response using the prompt from previous step
)

In [13]:
query = {'query': 'What are the first four colors of a rainbow?'}
response = conversation_chain.invoke(query)
response

AIMessage(content='The first four colors of a rainbow are red, orange, yellow, and green.', response_metadata={'token_usage': {'completion_tokens': 18, 'prompt_tokens': 30, 'total_tokens': 48, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-93699e16-adaf-4791-a73b-847abeeb5809-0')

In [14]:
print(response.content)

The first four colors of a rainbow are red, orange, yellow, and green.


In [15]:
memory.load_memory_variables({})

{'history': []}

In [16]:
memory.save_context(query, {"output": response.content})

In [17]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='What are the first four colors of a rainbow?'),
  AIMessage(content='The first four colors of a rainbow are red, orange, yellow, and green.')]}

In [18]:
query = {'query': 'and the other 3?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

The other three colors of a rainbow are blue, indigo, and violet.


In [19]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='What are the first four colors of a rainbow?'),
  AIMessage(content='The first four colors of a rainbow are red, orange, yellow, and green.'),
  HumanMessage(content='and the other 3?'),
  AIMessage(content='The other three colors of a rainbow are blue, indigo, and violet.')]}

In [20]:
query = {'query': 'Explain AI in 2 bullet points'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

- AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, such as learning, reasoning, and problem-solving.
- AI technologies include machine learning, natural language processing, computer vision, and robotics, among others.


In [21]:
query = {'query': 'Now do the same for Deep Learning'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

- Deep learning is a subset of artificial intelligence that uses neural networks with multiple layers to learn and make decisions from large amounts of data.
- Deep learning is particularly effective for tasks such as image and speech recognition, natural language processing, and autonomous driving.


In [22]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='What are the first four colors of a rainbow?'),
  AIMessage(content='The first four colors of a rainbow are red, orange, yellow, and green.'),
  HumanMessage(content='and the other 3?'),
  AIMessage(content='The other three colors of a rainbow are blue, indigo, and violet.'),
  HumanMessage(content='Explain AI in 2 bullet points'),
  AIMessage(content='- AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, such as learning, reasoning, and problem-solving.\n- AI technologies include machine learning, natural language processing, computer vision, and robotics, among others.'),
  HumanMessage(content='Now do the same for Deep Learning'),
  AIMessage(content='- Deep learning is a subset of artificial intelligence that uses neural networks with multiple layers to learn and make decisions from large amounts of data.\n- Deep learning is particularly effective for tasks such as image and speech recognition, nat

In [23]:
query = {'query': 'What have we discussed so far?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

We have discussed the colors of a rainbow, artificial intelligence (AI), and deep learning.


### Conversation Chains with ConversationBufferWindowMemory

If you have a really long conversation, you might exceed the max token limit of the context window allowed for the LLM when using `ConversationBufferMemory` so `ConversationBufferWindowMemory` helps in just storing the last K conversations (one conversation piece is one user message and the corresponding AI message from the LLM) and thus helps you manage token limits and costs

In [24]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationBufferWindowMemory
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda

SYS_PROMPT = """Act as a helpful assistant and give brief answers"""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", SYS_PROMPT),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{query}"),
    ]
)

# stores last 2 sets of human-AI conversations
memory = ConversationBufferWindowMemory(return_messages=True, k=2)

# creating our conversation chain now
def get_memory_messages(query):
    return memory.load_memory_variables(query)['history']

conversation_chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(get_memory_messages)
    ) # sends current query (input by user at runtime) and history messages to next step
      |
    prompt # creates prompt using the previous two variables
      |
    chatgpt # generates response using the prompt from previous step
)

In [25]:
query = {'query': 'What are the first four colors of a rainbow?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

The first four colors of a rainbow are red, orange, yellow, and green.


In [26]:
query = {'query': 'and the other 3?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

The other three colors of a rainbow are blue, indigo, and violet.


In [27]:
query = {'query': 'Explain AI in 2 bullet points'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

- AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, such as learning, reasoning, and problem-solving.
- AI technologies include machine learning, natural language processing, computer vision, and robotics, among others.


In [28]:
query = {'query': 'Now do the same for Deep Learning'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

- Deep learning is a subset of machine learning that uses artificial neural networks to model and solve complex problems.
- It is particularly effective for tasks such as image and speech recognition, natural language processing, and autonomous driving.


In [29]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='Explain AI in 2 bullet points'),
  AIMessage(content='- AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, such as learning, reasoning, and problem-solving.\n- AI technologies include machine learning, natural language processing, computer vision, and robotics, among others.'),
  HumanMessage(content='Now do the same for Deep Learning'),
  AIMessage(content='- Deep learning is a subset of machine learning that uses artificial neural networks to model and solve complex problems.\n- It is particularly effective for tasks such as image and speech recognition, natural language processing, and autonomous driving.')]}

In [30]:
query = {'query': 'What have we discussed so far?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

We have discussed artificial intelligence (AI) and deep learning, including their definitions and key characteristics.


### Conversation Chains with ConversationSummaryMemory

If you have a really long conversation or a lot of messages, you might exceed the max token limit of the context window allowed for the LLM when using `ConversationBufferMemory`

`ConversationSummaryMemory` creates a summary of the conversation history over time. This can be useful for condensing information from the conversation messages over time.

This memory is most useful for longer conversations, where keeping the past message history in the prompt verbatim would take up too many tokens.

In [31]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import ConversationSummaryMemory
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda

SYS_PROMPT = """Act as a helpful assistant and give brief answers"""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", SYS_PROMPT),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{query}"),
    ]
)

memory = ConversationSummaryMemory(return_messages=True, llm=chatgpt)
# creating our conversation chain now
def get_memory_messages(query):
    return memory.load_memory_variables(query)['history']

conversation_chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(get_memory_messages)
    ) # sends current query (input by user at runtime) and history messages as a summary to next step
      |
    prompt # creates prompt using the previous two variables
      |
    chatgpt # generates response using the prompt from previous step
)

In [32]:
query = {'query': 'Explain AI in 2 bullet points'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

1. AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, such as learning, reasoning, and problem-solving.
2. AI technologies include machine learning, natural language processing, computer vision, and robotics, and are used in various applications like virtual assistants, autonomous vehicles, and medical diagnosis.


In [33]:
query = {'query': 'Now do the same for Deep Learning'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

- Deep learning is a subset of artificial intelligence that involves training artificial neural networks to learn and make decisions on their own.
- It is used in various applications such as image and speech recognition, natural language processing, and autonomous vehicles.


In [34]:
memory.load_memory_variables({})

{'history': [SystemMessage(content='The human asks the AI to explain AI in 2 bullet points. The AI explains that AI refers to the simulation of human intelligence processes by machines and lists various AI technologies and applications. The human then asks the AI to do the same for Deep Learning. The AI explains that Deep Learning is a subset of artificial intelligence that involves training artificial neural networks to learn and make decisions on their own, and it is used in various applications such as image and speech recognition, natural language processing, and autonomous vehicles.')]}

In [35]:
query = {'query': 'What have we discussed so far?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

We have discussed artificial intelligence (AI) and deep learning, including their definitions and applications.


### Conversation Chains with VectorStoreRetrieverMemory

`VectorStoreRetrieverMemory` stores historical conversation messages in a vector store and queries the top-K most "relevant" history messages every time it is called.

This differs from most of the other Memory classes in that it doesn't explicitly track the order of interactions but retrieves history based on embedding similarity to the current question or prompt.

In this case, the "docs" are previous conversation snippets. This can be useful to refer to relevant pieces of information that the AI was told earlier in the conversation.

In [36]:
from langchain_openai import OpenAIEmbeddings

# details here: https://openai.com/blog/new-embedding-models-and-api-updates
openai_embed_model = OpenAIEmbeddings(model='text-embedding-3-small')

#### Create a Vector Database to store conversation history

Here we use the Chroma vector DB and initialize an empty database collection to store conversation messages

In [37]:
from langchain_chroma import Chroma

# create empty vector DB
chroma_db = Chroma(collection_name='history_db',
                   embedding_function=openai_embed_model)

In [38]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.memory import VectorStoreRetrieverMemory
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda

SYS_PROMPT = """Act as a helpful assistant and give brief answers"""
prompt = ChatPromptTemplate.from_messages(
    [
        ("system", SYS_PROMPT),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{query}"),
    ]
)

# load 2 most similar conversation messages from vector db history for each new message \ prompt
# this uses cosine embedding similarity to load the top 2 similar messgages to the input prompt \ query
retriever = chroma_db.as_retriever(search_type="similarity",
                                   search_kwargs={"k": 2})
memory = VectorStoreRetrieverMemory(retriever=retriever, return_messages=True)

# creating our conversation chain now
def get_memory_messages(query):
    return [memory.load_memory_variables(query)['history']]

conversation_chain = (
    RunnablePassthrough.assign(
        history=RunnableLambda(get_memory_messages)
    ) # sends current query (input by user at runtime) and history messages to next step
      |
    prompt # creates prompt using the previous two variables
      |
    chatgpt # generates response using the prompt from previous step
)

In [39]:
query = {'query': 'Tell me about AI'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, especially computer systems. AI technologies enable machines to learn from experience, adjust to new inputs, and perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI is used in various industries, including healthcare, finance, transportation, and entertainment, to improve efficiency, accuracy, and decision-making processes.


In [40]:
query = {'query': 'What about deep learning'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)



Deep learning is a subset of artificial intelligence that involves training artificial neural networks with multiple layers to learn and make decisions on its own. It is inspired by the structure and function of the human brain and is particularly effective for tasks such as image and speech recognition, natural language processing, and autonomous driving. Deep learning has shown significant advancements in recent years and is a key technology driving many AI applications.


In [41]:
query = {'query': 'Tell me about the fastest animal in the world in 2 lines'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

The fastest animal in the world is the peregrine falcon, which can reach speeds of over 240 miles per hour when diving to catch prey. Its incredible speed and agility make it one of the most efficient hunters in the animal kingdom.


In [42]:
query = {'query': 'What about the cheetah?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

The cheetah is the fastest land animal, capable of reaching speeds up to 60-70 miles per hour in short bursts. Its incredible speed and agility make it a formidable predator, allowing it to chase down prey with remarkable efficiency.


In [43]:
print(memory.load_memory_variables({'query': 'What about machine learning?'})['history'])

query: What about deep learning
output: Deep learning is a subset of artificial intelligence that involves training artificial neural networks with multiple layers to learn and make decisions on its own. It is inspired by the structure and function of the human brain and is particularly effective for tasks such as image and speech recognition, natural language processing, and autonomous driving. Deep learning has shown significant advancements in recent years and is a key technology driving many AI applications.
query: Tell me about AI
output: AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, especially computer systems. AI technologies enable machines to learn from experience, adjust to new inputs, and perform tasks that typically require human intelligence, such as visual perception, speech recognition, decision-making, and language translation. AI is used in various industries, including healthcare, finance, transportation, and ent

In [44]:
query = {'query': 'What about machine learning?'}
response = conversation_chain.invoke(query)
memory.save_context(query, {"output": response.content}) # remember to save your current conversation in memory
print(response.content)

Machine learning is a subset of artificial intelligence that focuses on developing algorithms and models that enable computers to learn from and make predictions or decisions based on data. It involves training machines to recognize patterns in data and make informed decisions without being explicitly programmed to do so. Machine learning algorithms are used in various applications, such as recommendation systems, predictive analytics, fraud detection, and autonomous vehicles.


### Multi-user Conversation Chains with ChatMessageHistory

The concept of `ChatHistory` refers to a class in LangChain which can be used to wrap an arbitrary chain. This `ChatHistory` will keep track of inputs and outputs of the underlying chain, and append them as messages to a message database. Future interactions will then load those messages and pass them into the chain as part of the input.

The beauty of `ChatMessageHistory` is that we can store separate conversation histories per user or session which is often the need for real-world chatbots which will be accessed by many users at the same time.

We use a `get_session_history` function which is expected to take in a `session_id` and return a Message History object. Everything is stored in memory. This `session_id` is used to distinguish between separate conversations, and should be passed in as part of the config when calling the new chain

In [46]:
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# used to retrieve conversation history from memory
# based on a specific user or session ID
history_store = {}
def get_session_history(session_id):
    if session_id not in history_store:
        history_store[session_id] = ChatMessageHistory()
    return history_store[session_id]

# prompt to load in history and current input from the user
prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "Act as a helpful AI Assistant"),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{human_input}"),
    ]
)

# create a basic LLM Chain
llm_chain = (prompt_template
                |
             chatgpt)

# create a conversation chain which can load memory based on specific user or session id
conv_chain = RunnableWithMessageHistory(
    llm_chain,
    get_session_history,
    input_messages_key="human_input",
    history_messages_key="history",
)

# create a utility function to take in current user input prompt and their session ID
# streams result live back to the user from the LLM
def chat_with_llm(prompt: str, session_id: str):
    for chunk in conv_chain.stream({"human_input": prompt},
                                   {'configurable': { 'session_id': session_id}}):
        print(chunk.content, end="")

In [47]:
user_id = 'bob123'
prompt = "Hi I am Bob, can you explain AI in 3 bullet points?"
chat_with_llm(prompt, user_id)

- AI, or artificial intelligence, refers to the simulation of human intelligence processes by machines, such as learning, reasoning, and problem-solving.
- AI technologies include machine learning, neural networks, natural language processing, and computer vision, among others.
- AI is used in various applications, such as virtual assistants, autonomous vehicles, healthcare diagnostics, and fraud detection, to improve efficiency and decision-making.

In [48]:
prompt = "Now do the same for deep learning"
chat_with_llm(prompt, user_id)

- Deep learning is a subset of machine learning that uses artificial neural networks to model and solve complex problems.
- Deep learning algorithms are designed to automatically learn and extract features from data, without the need for explicit programming.
- Deep learning has been successful in tasks such as image and speech recognition, natural language processing, and autonomous driving, due to its ability to handle large amounts of data and learn intricate patterns.

In [49]:
prompt = "Discuss briefly what have we discussed so far is bullet points?"
chat_with_llm(prompt, user_id)

- AI refers to the simulation of human intelligence processes by machines, utilizing technologies like machine learning, neural networks, and natural language processing.
- Deep learning is a subset of AI that uses artificial neural networks to automatically learn and extract features from data, enabling complex problem-solving without explicit programming.
- Both AI and deep learning have diverse applications in various fields, such as virtual assistants, healthcare diagnostics, image recognition, and autonomous driving, to enhance efficiency and decision-making processes.

In [50]:
user_id = 'james007'
prompt = "Hi can you explain what is an LLM in 2 bullet points?"
chat_with_llm(prompt, user_id)

- LLM stands for Master of Laws, which is a postgraduate degree typically pursued by individuals who already hold a law degree and wish to specialize in a specific area of law.
- The LLM program allows students to deepen their knowledge and understanding of legal principles, gain expertise in a particular legal field, and enhance their career prospects in the legal profession.

In [51]:
prompt = "Actually I meant in the context of AI?"
chat_with_llm(prompt, user_id)

- In the context of AI, LLM can refer to Large Language Models, which are advanced AI models designed to understand and generate human language at a sophisticated level.
- LLMs, such as GPT-3 (Generative Pre-trained Transformer 3), have been developed by companies like OpenAI and are used for various applications, including natural language processing, text generation, and conversational AI.

In [52]:
prompt = "Summarize briefly what we have discussed so far?"
chat_with_llm(prompt, user_id)

We have discussed two different meanings of LLM: Master of Laws (LLM) as a postgraduate degree for legal specialization, and Large Language Models (LLMs) in the context of AI as advanced models for understanding and generating human language.

In [53]:
user_id = 'bob123'
prompt = "Discuss briefly what have we discussed so far is bullet points?"
chat_with_llm(prompt, user_id)

- AI refers to the simulation of human intelligence processes by machines, utilizing technologies like machine learning, neural networks, and natural language processing.
- Deep learning is a subset of AI that uses artificial neural networks to automatically learn and extract features from data, enabling complex problem-solving without explicit programming.
- Both AI and deep learning have diverse applications in various fields, such as virtual assistants, healthcare diagnostics, image recognition, and autonomous driving, to enhance efficiency and decision-making processes.

### Multi-user Window-based Conversation Chains with persistence - SQLChatMessageHistory

The beauty of `SQLChatMessageHistory` is that we can store separate conversation histories per user or session which is often the need for real-world chatbots which will be accessed by many users at the same time. Instead of in-memory we can store it in a SQL database which can be used to store a lot of conversations.

We use a `get_session_history` function which is expected to take in a `session_id` and return a Message History object. Everything is stored in a SQL database. This `session_id` is used to distinguish between separate conversations, and should be passed in as part of the config when calling the new chain

We also use a `memory_buffer_window` function to only use the top-K last historical conversations before sending it to the LLM, basically our own implementation of `ConversationBufferWindowMemory`

In [54]:
# removes the memory database file - usually not needed
# you can run this only when you want to remove all conversation histories
!rm memory.db

rm: cannot remove 'memory.db': No such file or directory


In [55]:
from langchain_community.chat_message_histories import SQLChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough

# used to retrieve conversation history from database
# based on a specific user or session ID
def get_session_history_db(session_id):
    return SQLChatMessageHistory(session_id, "sqlite:///memory.db")

# prompt to load in history and current input from the user
prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "Act as a helpful AI Assistant"),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{human_input}"),
    ]
)

# create a memory buffer window function to return the last K conversations
def memory_buffer_window(messages, k=2):
    return messages[-(k+1):]

# create a basic LLM Chain which only sends the last K conversations per user
llm_chain = (
    RunnablePassthrough.assign(history=lambda x: memory_buffer_window(x["history"]))
      |
    prompt_template
      |
    chatgpt
)

In [56]:
# create a conversation chain which can load memory based on specific user or session id
conv_chain = RunnableWithMessageHistory(
    llm_chain,
    get_session_history_db,
    input_messages_key="human_input",
    history_messages_key="history",
)

# create a utility function to take in current user input prompt and their session ID
# streams result live back to the user from the LLM
def chat_with_llm(prompt: str, session_id: str):
    for chunk in conv_chain.stream({"human_input": prompt},
                                   {'configurable': { 'session_id': session_id}}):
        print(chunk.content, end="")

In [57]:
user_id = 'jim001'
prompt = "Hi can you tell me which is the fastest animal?"
chat_with_llm(prompt, user_id)

The fastest animal on land is the cheetah, which can reach speeds of up to 60-70 miles per hour (96-112 km/h) in short bursts covering distances up to 500 meters. In the air, the peregrine falcon holds the title for the fastest animal, reaching speeds of over 240 miles per hour (386 km/h) when diving to catch prey.

In [58]:
prompt = "what about the slowest animal?"
chat_with_llm(prompt, user_id)

The slowest animal is the garden snail, which moves at a leisurely pace of about 0.03 miles per hour (0.048 km/h). Garden snails are known for their slow and deliberate movements, using their muscular foot to glide along a trail of mucus they secrete.

In [59]:
prompt = "what about the largest animal?"
chat_with_llm(prompt, user_id)

The largest animal on Earth is the blue whale. Blue whales can grow up to 100 feet (30 meters) in length and weigh as much as 200 tons. These magnificent creatures are known for their massive size and are the largest animals to have ever lived on our planet.

In [60]:
prompt = "what topics have we discussed, show briefly as bullet points"
chat_with_llm(prompt, user_id)

- The slowest animal
- The largest animal

In [61]:
user_id = 'john005'
prompt = "Explain AI in 3 bullets to a child"
chat_with_llm(prompt, user_id)

- AI is like a smart robot that can learn and think on its own.
- It can help us do things faster and better, like playing games or finding information.
- AI is like having a super smart friend who can help us with different tasks.

In [62]:
prompt = "Now do the same for Generative AI"
chat_with_llm(prompt, user_id)

- Generative AI is a special type of AI that can create new things, like art or music.
- It uses algorithms to come up with new ideas and designs.
- Generative AI is like a magical artist that can make beautiful things out of nothing.

In [63]:
prompt = "Now do the same for machine learning"
chat_with_llm(prompt, user_id)

- Machine learning is a type of AI that can learn from data and improve over time.
- It uses algorithms to analyze patterns in data and make predictions or decisions.
- Machine learning is like having a super smart assistant that can help us make sense of complex information and make better decisions.

In [64]:
prompt = "what topics have we discussed, show briefly as bullet points"
chat_with_llm(prompt, user_id)

- Generative AI: AI that can create new things like art or music using algorithms.
- Machine learning: AI that can learn from data, analyze patterns, and make predictions or decisions to improve over time.