<a href="https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_04_1_langchain_chat.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-559: Applications of Generative Artificial Intelligence
**Module 4: LangChain: Chat and Memory**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 4 Material

* **Part 4.1: LangChain Conversations** [[Video]](https://www.youtube.com/watch?v=Effbhxq07Ag) [[Notebook]](t81_559_class_04_1_langchain_chat.ipynb)
* Part 4.2: Conversation Buffer Window Memory [[Video]](https://www.youtube.com/watch?v=14RgiFVGfAA) [[Notebook]](t81_559_class_04_2_memory_buffer.ipynb)
* Part 4.3: Conversation Token Buffer Memory [[Video]](https://www.youtube.com/watch?v=QTe5g2c3bSM) [[Notebook]](t81_559_class_04_3_memory_token.ipynb)
* Part 4.4: Conversation Summary Memory [[Video]](https://www.youtube.com/watch?v=asZQ8Ktqmt8) [[Notebook]](t81_559_class_04_4_memory_summary.ipynb)
* Part 4.5: Persisting Langchain Memory [[Video]](https://www.youtube.com/watch?v=sjCyqqOQcPA) [[Notebook]](t81_559_class_04_5_memory_persist.ipynb)

# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed.

In [None]:
import os

try:
    from google.colab import drive, userdata
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# OpenAI Secrets
if COLAB:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

# Install needed libraries in CoLab
if COLAB:
    !pip install langchain langchain_openai

Note: using Google CoLab
Collecting langchain
  Downloading langchain-0.1.17-py3-none-any.whl (867 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m867.6/867.6 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain_openai
  Downloading langchain_openai-0.1.4-py3-none-any.whl (33 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain)
  Downloading dataclasses_json-0.6.5-py3-none-any.whl (28 kB)
Collecting jsonpatch<2.0,>=1.33 (from langchain)
  Downloading jsonpatch-1.33-py2.py3-none-any.whl (12 kB)
Collecting langchain-community<0.1,>=0.0.36 (from langchain)
  Downloading langchain_community-0.0.36-py3-none-any.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m38.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting langchain-core<0.2.0,>=0.1.48 (from langchain)
  Downloading langchain_core-0.1.48-py3-none-any.whl (302 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.9/302.9 kB[

# 4.1: LangChain Conversations

Large language models (LLMs) facilitate interaction like human conversations. They are capable of referencing information shared earlier in the dialogue. In this module, we will explore managing an LLM's memory capabilities. Notably, LLMs need an inherent memory system beyond their immediate context buffer. Consequently, it falls upon external systems like LangChain to embed all previous conversational memory into each new prompt. To effectively remember past interactions, LangChain maintains a comprehensive transcript that accumulates over time. This transcript is reintroduced to the LLM with each exchange, which incorporates both user inputs and LLM responses. With each new prompt, the LLM is tasked to generate a suitable next response, thereby perpetuating the interactive process.

## Creating a Chat Conversation

The code snippet provides two functions designed to create a basic conversation utility for a Large Language Model (LLM). This is part of an effort to build foundational utilities before introducing more complex memory capabilities using LangChain classes in later sections.

First, the necessary modules and classes are imported. HumanMessage and SystemMessage from langchain_core.messages are used to represent messages from a human user and system responses, respectively. The ChatPromptTemplate, HumanMessagePromptTemplate, and SystemMessagePromptTemplate from langchain_core.prompts.chat help format prompts for the chat. Additionally, ChatOpenAI from langchain_openai is used to interact with OpenAI's language models, and display_markdown from IPython.display allows for markdown rendering within IPython environments.

The first function, begin_conversation, initializes a conversation with a system prompt, which is a predefined message declaring that the system's role is to assist (specified by the DEFAULT_SYSTEM variable). It creates an initial SystemMessage containing this prompt and returns a list containing this message, setting the stage for a conversation.

The second function, converse, facilitates the interaction between the human user and the LLM. It takes an LLM instance, the ongoing conversation (a list of messages), and the user's prompt as inputs. It begins by appending the user's message (encapsulated as a HumanMessage) to the conversation. It then invokes the LLM to respond based on the entire conversation history up to that point. The LLM's response is captured and added back to the conversation. Finally, the content of the LLM's response is returned, providing the output for the user to see. This function enables a dynamic and continuous exchange between the user and the system, leveraging the LLM's capabilities to generate relevant responses.

In [None]:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai import ChatOpenAI
from IPython.display import display_markdown

DEFAULT_SYSTEM = "You are a helpful assistant. Format answers with markdown."

def begin_conversation(sys_prompt):
  messages = [
      SystemMessage(content=sys_prompt)
  ]
  return messages

def converse(llm, conversation, prompt):
  conversation.append(HumanMessage(content=prompt))
  output = llm.invoke(conversation)
  conversation.append(output)
  return output.content

The provided code builds upon the initial snippet to create a simple conversation where the Large Language Model (LLM) recalls the user's name through a series of exchanges.

The conversation starts by initializing an instance of ChatOpenAI, a class from the LangChain library, with a specific model ('gpt-4o-mini'). Several parameters are set for the LLM, including the temperature, which controls the randomness of the response, and n, which specifies the number of responses to generate (in this case, a single response).

The function begin_conversation is called with the constant DEFAULT_SYSTEM as its argument. This function initializes the conversation with a system message stating, "You are a helpful assistant." The conversation list, containing this initial system message, is created and will be used to track the entire conversation history.

Next, the converse function is used to facilitate the dialogue between the user and the LLM. The user begins by sending a prompt: "Hello, what is my name?" This user's message is appended to the conversation history, and the LLM is invoked to generate a response based on the conversation so far. The LLM's response is added to the conversation and displayed using display_markdown to format the output appropriately.

In response to realizing the LLM does not yet know the user's name, the user then provides their name with the statement, "Oh sorry, my name is Jeff." This message is similarly processed: added to the conversation, and the LLM generates a new response acknowledging the name or continuing the conversation. Again, the response is displayed.

Finally, the user asks again, "What is my name?" to test if the LLM recalls the name from earlier in the conversation. The same process ensues, with the LLM generating a response based on the updated conversation history that now includes the user's name.

This sequence effectively demonstrates a simple use case where the conversation history maintained in the list enables the LLM to recall and utilize context from earlier exchanges, such as remembering the user's name. The ability to recall details like this is crucial for creating more engaging and personalized user interactions with LLMs.

In [None]:
MODEL = 'gpt-4o-mini'

# Initialize the OpenAI LLM with your API key
llm = ChatOpenAI(
  model=MODEL,
  temperature= 0.3,
  n= 1)

conversation = begin_conversation(DEFAULT_SYSTEM)
output = converse(llm, conversation, "Hello, what is my name?")
display_markdown(output,raw=True)
output = converse(llm, conversation, "Oh sorry, my name is Jeff.")
display_markdown(output,raw=True)
output = converse(llm, conversation, "What is my name?")
display_markdown(output,raw=True)

I'm sorry, but I don't have access to your personal information such as your name. How can I assist you today?

Nice to meet you, Jeff! How can I help you today?

Your name is Jeff. How can I assist you further, Jeff?

## Conversing with the LLM in Markdown

The function chat serves as a facilitator for conversation between a human user and a Large Language Model (LLM) and enhances the display of the chat responses using Markdown formatting. It is built upon the earlier provided code snippets that set up a basic conversational framework with an LLM.

Markdown is a lightweight markup language with plain-text formatting syntax. It is designed to be converted into HTML or other formats while remaining easy to read and write in its raw form. This feature makes it popular for writing on the web, as users can create formatted text (like headers, lists, italics, and bold text) using simple and readable symbols.

In the chat function, the process starts by printing the user's prompt prefixed with "Human: ". This mimics a real chat interface, clearly delineating the messages that are input by the user. Then, the function calls converse, which was previously described, to process the prompt within the ongoing conversation with the LLM. This function appends the human message to the conversation, invokes the LLM to generate a response based on the conversation's context, and then appends this system-generated message back into the conversation history.

After obtaining the LLM's response through converse, the function display_markdown is used to render this response. The raw=True parameter tells the IPython display function to treat the string as raw Markdown. By using Markdown formatting, responses can include enhanced textual features such as italics, bolding, and lists, which can make the output more readable and engaging.

By formatting LLM responses in Markdown, we reinforce the natural, text-based communication style of the LLM. Given that LLMs, such as those provided by OpenAI, often format their responses in Markdown to leverage its text-enhancement capabilities, this approach ensures that the conversation utility outputs responses that utilize the full range of expressive possibilities offered by Markdown, enhancing the user interaction experience. This design choice aligns well with the Markdown capabilities inherently supported by many LLMs, ensuring that the responses are both visually appealing and functionally informative.

In [None]:
def chat(llm, conversation, prompt):
  print(f"Human: {prompt}")
  output = converse(llm, conversation, prompt)
  display_markdown(output,raw=True)

The provided code sequence demonstrates a conversation between a human user and a Large Language Model (LLM), making use of the chat function to interactively manage the conversation and display responses in Markdown format. This approach allows for a dynamic and contextually aware chat, while also enhancing the visual and structural presentation of the responses.

In [None]:
conversation = begin_conversation(DEFAULT_SYSTEM)
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "Okay, then let me introduce myself, my name is Jeff")
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "Give me a table of the 5 most populus cities with population and country.")


Human: What is my name?


I'm sorry, but I don't have access to your personal information such as your name. How can I assist you today?

Human: Okay, then let me introduce myself, my name is Jeff


Nice to meet you, Jeff! How can I assist you today?

Human: What is my name?


Your name is Jeff. How can I assist you today, Jeff?

Human: Give me a table of the 5 most populus cities with population and country.


| City          | Population     | Country |
|---------------|----------------|---------|
| Tokyo         | 37,833,000     | Japan   |
| Delhi         | 30,291,000     | India   |
| Shanghai      | 27,058,000     | China   |
| Sao Paulo     | 22,043,000     | Brazil  |
| Mumbai        | 21,042,000     | India   |

## Constraining the Conversation with a System Prompt

You can use the system prompt to constrain the conversation to a specific topic. Here, we provide a simple agent that will only discuss life insurance.

In [None]:
conversation = begin_conversation("""
You are a helpful agent to answer questions about life insurance. Do not talk
about anything else with users. . Format answers with markdown.""")
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "Okay, then let me introduce myself, my name is Jeff")
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "What is your favorite programming language?")
chat(llm, conversation, "What is the difference between a term and whole life policy?")

Human: What is my name?


I'm here to help answer questions about life insurance. How can I assist you today with life insurance-related queries?

Human: Okay, then let me introduce myself, my name is Jeff


I'm here to help answer questions about life insurance. How can I assist you today with life insurance-related queries?

Human: What is my name?


I'm here to help answer questions about life insurance. How can I assist you today with life insurance-related queries?

Human: What is your favorite programming language?


I'm here to help answer questions about life insurance. How can I assist you today with life insurance-related queries?

Human: What is the difference between a term and whole life policy?


- **Term Life Insurance:**
  - Provides coverage for a specific period, such as 10, 20, or 30 years.
  - Typically more affordable than whole life insurance.
  - Does not build cash value.
  - Coverage ends when the term expires unless renewed or converted.
  
- **Whole Life Insurance:**
  - Provides coverage for your entire life.
  - Premiums remain level throughout the policy's lifetime.
  - Builds cash value over time that you can borrow against or withdraw.
  - Generally more expensive than term life insurance.
  
It's important to consider your financial goals and needs when deciding between term and whole life insurance.

# Module 4 Assignment

You can find the first assignment here: [assignment 4](https://github.com/jeffheaton/app_generative_ai/blob/main/assignments/assignment_yourname_class4.ipynb)