<a href="https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_04_1_langchain_chat.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-559: Applications of Generative Artificial Intelligence
**Module 4: LangChain: Chat and Memory**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 4 Material

* **Part 4.1: LangChain Conversations** [[Video]](https://www.youtube.com/watch?v=-wXT2RlzJec&ab_channel=JeffHeaton) [[Notebook]](t81_559_class_04_1_langchain_chat.ipynb)
* Part 4.2: Conversation Buffer Window Memory [[Video]](https://www.youtube.com/watch?v=G-l3T1Z9CHc&ab_channel=JeffHeaton) [[Notebook]](t81_559_class_04_2_memory_buffer.ipynb)
* Part 4.3: Chat with Summary and Fixed Window [[Video]](https://www.youtube.com/watch?v=z0iTmoEgn9U&ab_channel=JeffHeaton) [[Notebook]](t81_559_class_04_3_summary.ipynb)
* Part 4.4: Chat with Persistence, Rollback and Regeneration [[Video]](https://www.youtube.com/watch?v=7QEFjNE6wxs&ab_channel=JeffHeaton) [[Notebook]](t81_559_class_04_4_persistence.ipynb)
* Part 4.5: Automated Coder Application [[Video]](https://www.youtube.com/watch?v=pHcKXOMDZKU&ab_channel=JeffHeaton) [[Notebook]](t81_559_class_04_5_coder.ipynb)

# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed.

In [None]:
import os

try:
    from google.colab import drive, userdata
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# OpenAI Secrets
if COLAB:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

# Install needed libraries in CoLab
if COLAB:
    !pip install langchain langchain_openai

Note: using Google CoLab


# 4.1: LangChain Conversations

Large language models (LLMs) facilitate interaction like human conversations. They are capable of referencing information shared earlier in the dialogue. In this module, we will explore managing an LLM's memory capabilities. Notably, LLMs need an inherent memory system beyond their immediate context buffer. Consequently, it falls upon external systems like LangChain to embed all previous conversational memory into each new prompt. To effectively remember past interactions, LangChain maintains a comprehensive transcript that accumulates over time. This transcript is reintroduced to the LLM with each exchange, which incorporates both user inputs and LLM responses. With each new prompt, the LLM is tasked to generate a suitable next response, thereby perpetuating the interactive process.

## Creating a Chat Conversation

The code snippet provides two functions designed to create a basic conversation utility for a Large Language Model (LLM). This is part of an effort to build foundational utilities before introducing more complex memory capabilities using LangChain classes in later sections.

First, the necessary modules and classes are imported. HumanMessage and SystemMessage from langchain_core.messages are used to represent messages from a human user and system responses, respectively. The ChatPromptTemplate, HumanMessagePromptTemplate, and SystemMessagePromptTemplate from langchain_core.prompts.chat help format prompts for the chat. Additionally, ChatOpenAI from langchain_openai is used to interact with OpenAI's language models, and display_markdown from IPython.display allows for markdown rendering within IPython environments.

The first function, begin_conversation, initializes a conversation with a system prompt, which is a predefined message declaring that the system's role is to assist (specified by the DEFAULT_SYSTEM variable). It creates an initial SystemMessage containing this prompt and returns a list containing this message, setting the stage for a conversation.

The second function, converse, facilitates the interaction between the human user and the LLM. It takes an LLM instance, the ongoing conversation (a list of messages), and the user's prompt as inputs. It begins by appending the user's message (encapsulated as a HumanMessage) to the conversation. It then invokes the LLM to respond based on the entire conversation history up to that point. The LLM's response is captured and added back to the conversation. Finally, the content of the LLM's response is returned, providing the output for the user to see. This function enables a dynamic and continuous exchange between the user and the system, leveraging the LLM's capabilities to generate relevant responses.

In [None]:
from langchain_core.messages import HumanMessage, SystemMessage
from langchain_core.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain_openai import ChatOpenAI
from IPython.display import display_markdown

DEFAULT_SYSTEM = "You are a helpful assistant. Format answers with markdown."

def begin_conversation(sys_prompt):
  messages = [
      SystemMessage(content=sys_prompt)
  ]
  return messages

def converse(llm, conversation, prompt):
  conversation.append(HumanMessage(content=prompt))
  output = llm.invoke(conversation)
  conversation.append(output)
  return output.content

The provided code builds upon the initial snippet to create a simple conversation where the Large Language Model (LLM) recalls the user's name through a series of exchanges.

The conversation starts by initializing an instance of ChatOpenAI, a class from the LangChain library, with a specific model ('gpt-5-mini'). Several parameters are set for the LLM, including the temperature, which controls the randomness of the response, and n, which specifies the number of responses to generate (in this case, a single response).

The function begin_conversation is called with the constant DEFAULT_SYSTEM as its argument. This function initializes the conversation with a system message stating, "You are a helpful assistant." The conversation list, containing this initial system message, is created and will be used to track the entire conversation history.

Next, the converse function is used to facilitate the dialogue between the user and the LLM. The user begins by sending a prompt: "Hello, what is my name?" This user's message is appended to the conversation history, and the LLM is invoked to generate a response based on the conversation so far. The LLM's response is added to the conversation and displayed using display_markdown to format the output appropriately.

In response to realizing the LLM does not yet know the user's name, the user then provides their name with the statement, "Oh sorry, my name is Jeff." This message is similarly processed: added to the conversation, and the LLM generates a new response acknowledging the name or continuing the conversation. Again, the response is displayed.

Finally, the user asks again, "What is my name?" to test if the LLM recalls the name from earlier in the conversation. The same process ensues, with the LLM generating a response based on the updated conversation history that now includes the user's name.

This sequence effectively demonstrates a simple use case where the conversation history maintained in the list enables the LLM to recall and utilize context from earlier exchanges, such as remembering the user's name. The ability to recall details like this is crucial for creating more engaging and personalized user interactions with LLMs.

In [None]:
MODEL = 'gpt-5-mini'

# Initialize the OpenAI LLM with your API key
llm = ChatOpenAI(
  model=MODEL,
  temperature= 0.3,
  n= 1)

conversation = begin_conversation(DEFAULT_SYSTEM)
output = converse(llm, conversation, "Hello, what is my name?")
display_markdown(output,raw=True)
output = converse(llm, conversation, "Oh sorry, my name is Jeff.")
display_markdown(output,raw=True)
output = converse(llm, conversation, "What is my name?")
display_markdown(output,raw=True)

I don't know your name — you haven't told me yet. If you'd like, tell me now (for example: "My name is Alex") and I'll use it during this conversation.  

Note: I can't access your accounts or personal data outside of what you share here, and I don't retain personal information between sessions unless you tell me again.

Nice to meet you, Jeff! I'll remember your name for the rest of this conversation. How can I help you today?

Your name is Jeff.

## Conversing with the LLM in Markdown

The function chat serves as a facilitator for conversation between a human user and a Large Language Model (LLM) and enhances the display of the chat responses using Markdown formatting. It is built upon the earlier provided code snippets that set up a basic conversational framework with an LLM.

Markdown is a lightweight markup language with plain-text formatting syntax. It is designed to be converted into HTML or other formats while remaining easy to read and write in its raw form. This feature makes it popular for writing on the web, as users can create formatted text (like headers, lists, italics, and bold text) using simple and readable symbols.

In the chat function, the process starts by printing the user's prompt prefixed with "Human: ". This mimics a real chat interface, clearly delineating the messages that are input by the user. Then, the function calls converse, which was previously described, to process the prompt within the ongoing conversation with the LLM. This function appends the human message to the conversation, invokes the LLM to generate a response based on the conversation's context, and then appends this system-generated message back into the conversation history.

After obtaining the LLM's response through converse, the function display_markdown is used to render this response. The raw=True parameter tells the IPython display function to treat the string as raw Markdown. By using Markdown formatting, responses can include enhanced textual features such as italics, bolding, and lists, which can make the output more readable and engaging.

By formatting LLM responses in Markdown, we reinforce the natural, text-based communication style of the LLM. Given that LLMs, such as those provided by OpenAI, often format their responses in Markdown to leverage its text-enhancement capabilities, this approach ensures that the conversation utility outputs responses that utilize the full range of expressive possibilities offered by Markdown, enhancing the user interaction experience. This design choice aligns well with the Markdown capabilities inherently supported by many LLMs, ensuring that the responses are both visually appealing and functionally informative.

In [None]:
def chat(llm, conversation, prompt):
  print(f"Human: {prompt}")
  output = converse(llm, conversation, prompt)
  display_markdown(output,raw=True)

The provided code sequence demonstrates a conversation between a human user and a Large Language Model (LLM), making use of the chat function to interactively manage the conversation and display responses in Markdown format. This approach allows for a dynamic and contextually aware chat, while also enhancing the visual and structural presentation of the responses.

In [None]:
conversation = begin_conversation(DEFAULT_SYSTEM)
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "Okay, then let me introduce myself, my name is Jeff")
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "Give me a table of the 5 most populus cities with population and country.")


Human: What is my name?


I don't know your name — I don't have access to personal data unless you tell me. Would you like to tell me now? If so, how should I address you (full name, first name, nickname)?  

Options:
- Tell me your name and I'll use it in this conversation.  
- Ask me to remember it for future chats (note: I can only remember if you enable a memory feature; otherwise I won't retain it).  
- Stay anonymous — I can call you "Friend" or "User."  

Which do you prefer?

Human: Okay, then let me introduce myself, my name is Jeff


Nice to meet you, Jeff — I’ll call you that in this chat.

Would you like me to remember your name for future conversations?  
- If yes: enable the memory feature (in your settings) and I’ll keep it.  
- If no: I won’t retain it beyond this session.

Anything you’d like help with right now?

Human: What is my name?


Your name is Jeff.  

Would you like me to remember it for future conversations?

Human: Give me a table of the 5 most populus cities with population and country.


Do you mean the 5 most populous cities by "city proper" (administrative boundaries) or by "metropolitan area"/"urban agglomeration"? Also which year or data source should I use (most recent estimates, UN 2022/2023, World Population Review, etc.)?  

If you want, I can proceed now with a default: city proper using the most recent commonly cited estimates (and include the year/source). Which do you prefer?

## Constraining the Conversation with a System Prompt

You can use the system prompt to constrain the conversation to a specific topic. Here, we provide a simple agent that will only discuss life insurance.

In [None]:
conversation = begin_conversation("""
You are a helpful agent to answer questions about life insurance. Do not talk
about anything else with users. . Format answers with markdown.""")
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "Okay, then let me introduce myself, my name is Jeff")
chat(llm, conversation, "What is my name?")
chat(llm, conversation, "What is your favorite programming language?")
chat(llm, conversation, "What is the difference between a term and whole life policy?")

Human: What is my name?


**I don’t know your name.** I don’t have access to personal data or identifiers unless you tell me.

I’m here to help with life insurance questions. If you’d like to be called something specific, tell me how you’d like to be addressed and then ask a life-insurance question (examples you can ask):

- Differences between term and whole life insurance  
- How much coverage you might need  
- How beneficiaries and payouts work  
- What affects premiums and how to lower them  
- Riders, conversions, and policy options

Which life insurance question can I help with?

Human: Okay, then let me introduce myself, my name is Jeff


**Hi Jeff — nice to meet you.**  

How can I help with life insurance today? Here are some things I can do — pick one or tell me more about your situation:

- Explain differences between term, whole, universal, and variable life insurance  
- Estimate how much coverage you might need (I can use your age, income, debts, dependents, mortgage, savings)  
- Explain beneficiaries, payout timing, and tax treatment  
- Describe what affects premiums and how to lower them (medical exams, tobacco use, occupation, hobbies)  
- Review riders (accelerated death benefit, waiver of premium, child rider, etc.) and conversion options  
- Help compare quotes or evaluate an existing policy

If you want a personalized estimate, tell me your age, general health (smoker? major conditions?), marital/dependent status, annual income, and any big debts (mortgage, loans).

Human: What is my name?


**Your name is Jeff.**

How can I help with life insurance today?

Human: What is your favorite programming language?


**I can’t answer that.** I only discuss life insurance topics.

How can I help with life insurance today? Examples:
- Explain term vs. whole life insurance  
- Estimate how much coverage you may need (tell me age, health, income, debts, dependents)  
- Explain beneficiaries, payouts, and taxes  
- Describe what affects premiums and ways to lower them  
- Review common riders (accelerated death benefit, waiver of premium, etc.)

Which would you like?

Human: What is the difference between a term and whole life policy?


Below is a clear comparison of term vs. whole life insurance to help you decide which fits your needs.

What they are
- Term life: Provides a death benefit for a fixed period (common terms: 10, 15, 20, 30 years). If you die during the term, the policy pays the benefit; if you outlive the term, the coverage stops (unless renewed or converted).
- Whole life: A permanent policy that stays in force for your lifetime as long as premiums are paid. It provides a guaranteed death benefit plus a cash value component that grows over time.

Key differences at a glance
- Duration
  - Term: Temporary coverage for a set number of years.
  - Whole: Permanent coverage for life.
- Premiums
  - Term: Much lower initially; level for the term length, then usually rises sharply on renewal.
  - Whole: Higher, usually level for life (can be structured different ways).
- Cash value
  - Term: No cash value.
  - Whole: Accumulates cash value you can access via loans or withdrawals.
- Purpose
  - Term: Income replacement, mortgage protection, temporary needs.
  - Whole: Lifetime protection, estate planning, tax-deferred savings component.
- Cost
  - Term: More affordable coverage per dollar of death benefit.
  - Whole: Significantly more expensive for the same death benefit amount.

Other important considerations
- Conversions: Many term policies include a conversion option to convert to a permanent policy without new underwriting—useful if health changes.
- Renewability: Term policies often allow renewal at term end, but premiums may be much higher based on age.
- Cash value access: Whole life cash value can be borrowed against or withdrawn; loans accrue interest and unpaid loans reduce the death benefit.
- Guarantees vs. variability: Whole life offers more guarantees (premiums, death benefit, cash-value growth in traditional whole life). Universal and variable life are other permanent types with different risk/crediting features.
- Tax treatment: Death benefits are generally income-tax-free to beneficiaries. Cash value growth in permanent policies is tax-deferred; loans typically aren’t taxed unless policy lapses.
- Dividends: Some whole life policies (participating) may pay dividends—not guaranteed but can increase cash value or be used to lower premiums.

Pros & cons
- Term life
  - Pros: Affordable, simple, good for temporary large needs (young families, mortgage).
  - Cons: No savings/cash value; coverage can become expensive as you age.
- Whole life
  - Pros: Lifetime coverage, forced savings/cash value, predictable premiums (in many policies), useful for estate planning.
  - Cons: Much higher premiums; complexity; lower effective investment returns compared with some alternatives.

Which is right for you?
- Choose term if: You need affordable coverage to protect dependents during working years, pay off a mortgage, or cover college costs.
- Choose whole (or another permanent policy) if: You need guaranteed lifetime coverage, want a tax-deferred cash-value accumulation, have estate-planning needs, or want to lock in coverage while young and healthy.

Next steps
- Estimate how much coverage you need and for how long (I can help with a needs calculation if you give age, income, debts, dependents, mortgage).
- If cost is a concern, compare term quotes first; if you value lifelong coverage or cash value, get quotes on whole/other permanent types and compare illustrated cash-value projections and fees.
- If you already have a term policy, check whether it’s convertible and the conversion deadline.

Would you like a quick coverage-need estimate or sample quote comparison based on your age and situation?

# Module 4 Assignment

You can find the first assignment here: [assignment 4](https://github.com/jeffheaton/app_generative_ai/blob/main/assignments/assignment_yourname_t81_559_class4.ipynb)