<a href="https://colab.research.google.com/github/GenAIUnplugged/langchain_series/blob/main/Gradio_langchain_basic_chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install langchain langchain-core langchain-community langchain_openai pyMuPDF gradio

Collecting langchain-community
  Downloading langchain_community-0.3.23-py3-none-any.whl.metadata (2.5 kB)
Collecting langchain_openai
  Downloading langchain_openai-0.3.16-py3-none-any.whl.metadata (2.3 kB)
Collecting pyMuPDF
  Downloading pymupdf-1.25.5-cp39-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (3.4 kB)
Collecting gradio
  Downloading gradio-5.29.0-py3-none-any.whl.metadata (16 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting langchain-core
  Downloading langchain_core-0.3.59-py3-none-any.whl.metadata (5.9 kB)
Collecting tiktoken<1,>=0.7 (from langchain_openai)
  Downloading tiktoken-0.9.0-cp3

In [2]:
import gradio as gr
from langchain_openai import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

In [3]:
from google.colab import userdata
import os
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

In [4]:
llm = ChatOpenAI(temperature=0,model="gpt-4o-mini")

In [5]:
memory = ConversationBufferMemory()

  memory = ConversationBufferMemory()


the below code connects the llm and the memory together \
Every time you call conversation.predict(input=...), it uses the entire conversation stored in memory to provide a contextual reply.

In [6]:
conversation = ConversationChain(
    llm=llm,
    verbose=True,
    memory=memory
)

  conversation = ConversationChain(


This is the main chatbot logic that runs on every message:

* Takes user_input from the text box.
* Calls conversation.predict(...) to get the model’s reply using full memory.
* Appends the input-output pair to the history list (which Gradio uses to render the chat visually).
* Returns the updated history twice — once for display in the UI and once for internal state management.

The history variable here is only used by Gradio, not by ConversationBufferMemory.\
history is for Gradio UI:
**gr.Chatbot() expects a list of (user_input, bot_response) tuples to display the conversation.** \
So history.append((user_input, response)) updates the visible chat window. \
**Format is List of tuples like [(user, bot), (user2, bot2)]** \\

**ConversationBufferMemory is internal to LangChain:**
* It does not use the history variable from Gradio.
* Instead, it automatically tracks the full conversation context every time you call response = conversation.predict(input=user_input)
* It stores that internally and appends new messages behind the scenes.
* When you use ConversationBufferMemory, LangChain automatically formats the prompt sent to the LLM to include the full conversation history. \

**Under the Hood**
When we call response = conversation.predict(input="What's the capital of Italy?")
1. Retrieves prior messages from ConversationBufferMemory
2. Formats them like: \
Human: Hi \
AI: Hello! \
Human: What's my name? \
AI: You said it's Sarah. \
Human: What's the capital of Italy? \
3. Sends the whole thing as a prompt to the LLM.
4. Gets the response and updates memory.

If you don’t want this default behavior, you can: \
Use your own memory class Override how the prompt is built \
**Examples:**
* ConversationSummaryMemory: Summarizes instead of verbatim history.
* ChatMessageHistory: Keeps messages structured (like OpenAI chat format).

In [10]:
def chat(user_input, history):
    response = conversation.predict(input=user_input)
    history.append((user_input, response))
    return history, history

* gr.Blocks(): A flexible layout system for building interfaces.
* gr.Markdown(...): Adds a title at the top.
* gr.Chatbot(): The UI element that shows the conversation.
* gr.Textbox(): Where users type their messages.
* gr.State([]): Keeps the chat history across interactions (Gradio needs this to remember prior messages).
* **msg.submit(...): When the user hits enter:**
* It calls the chat() function.
* Passes the current message and history.
* Updates both the visible chatbot and the stored state.

In [11]:
# Gradio interface
with gr.Blocks() as demo:
    gr.Markdown("# 🧠 LangChain Chatbot with Memory")
    chatbot = gr.Chatbot()
    msg = gr.Textbox(label="Your Message")
    state = gr.State([])  # This will store the chat history
    msg.submit(chat, [msg, state], [chatbot, state])

  chatbot = gr.Chatbot()


In [12]:
demo.launch()

It looks like you are running Gradio on a hosted a Jupyter notebook. For the Gradio app to work, sharing must be enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://4da84bef06095bd25d.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




**Understanding the difference between Gradio State and LangChain’s ConversationBufferMemory is key to building memory-aware chatbots.**
* Gradio’s utility to store temporary values between interactions.
* Think of it as a global variable that survives UI updates.
* Not visible to LLM. Only backend data to show the conversation history for the front end

**ConversationBufferMemory (LangChain)**
* Keeps the raw input-output message history in a format the LLM can understand.
* Automatically inserts previous turns into the prompt passed to the LLM

**In conclusion:Use Both Together**
* gr.State → Keeps chat history for the UI (gr.Chatbot() needs it).
*ConversationBufferMemory → Keeps message history for the LLM so it responds contextually.