<a href="https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_10_5_chat_multimodal.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-559: Applications of Generative Artificial Intelligence
**Module 10: StreamLit**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 10 Material

Module 10: StreamLit

* Part 10.1: Running StreamLit in Google Colab [[Video]]() [[Notebook]](t81_559_class_10_1_streamlit.ipynb)
* Part 10.2: StreamLit Introduction [[Video]]() [[Notebook]](t81_559_class_10_2_streamlit_intro.ipynb)
* Part 10.3: Understanding Streamlit State [[Video]]() [[Notebook]](t81_559_class_10_3_streamlit_state.ipynb)
* Part 10.4: Creating a Chat Application [[Video]]() [[Notebook]](t81_559_class_10_4_chat.ipynb)
* **Part 10.5: MultiModal Chat Application** [[Video]]() [[Notebook]](t81_559_class_10_5_chat_multimodal.ipynb)


# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed.

In [None]:
import os

try:
    from google.colab import drive, userdata
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# OpenAI Secrets
if COLAB:
    os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

# Install needed libraries in CoLab
if COLAB:
    !pip install langchain langchain_openai openai streamlit

# Part 10.5: MultiModal Chat Application

In this module, we will guide you through the process of creating a multimodal StreamLit-based LLM chat application. To keep things accessible and straightforward, we'll run our app using Google Colab. Additionally, we'll introduce you to llm_util.py, a utility script designed to make it easier to work with LangChain-compatible large language models (LLMs). For this example, we'll be using OpenAI's LLM to power our chat application. By the end of this module, you'll have a functional, interactive chat app and a solid understanding of how to integrate LLMs into your projects using StreamLit.

We will now create three files:

* **app.py** - The main StreamLit chat application.
* **llm_util.py** - The LLM utility that allows us to utilize any LangChain LLM for our chat application.
* **llms.yaml** - A config file to define which LLM's we will use; for this example it will be OpenAI.

This chat application allows images to be attached to the conversation.

## Chat Application

To enhance your chatbot with multimodal capabilities, you need to modify your Streamlit application to accept both text and image inputs from users. Start by replacing the simple text input with a form that includes a text field and an image uploader. This allows users to type messages and optionally attach images.

When a user submits the form, process the inputs by creating a message content structure that includes both the text and the image data. For the image, read the uploaded file, encode it in base64, and format it into a data URL. This prepares the image data to be sent to the language model in a compatible format.

Instead of using a basic conversation chain, utilize a language model interface like ChatOpenAI that can handle messages containing both text and images. Invoke the language model with the properly formatted message, which includes the user's text and the encoded image data.

Manage the conversation history using session state to maintain context across interactions. Update the chat interface to display both the user's inputs (text and images) and the assistant's responses. Ensure that the assistant's replies are rendered appropriately in the interface.

By implementing these changes, your chatbot will support multimodal interactions, allowing users to engage in richer conversations that combine textual and visual information. This setup leverages Streamlit for the user interface and LangChain along with OpenAI's language models for processing, enabling the chatbot to interpret and respond to images attached by users.

In [None]:
%%writefile app.py
import streamlit as st
from openai import OpenAI
from llm_util import *
import base64
from langchain_core.messages import HumanMessage
from langchain_openai import ChatOpenAI
import sys

# This retrieves all command line arguments as a list
arguments = sys.argv
if len(sys.argv) != 2:
    print("Please specify the llm to use as the first argument")
    st.stop()
else:
    profile = sys.argv[1]

st.title("Chat with Image Support")

if "chat" not in st.session_state:
    client = open_llm(profile)
    st.session_state.chat = client  # Assume this returns a ChatOpenAI instance

if "messages" not in st.session_state:
    st.session_state.messages = []

# Display existing messages
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# Create a form for user input
with st.form("chat_form"):
    prompt = st.text_input("Enter your message:")
    uploaded_file = st.file_uploader("Upload an image (optional)", type=["jpg", "jpeg", "png"])
    submit_button = st.form_submit_button("Send")

if submit_button:
    # Build the user message content
    message_content = []
    if prompt:
        message_content.append({"type": "text", "text": prompt})
    if uploaded_file is not None:
        # Read the image data and encode it in base64
        image_bytes = uploaded_file.read()
        image_type = uploaded_file.type  # e.g., 'image/jpeg'
        image_data = base64.b64encode(image_bytes).decode("utf-8")
        # Include the image data in the message content
        message_content.append(
            {"type": "image_url", "image_url": {"url": f"data:{image_type};base64,{image_data}"}}
        )
    # Create the HumanMessage
    message = HumanMessage(content=message_content)
    # Append user's message to session state
    st.session_state.messages.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        if prompt:
            st.markdown(prompt)
        if uploaded_file is not None:
            st.image(uploaded_file)
    # Get response from the LLM
    response = st.session_state.chat.invoke([message])
    # Append assistant's response to messages
    st.session_state.messages.append({"role": "assistant", "content": response.content})
    with st.chat_message("assistant"):
        st.markdown(response.content)



## LLM Utility


This following code enables the chat application to use various language models supported by LangChain based on a configuration file. The llm_util.py script serves as a utility that dynamically loads and initializes different language models using configurations specified in a YAML file (llms.yaml). This approach provides a flexible way to change the language model without modifying the main application code, allowing for easy experimentation and customization.

In [None]:
%%writefile llm_util.py
import yaml

# Load the YAML file
def load_yaml(file_path):
    with open(file_path, "r") as file:
        return yaml.safe_load(file)


# Function to dynamically import a class based on a string path (e.g., "langchain_community.chat_models.ChatOllama")
def get_class(class_path):
    module_path, class_name = class_path.rsplit(".", 1)
    module = __import__(module_path, fromlist=[class_name])
    return getattr(module, class_name)


# Open Language Model Server function
def open_llm(server_name):
    config = load_yaml("llms.yaml")
    for server in config["servers"]:
        if server["name"] == server_name:
            class_path = server["class"]
            clazz = get_class(class_path)
            # Remove 'class' and 'name' from the parameters as they're not needed for initialization
            params = {k: v for k, v in server.items() if k not in ["class", "name"]}

            return clazz(**params)
    raise ValueError(f"Server '{server_name}' not found")

In [None]:
%%writefile llms.yaml

servers:
  - name: server1
    class: langchain_openai.ChatOpenAI
    model: gpt-4o-mini
    temperature: 0



Next, we obtain the password for our StreamLit server we are about to launch.

In [None]:
!curl https://loca.lt/mytunnelpassword

We launch the StreamLit server and obtain its URL. You will need the above password when you access the URL it gives you.

In [None]:
!streamlit run app.py server1 &>/content/logs.txt &
!npx --yes localtunnel --port 8501