## RAG and Conversational integrated with MLLMs
This is a demo application demonstrating the use of RAG and conversational agents integrated with MLLMs using GPT-4o and GPT-4o-mini. For usage, you will need to create your own index. Click the link in the last code block and navigate through the UI, asking questions using different setups.

In [1]:
# Clone the main repository to retrieve the structured folders
!git clone https://github.com/gpapageorgiouedu/A-Multimodal-Framework-Embedding-Retrieval-Augmented-Generation-with-MLLMs-for-Eurobarometer-Data.git
!mv /content/A-Multimodal-Framework-Embedding-Retrieval-Augmented-Generation-with-MLLMs-for-Eurobarometer-Data/* /content/

Cloning into 'A-Multimodal-Framework-Embedding-Retrieval-Augmented-Generation-with-MLLMs-for-Eurobarometer-Data'...
remote: Enumerating objects: 11, done.[K
remote: Counting objects: 100% (11/11), done.[K
remote: Compressing objects: 100% (9/9), done.[K
remote: Total 11 (delta 1), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (11/11), 36.21 KiB | 579.00 KiB/s, done.
Resolving deltas: 100% (1/1), done.


#### Install dependencies

In [2]:
# Update package list
!apt-get update -y -q

# Install required libs
!apt-get install -y -q libwoff1 libharfbuzz-icu0 libenchant-2-2 libsecret-1-0 libhyphen0 gstreamer1.0-gl gstreamer1.0-plugins-bad libmanette-0.2-0

# Install additional Python libs for LLM applications
!pip install transformers sentence-transformers faiss-gpu pillow PyPDF2 pymupdf -q
!pip install farm-haystack[all] -q
!pip install fpdf -q
!pip install openai -q

# Install FastAPI, Uvicorn, and Python Multipart
!pip install -q fastapi==0.112.0 uvicorn==0.30.5 python-multipart==0.0.9

Hit:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Get:2 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Get:3 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Get:4 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,626 B]
Hit:5 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:6 https://r2u.stat.illinois.edu/ubuntu jammy/main amd64 Packages [2,632 kB]
Get:7 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Get:8 http://security.ubuntu.com/ubuntu jammy-security/main amd64 Packages [2,505 kB]
Get:9 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease [18.1 kB]
Hit:10 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Get:11 https://r2u.stat.illinois.edu/ubuntu jammy/main all Packages [8,551 kB]
Get:12 http://security.ubuntu.com/ubuntu jammy-security/restricted amd64 Packages [3,436 kB]
Hit:13 https://ppa.launchpadcontent.net/ubun

In [3]:
# Install required packages
!apt-get install -y poppler-utils

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following NEW packages will be installed:
  poppler-utils
0 upgraded, 1 newly installed, 0 to remove and 51 not upgraded.
Need to get 186 kB of archives.
After this operation, 696 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 poppler-utils amd64 22.02.0-2ubuntu0.5 [186 kB]
Fetched 186 kB in 0s (402 kB/s)
Selecting previously unselected package poppler-utils.
(Reading database ... 126427 files and directories currently installed.)
Preparing to unpack .../poppler-utils_22.02.0-2ubuntu0.5_amd64.deb ...
Unpacking poppler-utils (22.02.0-2ubuntu0.5) ...
Setting up poppler-utils (22.02.0-2ubuntu0.5) ...
Processing triggers for man-db (2.10.2-1) ...


#### Imports

In [4]:
# Import required libs
import base64
import os

import torch
import numpy as np
from PIL import Image
from transformers import CLIPProcessor, CLIPModel
from haystack.document_stores.faiss import FAISSDocumentStore
from haystack.nodes import PromptNode, PromptTemplate
from haystack.pipelines import Pipeline
from haystack import Document
import openai
from haystack.agents import Tool
from haystack.agents.conversational import ConversationalAgent
from haystack.agents.memory import ConversationSummaryMemory
from haystack.nodes import (PreProcessor, EmbeddingRetriever, AnswerParser)
from haystack.utils import convert_files_to_docs

from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import HTMLResponse, RedirectResponse
from pydantic import BaseModel
from typing import List, Dict, Any, Tuple
from threading import Thread
import uvicorn
from google.colab.output import eval_js
import re
import requests
from fastapi.staticfiles import StaticFiles
from enum import Enum
import mimetypes

from google.colab import userdata
openai_key = userdata.get('OPENAI')
openai.api_key = openai_key

### Helper Functions

In [5]:
class PipelineType(str, Enum):
    """
    An enumeration representing the types of pipelines that can be used in the system.

    Attributes:
        conversational (str): Represents the conversational agent pipeline, using GPT-4o.
        rag (str): Represents the retrieval-augmented generation (RAG) pipeline, using GPT-4o for conversational purposes.
    """
    conversational = "conversational_agent_gpt_4"
    rag = "rag_pipeline_gpt_4_conversational"

class ChatRequest(BaseModel):
    """
    A model representing a chat request.

    Attributes:
        question (str): The user's question to be processed by the system.
        pipeline (PipelineType): The type of pipeline to use when answering the question, either 'conversational' or 'rag'.
    """
    question: str
    pipeline: PipelineType

class ChatImageRequest(BaseModel):
    """
    A model representing a chat request specifically related to images.

    Attributes:
        question (str): The user's question related to the provided images.
    """
    question: str

class ImageAnalysisRequest(BaseModel):
    """
    A model representing a request to analyze images.

    Attributes:
        selected_images (List[Dict[str, str]]): A list of dictionaries, each containing details about the selected images (e.g., title and file path).
        query (str): The query or request to analyze the images, provided by the user.
    """
    selected_images: List[Dict[str, str]]
    query: str

def analyze_images(selected_images: List[Dict[str, str]], query: str) -> Dict[str, str]:
    """
    Encodes images to base64, sends them to GPT-4o for analysis, and retrieves answers.

    Parameters:
        selected_images (List[Dict[str, str]]): A list of dictionaries, where each dictionary contains 'title' and 'path' of an image to be analyzed.
        query (str): The analysis query to send to GPT-4o.

    Returns:
        Dict[str, str]: A dictionary where the keys are image titles and the values are the corresponding analysis answers from GPT-4o.
    """
    answers = {}

    for image_info in selected_images:
        image_path = image_info['path']
        image_title = image_info['title']

        full_query = f"The image titled '{image_title}' refers to: {query}"
        mime_type, _ = mimetypes.guess_type(image_path)

        if mime_type not in ["image/jpeg", "image/png"]:
          raise ValueError("Unsupported image format. Please use JPEG or PNG.")

        with open(image_path, "rb") as image_file:
            image_bytes = base64.b64encode(image_file.read())

        response =  openai.chat.completions.create(
            model="gpt-4o",
            messages=[
                {
                    "role": "user",
                    "content": [
                        {"type": "text", "text": full_query},
                        {
                            "type": "image_url",
                            "image_url": {"url": f"data:{mime_type};base64,{image_bytes.decode('utf-8')}"}
                        }
                    ],
                }
            ]
        )

        answer = response.choices[0].message.content
        answers[image_title] = answer

    return answers

def write_to_debug_file(message: str) -> None:
    """
    Writes a message to the debug output file for logging purposes.

    Parameters:
        message (str): The message to write to the debug file.

    Returns:
        None
    """
    with open("/content/debug_output.txt", "a") as f:
        f.write(message + "\n")

def index_pdf(pdf_metadata: Dict[str, any], pdf_path: str) -> None:
    """
    Indexes a PDF from the repository.

    Parameters:
        pdf_metadata (dict): Metadata related to the PDF, such as title.
        pdf_path (str): The path where the downloaded PDF should be saved.

    Returns:
        None

    Raises:
        Exception: If an error occurs during the processing or indexing of the PDF.
    """
    try:
        documents = convert_files_to_docs(dir_path=os.path.dirname(pdf_path), split_paragraphs=True)
        if not documents:
            write_to_debug_file(f"No documents found in PDF at {pdf_path}")
            return

        for doc in documents:
            write_to_debug_file(f"Indexing document: {doc}")
            indexing_pipeline_text.run(documents=[doc])
        write_to_debug_file(f"Successfully processed and indexed PDF: {pdf_metadata.get('title', 'Unknown Title')}")
    except Exception as e:
        write_to_debug_file(f"Error during processing or indexing PDF '{pdf_metadata.get('title', 'Unknown Title')}': {repr(e)}")
        raise


def extract_first_answer_image_title(haystack_response: Dict[str, any]) -> List[str]:
    """
    Extracts the titles of images from documents related to the first answer in the Haystack response.

    Parameters:
        haystack_response (Dict[str, any]): The response dictionary from Haystack, which includes answers and documents.

    Returns:
        List[str]: A list of image titles extracted from the relevant documents of the first answer.
    """
    image_titles = []
    if 'answers' in haystack_response and haystack_response['answers']:
        first_answer = haystack_response['answers'][0]
        document_ids = first_answer.document_ids if hasattr(first_answer, 'document_ids') else []

        if 'invocation_context' in haystack_response and 'documents' in haystack_response['invocation_context']:
            for document in haystack_response['invocation_context']['documents']:
                if hasattr(document, 'id') and document.id in document_ids:
                    if hasattr(document, 'meta') and 'title' in document.meta:
                        image_titles.append(document.meta['title'])

    return image_titles

def integrate_image_paths(image_titles: List[str], session_images: Dict[str, str]) -> List[Dict[str, str]]:
    """
    Integrates image paths from session_images into image_titles if a match is found.

    Parameters:
        image_titles (List[str]): A list of image titles.
        session_images (Dict[str, str]): A dictionary where keys are image titles and values are the image file paths.

    Returns:
        List[Dict[str, str]]: A list of dictionaries with the image title and the corresponding image path if found.
    """
    return [{"title": title, "path": f"/static/{os.path.basename(session_images[title])}"}
            for title in image_titles if title in session_images]

def process_text_from_json(json_data: Dict[str, any]) -> None:
    """
    Processes text and metadata from a JSON object, creating a Document and sending it for indexing.

    Parameters:
        json_data (dict): The JSON data containing 'plain_text' and 'metadata' for creating a Document.

    Returns:
        None
    """
    plain_text = json_data.get('plain_text', '')
    metadata_list = json_data.get('metadata', [])
    metadata = {item['term']: item['definition'] for item in metadata_list}

    document = Document(content=plain_text, meta=metadata)
    documents = [document]
    indexing_pipeline_text.run(documents=documents)

def process_image(image_path: str, alt_text: str, metadata: Dict[str, any]) -> None:
    """
    Processes an image by generating an embedding using CLIP, and creates a Document to be stored.

    Parameters:
        image_path (str): The path to the image file.
        alt_text (str): The alternative text for the image.
        metadata (dict): Metadata for the image to be stored alongside the content.

    Returns:
        None
    """
    image = Image.open(image_path)
    clip_processor = CLIPProcessor.from_pretrained("openai/clip-vit-base-patch32")
    clip_model = CLIPModel.from_pretrained("openai/clip-vit-base-patch32").to(device)

    inputs = clip_processor(images=image, return_tensors="pt").to(device)
    image_features = clip_model.get_image_features(**inputs)
    image_embedding = image_features.detach().cpu().numpy().flatten()

    if len(image_embedding) != 1536:
        image_embedding = np.pad(image_embedding, (0, 1536 - len(image_embedding)), 'constant')

    document = Document(content=alt_text, meta=metadata, embedding=image_embedding)
    document_store_images.write_documents([document])

### LLMs and MLLMs Orchestration

In [6]:
# Set device for torch
device = "cuda" if torch.cuda.is_available() else "cpu"

# Document Store Setup (FAISS) for text data
document_store_path = "faiss_index_file"
if os.path.exists(document_store_path):
    document_store_text = FAISSDocumentStore.load(document_store_path)
else:
    document_store_text = FAISSDocumentStore(
        faiss_index_factory_str="Flat",
        embedding_dim=1536,
        sql_url="sqlite:///faiss_document_store_text.db"
    )
    document_store_text.save(document_store_path)

# Document Store Setup (FAISS) for images data
document_store_path = "faiss_index_file_images"
if os.path.exists(document_store_path):
    document_store_images = FAISSDocumentStore.load(document_store_path)
else:
    document_store_images = FAISSDocumentStore(
        faiss_index_factory_str="Flat",
        embedding_dim=1536,
        sql_url="sqlite:///faiss_document_store_images.db"  # Specify unique database file
    )
    document_store_images.save(document_store_path)


# Pre-Processor and Retriever Setup
preprocessor = PreProcessor(
     clean_empty_lines=True,
     clean_whitespace=False,
     clean_header_footer=True,
     split_by="word",
     split_length=200,
     split_overlap=30,
     split_respect_sentence_boundary=True,
)

# Retriever for solely text content
retriever_text = EmbeddingRetriever(
    document_store=document_store_text,
    embedding_model="text-embedding-ada-002",  # Use OpenAI's ada-002 model
    api_key=openai.api_key,
    batch_size=8,
    max_seq_len=1536
)

# Retriever for image metadata content
retriever_images = EmbeddingRetriever(
    document_store=document_store_images,
    embedding_model="text-embedding-ada-002",  # Use OpenAI's ada-002 model
    api_key=openai.api_key,
    batch_size=8,
    max_seq_len=1536
)

# Update the embeddings if needed
# document_store_text.update_embeddings(retriever_text)
# document_store_images.update_embeddings(retriever_images)

# Set Up Indexing Pipeline
indexing_pipeline_text = Pipeline()
indexing_pipeline_text.add_node(component=preprocessor, name="PreProcessor", inputs=["File"])
indexing_pipeline_text.add_node(component=retriever_text, name="Retriever", inputs=["PreProcessor"])
indexing_pipeline_text.add_node(component=document_store_text, name="DocumentStore", inputs=["Retriever"])


# RAG Pipelines for Conversational and Image Queries

# Text data RAG prompt config
conversational_prompt = """
In the following conversation, a human user interacts with the AI virtual assistant that has access to the survey data of Eurobarometer.
Eurobarometer is a collection of cross-country public opinion surveys conducted regularly on behalf of the EU Institutions since 1974.
Based on the survey data, generate a response that answers the question.
If the information is not sufficient, say that the answer is not possible from the documents alone.
You should ignore your knowledge when answering the questions, and be based solely on the below documents.
Provide a clear and concise answer, no longer 200 words.

Always include a disclaimer in your answer regarding AI generated content.
Disclaimer: This is AI generated content — please use it with caution.

\n\n Context: {join(documents)} \n\n Question: {query} \n\n Answer:
"""
template_conversational = PromptTemplate(prompt=conversational_prompt, output_parser=AnswerParser())
prompt_node_gpt_4_conversational = PromptNode(
    model_name_or_path="gpt-4o-mini",
    default_prompt_template=template_conversational,
    api_key=openai.api_key,
    max_length=4096,
    model_kwargs={
        "temperature": 0.1,
        "top_p": 0.9
    }
)
# Image data RAG prompt config
image_prompt = """
In the following conversation, a human user interacts with the AI virtual assistant that has access to the survey images metadata of Eurobarometer.
Eurobarometer is a collection of cross-country public opinion surveys conducted regularly on behalf of the EU Institutions since 1974.
Based on the image metadata, generate a response that answers the question.
If the image information is not sufficient, say that the answer is not possible from the image metadata alone.
You should ignore your knowledge when answering the questions, and be based solely on the below documents.
Provide a clear and concise answer, no longer 200 words.

Always include a disclaimer in your answer regarding AI generated content.
Disclaimer: This is AI generated content — please use it with caution.

\n\n Context: {join(documents)} \n\n Question: {query} \n\n Answer:
"""
template_image = PromptTemplate(prompt=image_prompt, output_parser=AnswerParser())
prompt_node_gpt_4_image = PromptNode(
    model_name_or_path="gpt-4o-mini",
    default_prompt_template=template_image,
    api_key=openai.api_key,
    max_length=4096,
    model_kwargs={
        "temperature": 0.1,
        "top_p": 0.9
    }
)
# Setting up pipelines
rag_pipeline_gpt_4_conversational = Pipeline()
rag_pipeline_gpt_4_conversational.add_node(component=retriever_text, name="Retriever", inputs=["Query"])
rag_pipeline_gpt_4_conversational.add_node(component=prompt_node_gpt_4_conversational, name="PromptNode", inputs=["Retriever"])

rag_pipeline_gpt_4_image = Pipeline()
rag_pipeline_gpt_4_image.add_node(component=retriever_images, name="Retriever", inputs=["Query"])
rag_pipeline_gpt_4_image.add_node(component=prompt_node_gpt_4_image, name="PromptNode", inputs=["Retriever"])


# Agent prompt config
agent_prompt_node_gpt_4 = PromptNode(
    model_name_or_path="gpt-4o", #gpt-4o-mini
    api_key=openai_key,
    max_length=256,
    stop_words=["Observation:"],
    model_kwargs={
        "temperature": 0.1,
        "top_p": 0.9
    },
)

# In memory setup
memory_prompt_node = PromptNode(
    "philschmid/flan-t5-base-samsum", #bart-large-cnn-samsum or any model that peforms good in summarization
    max_length=256,
    model_kwargs={"task_name": "text2text-generation"}
)
memory = ConversationSummaryMemory(memory_prompt_node, prompt_template="{chat_transcript}")

# Conversational prompt
agent_prompt = """
In the following conversation, a human user interacts with the AI Agent that has access to the documentation of Eurobarometer.
Eurobarometer is a collection of cross-country public opinion surveys conducted regularly on behalf of the EU Institutions since 1974.
The human poses questions and AI Agent should try to find an answer to every question.
The final answer to the question should be truthfully based solely on the output of the tool.
The AI Agent should ignore its knowledge when answering the questions.

You can use each tool only one time!

The AI Agent has access to this tool:
{tool_names_with_descriptions}

The following is the previous conversation between a human and The AI Agent:
{memory}

AI Agent responses must start with one of the following:

Thought: [the AI Agent's reasoning process]
Tool: [tool names] (on a new line) Tool Input: [input as a question for the selected tool WITHOUT quotation marks and on a new line] (These must always be provided together and on separate lines.)
Observation: [tool's result]
Final Answer: (on a new line) [final answer to the human user's question]
When selecting a tool, the AI Agent must provide both the "Tool:" and "Tool Input:" pair in the same response, but on separate lines.

The AI Agent should not ask the human user for additional information, clarification, or context.
If the AI Agent cannot find a specific answer after exhausting available tools and approaches, it answers with Final Answer: inconclusive

Always include a disclaimer in your answer regarding AI generated content.
Disclaimer: This is AI generated content — please use it with caution.

Question: {query}
Thought:
{transcript}
"""

def create_tool(pipeline, name, description):
    return Tool(
        name=name,
        pipeline_or_node=pipeline,
        description=description,
        output_variable="answers",
    )

# Tools for agents config
tools = [
    create_tool(
        rag_pipeline_gpt_4_conversational, "Eurobarometer", "useful for when you need to answer questions about the Eurobarometer surveys of the European Commission and European Parliament."
    ),
    create_tool(
        rag_pipeline_gpt_4_image,
        "EurobarometerImage",
        "useful for when you need to answer questions about the Eurobarometer surveys based on Images' metadata of the European Commission and European Parliament.",
    ),
]

# Conversational Agent final config including both agents as tools
conversational_agent_gpt_4 = ConversationalAgent(
    prompt_node=agent_prompt_node_gpt_4,
    tools=tools,
    memory=memory,
    prompt_template=agent_prompt,
)

# Function for image analysis
def analyze_single_image(image_path: str, image_title: str, query: str) -> str:
    """
    Analyzes a single image using GPT-4o-mini and a given query.

    Parameters:
        image_path (str): The path to the image file.
        image_title (str): The title of the image.
        query (str): The query for analyzing the image.

    Returns:
        str: The analysis result for the image.
    """
    try:
        prompt = f"""
        In the following conversation, a human user interacts with the AI virtual assistant that has access to the survey images of Eurobarometer.
        Eurobarometer is a collection of cross-country public opinion surveys conducted regularly on behalf of the EU Institutions since 1974.
        Based on the image titled '{image_title}', generate a response that answers the question.
        If the image information is not sufficient, say that the answer is not possible from the image alone.
        You should ignore your knowledge when answering the questions, and be based solely on the attached image.

        Always include a disclaimer in your answer regarding AI generated content.
        Disclaimer: This is AI generated content — please use it with caution.

        Question: {query}

        """

        with open(image_path, "rb") as image_file:
            image_bytes = base64.b64encode(image_file.read()).decode("utf-8")

        response = openai.chat.completions.create(
            model="gpt-4o-mini",
            messages=[
                {"role": "user", "content": [
                    {"type": "text", "text": prompt},
                    {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_bytes}"}}
                ]}
            ]
        )

        return response.choices[0].message.content
    except Exception as e:
        raise HTTPException(status_code=500, detail=f"Error analyzing image: {e}")

[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


config.json:   0%|          | 0.00/1.53k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/990M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

### Fast API and UI

In [7]:
# Initialize FastAPI application
app = FastAPI()

# Mount the /content/static directory as /static to serve CSS
app.mount("/static", StaticFiles(directory="/content/static"), name="static")

# Track indexed deliverableId values to avoid re-indexing
indexed_files = set()

PDF_DOWNLOAD_DIR = "/content/pdf_downloads"
os.makedirs(PDF_DOWNLOAD_DIR, exist_ok=True)

@app.get("/", response_class=HTMLResponse)
async def root() -> HTMLResponse:
    """
    Serves the 'chat.html' file located in the /content/templates directory at the root URL.

    Returns:
        HTMLResponse: The HTML content of the 'chat.html' file if found,
                      otherwise a 404 status code with a "File not found" message.
    """
    full_path: str = "/content/templates/chat.html"
    print(f"Serving {full_path} for root URL")
    try:
        with open(full_path, "r") as file:
            html_content = file.read()
        return HTMLResponse(html_content)
    except FileNotFoundError:
        return HTMLResponse("File not found", status_code=404)

@app.post("/chat")
async def chat_with_pipeline(data: ChatRequest) -> Dict[str, str]:
    """
    Handles chat requests using either a conversational agent or a retrieval-augmented generation (RAG) pipeline.

    Parameters:
        data (ChatRequest): A pydantic model containing the user's question and the selected pipeline type.

    Returns:
        Dict[str, str]: A dictionary containing the response generated by the chosen pipeline.

    Raises:
        HTTPException: If an error occurs during the processing of the chat request, returns a 500 status code with an error message.
    """
    question = data.question
    pipeline = data.pipeline

    write_to_debug_file(f"Received chat question: '{question}' with pipeline '{pipeline}'")

    try:
        if pipeline == PipelineType.conversational:
            response = conversational_agent_gpt_4.run(question)
            response_text = response.get("transcript", "No transcript available.")
        elif pipeline == PipelineType.rag:
            response = rag_pipeline_gpt_4_conversational.run(question)
            first_answer = response['answers'][0]
            response_text = first_answer.answer if hasattr(first_answer, 'answer') else first_answer['answer']

        return {"response": response_text}
    except Exception as e:
        write_to_debug_file(f"Error in chat_with_pipeline: {repr(e)}")
        raise HTTPException(status_code=500, detail=f"Chat error: {repr(e)}")

@app.post("/chat_image")
async def chat_image(data: ChatImageRequest) -> Dict[str, Any]:
    """
    Handles image-related chat requests using a retrieval-augmented generation (RAG) pipeline.

    Parameters:
        data (ChatImageRequest): A pydantic model containing the user's question related to images.

    Returns:
        Dict[str, any]: A dictionary containing the chat response and a list of image paths integrated based on the analysis.

    Raises:
        HTTPException: If an error occurs during processing of the chat request, returns a 500 status code with an error message.
    """
    question = data.question
    try:
        response = rag_pipeline_gpt_4_image.run(question)
        first_answer = response['answers'][0]
        response_text = first_answer.answer if hasattr(first_answer, 'answer') else first_answer['answer']

        image_titles = extract_first_answer_image_title(response)
        integrated_images = integrate_image_paths(image_titles, session_images)

        return {
            "response": response_text,
            "image_paths": integrated_images
        }
    except Exception as e:
        print(f"Error in chat_image: {repr(e)}")
        raise HTTPException(status_code=500, detail="Image chat error.")

@app.post("/analyze_images")
async def analyze_images(data: ImageAnalysisRequest) -> Dict[str, Any]:
    """
    Analyzes selected images using GPT-4o-mini and returns answers for each image based on a provided query.

    Parameters:
        data (ImageAnalysisRequest): A pydantic model containing the selected images and the query to analyze.

    Returns:
        Dict[str, Any]: A dictionary containing the analysis results for each image title.

    Raises:
        HTTPException: If an error occurs during the image analysis process, returns a 500 status code with an error message.
    """
    selected_images = data.selected_images
    query = data.query
    answers = {}

    for image_info in selected_images:
        image_path = image_info['path'].replace("/static/", "/content/")
        image_title = image_info['title']

        answers[image_title] = analyze_single_image(image_path, image_title, query)

    return {"results": answers}

@app.get("{file_path:path}", response_class=HTMLResponse)
async def get_html(file_path: str) -> HTMLResponse:
    """
    Serves an HTML file dynamically from the /content/templates directory.

    Parameters:
        file_path (str): The relative path to the HTML file inside the /content directory.

    Returns:
        HTMLResponse: The HTML content of the requested file if found, otherwise a 404 response with a "File not found" message.

    Raises:
        HTTPException: If the requested file cannot be read or if a different I/O error occurs.
    """
    full_path = f"/content/templates/{file_path}"
    try:
        with open(full_path, "r") as file:
            html_content = file.read()
        return HTMLResponse(html_content)
    except FileNotFoundError:
        return HTMLResponse("File not found", status_code=404)

def run_app() -> None:
    """
    Starts the FastAPI application using Uvicorn.

    Configures the host, port, and timeout settings for the application.

    Parameters:
        None

    Returns:
        None
    """
    uvicorn.run(app, host="0.0.0.0", port=8000, timeout_keep_alive=600)

# Start the app in a new thread
thread = Thread(target=run_app)
thread.start()

# Display the server URL in Google Colab
print(eval_js("google.colab.kernel.proxyPort(8000)"))

INFO:     Started server process [316]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)


https://w9eexbawnl-496ff2e9c6d22116-8000-colab.googleusercontent.com/
