<a href="https://colab.research.google.com/github/Loiruck/CORD-19-LLM-Agent/blob/main/CORD_19_Medical_assistant_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Agent

# LLM Agent for the CORD-19 Dataset.

Navigating the science behind COVID-19 and its connection to smoking or high blood pressure can be a lot. That's why this AI agent was built! Think of it as a knowledgeable guide that's been trained on the CORD-19 research dataset. It's here to help answer your questions by pulling out relevant details directly from the scientific papers.

# SECTION A: SETUP AND DEPENDENCIES

In [1]:
!pip install llama-index llama-index-embeddings-huggingface llama-index-llms-huggingface bitsandbytes

Collecting llama-index
  Downloading llama_index-0.12.41-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-embeddings-huggingface
  Downloading llama_index_embeddings_huggingface-0.5.4-py3-none-any.whl.metadata (458 bytes)
Collecting llama-index-llms-huggingface
  Downloading llama_index_llms_huggingface-0.5.0-py3-none-any.whl.metadata (2.8 kB)
Collecting bitsandbytes
  Downloading bitsandbytes-0.46.0-py3-none-manylinux_2_24_x86_64.whl.metadata (10 kB)
Collecting llama-index-agent-openai<0.5,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.10-py3-none-any.whl.metadata (439 bytes)
Collecting llama-index-cli<0.5,>=0.4.2 (from llama-index)
  Downloading llama_index_cli-0.4.3-py3-none-any.whl.metadata (1.4 kB)
Collecting llama-index-core<0.13,>=0.12.41 (from llama-index)
  Downloading llama_index_core-0.12.41-py3-none-any.whl.metadata (2.4 kB)
Collecting llama-index-embeddings-openai<0.4,>=0.3.0 (from llama-index)
  Downloading llama_index_embeddings_openai-0.3.

In [2]:
import os
import kagglehub
import pandas as pd

from tqdm import tqdm

from llama_index.core import Settings
from llama_index.core import Document
from llama_index.core import StorageContext
from llama_index.core import VectorStoreIndex
from llama_index.core import load_index_from_storage
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
# Add these to your import block
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core.memory import ChatMemoryBuffer

from matplotlib import pyplot as plt

# SECTION B: DATA ACQUISITION AND INITIAL LOADING

In [3]:
# Download latest version
path = kagglehub.dataset_download(handle="googleai/dataset-metadata-for-cord19")

print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/googleai/dataset-metadata-for-cord19?dataset_version_number=1...


100%|██████████| 5.89M/5.89M [00:00<00:00, 119MB/s]

Extracting files...





Path to dataset files: /root/.cache/kagglehub/datasets/googleai/dataset-metadata-for-cord19/versions/1


In [4]:
os.listdir(path)

['CORD19 datasets - Sheet 1.csv']

In [5]:
filename_with_path = path + "/" + os.listdir(path)[0]
filename_with_path

'/root/.cache/kagglehub/datasets/googleai/dataset-metadata-for-cord19/versions/1/CORD19 datasets - Sheet 1.csv'

In [6]:
df_meta_cord19 = pd.read_csv(filename_with_path)
df_meta_cord19.head()

Unnamed: 0,cord_uid,paper_url,paper_title,dataset_url,dataset_name,alternate_name,description,author_list,last_updated,license,source_organization,doi,compact_identifier,data_download
0,rmzpiyqj,https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6...,"Nipah virus: epidemiology, pathology, immunobi...",https://data.csiro.au/dap/landingpage?pid=csir...,Nature of exposure drives transmission of Nipa...,,"[""RT-PCR data of comparative viral loads/ tiss...",Bronwyn Clayton; Deborah Middleton; Rachel Ark...,2016,"[{""url"":""https://confluence.csiro.au/display/d...",CSIRO,10.4225/08/56806AAEAD713,,
1,h7g5ecc0,http://europepmc.org/articles/pmc4052367?pdf=r...,Novel approaches and challenges to treatment o...,https://datamed.org/display-item.php?repositor...,Key Role of T cell Defects in Age-Related Vuln...,,"[""In a mouse model of age-related vulnerabilit...",,2019-05-06,,,,,
2,3uvlmww0,https://jvi.asm.org/content/jvi/88/17/10228.fu...,"Verdinexor, a Novel Selective Inhibitor of Nuc...",https://datamed.org/display-item.php?repositor...,MicroRNA Regulation of Human Protease Genes,,the human protease genes required for influenz...,,2011-10-13,,,,,
3,xzps65et,https://doi.org/10.14745/ccdr.v45i04a01,Climate change and infectious diseases: What c...,https://search.datacite.org/works/10.5065/d6sj...,The NA-CORDEX dataset,,"[""The NA-CORDEX data archive contains output f...",Linda Mearns; Seth McGinnis; Daniel Korytina; ...,2017,"[{""url"":""http://na-cordex.org/terms-use""}]",UCAR/NCAR,10.5065/d6sj1jch,,
4,a6p8te8q,https://jvi.asm.org/content/jvi/79/6/3370.full...,Increased Epitope-Specific CD8(+) T Cells Prev...,http://www.immunedata.org/display-item.php?rep...,CMV CD8 T Cells,,"[""We present human T cell responses in multipl...",,2018-09-17,,,,,


In [7]:
df_meta_cord19.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16070 entries, 0 to 16069
Data columns (total 14 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   cord_uid             16070 non-null  object
 1   paper_url            16070 non-null  object
 2   paper_title          16070 non-null  object
 3   dataset_url          16070 non-null  object
 4   dataset_name         16070 non-null  object
 5   alternate_name       132 non-null    object
 6   description          14126 non-null  object
 7   author_list          6052 non-null   object
 8   last_updated         12548 non-null  object
 9   license              12788 non-null  object
 10  source_organization  12518 non-null  object
 11  doi                  5923 non-null   object
 12  compact_identifier   2080 non-null   object
 13  data_download        5500 non-null   object
dtypes: object(14)
memory usage: 1.7+ MB


# SECTION C: DATA CLEANING AND PREPARATION

In [8]:
# ===>>> ADD/REPLACE WITH THIS CODE (using 'description') <<<===
print("Cleaning and preparing 'description' column...")
# Use 'description' as identified from your .info() output
df_meta_cord19_cleaned = df_meta_cord19[df_meta_cord19['description'].notnull()].copy()
df_meta_cord19_cleaned['description'] = df_meta_cord19_cleaned['description'].astype(str)
# Optionally, prepare 'paper_title' if you plan to use it in metadata
if 'paper_title' in df_meta_cord19_cleaned.columns:
    df_meta_cord19_cleaned['paper_title'] = df_meta_cord19_cleaned['paper_title'].astype(str)
print(f"Working with {len(df_meta_cord19_cleaned)} documents after initial cleaning.")

Cleaning and preparing 'description' column...
Working with 14126 documents after initial cleaning.


# SECTION D: CONTENT-BASED DOCUMENT FILTERING

In [9]:
# ===>>> ADD/REPLACE WITH THIS CODE (using 'description') <<<===
print("Filtering documents for relevance to smoking or high blood pressure...")
keywords_smoking = ['smoking', 'cigarette', 'nicotine', 'tobacco', 'vaping']
keywords_hbp = ['high blood pressure', 'hypertension', 'bp']

# Ensure you are searching in the 'description' column of df_meta_cord19_cleaned
contains_smoking = df_meta_cord19_cleaned['description'].str.contains('|'.join(keywords_smoking), case=False, na=False)
contains_hbp = df_meta_cord19_cleaned['description'].str.contains('|'.join(keywords_hbp), case=False, na=False)

df_relevant_docs = df_meta_cord19_cleaned[contains_smoking | contains_hbp].copy()
print(f"Found {len(df_relevant_docs)} documents relevant to smoking or high blood pressure.")

Filtering documents for relevance to smoking or high blood pressure...
Found 346 documents relevant to smoking or high blood pressure.


# SECTION E: EMBEDDING MODEL INITIALIZATION

In [10]:
# Add this before creating the LLM or the index
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

# Initialize the embedding model (you can choose other models from Hugging Face)
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
Settings.embed_model = embed_model

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/94.8k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/52.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/743 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/133M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/366 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## SECTION F: VECTOR DATABASE USING THE LLAMAINDEX FUNCTION LIBRARY

In [11]:
# """## Create a vector database using the LlamaIndex function library.""" # This is a markdown heading

# --- REPLACE THE DOCUMENT CREATION LOGIC BELOW ---
# The old logic uses 'df_meta_cord19_filtered' which is not defined after the cleanup,
# and it also directly uses 'description' without referencing the topic-filtered DataFrame.

# Convert pandas DataFrame to LlamaIndex Documents
# We'll use the 'description' column for the main content.
# You can also add other relevant fields to metadata if needed.

# ===>>> REPLACE THE FOLLOWING LOOP:
# documents = []
# for index, row in tqdm(df_meta_cord19_filtered.iterrows(), total=df_meta_cord19_filtered.shape[0], desc="Creating Documents"):
#     # Ensure the text content is a string
#     text_content = str(row['description']) if pd.notnull(row['description']) else ""
#     documents.append(Document(text=text_content))
# print(f"Created {len(documents)} documents.")
# ===>>> WITH THIS CORRECTED LOOP (using df_relevant_docs and 'description'):
documents = []
if not df_relevant_docs.empty:
    print(f"Creating LlamaIndex Documents from {len(df_relevant_docs)} relevant rows...") # Adapted from previous guidance
    for index, row in tqdm(df_relevant_docs.iterrows(), total=df_relevant_docs.shape[0], desc="Creating Documents"): # Iterate over df_relevant_docs
        text_content = str(row['description']) # Get text from 'description' column
        metadata = {}
        if 'paper_title' in row and pd.notnull(row['paper_title']): # Use 'paper_title' from df_relevant_docs
            metadata["title"] = str(row['paper_title'])
        documents.append(Document(text=text_content, metadata=metadata))
else:
    print("No relevant documents found after filtering. The index will be empty.")

print(f"Created {len(documents)} LlamaIndex Documents.")
# --- END OF REPLACEMENT ---

# Create the VectorStoreIndex - This part is fine if 'documents' is correctly populated
print("Creating the VectorStoreIndex.This might take a while...")
index = VectorStoreIndex.from_documents(
    documents,
    show_progress=True
)
print("VectorStoreIndex created successfully.")

# ... (Optional persist/load index code is fine) ...

Creating LlamaIndex Documents from 346 relevant rows...


Creating Documents: 100%|██████████| 346/346 [00:00<00:00, 12212.75it/s]

Created 346 LlamaIndex Documents.
Creating the VectorStoreIndex.This might take a while...





Parsing nodes:   0%|          | 0/346 [00:00<?, ?it/s]

Generating embeddings:   0%|          | 0/348 [00:00<?, ?it/s]

VectorStoreIndex created successfully.


## SECTION G: LANGUAGE MODEL (LLM) INITIALIZATION

In [12]:
# Létrehozzuk a nyelvi modellt (LLM), amit az ágens fog használni.
llm = HuggingFaceLLM(
    model_name="colesmcintosh/Llama-3.2-1B-Instruct-Mango",       # Nyelvi modell beállítása
    tokenizer_name="colesmcintosh/Llama-3.2-1B-Instruct-Mango",   # Nyelvi modell tokenizátorának beállítása
    context_window=2048,                                          # Maximum token limit
    max_new_tokens=256,                                           # Válasz maximális hossza
    device_map="cuda:0",                                          # GPU használata,
    generate_kwargs={"temperature": 0.95, "do_sample": True},     # Ezek a paraméterek befolyásolják a modell válaszainak véletlenszerűségét és kreativitását.
)

adapter_config.json:   0%|          | 0.00/738 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.52k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.03G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/234 [00:00<?, ?B/s]

adapter_model.safetensors:   0%|          | 0.00/45.1M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/55.4k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/454 [00:00<?, ?B/s]



In [13]:
Settings.llm = llm

# SECTION H: CHAT ENGINE CONFIGURATION

In [14]:
# Létrehozzuk a chat motort, ami az ágens párbeszédéért felelős.
chat_engine = index.as_chat_engine(
    # Ez a paraméter beállítja, hogy a chat motor a korábban létrehozott vektoradatbázist használja a válaszokhoz.
    chat_mode="context",
    # Ez a paraméter beállítja a chat motor memóriáját. A ChatMemoryBuffer emlékszik a korábbi beszélgetésekre.
    memory=ChatMemoryBuffer.from_defaults(token_limit=32000),
    # Ez a paraméter beállítja a rendszerüzenetet, ami az ágens viselkedését befolyásolja. Ebben az esetben az ágens egy orvosi chatbot, amely a MedQuad adathalmaz alapján válaszol.
    system_prompt=(
    "You are a specialized medical information assistant. "
    "Your knowledge is strictly limited to information about the relationship between COVID-19 and smoking,and COVID-19 and high blood pressure, based on the provided CORD-19 research documents. "

)
)

# SECTION I: WEB APPLICATION (FLASK SETUP)

In [15]:
# Ensure this cell is run
!pip install Flask flask-cors # Run this first if you haven't in this session or after a restart
!pip install flask-ngrok

# ... (your existing imports)
from flask import Flask
from flask_ngrok import run_with_ngrok # Add this import
from llama_index.core import Settings
# ... (all other llama_index, HuggingFace, etc. imports) ...

from flask import Flask, render_template_string, request, jsonify
from flask_cors import CORS
import threading # <<< --- ADD THIS LINE HERE ---
try:
    from google.colab.output import eval_js # For getting Colab URL
except ImportError:
    eval_js = None # Define it as None if not in Colab

print("All required modules have been imported/defined.")

app = Flask(__name__)
run_with_ngrok(app) # Add this line before your routes
CORS(app)

# ... (your @app.route definitions and FRONTEND_HTML) ...
# --- REPLACE YOUR OLD if __name__ == '__main__': BLOCK WITH THIS ---
if __name__ == '__main__':
    print("Preparing to start Flask app...")

    # Run Flask app in a separate thread
    # This allows the Colab URL to be printed after the server starts listening
    flask_thread = threading.Thread(target=app.run, kwargs={'host':'0.0.0.0','port':5000, 'debug':False, 'use_reloader':False})
    flask_thread.start()
    print("Flask app thread started. It might take a few seconds for the server to be ready.")

    # Get and print the Colab proxy URL (if in Colab)
    if eval_js: # Checks if eval_js was imported (i.e., if running in Colab)
        print("Attempting to get Colab proxy URL...")
        # Add a small delay to give the server time to start before trying to get the proxy URL
        import time
        time.sleep(5) # Wait 5 seconds
        try:
            colab_url = eval_js('google.colab.kernel.proxyPort(5000)')
            print(f"Your Colab app should be accessible at: {colab_url}")
            print("If the link doesn't work immediately, please wait a few more seconds for the server to fully initialize.")
        except Exception as e:
            print(f"Could not get Colab URL via eval_js: {e}")
            print("You might need to manually check Colab's output for a forwarded port if the Flask server messages appear.")
    else:
        print("Not running in Colab or google.colab.output not available.")
        print(f"If running locally, try accessing the app via http://127.0.0.1:5000 or http://0.0.0.0:5000")

    print("Main script setup complete. Flask server is running in a background thread.")
    print("You can now try accessing the URL provided above.")
# --- END OF REPLACEMENT ---

Collecting flask-cors
  Downloading flask_cors-6.0.1-py3-none-any.whl.metadata (5.3 kB)
Downloading flask_cors-6.0.1-py3-none-any.whl (13 kB)
Installing collected packages: flask-cors
Successfully installed flask-cors-6.0.1
Collecting flask-ngrok
  Downloading flask_ngrok-0.0.25-py3-none-any.whl.metadata (1.8 kB)
Downloading flask_ngrok-0.0.25-py3-none-any.whl (3.1 kB)
Installing collected packages: flask-ngrok
Successfully installed flask-ngrok-0.0.25


Exception in thread Thread-9 (new_run):
Traceback (most recent call last):
  File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.11/threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
TypeError: run_with_ngrok.<locals>.new_run() got an unexpected keyword argument 'host'


All required modules have been imported/defined.
Preparing to start Flask app...
Flask app thread started. It might take a few seconds for the server to be ready.
Attempting to get Colab proxy URL...
Your Colab app should be accessible at: https://5000-gpu-t4-s-1jlq3evkyojll-a.us-west4-0.prod.colab.dev
If the link doesn't work immediately, please wait a few more seconds for the server to fully initialize.
Main script setup complete. Flask server is running in a background thread.
You can now try accessing the URL provided above.


# SECTION J: WEB APPLICATION, FRONTEND HTML AND JAVASCRIPT

In [None]:
# --- START OF FLASK WEB FRONTEND ADDITION ---
app = Flask(__name__)
CORS(app) # Enable Cross-Origin Resource Sharing

FRONTEND_HTML = """
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Medical Chatbot</title>
    <style>
        body { font-family: Arial, sans-serif; margin: 0; padding: 0; background-color: #f0f2f5; display: flex; flex-direction: column; align-items: center; min-height: 100vh; }
        header { background-color: #007bff; color: white; padding: 15px 0; text-align: center; width: 100%; box-shadow: 0 2px 5px rgba(0,0,0,0.1); position: fixed; top: 0; left: 0; z-index: 1000;}
        header h1 { margin: 0; font-size: 1.6em; }
        #chat-container { width: 95%; max-width: 800px; background-color: #ffffff; border-radius: 8px; box-shadow: 0 4px 12px rgba(0,0,0,0.1); margin-top: 100px; /* Adjusted for fixed header */ margin-bottom: 20px; display: flex; flex-direction: column; flex-grow: 1; overflow: hidden; height: calc(100vh - 140px); /* Adjusted for fixed header and input */ }
        #chatbox { flex-grow: 1; padding: 20px; overflow-y: auto; border-bottom: 1px solid #e0e0e0; }
        .message { margin-bottom: 12px; padding: 10px 15px; border-radius: 18px; line-height: 1.5; max-width: 80%; word-wrap: break-word; }
        .user-message { background-color: #007bff; color: white; margin-left: auto; border-bottom-right-radius: 5px; align-self: flex-end; }
        .agent-message { background-color: #e9ecef; color: #212529; margin-right: auto; border-bottom-left-radius: 5px; align-self: flex-start; }
        .system-message { background-color: #fff3cd; color: #856404; text-align: center; font-style: italic; padding: 8px; border-radius: 5px; }
        #input-area { display: flex; padding: 15px; border-top: 1px solid #e0e0e0; background-color: #f8f9fa;}
        #queryInput { flex-grow: 1; padding: 10px 15px; border: 1px solid #ced4da; border-radius: 20px; margin-right: 10px; font-size: 0.95em; }
        #queryInput:focus { border-color: #80bdff; box-shadow: 0 0 0 0.2rem rgba(0,123,255,.25); outline: none; }
        .button { padding: 10px 18px; color: white; border: none; border-radius: 20px; cursor: pointer; font-size: 0.95em; transition: background-color 0.2s ease-in-out; }
        #sendButton { background-color: #007bff; }
        #sendButton:hover { background-color: #0056b3; }
        #resetButton { background-color: #6c757d; margin-left: 8px; }
        #resetButton:hover { background-color: #545b62; }
        .thinking { font-style: italic; color: #6c757d; padding: 8px 0; text-align: left; }
    </style>
</head>
<body>
    <header><h1>Medical Chatbot (CORD-19)</h1></header>
    <div id="chat-container">
        <div id="chatbox">
            <div class="message agent-message">Hello! Ask me about COVID-19 in relation to smoking or high blood pressure.</div>
        </div>
        <div id="input-area">
            <input type="text" id="queryInput" placeholder="Type your question..." onkeypress="handleKeyPress(event)">
            <button id="sendButton" class="button" onclick="askQuestion()">Send</button>
            <button id="resetButton" class="button" onclick="resetChat()">Reset</button>
        </div>
    </div>

    <script>
        const chatbox = document.getElementById('chatbox');
        const queryInput = document.getElementById('queryInput');
        const sendButton = document.getElementById('sendButton');
        let thinkingDiv = null; // To hold the "thinking..." message

        function addMessage(text, type) {
            const messageDiv = document.createElement('div');
            messageDiv.classList.add('message', type + '-message');
            messageDiv.textContent = text; // Using textContent for security
            chatbox.appendChild(messageDiv);
            chatbox.scrollTop = chatbox.scrollHeight; // Auto-scroll
        }

        function showThinking() {
            if (thinkingDiv) return; // Already showing
            thinkingDiv = document.createElement('div');
            thinkingDiv.classList.add('thinking', 'agent-message'); // Style like an agent message
            thinkingDiv.textContent = 'Agent is thinking...';
            chatbox.appendChild(thinkingDiv);
            chatbox.scrollTop = chatbox.scrollHeight;
        }

        function hideThinking() {
            if (thinkingDiv) {
                chatbox.removeChild(thinkingDiv);
                thinkingDiv = null;
            }
        }

        async function askQuestion() {
            const query = queryInput.value.trim();
            if (!query) return;

            addMessage(query, 'user');
            queryInput.value = '';
            queryInput.disabled = true;
            sendButton.disabled = true;
            showThinking();

            try {
                const response = await fetch('/chat', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({ query: query })
                });

                hideThinking();

                if (!response.ok) {
                    const errorData = await response.json().catch(() => ({ error: "Failed to parse error response." }));
                    addMessage('Error: ' + (errorData.error || 'Failed to get response from server.'), 'agent');
                    return;
                }
                const data = await response.json();
                addMessage(data.response, 'agent');

            } catch (error) {
                hideThinking();
                console.error('Error during fetch:', error);
                addMessage('Error: Could not connect to the agent. Please check the server.', 'agent');
            } finally {
                 queryInput.disabled = false;
                 sendButton.disabled = false;
                 queryInput.focus();
            }
        }

        async function resetChat() {
            addMessage('Resetting chat history...', 'system');
            queryInput.disabled = true;
            sendButton.disabled = true;

            try {
                const response = await fetch('/reset', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                });
                if (!response.ok) {
                    const errorData = await response.json().catch(() => ({ error: "Failed to parse error response." }));
                    addMessage('Error: ' + (errorData.error || 'Failed to reset.'), 'agent');
                    return;
                }
                const data = await response.json();
                addMessage(data.response, 'agent');
                // Optionally, you could clear more messages from the frontend if desired:
                // chatbox.innerHTML = ""; // Clears all messages
                // addMessage(data.response, 'agent'); // Then add the confirmation
            } catch (error) {
                 console.error('Error during reset:', error);
                addMessage('Error: Could not connect to reset chat.', 'agent');
            } finally {
                 queryInput.disabled = false;
                 sendButton.disabled = false;
                 queryInput.focus();
            }
        }

        function handleKeyPress(event) {
            if (event.key === 'Enter') {
                askQuestion();
            }
        }
        queryInput.focus(); // Focus on input on load
    </script>
</body>
</html>
"""

@app.route('/')
def home():
    return render_template_string(FRONTEND_HTML)

@app.route('/chat', methods=['POST'])
def chat_handler_route():
    try:
        data = request.get_json()
        query = data.get('query')

        if not query:
            return jsonify({"error": "No query provided"}), 400

        # Use the globally defined chat_engine
        response_stream = chat_engine.stream_chat(query)
        full_response = "".join(token for token in response_stream.response_gen)

        return jsonify({"response": full_response})
    except Exception as e:
        print(f"Error in /chat endpoint: {str(e)}") # Log error to server console
        return jsonify({"error": "An internal server error occurred."}), 500

@app.route('/reset', methods=['POST'])
def reset_handler_route():
    try:
        chat_engine.reset() # Reset the chat engine's memory
        return jsonify({"response": "Chat history has been successfully reset."})
    except Exception as e:
        print(f"Error in /reset endpoint: {str(e)}") # Log error to server console
        return jsonify({"error": "An internal server error occurred during reset."}), 500

# --- END OF FLASK WEB FRONTEND ADDITION ---

# Remove or comment out the old command-line loop:
# while True:
#   query = input("> ")
#   if query.lower() == "quit":
#       break
#   print("Agent: ", end="", flush=True)
#   response = chat_engine.stream_chat(query)
#   for token in response.response_gen:
#       print(token, end="", flush=True)
#   print()
# chat_engine.reset() # This line (if it was at the end of the script) is no longer needed here.

if __name__ == '__main__':
    # If running in Google Colab, you can use this to get a public URL:
      from google.colab.output import eval_js
      print(eval_js("google.colab.kernel.proxyPort(5000)"))

      #For local development or other environments:
      #print("Starting Flask app on http://127.0.0.1:5000/")
      app.run(host='0.0.0.0', port=5000, debug=False) # debug=True can be useful for development, but use False for stability
                                                 # host='0.0.0.0' makes it accessible on your network

https://5000-gpu-t4-s-1jlq3evkyojll-a.us-west4-0.prod.colab.dev
 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5000
 * Running on http://172.28.0.12:5000
INFO:werkzeug:[33mPress CTRL+C to quit[0m
INFO:werkzeug:127.0.0.1 - - [11/Jun/2025 17:05:37] "GET / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [11/Jun/2025 17:05:38] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
INFO:werkzeug:127.0.0.1 - - [11/Jun/2025 17:05:54] "POST /chat HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [11/Jun/2025 17:06:03] "POST /chat HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [11/Jun/2025 17:06:49] "GET / HTTP/1.1" 200 -
