<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       ChatBot using Teradata Vector Store
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial'><b>Introduction:</b></p>

<p style = 'font-size:16px;font-family:Arial'>In today’s information-driven world, organizations need smarter, faster, and more efficient ways to interact with vast amounts of data. Traditional search systems, which rely solely on keywords, often fail to provide the most relevant results, especially when dealing with complex or unstructured data. Teradata's Enterprise Vector Store takes a revolutionary approach to this problem by enabling semantic search — the ability to search based on meaning rather than keywords.
</p>

<p style = 'font-size:16px;font-family:Arial'>
In this demo, we showcase the power of Teradata's Enterprise Vector Store combined with conversational AI, demonstrating how the system can intelligently retrieve and present relevant data, all within a chat interface.</p>


<center><img src="images/header.png" alt="mortgage calc"  width=800 height=800/></center>

<br>
<p style = 'font-size:18px;font-family:Arial'><b>Architecture Overview</b></p>

    
<ul style="font-size:16px;font-family:Arial"> 
    <li>Teradata Enterprise Vector Store: A high-performance, scalable database optimized for storing and searching vectorized data, enabling semantic search over vast datasets.</li>
    <li>Chat Application Interface: A user-friendly interface where users interact by typing queries, receiving intelligent responses in return.</li>
    <li>Vector Embedding Model: A model that converts textual data into numerical vectors, allowing the system to understand the semantic meaning of queries and data.</li>
    <li>Retrieval Mechanism: This mechanism enables the search of semantically relevant information from the stored data, enabling context-aware responses based on user input.</li>
</ul>

<p style = 'font-size:18px;font-family:Arial'><b>Key Components</b></p>

    
<ul style="font-size:16px;font-family:Arial"> 
    <li><b>Teradata VectorStore :</b>
Teradata's VectorStore is an enterprise-grade vector database that allows for fast indexing and searching of document content. It uses embedding models and search algorithms to find relevant information from large collections of text efficiently. In this demo, we will leverage the Amazon Titan-Embed-Text model for embedding and <code>VECTORDISTANCE</code> as the search algorithm.</li>
    <li><b>Panel : </b>is a Python library that enables us to build interactive interfaces. It allows us to create widgets such as text inputs, buttons, and display areas directly in the notebook. By integrating Panel with Teradata’s VectorStore, we can create an engaging, user-friendly chatbot interface that lets you interact with the document content effortlessly.</li>
</ul>

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>1. Configuring the environment</b>
<div class="alert alert-block alert-info">
    <p style = 'font-size:16px;font-family:Arial'><i><b>Note:</b>The installation of the required libraries will take approximately <b>4 to 5 minutes</b> for the first-time installation. However, if the libraries are already installed, the execution will complete within 5 seconds.</i></p>

In [None]:
%%capture
!pip install -r requirements.txt

<p style = 'font-size:16px;font-family:Arial'>
    <i>The above statements will install the required libraries to run this demo. To gain access to installed libraries after running this, restart the kernel.</i></p>

<div class="alert alert-block alert-info">
    <p style = 'font-size:16px;font-family:Arial'><i><b>Note:</b> The above statements may need to be uncommented if you run the notebooks on a platform other than ClearScape Analytics Experience that does not have the libraries installed. If you uncomment those installs, be sure to restart the kernel after executing those lines to bring the installed libraries into memory. The simplest way to restart the Kernel is by typing zero zero: <b>0 0</b></i></p></div>

<div class="alert alert-block alert-info">
    <p style = 'font-size:16px;font-family:Arial'><i><b>Note:</b> To ensure that the Chatbot interface reflects the latest changes, please reload the page by clicking the 'Reload' button or pressing F5 on your keyboard for <b>first-time only</b> This will update the notebook with the latest modifications, and you'll be able to interact with the Chatbot using the new libraries.</i></p></div>

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>1.1 Import the required libraries</b></p>

<p style = 'font-size:16px;font-family:Arial'>Here, we import the required libraries, set environment variables and environment paths (if required).</p>

In [None]:
import os
import glob
from dotenv import load_dotenv
import panel as pn
from teradataml import *
from teradatagenai import VSManager, VectorStore
from teradataml import create_context, set_auth_token
import logging
import time

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>2. Connect to VantageCloud Lake</b>
<p style = 'font-size:16px;font-family:Arial'>Connect to VantageCloud using <code>create_context</code> from the teradataml Python library. If this environment has been prepared for connecting to a VantageCloud Lake OAF Container, all the details required will be loaded and you will see an acknowledgement after executing this cell.</p>

In [None]:
print("Checking if this environment is ready to connect to VantageCloud Lake...")

if os.path.exists("/home/jovyan/JupyterLabRoot/VantageCloud_Lake/.config/.env"):
    print("Your environment parameter file exist.  Please proceed with this use case.")
    # Load all the variables from the .env file into a dictionary
    env_vars = dotenv_values("/home/jovyan/JupyterLabRoot/VantageCloud_Lake/.config/.env")
    # Create the Context
    eng = create_context(host=env_vars.get("host"), username=env_vars.get("username"), password=env_vars.get("my_variable"))
    execute_sql('''SET query_band='DEMO=Chatbot_VS.ipynb;' UPDATE FOR SESSION;''')
    print("Connected to VantageCloud Lake with:", eng)
else:
    print("Your environment has not been prepared for connecting to VantageCloud Lake.")
    print("Please contact the support team.")

In [None]:
# We've already loaded all the values into our environment variables and into a dictionary, env_vars.

if set_auth_token(base_url=env_vars.get("ues_uri"),
                  pat_token=env_vars.get("access_token"), 
                  pem_file=env_vars.get("pem_file"),
                  valid_from=int(time.time())
                 ):
    print("UES Authentication successful")
else:
    print("UES Authentication failed. Check credentials.")
    sys.exit(1)

In [None]:
VSManager.health()

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>3. Initializing the Vector Store</b>
<p style = 'font-size:16px;font-family:Arial'>Here, we initialize the Vector Store, which will store the document embeddings. This vector store will be used to index and search the uploaded documents efficiently..</p>

In [None]:
# Create the vector store
vs_ti = VectorStore("testing")

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>3.1 File Upload Setup</b></p>

<p style = 'font-size:16px;font-family:Arial'>We initialize the Panel extension to create a user interface that allows document uploads. The panel interface enables users to select and upload documents.</p>

In [None]:
# File upload functionality using Panel
pn.extension()

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>3.2 File Upload Handling</b></p>

<p style = 'font-size:16px;font-family:Arial'>This function saves the uploaded file into a local directory called <code>data</code>. It checks for supported file types and ensures that the file is saved correctly.</p>

In [None]:
# Global variable to track upload status
upload_completed = False

In [None]:
def save_uploaded_file(file_content, filename):
    """Save uploaded file to data folder"""
    global upload_completed
    
    if file_content is None or filename is None:
        return "No file selected"
    
    # Create data folder if it doesn't exist
    data_folder = os.path.join(os.getcwd(), "data")
    if not os.path.exists(data_folder):
        os.makedirs(data_folder)
    
    # List of supported document extensions
    supported_extensions = ['.pdf', '.doc', '.docx', '.txt', '.rtf', '.odt', '.html', '.htm', '.xml', '.csv', '.md']
    
    # Check if it's a supported document file
    file_extension = os.path.splitext(filename.lower())[1]
    if file_extension not in supported_extensions:
        return f"Error: {filename} is not a supported document type. Supported types: {', '.join(supported_extensions)}"
    
    # Save file to data folder
    file_path = os.path.join(data_folder, filename)
    
    try:
        with open(file_path, 'wb') as f:
            f.write(file_content)
        upload_completed = True
        return f"Successfully uploaded: {filename} to data folder"
    except Exception as e:
        return f"Error saving file: {str(e)}"

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>4. File Input Widget</b>
<p style = 'font-size:16px;font-family:Arial'>We create a File Input widget using Panel, allowing users to select multiple document files for upload. Supported file types include PDF, DOC, TXT, CSV, and others.</p>

In [None]:
# Create file input widget
file_input = pn.widgets.FileInput(
    accept='.pdf,.doc,.docx,.txt,.rtf,.odt,.html,.htm,.xml,.csv,.md',
    multiple=True,
    sizing_mode='stretch_width'
)

# Create upload button and status
upload_button = pn.widgets.Button(name="Upload Documents", button_type="primary")
status_pane = pn.pane.HTML("<p>Please select document files to upload.</p>")

In [None]:
def handle_upload(event):
    """Handle the upload button click"""
    global upload_completed
    
    if file_input.value is None:
        status_pane.object = "<p style='color: red;'>Please select a file first.</p>"
        return
    
    if isinstance(file_input.value, list):
        # Multiple files
        results = []
        filenames = file_input.filename if isinstance(file_input.filename, list) else [file_input.filename]
        for i, file_content in enumerate(file_input.value):
            filename = filenames[i] if i < len(filenames) else f"file_{i}"
            result = save_uploaded_file(file_content, filename)
            results.append(result)
        status_messages = "<br>".join([f"<p>{result}</p>" for result in results])
        status_pane.object = status_messages
    else:
        # Single file
        result = save_uploaded_file(file_input.value, file_input.filename)
        if "Successfully" in result:
            status_pane.object = f"<p style='color: green;'>{result}</p>"
        else:
            status_pane.object = f"<p style='color: red;'>{result}</p>"

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>4.1 Upload Button and Status Display</b></p>

<p style = 'font-size:16px;font-family:Arial'>This section sets up a button to trigger the file upload process and a status pane that shows the upload progress or completion messages.</p>

In [None]:
# Bind the upload function to the button
upload_button.on_click(handle_upload)

# Create upload interface
upload_interface = pn.Column(
    pn.pane.HTML("<h3>📁 Upload Document Files</h3>"),
    pn.pane.HTML("<p><b>Supported formats:</b> PDF, DOC, DOCX, TXT, RTF, ODT, HTML, HTM, XML, CSV, MD</p>"),
    pn.pane.HTML("<p>Select one or more document files to upload:</p>"),
    file_input,
    pn.Spacer(height=10),
    upload_button,
    pn.Spacer(height=10),
    status_pane,
    width=600,
    margin=(10, 10)
)

# Display the upload interface in the notebook
upload_interface

<p style = 'font-size:16px;font-family:Arial'>File Input widget using Panel, allowing users to select multiple document files for upload. Supported file types include <code>PDF, DOC, TXT, CSV, and others</code></p>

In [None]:
# Fetch multiple document files from the data directory
data_folder = os.path.join(os.getcwd(), "data")
supported_patterns = ["*.pdf", "*.doc", "*.docx", "*.txt", "*.rtf", "*.odt", "*.html", "*.htm", "*.xml", "*.csv", "*.md"]
files = []
for pattern in supported_patterns:
    files.extend(glob.glob(os.path.join(data_folder, pattern)))

In [None]:
# Check if there are document files available in data folder
if len(files) == 0:
    raise FileNotFoundError("No document files found in the data directory.")
else:
    print("Input document files from data folder:")
    for file in files:
        print(os.path.basename(file))

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>4.2 Creating Vector Store</b></p>

<p style = 'font-size:16px;font-family:Arial'>initialize and configure the <b>Teradata Vector Store</b> with the required parameters. This is the core step where we set up the vector store with the relevant models, algorithms, and document files. The Vector Store will index the uploaded documents and prepare them for fast retrieval using similarity search.</p>

In [None]:
vs_ti.create(
    embeddings_model="amazon.titan-embed-text-v2:0",
    chat_completion_model="anthropic.claude-3-5-sonnet-20240620-v1:0",
    search_algorithm="VECTORDISTANCE",
    top_k=10,
    object_names="tbl_testing",
    data_columns=["chunks"],
    vector_column="VectorIndex",
    chunk_size=100,
    optimized_chunking=False,
    document_files=files,
)

<p style = 'font-size:16px;font-family:Arial'>Check the current status of the <b>Teradata Vector Store</b> after it has been created. This step ensures that the Vector Store has been successfully initialized and is ready for processing queries.</p>

In [None]:
vs_ti.status()

<p style = 'font-size:16px;font-family:Arial'>The `run_query` function is designed to process and answer user queries based on the document content stored in the <b>Teradata Vector Store</b>. This function leverages the embeddings created from the uploaded documents to retrieve relevant information and provide answers.</p>

In [None]:
# Function to run a query from the PDF content
def run_query(query: str):
    res = vs_ti.ask(question=query)
    return res

<p style = 'font-size:16px;font-family:Arial'>The <code>callback</code> function is responsible for handling the chat messages from the user and providing appropriate responses. It acts as the core mechanism for processing user input and querying the <b>Teradata Vector Store</b> to generate responses based on the uploaded document content.</p>

In [None]:
# Callback function for handling chat messages and providing responses
def callback(contents, user, instance):
    """Handles the chat interaction and returns the response."""
    # Process the contents of the message
    response = run_query(contents) 
    return response

<div class="alert alert-block alert-info">
    <p style = 'font-size:16px;font-family:Arial'><i><b>Note:</b>Chatbot is accessing multiple components, including databases and LLMs. This may cause a brief delay in responses. Your patience is appreciated.</i></p>
</div>

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>5. Panel's Chat Interface</b>
<p style = 'font-size:16px;font-family:Arial'>The chatbot uses Panel's <code>ChatInterface</code> to handle the user interface for interactions. This interface allows users to input questions and view responses in real-time, providing an intuitive and smooth experience for engaging with the documents.</p>

<p style='font-size:16px;font-family:Arial'>
    You can ask the chatbot about anything in the documents you uploaded. Here are some example queries:
</p>

<ul style="font-size:16px;font-family:Arial"> 
    <li><b>Ask a question:</b>
    <br>Can you summarize the introduction of the report? <br>
    Where in the document is the financial analysis mentioned?</li>
    <li><b>Request specific information:</b>
    <br>Can you give me the names of the authors of the research? <br>
    What does the document say about climate change?</li>
</ul>


In [None]:
# Using Panel's ChatInterface for the chatbot UI
pn.chat.ChatInterface(
    callback=callback,
    show_rerun=False,  # Hide rerun button
    show_undo=False,   # Hide undo button
    show_clear=False,  # Hide clear button
    width=800,
    height=400
).servable()

<i>If the chatbot didn't work when you pressed ENTER, on your first time using this demo on your environment, did you use F5 to reload the site? See instructions at the top of the notebook.<br>
If you asked a question and got no reponse after a few minutes, it is possible that you will need to type 0 0 to restart the kernel and re-run the demo. Questions outside the model seem to confuse the chatbot.  </i>

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>6. Cleanup</b>
<p style = 'font-size:16px;font-family:Arial'>Call the destroy() method of the VS object to clean up the objects created during this demo.</p>

In [None]:
# Destroy the vector store after use
vs_ti.destroy()

In [None]:
remove_context()

<p style = 'font-size:16px;font-family:Arial'><b>Link:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Teradata Enterprise Vector Store: <a href = 'https://docs.teradata.com/search/all?query=Teradata+Enterprise+Vector+Store&content-lang=en-US'>here</a></li>
    
</ul>

<footer style="padding-bottom:35px; border-bottom:3px solid">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2025. All Rights Reserved
        </div>
    </div>
</footer>