<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       ChatBot using Teradata's Enterprise Vector Store
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial'><b>Introduction:</b></p>

<p style = 'font-size:16px;font-family:Arial'>In today‚Äôs information-driven world, organizations need smarter, faster, and more efficient ways to interact with vast amounts of data. Traditional search systems, which rely solely on keywords, often fail to provide the most relevant results, especially when dealing with complex or unstructured data. Teradata's Enterprise Vector Store takes a revolutionary approach to this problem by enabling semantic search ‚Äî the ability to search based on meaning rather than keywords.
</p>

<p style = 'font-size:16px;font-family:Arial'>
In this demo, we showcase the power of Teradata's Enterprise Vector Store combined with conversational AI, demonstrating how the system can intelligently retrieve and present relevant data, all within a chat interface.</p>


<center><img src="images/header.png" alt="mortgage calc"  width=500 height=500 style="border: 4px solid #404040; border-radius: 10px;"/></center>

<br>
<p style = 'font-size:18px;font-family:Arial'><b>Architecture Overview</b></p>

    
<ul style="font-size:16px;font-family:Arial"> 
    <li>Teradata Enterprise Vector Store: A high-performance, scalable database optimized for storing and searching vectorized data, enabling semantic search over vast datasets.</li>
    <li>Chat Application Interface: A user-friendly interface where users interact by typing queries, receiving intelligent responses in return.</li>
    <li>Vector Embedding Model: A model that converts textual data into numerical vectors, allowing the system to understand the semantic meaning of queries and data.</li>
    <li>Retrieval Mechanism: This mechanism enables the search of semantically relevant information from the stored data, enabling context-aware responses based on user input.</li>
</ul>

<p style = 'font-size:18px;font-family:Arial'><b>Key Components</b></p>

    
<ul style="font-size:16px;font-family:Arial"> 
    <li><b>Teradata VectorStore :</b>
Teradata's VectorStore is an enterprise-grade vector database that allows for fast indexing and searching of document content. It uses embedding models and search algorithms to find relevant information from large collections of text efficiently. In this demo, we will leverage the Amazon Titan-Embed-Text model for embedding and <code>VECTORDISTANCE</code> as the search algorithm.</li>
    <li><b>Panel : </b>is a Python library that enables us to build interactive interfaces. It allows us to create widgets such as text inputs, buttons, and display areas directly in the notebook. By integrating Panel with Teradata‚Äôs VectorStore, we can create an engaging, user-friendly chatbot interface that lets you interact with the document content effortlessly.</li>
</ul>

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>1. Configuring the environment</b>
<div class="alert alert-block alert-info">
    <p style = 'font-size:16px;font-family:Arial'><i><b>Note:</b>The installation of the required libraries will take approximately <b>4 to 5 minutes</b> for the first-time installation. However, if the libraries are already installed, the execution will complete within 5 seconds.</i></p>

In [None]:
%%capture
!pip install -r requirements.txt --quiet

<div class="alert alert-block alert-info">
<p style = 'font-size:16px;font-family:Arial;'><b>Note: </b><i>Please restart the kernel after executing a </i><code>!pip install</code>. <i>The simplest way to restart the Kernel is by typing zero zero: <b> 0 0</b></i> and then clicking <b><i>Restart</i></b>.</p>

<div class="alert alert-block alert-info">
    <p style = 'font-size:16px;font-family:Arial'><i><b>Note:</b> To ensure that the Chatbot interface reflects the latest changes, please reload the page by clicking the <b>Reload</b> or <b>Refresh</b> button or pressing F5 on your keyboard for <b>first-time only</b> This will update the notebook with the latest modifications, and you'll be able to interact with the Chatbot using the new libraries.</i></p></div>

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>1.1 Import the required libraries</b></p>

<p style = 'font-size:16px;font-family:Arial'>Here, we import the required libraries, set environment variables and environment paths (if required).</p>

In [None]:
import os
from IPython.display import display, HTML 
import ipywidgets as widgets
import glob
from dotenv import load_dotenv
import panel as pn
from teradataml import *
from teradatagenai import VSManager, VectorStore
from teradataml import create_context, set_auth_token
import logging
import time
import pandas

<hr style="height:2px;border:none">
<p style = 'font-size:20px;font-family:Arial'><b>2. Connect to VantageCloud</b></p>
<p style = 'font-size:16px;font-family:Arial'>Connect to VantageCloud using <code>create_context</code> from the teradataml Python library. This environment has been prepared for connecting to a VantageCloud OAF Container. All the details required have been provided.</p>

<p style = 'font-size:18px;font-family:Arial;'><b>2.1 Load the Environment Variables and Connect to Vantage</b></p>
<p style = 'font-size:16px;font-family:Arial;'>Load the environment variables from a .env file and use them to create a connection context to VantageCloud.</p>


In [None]:
print("Checking if this environment is ready to connect to VantageCloud Lake...")

if os.path.exists("/home/jovyan/JupyterLabRoot/VantageCloud_Lake/.config/.env"):
    print("Your environment parameter file exist.  Please proceed with this use case.")
    # Load all the variables from the .env file into a dictionary
    env_vars = dotenv_values("/home/jovyan/JupyterLabRoot/VantageCloud_Lake/.config/.env")
    # Create the Context
    eng = create_context(host=env_vars.get("host"), username=env_vars.get("username"), password=env_vars.get("my_variable"))
    execute_sql('''SET query_band='DEMO=Chatbot_Teradata_Vector_Store.ipynb;' UPDATE FOR SESSION;''')
    print("Connected to VantageCloud Lake with:", eng)
else:
    print("Your environment has not been prepared for connecting to VantageCloud Lake.")
    print("Please contact the support team.")

In [None]:
# We've already loaded all the values into our environment variables and into a dictionary, env_vars.

if set_auth_token(base_url=env_vars.get("ues_uri"),
                  pat_token=env_vars.get("access_token"), 
                  pem_file=env_vars.get("pem_file"),
                  valid_from=int(time.time())
                 ):
    print("UES Authentication successful")
else:
    print("UES Authentication failed. Check credentials.")
    sys.exit(1)

<p style = 'font-size:18px;font-family:Arial;'><b>2.2 Check the connectivity to our Vector Store Database</b></p>
<p style = 'font-size:16px;font-family:Arial;'>Execute this statement to test the connection.</p>

In [None]:
VSManager.health()

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>3. Initializing the Vector Store</b>
<p style = 'font-size:16px;font-family:Arial'>Here, we initialize the Vector Store, which will store the document embeddings. This vector store will be used to index and search the uploaded documents efficiently..</p>

In [None]:
# Create the vector store
document_vector_store = VectorStore(env_vars.get("username"))

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>3.1 File Upload Setup</b></p>

<p style = 'font-size:16px;font-family:Arial'>We initialize the Panel extension to create a user interface that allows document uploads. The panel interface enables users to select and upload documents.</p>

In [None]:
# File upload functionality using Panel
pn.extension()

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>3.2 File Upload Handling</b></p>

<p style = 'font-size:16px;font-family:Arial'>Creates a <code>data</code> folder inside the current working directory if it doesn't already exist.</p>

In [None]:
cwd = os.getcwd()
data_folder = os.path.join(cwd, "data")
os.makedirs(data_folder, exist_ok=True)

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>4. File Input Widget</b>
<p style = 'font-size:16px;font-family:Arial'>We create a File Input widget, allowing users to select multiple document files for upload. Supported file types include PDF.</p>

In [None]:
upload_btn = widgets.FileUpload(
    accept='.pdf',
    multiple=True,
    layout=widgets.Layout(width='60%')
)

upload_label = widgets.Label(
    value="Select one or more PDF files to upload.",
    layout=widgets.Layout(padding='6px 0px 10px 0px')
)

output = widgets.Output()

In [None]:
def on_upload_change(change):
    with output:
        output.clear_output()
        files = upload_btn.value
        if not files:
            display(HTML("<p style='color:#cc0000;font-weight:500;'>‚ö†Ô∏è No files selected.</p>"))
            return

        display(HTML("<p style='color:#005f9e;font-weight:500;'>‚è≥ Uploading files...</p>"))

        # Handles both dict and tuple return types
        file_list = files if isinstance(files, tuple) else files.values()

        for file_info in file_list:
            filename = file_info['name']
            content = file_info['content']
            save_path = os.path.join(data_folder, filename)
            with open(save_path, 'wb') as f:
                f.write(content)
            display(HTML(f"<p style='color:#008000;'>‚úÖ Saved: <b>{filename}</b></p>"))

        display(HTML(f"""
        <div style="
            background-color:#00233c;
            color:white;
            padding:10px;
            border-radius:8px;
            margin-top:10px;
            font-weight:600;
            font-family:Segoe UI, sans-serif;">
         Upload Completed Successfully!
        </div>
        <p style='font-size:13px;color:gray;margin-top:5px;'>
        Files saved to: <code>{data_folder}</code>
        </p>
        """))

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>4.1 **Optional** Upload your own PDF files!</b></p>

<p style = 'font-size:16px;font-family:Arial'>This section is not required to continue. If provides a button that will open your File Explorer and allow you to select one or more PDF files. Continue to 4.2 to use the provided PDF file.</p>

In [None]:
from IPython.display import display, HTML 

In [None]:
display(HTML(f"""
<div style="
    background-color:#00233c;
    color:white;
    padding:14px;
    border-radius:10px;
    font-family:Segoe UI, sans-serif;
    font-size:18px;
    font-weight:500;
    margin-bottom:12px;
    box-shadow:0px 2px 5px rgba(0,0,0,0.2);">
üìÑ Upload PDF Files
</div>
<br>
"""))

upload_btn.observe(on_upload_change, names='value')

ui = widgets.VBox([
    upload_label,
    upload_btn,
    output
])
display(ui)

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>4.2 Validate the PDF files</b></p>

<p style = 'font-size:16px;font-family:Arial'>Scan the files in the included /data folder and set the Project Directory.</p>

In [None]:
import os

# Full path to your desired project folder
PROJECT_DIR = "/home/jovyan/JupyterLabRoot/VantageCloud_Lake/UseCases/Chatbot_Teradata_Vector_Store"

# Ensure current working directory is valid and set to the project folder
try:
    _ = os.getcwd()
except FileNotFoundError:
    os.chdir(PROJECT_DIR)
else:
    if os.getcwd() != PROJECT_DIR:
        os.chdir(PROJECT_DIR)

print("Current Working Directory:", os.getcwd())

<p style = 'font-size:16px;font-family:Arial'>Checking all the files available to store inside teradata vector store. </p>

In [None]:

data_folder = os.path.join(PROJECT_DIR, "data")
supported_patterns = ["*.pdf"]
files = []
for pattern in supported_patterns:
    files.extend(glob.glob(os.path.join(data_folder, pattern)))

if len(files) == 0:
    raise FileNotFoundError("No PDF files found in the data directory.")
else:
    print("Input PDF files from data folder:")
    for file in files:
        print(os.path.basename(file))

<hr style="height:2px;border:none">
<p style = 'font-size:20px;font-family:Arial'><b>5. Creating the Vector Store</b></p>

<p style = 'font-size:18px;font-family:Arial'><b>5.1 Create</b></p>
<p style = 'font-size:18px;font-family:Arial'>Use the <code>Create</code> function to initialize and configure the <b>Teradata Vector Store</b> with the required parameters. This is the core step where we set up the vector store with the relevant models, algorithms, and document files. The Vector Store will index the uploaded documents and prepare them for fast retrieval using similarity search.</p>

In [None]:
df=document_vector_store.status()
if df is None:
    document_vector_store.create(
        embeddings_model="amazon.titan-embed-text-v2:0",
        chat_completion_model="anthropic.claude-3-5-sonnet-20240620-v1:0",
        search_algorithm="VECTORDISTANCE",
        top_k=10,
        object_names="tbl_testing",
        data_columns=["chunks"],
        vector_column="VectorIndex",
        chunk_size=100,
        optimized_chunking=False,
        document_files=files,
    )
else:
    print("Our Vector Store Database already exist!")

<p style = 'font-size:16px;font-family:Arial'>Check the current status of the <b>Teradata Vector Store</b> after it has been created. This step ensures that the Vector Store has been successfully initialized and is ready for processing queries. <br>
</p>

<p style = 'font-size:16px;font-family:Arial'>
This cell will loop every 15 seconds to check the status, Move on to next cell when the status shows as - <b>"READY"</b>
</p>

In [None]:
df = document_vector_store.status()

while True:
    if df.loc[0, 'status'] == 'READY':
        break
    else:
        print(f"Current status: {df.loc[0, 'status']}. Waiting 15 seconds...")
        time.sleep(30)
        df = document_vector_store.status()

print(f"The Vector Store Database: {df.loc[0,'vs_name']} is {df.loc[0, 'status']}!")


<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>5.2 Run_Query</b></p>

<p style = 'font-size:16px;font-family:Arial'>The <code>Run_Query</code> function is designed to process and answer user queries based on the document content stored in the Teradata Vector Store. This function leverages the embeddings created from the uploaded documents to retrieve relevant information and provide answers.</p>

In [None]:
# Function to run a query from the PDF content
def run_query(query: str):
    res = document_vector_store.ask(question=query)
    return res

<hr style="height:2px;border:none">
<p style = 'font-size:18px;font-family:Arial'><b>5.3 Callback</b></p>

<p style = 'font-size:16px;font-family:Arial'>The <code>Callback</code> function is responsible for handling the chat messages from the user and providing appropriate responses. It acts as the core mechanism for processing user input and querying the <b>Teradata Vector Store</b> to generate responses based on the uploaded document content.</p></p>

<p style = 'font-size:16px;font-family:Arial'>The <code>callback</code> function is responsible for handling the chat messages from the user and providing appropriate responses. It acts as the core mechanism for processing user input and querying the <b>Teradata Vector Store</b> to generate responses based on the uploaded document content.</p>

In [None]:
# Callback function for handling chat messages and providing responses
def callback(contents, user, instance):
    """Handles the chat interaction and returns the response."""
    # Process the contents of the message
    response = run_query(contents) 
    return response

<div class="alert alert-block alert-info">
    <p style = 'font-size:16px;font-family:Arial'><i><b>Note:</b>Chatbot is accessing multiple components, including databases and LLMs. This may cause a brief delay in responses. Your patience is appreciated.</i></p>
</div>

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>6. Create the Chatbot Interface</b>
<p style = 'font-size:16px;font-family:Arial'>The chatbot uses Panel's <code>ChatInterface</code> to handle the user interface for interactions. This interface allows users to input questions and view responses in real-time, providing an intuitive and smooth experience for engaging with the documents.</p>

<p style='font-size:16px;font-family:Arial'>
    Defaut file uploaded for testing RAG : 
</p>

<p style='font-size:16px;font-family:Arial;'>
    <b>File:</b> Attention_is_all_you_need.pdf<br>
    <b>Summary:</b> The 2017 paper by Vaswani et al. introducing the Transformer model ‚Äî 
    a neural network based entirely on self-attention, achieving faster training and state-of-the-art results in machine translation.
</p>

<p style='font-size:16px;font-family:Arial'>
    You can ask the chatbot about anything in the documents you have uploaded or can ask from the sample questions given below. 
</p>

<ul style="font-size:16px;font-family:Arial"> 
    <li><b>Ask a question:</b>
    <br>What is the main innovation introduced by the Transformer model? <br>
    How does multi-head attention work in the Transformer? <br>
    What were the training results and efficiency improvements reported?
    </li>
</ul>


In [None]:
# Using Panel's ChatInterface for the chatbot UI
pn.chat.ChatInterface(
    callback=callback,
    show_rerun=False,  # Hide rerun button
    show_undo=False,   # Hide undo button
    show_clear=False,  # Hide clear button
    width=800,
    height=400
).servable()

<i>If the chatbot didn't work when you pressed ENTER, on your first time using this demo on your environment, did you use F5 to reload the site? See instructions at the top of the notebook.<br>
If you asked a question and got no reponse after a few minutes, it is possible that you will need to type 0 0 to restart the kernel and re-run the demo. Questions outside the model seem to confuse the chatbot.  </i>

<hr style="height:2px;border:none">
<b style = 'font-size:20px;font-family:Arial'>7. Cleanup</b>
<p style = 'font-size:16px;font-family:Arial'>Call the destroy() method of the VS object to clean up the objects created during this demo.</p>

In [None]:
# Destroy the vector store after use
document_vector_store.destroy()

In [None]:
df = document_vector_store.status()

while True:
    if df is None:
        break
    else:
        print(f"Current status: {df}. Waiting 10 seconds...")
        time.sleep(10)
        df = document_vector_store.status()

print(f"The Vector Store Database has been successfully destroyed!")

In [None]:
remove_context()

<p style = 'font-size:16px;font-family:Arial'><b>Link:</b></p>
<ul style = 'font-size:16px;font-family:Arial'>
    <li>Teradata Enterprise Vector Store: <a href = 'https://docs.teradata.com/search/all?query=Teradata+Enterprise+Vector+Store&content-lang=en-US'>here</a></li>
    
</ul>

<footer style="padding-bottom:35px; border-bottom:3px solid">
    <div style="float:left;margin-top:14px">ClearScape Analytics‚Ñ¢</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright ¬© Teradata Corporation - 2025. All Rights Reserved
        </div>
    </div>
</footer>