<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Regulatory compliance and customer complaint analytics powered by ClearScape Analytics In-db functions, BYO-LLM, and GPU compute
  <br>
       <img id="teradata-logo" src="images/TeradataLogo.png" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>
<hr>
<p style = 'font-size:28px;font-family:Arial;color:#00233C'><b>Use case</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>For financial institutions, failure to comply with regulations can have severe financial implications and reputation damage.   Monitoring employee-client communications helps to prevent irresponsible behavior. In addition, it is also required to manage customer complaints efficiently
In this demo you will see how use power of ClearScape Analytics In-db functions, BYO-LLM and GPU compute to analyze text related to irresponsible behavior and semantic search on customer complaints.</p>

<p style = 'font-size:28px;font-family:Arial;color:#00233C'><b>Demonstration Overview</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following demonstration illustrates an operationalized end-to-end process of utilizing VantageCloud Lake <b>GPU-enabled Analytic Cluster</b> architecture to run open-source large language models at massive parallelism and scale.</p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>This notebook illustrates the final step of a GPU-augmented analytic pipeline.  In the previous demonstration, we reviewed;</p>
<ol style = 'font-size:16px;font-family:Arial;color:#00233C'>
    <li><b>Container Management</b>.  Administrators can create and manage <b>secure, custom</b> runtime containers that will host any number of models and model artifacts to unlock GPU-augmented analytics</li>
    <li><b>Data Prep with Vector Embeddings</b>. Developers will use the Hugging Face BAAI/bge-small-en-v1.5 model to generate vector embeddings for two data sets:<ul style = 'font-size:16px;font-family:Arial;color:#00233C'>
        <li><b>Consumer Complaints</b>. A subset of the Consumer Financial Protection Board Complaints <a href = 'https://catalog.data.gov/dataset/consumer-complaint-database'>database</a> data has been loaded into the VantageCloud Lake database for this demonstration</li><li><b>Semantic Search Terms</b>.  A user-defined list of natural language topics to use to perform a Semantic Search</li></ul></li>
    </ol>

<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>Operationalization with interactive search</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>In this final demonstraion, we will illustrate two interactive use-cases for the semantic embeddings generated above;</p>
<ol style = 'font-size:16px;font-family:Arial;color:#00233C'>
    <li><b>Calculate Semantic Similarity</b> for all pre-staged search or compliance topics.  Generate an interactive UI to allow rapid browsing of top search results</li>
    <li><b>Execute the pipeline on-demand</b>.  Create an interactive experience that performs the Hugging Face embeddings, vector distance, and data formatting all in a single call to the analytic engine</li>
    </ol>
<hr>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Python Package Installation</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>If necessary, install required client packages for the demonstrations.  User may need to restart the Jupyter kernel after installation.</p> 

In [None]:
%pip install -r requirements.txt

<hr>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Python Package Imports</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Standard practice to import required packages and libraries; execute this cell to import packages for Teradata automation as well as machine learning, analytics, utility, and data management packages.</p> 

In [None]:
from teradataml import *
from oaf_utils import *
from teradatasqlalchemy.types import *
from time import sleep
import pandas as pd
import csv, sys, os, warnings
from os.path import expanduser
from collections import OrderedDict
import ipywidgets as widgets
from wordcloud import WordCloud, STOPWORDS

from IPython.display import clear_output , display as ipydisplay
import matplotlib.pyplot as plt
from itables import init_notebook_mode
import itables.options as opt

# Set display options for dataframes, plots, and warnings

opt.style="table-layout:auto;width:auto;float:left"
opt.columnDefs = [{"className": "dt-left", "targets": "_all"}]
init_notebook_mode(all_interactive=True)
%matplotlib inline
warnings.filterwarnings('ignore')
display.suppress_vantage_runtime_warnings = True

# load vars json
with open('vars_gpu_SEP_24.json', 'r') as f:
    session_vars = json.load(f)

# Database login information
host = session_vars['environment']['host']
username = session_vars['hierarchy']['users']['business_users'][1]['username']
password = session_vars['hierarchy']['users']['business_users'][1]['password']

# UES Authentication information
ues_url = session_vars['environment']['UES_URI']
configure.ues_url = ues_url
pat_token = session_vars['hierarchy']['users']['business_users'][1]['pat_token']
pem_file = session_vars['hierarchy']['users']['business_users'][1]['key_file']


compute_group = session_vars['hierarchy']['users']['business_users'][1]['compute_group']


# container name - set here for easier notebook navigation
### User will also be asked to change it ###
oaf_name = 'embeddings_env_demo'
###########################

<hr>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Operationalizing interactive AI-powered analytics
  <br>
    </p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following demonstration will illustrate how developers can <b>operationalize</b> end-to-end GPU-augmented processing, enabling the entire organization to leverage AI across the data lifecycle.  The following steps illustrate how to:</p>

<table style = 'width:100%;table-layout:fixed;'>
    <tr>
        <td style = 'vertical-align:top' width = '30%'>
           <ol style = 'font-size:16px;font-family:Arial;color:#00233C'>
               <li><b>Calculate Semantic Similarity</b>.  Execute native vector distance functions to check similarity between the search topics and complaints embeddings</li>
            <br>
            <br>
               <li><b>Examine Results</b>.  Generate a data table and word cloud to check for other contextual patterns</li>
            <br>
            <br>
               <li><b>Create an interactive exerience</b>.  Create an operational pipeline to take interactive semantic search terms</li>
        </ol>
        </td>
        <td width = '20%'></td>
        <td style = 'vertical-align:top'><img src = 'images/cos_similarity.jpeg' width = 350 ></td>
    </tr>
</table>


<hr>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Check connection</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Connect to the database, UES, and start cluster if necessary<get_context()/p> 

In [None]:
# check for existing connection
eng = check_and_connect(host=host, username=username, password=password, compute_group = compute_group)
print(eng)
    

# check to see if there is a valid UES auth
# if not, authenticate
try:
    demo_env = get_env(oaf_name)

except Exception as e:
    if '''NoneType' object has no attribute 'value''' in str(e): #UES auth expired/required
        if set_auth_token(ues_url = ues_url, username = username, pat_token = pat_token, pem_file = pem_file):
            print('UES Authentication successful')
            try:
                demo_env = get_env(oaf_name)
                pass
            except Exception as l:
                if f'''User environment '{oaf_name}' not found''' in str(l):
                    print('User environment not found')
                    pass
                else:
                    raise
        else:
            print('UES Authentication failed, check URL and account info')
        pass
    elif f'''User environment '{oaf_name}' not found''' in str(e):
        print('User environment not found')
        pass
    else:
        raise
try:
    ipydisplay(list_user_envs())
except Exception as e:
    if str(e).find('No user environments found') > 0:
        print('No user environments found')
        pass
    else:
        raise

print('Use an existing environment')
print(f'OAF Environment is set to {oaf_name}.')
print('Enter to accept, or input a new value.')
i = input()
if len(i) != 0:
    oaf_name = i
    demo_env = get_env(oaf_name)
    print(f'OAF Environment is now {oaf_name}')
    

# check cluster status
check_cluster_start(compute_group = compute_group)

<hr>

<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>1.  Calculate semantic similarity</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>First, create a reference to the prepared data sets.  Next, execute the native <b>ClearScape Analytics</b> VectorDistance function.</p> 

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The <a href = 'https://docs.teradata.com/r/Enterprise_IntelliFlex_VMware/Database-Analytic-Functions/Model-Training-Functions/TD_VectorDistance'>TD_VectorDistance</a> function accepts a table of target vectors and a table of reference vectors and returns a table that contains the distance between target-reference pairs. The function computes the distance between the target pair and the reference pair from the same table if you provide only one table as the input.  This function can be called using the terdataml python package, or native SQL.</p>

In [None]:
# check for the required tables
# if any are missing, run the appropriate sections of the 
# first demo notebook

# Required tables
reqs = ['CFPB_embeddings', 'topics_embeddings', 'topics_of_interest', 'CFPB_Complaints']

# tables in the database demo_ofs
tbls = db_list_tables(schema_name = 'demo_ofs')['TableName'].to_list()

if len(set(reqs).intersection(tbls)) == len(reqs):
    print('Required tables exist')
else:
    print('The following tables are missing, Please run the previous notebook to generate them:')
    ipydisplay(set(reqs) - set(tbls))

In [None]:
print('Complaints embeddings:')
tdf_target = DataFrame('"demo_ofs"."CFPB_embeddings"')
ipydisplay(tdf_target.sample(1))
print('Topics embeddings:')
tdf_reference = DataFrame('"demo_ofs"."topics_embeddings"')
ipydisplay(tdf_reference.sample(1))

print('Calculating vector distance')
feature_list = ['"' + str(i) + '"' for i in range(384)]
res_tdf = VectorDistance(target_data = tdf_target, 
                   reference_data = tdf_reference, 
                   target_id_column = 'id',
                   target_feature_columns = feature_list,
                   ref_id_column = 'id',
                   ref_feature_columns = feature_list,
                   distance_measure = 'cosine',
                   topk = 1
                  ).result
print('Analysis complete, raw output: ')
res_tdf

<hr>

<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>2.  Create user-friendly results</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>First, use various teradataml methods to join the original comments data to the distance calculations.</p> 

In [None]:
# Get DataFrame references to the original data sets
tdf_topics = DataFrame('"demo_ofs"."topics_of_interest"')
tdf_full = DataFrame.from_query('''SELECT id,
    CASE 
        WHEN txt IS NULL THEN ' '
        ELSE regexp_replace(regexp_replace(regexp_replace(regexp_replace(regexp_replace(txt , X'0d' , ' ') , X'0a' , ' ') , X'09', ' '), ',', ' '), '"', ' ')
    END text
    FROM demo_ofs.CFPB_Complaints;''')

print('Joining topics text to distance calculations')
tdf_final = res_tdf.join(tdf_topics, on = 'reference_id = id')
print('Joining comments text to distance calculations')
tdf_final = tdf_final.join(tdf_full, on = 'target_id = id', lsuffix = 'l')
tdf_final = tdf_final.assign(similarity = 1 - tdf_final['distance'])
tdf_final = tdf_final.assign(topic = tdf_final['txt'])
tdf_final = tdf_final.assign(comment = tdf_final['text'])
tdf_final = tdf_final.assign(topic_id = tdf_final['reference_id'])
tdf_final = tdf_final.drop(['target_id', 'reference_id', 'id_l', 'txt', 'text'], axis = 1)

print('Writing final results to a table')
copy_to_sql(tdf_final, table_name = 'similarity_results', schema_name = 'demo_ofs', if_exists = 'replace')
ipydisplay(DataFrame('"demo_ofs"."similarity_results"').head(2).to_pandas())

<hr>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Visualize results</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Use client-side widgets to provide an interactive experience.  Return ten rows of selected results <b>only</b>.  Then, generate a Word Cloud to represent frequent terms.</p> 

In [None]:
tdf_results = DataFrame('"demo_ofs"."similarity_results"')
topics_list = DataFrame('"demo_ofs"."topics_of_interest"').to_pandas()['txt'].to_list()

d = widgets.Dropdown(options=topics_list)

output = widgets.Output()

def on_change(change):
    if change['type'] == 'change' and change['name'] == 'value':
        with output:
            clear_output()
            local_df = tdf_results[tdf_results['topic'] == change['new']].set_index('distance').iloc[:10].to_pandas().drop('topic', axis = 1)
            ipydisplay(local_df)
            text = " ".join(comment for comment in local_df['comment'])
            stopwords = set(STOPWORDS)
            wordcloud = WordCloud(stopwords = stopwords, width = 800, height=400, 
                                  max_words=100, min_word_length=1, 
                                  collocations = True, background_color = 'white').generate(text)
            plt.figure(figsize=(10, 5))
            plt.imshow(wordcloud, interpolation='gaussian')
            plt.axis('off')
            plt.title(f'Word Cloud for Search Topic')
            plt.show()


d.observe(on_change)

print('Select Topic to Search:')
ipydisplay(d, output)


<hr>
<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>3.  Interactive Semantic Search</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following demonstration will illustrate an example of how to <b>operationalize</b> the combined Hugging Face embeddings, and native analytics performed in the steps above.  In this section, the python methods are replaced with SQL, which allows for broad adoption and re-use by many tools and applications.  Steps are as follow:</p>

<ul style = 'font-size:16px;font-family:Arial;color:#00233C'>
    <li><b>Construct the SQL pipeline</b>.  This query will combine all the analytic steps above into a single expression:<ol style = 'font-size:16px;font-family:Arial;color:#00233C'><li>Embedding incoming search term using the Hugging Face model</li>
        <li>Passing the vector values to the TD_VectorDistance function</li>
        <li>Joining the distance measurements to the original comments</li>
        <li>Returning search results back to the user</li></ol></li>
    <li><b>User</b> Submits search term to the query</li>
    <li><b>Responses</b> are sent back to the client</li>
    </ul>

<hr>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Create an interactive search experience</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Use this query in a dynamic UI - this could be used in a BI tool or other application.</p> 

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>This first cell defines functions we will use with the Jupyter Widgets</p>

In [None]:
def build_search(term, oaf_name):
    return f'''
WITH aq AS (SELECT * FROM Apply(
    ON (SELECT 1 as id, '{term}' as txt)
    PARTITION BY ANY
    
    returns(id BIGINT, "0" FLOAT, "1" FLOAT, "2" FLOAT, "3" FLOAT, "4" FLOAT, "5" FLOAT, "6" FLOAT, "7" FLOAT, "8" FLOAT, "9" FLOAT, "10" FLOAT, "11" FLOAT, "12" FLOAT, "13" FLOAT, "14" FLOAT, "15" FLOAT, "16" FLOAT, "17" FLOAT, "18" FLOAT, "19" FLOAT, "20" FLOAT, "21" FLOAT, "22" FLOAT, "23" FLOAT, "24" FLOAT, "25" FLOAT, "26" FLOAT, "27" FLOAT, "28" FLOAT, "29" FLOAT, "30" FLOAT, "31" FLOAT, "32" FLOAT, "33" FLOAT, "34" FLOAT, "35" FLOAT, "36" FLOAT, "37" FLOAT, "38" FLOAT, "39" FLOAT, "40" FLOAT, "41" FLOAT, "42" FLOAT, "43" FLOAT, "44" FLOAT, "45" FLOAT, "46" FLOAT, "47" FLOAT, "48" FLOAT, "49" FLOAT, "50" FLOAT, "51" FLOAT, "52" FLOAT, "53" FLOAT, "54" FLOAT, "55" FLOAT, "56" FLOAT, "57" FLOAT, "58" FLOAT, "59" FLOAT, "60" FLOAT, "61" FLOAT, "62" FLOAT, "63" FLOAT, "64" FLOAT, "65" FLOAT, "66" FLOAT, "67" FLOAT, "68" FLOAT, "69" FLOAT, "70" FLOAT, "71" FLOAT, "72" FLOAT, "73" FLOAT, "74" FLOAT, "75" FLOAT, "76" FLOAT, "77" FLOAT, "78" FLOAT, "79" FLOAT, "80" FLOAT, "81" FLOAT, "82" FLOAT, "83" FLOAT, "84" FLOAT, "85" FLOAT, "86" FLOAT, "87" FLOAT, "88" FLOAT, "89" FLOAT, "90" FLOAT, "91" FLOAT, "92" FLOAT, "93" FLOAT, "94" FLOAT, "95" FLOAT, "96" FLOAT, "97" FLOAT, "98" FLOAT, "99" FLOAT, "100" FLOAT, "101" FLOAT, "102" FLOAT, "103" FLOAT, "104" FLOAT, "105" FLOAT, "106" FLOAT, "107" FLOAT, "108" FLOAT, "109" FLOAT, "110" FLOAT, "111" FLOAT, "112" FLOAT, "113" FLOAT, "114" FLOAT, "115" FLOAT, "116" FLOAT, "117" FLOAT, "118" FLOAT, "119" FLOAT, "120" FLOAT, "121" FLOAT, "122" FLOAT, "123" FLOAT, "124" FLOAT, "125" FLOAT, "126" FLOAT, "127" FLOAT, "128" FLOAT, "129" FLOAT, "130" FLOAT, "131" FLOAT, "132" FLOAT, "133" FLOAT, "134" FLOAT, "135" FLOAT, "136" FLOAT, "137" FLOAT, "138" FLOAT, "139" FLOAT, "140" FLOAT, "141" FLOAT, "142" FLOAT, "143" FLOAT, "144" FLOAT, "145" FLOAT, "146" FLOAT, "147" FLOAT, "148" FLOAT, "149" FLOAT, "150" FLOAT, "151" FLOAT, "152" FLOAT, "153" FLOAT, "154" FLOAT, "155" FLOAT, "156" FLOAT, "157" FLOAT, "158" FLOAT, "159" FLOAT, "160" FLOAT, "161" FLOAT, "162" FLOAT, "163" FLOAT, "164" FLOAT, "165" FLOAT, "166" FLOAT, "167" FLOAT, "168" FLOAT, "169" FLOAT, "170" FLOAT, "171" FLOAT, "172" FLOAT, "173" FLOAT, "174" FLOAT, "175" FLOAT, "176" FLOAT, "177" FLOAT, "178" FLOAT, "179" FLOAT, "180" FLOAT, "181" FLOAT, "182" FLOAT, "183" FLOAT, "184" FLOAT, "185" FLOAT, "186" FLOAT, "187" FLOAT, "188" FLOAT, "189" FLOAT, "190" FLOAT, "191" FLOAT, "192" FLOAT, "193" FLOAT, "194" FLOAT, "195" FLOAT, "196" FLOAT, "197" FLOAT, "198" FLOAT, "199" FLOAT, "200" FLOAT, "201" FLOAT, "202" FLOAT, "203" FLOAT, "204" FLOAT, "205" FLOAT, "206" FLOAT, "207" FLOAT, "208" FLOAT, "209" FLOAT, "210" FLOAT, "211" FLOAT, "212" FLOAT, "213" FLOAT, "214" FLOAT, "215" FLOAT, "216" FLOAT, "217" FLOAT, "218" FLOAT, "219" FLOAT, "220" FLOAT, "221" FLOAT, "222" FLOAT, "223" FLOAT, "224" FLOAT, "225" FLOAT, "226" FLOAT, "227" FLOAT, "228" FLOAT, "229" FLOAT, "230" FLOAT, "231" FLOAT, "232" FLOAT, "233" FLOAT, "234" FLOAT, "235" FLOAT, "236" FLOAT, "237" FLOAT, "238" FLOAT, "239" FLOAT, "240" FLOAT, "241" FLOAT, "242" FLOAT, "243" FLOAT, "244" FLOAT, "245" FLOAT, "246" FLOAT, "247" FLOAT, "248" FLOAT, "249" FLOAT, "250" FLOAT, "251" FLOAT, "252" FLOAT, "253" FLOAT, "254" FLOAT, "255" FLOAT, "256" FLOAT, "257" FLOAT, "258" FLOAT, "259" FLOAT, "260" FLOAT, "261" FLOAT, "262" FLOAT, "263" FLOAT, "264" FLOAT, "265" FLOAT, "266" FLOAT, "267" FLOAT, "268" FLOAT, "269" FLOAT, "270" FLOAT, "271" FLOAT, "272" FLOAT, "273" FLOAT, "274" FLOAT, "275" FLOAT, "276" FLOAT, "277" FLOAT, "278" FLOAT, "279" FLOAT, "280" FLOAT, "281" FLOAT, "282" FLOAT, "283" FLOAT, "284" FLOAT, "285" FLOAT, "286" FLOAT, "287" FLOAT, "288" FLOAT, "289" FLOAT, "290" FLOAT, "291" FLOAT, "292" FLOAT, "293" FLOAT, "294" FLOAT, "295" FLOAT, "296" FLOAT, "297" FLOAT, "298" FLOAT, "299" FLOAT, "300" FLOAT, "301" FLOAT, "302" FLOAT, "303" FLOAT, "304" FLOAT, "305" FLOAT, "306" FLOAT, "307" FLOAT, "308" FLOAT, "309" FLOAT, "310" FLOAT, "311" FLOAT, "312" FLOAT, "313" FLOAT, "314" FLOAT, "315" FLOAT, "316" FLOAT, "317" FLOAT, "318" FLOAT, "319" FLOAT, "320" FLOAT, "321" FLOAT, "322" FLOAT, "323" FLOAT, "324" FLOAT, "325" FLOAT, "326" FLOAT, "327" FLOAT, "328" FLOAT, "329" FLOAT, "330" FLOAT, "331" FLOAT, "332" FLOAT, "333" FLOAT, "334" FLOAT, "335" FLOAT, "336" FLOAT, "337" FLOAT, "338" FLOAT, "339" FLOAT, "340" FLOAT, "341" FLOAT, "342" FLOAT, "343" FLOAT, "344" FLOAT, "345" FLOAT, "346" FLOAT, "347" FLOAT, "348" FLOAT, "349" FLOAT, "350" FLOAT, "351" FLOAT, "352" FLOAT, "353" FLOAT, "354" FLOAT, "355" FLOAT, "356" FLOAT, "357" FLOAT, "358" FLOAT, "359" FLOAT, "360" FLOAT, "361" FLOAT, "362" FLOAT, "363" FLOAT, "364" FLOAT, "365" FLOAT, "366" FLOAT, "367" FLOAT, "368" FLOAT, "369" FLOAT, "370" FLOAT, "371" FLOAT, "372" FLOAT, "373" FLOAT, "374" FLOAT, "375" FLOAT, "376" FLOAT, "377" FLOAT, "378" FLOAT, "379" FLOAT, "380" FLOAT, "381" FLOAT, "382" FLOAT, "383" FLOAT)
    USING
    
    APPLY_COMMAND('python embedding_bge_small_en.py')
    ENVIRONMENT('{oaf_name}')
    STYLE('csv')
    delimiter(',') 
) as d),

dt AS (SELECT TOP 10 Target_ID, Distance, 1-Distance AS Similarity FROM TD_VectorDistance(
    ON demo_ofs.CFPB_embeddings AS TargetTable
    PARTITION BY ANY 
    ON aq AS ReferenceTable
    DIMENSION
    USING
    TargetIDColumn('id')
    TargetFeatureColumns('[1:384]')
    RefIDColumn('id')
    RefFeatureColumns('[1:384]')
    DistanceMeasure('cosine')
    TopK(1)
) as dst
ORDER BY Similarity DESC)

SELECT dt.Target_ID, cpl.txt, dt.Similarity
FROM demo_ofs.CFPB_Complaints cpl
 INNER JOIN  dt
     ON dt.Target_ID = cpl.id
ORDER BY dt.Similarity DESC;
'''

def on_button_clicked(b):
    if len(t.value) > 0:
        with o:
            clear_output()
            print('Starting Search...')
            end_to_end = build_search(t.value, oaf_name)
            #print('Starting Search...')
            local_df = pd.read_sql(end_to_end, eng)[['target_id', 'Similarity', 'txt']]
            ipydisplay(local_df)
            text = " ".join(comment for comment in local_df['txt'])
            stopwords = set(STOPWORDS)
            wordcloud = WordCloud(stopwords = stopwords, width = 800, height=400, 
                                  max_words=100, min_word_length=1, 
                                  collocations = True, background_color = 'white').generate(text)
            plt.figure(figsize=(10, 5))
            plt.imshow(wordcloud, interpolation='gaussian')
            plt.axis('off')
            plt.title(f'Word Cloud for Search Topic')
            plt.show()



<hr>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Launch the UI</b></p>

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Enter a term and click "Search".  Try misspellings to illustrate this is semantic vs. literal search.</p> 

In [None]:
print('Enter Search Term: ')
t = widgets.Text(
    value='',
    placeholder='Type something')
b  = widgets.Button(
    description='Search',
    tooltip='Search')
o = widgets.Output()

ipydisplay(t, b, o)
b.on_click(on_button_clicked)

<hr>
<p style = 'font-size:24px;font-family:Arial;color:#00233C'><b>Conclusion - Operationalizing AI-powered analytics</b></p>



<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The preceding demo showed two methods for operationalizing the model execution; using python syntax, or embedding it as a simplified SQL view.  The former allows for developers and data scientists to easily embed this processing in their existing or new applications and workflows.  The latter allows for broad, democratized adoption across the data lifecycle and enterprise - enabling this analytic processing in ETL for data prep and transformation tasks, and in production to power dashboards and/or BI tools.</p>

<hr>
<p style = 'font-size:18px;font-family:Arial;color:#00233C'><b>Cleanup</b></p>

<ol style = 'font-size:16px;font-family:Arial;color:#00233C'><li>Stop the cluster</li>
    <li><b>Not required</b>, but if desired remove the environment and drop tables created in the demonstrations</li>
    <li>Disconnect from the database.</li>
    </ol>

In [None]:
res = check_cluster_stop(compute_group)

In [None]:
# uninstall the libraries from the environment first before removing it
# demo_env.uninstall_lib(libs = demo_env.libs['name'].to_list())
# remove_env(oaf_name)

In [None]:
# Remove the tables created during this demo

# db_drop_table('similarity_results');

In [None]:
remove_context()