<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       Complaints Summarization Using Amazon Bedrock
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<hr style="height:2px;border:none;background-color:#00233C;">
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Here, we import the required libraries, set environment variables and environment paths (if required).</p>

In [1]:
import numpy as np
import pandas as pd
import timeit
import boto3
from tqdm import tqdm
from teradataml import *
from langchain.llms.bedrock import Bedrock

display.max_rows = 5
pd.set_option('display.max_colwidth', None)

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>1. Connect to Vantage</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We will be prompted to provide the password. We will enter the password, press the Enter key, and then use the down arrow to go to the next cell.</p>

In [2]:
%run -i ../startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)
execute_sql('''SET query_band='DEMO=Complaint_Summarization.ipynb;' UPDATE FOR SESSION;''')

Performing setup ...
Setup complete



Enter password:  ········


... Logon successful
Connected as: xxxxxsql://demo_user:xxxxx@host.docker.internal/dbc
Engine(teradatasql://demo_user:***@host.docker.internal)


TeradataCursor uRowsHandle=15 bClosed=False

<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Begin running steps with Shift + Enter keys. </p>

<p style = 'font-size:20px;font-family:Arial;color:#00233C'><b>Getting Data for This Demo</b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We have provided data for this demo on cloud storage. We have the option of either running the demo using foreign tables to access the data without using any storage on our environment or downloading the data to local storage, which may yield somewhat faster execution. However, we need to consider available storage. There are two statements in the following cell, and one is commented out. We may switch which mode we choose by changing the comment string.</p>

In [3]:
# %run -i ../run_procedure.py "call get_data('DEMO_ComplaintAnalysis_cloud');"        # Takes 1 minute
%run -i ../run_procedure.py "call get_data('DEMO_ComplaintAnalysis_local');"        # Takes 2 minutes

That ran for   0:01:35.58 with 10 statements and 0 errors. 


<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>2. Configuring AWS CLI</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following cell will prompt us for the following information:</p>
<ol style = 'font-size:16px;font-family:Arial;color:#00233C'>
<li><b>aws_access_key_id</b>: Enter your AWS access key ID</li>
<li><b>aws_secret_access_key</b>: Enter your AWS secret access key</li>
<li><b>region name</b>: Enter the AWS region you want to configure (e.g., us-east-1)</li>
<ol>

In [4]:
def configure_aws():
    print("configure the AWS CLI")
    # enter the access_key/secret_key
    access_key = getpass.getpass("aws_access_key_id ")
    secret_key = getpass.getpass("aws_secret_access_key ")
    region_name = getpass.getpass("region name")

    #set to the env
    !aws configure set aws_access_key_id {access_key}
    !aws configure set aws_secret_access_key {secret_key}
    !aws configure set default.region {region_name}

In [5]:
does_access_key_exists = !aws configure get aws_access_key_id

if len(does_access_key_exists) == 0:
    configure_aws()

In [6]:
!aws configure list

      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile                <not set>             None    None
access_key     ****************GXKN shared-credentials-file    
secret_key     ****************u8mf shared-credentials-file    
    region                us-east-1      config-file    ~/.aws/config


<hr style="height:1px;border:none;background-color:#00233C;">
<b style = 'font-size:18px;font-family:Arial;color:#00233C'>2.1 Initialize the Bedrock Model</b>
<ul style = 'font-size:16px;font-family:Arial;color:#00233C'>
<li>The code below initializes a Boto3 client for the “bedrock-runtime” service.</li>
<li>The get_llm() function creates a Bedrock language model with specific configuration options.</li>
<li>The model can be used for natural language generation tasks.</li>
<ul>

In [7]:
# Create a Boto3 client for the "bedrock-runtime" service in the us-east-1 region
bedrock = boto3.client(service_name="bedrock-runtime", region_name='us-east-1')

def get_llm():
    # Create a Bedrock model with specific configuration options
    return Bedrock(
        model_id="ai21.j2-mid-v1",
        client=bedrock,
        model_kwargs={
            'temperature': 0.7,
            'maxTokens': 50,
            'stopSequences': ["$$"],
            'countPenalty': {'scale': 0},
            'presencePenalty': {'scale': 0}
        }
    )

# Get the Bedrock model
ai21 = get_llm()

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>3. Generate Summaries</b>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>We'll use a sample of the data to generate summaries</p>

In [8]:
df = DataFrame(in_schema('DEMO_ComplaintAnalysis', 'Consumer_Complaints')).to_pandas(num_rows = 10)
df['Summary'] = ""

In [9]:
for i in tqdm(range(len(df))):
    prompt = f'''
        The following is text from a Bank Review:
        “{df['consumer_complaint_narrative'][i]}”
        Summarize the Bank Review in one sentence
    '''

    df['Summary'][i] = ai21(prompt = prompt)

100%|██████████| 10/10 [00:09<00:00,  1.07it/s]


In [10]:
df['Summary'] = df['Summary'].apply(lambda x: x.strip())

In [11]:
df[['complaint_id', 'Summary']]

Unnamed: 0,complaint_id,Summary
0,5921107,"The Bank Review states that a Discover credit card customer was behind one payment due to illness/accident, and chose an option to decrease interest rate, but was later informed that the card would be revoked, which was deceptive and fraudulent."
1,8119363,"The Bank Customer is frustrated because the Bank has acknowledged the complaint, but has not acted on the fraudulent account and balance reported on the Customer's credit report."
2,5994654,The Bank Review is about a customer who had problems with Discover Bank sending their payment to the wrong account twice.
3,6048305,"Bank Review states that Discover Card closed an account due to two payments being returned and inability to verify identity, although the payments were returned due to a lost debit card and that the account was settled."
4,3212475,"Bank review: Discover credit was discharged in chapter XXXX bankruptcy, but is now reporting a balance."
5,5694588,"I opened an online savings account with Discover Bank, and they froze my account, and I cannot access my funds because their security number is impossible to to get through."
6,4788307,"The Bank Review states that Discover Financial repeatedly called the customer to pay their credit card, and their personal information was shared without their consent, resulting in a dip in credit score."
7,3298931,"The Bank Review is concerned about a Discover Credit Card company giving a XXXX senior on a fixed income a {$30000.00} credit line, and suggests that seniors are easy targets for scams."
8,4737435,"Discover has neglected abiding by federal law, and has sent debt collection notices to a customer's address without the customer's permission."
9,4681764,"The Bank Review is a complaint that the customer requested an investigation and the removal of the charge-off status, but the bank did not review the matter or make the appropriate corrections."


<p style = 'font-size:16px;font-family:Arial;color:#00233C'>Save the results back to Vantage.</p>

In [12]:
copy_to_sql(df = df, table_name = 'Complaints_Summaries', if_exists = 'replace')

<hr style="height:2px;border:none;background-color:#00233C;">
<b style = 'font-size:20px;font-family:Arial;color:#00233C'>4. Cleanup</b>

<p style = 'font-size:18px;font-family:Arial;color:#00233C'> <b>Databases and Tables </b></p>
<p style = 'font-size:16px;font-family:Arial;color:#00233C'>The following code will clean up tables and databases created above.</p>

In [13]:
%run -i ../run_procedure.py "call remove_data('DEMO_ComplaintAnalysis');"        # Takes 10 seconds

Removed objects related to DEMO_ComplaintAnalysis. That ran for 0:00:02.22


In [14]:
remove_context()

True

<footer style="padding-bottom:35px; background:#f9f9f9; border-bottom:3px solid #00233C">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2024. All Rights Reserved
        </div>
    </div>
</footer>