# Post Call Analytics

Welcome to this training module on post-call analytics use cases using Amazon SageMaker JumpStart. 

As businesses continue to interact with customers through various channels, it becomes increasingly important to analyze these interactions to gain insights into customer behavior and preferences. Post-call analytics is one such method that involves analyzing customer interactions after the call has ended. The use of large language models can greatly enhance the effectiveness of post-call analytics by enabling more accurate sentiment analysis, identifying specific customer needs and preferences, and improving overall customer experience. 

In this sample notebook, we will explore following topics to demonstrate the various benefits of using Bedrock for post-call analytics and businesses gain a competitive edge in the modern marketplace.

- [수정필요] Choice of LLM models in Amazon SageMaker JumpStart
- One model handling multiple PCA tasks
- Handling long call transcripts

## Step 0. Install packages

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import sys
sys.path.append('../utils')
sys.path.append('../templates')

In [3]:
install_needed = True  # should only be True once

In [None]:
import sys
import IPython

if install_needed:
    print("installing deps and restarting kernel")
    !{sys.executable} -m pip install -U pip
    !{sys.executable} -m pip install -U termcolor
    !{sys.executable} -m pip install -U langchain

    IPython.Application.instance().kernel.do_shutdown(True)

## Step 1. Prepare Large Language Model (LLM)

In [5]:
import boto3
from termcolor import colored
from sagemaker.session import Session
from langchain.llms import AmazonAPIGateway
from lib_en import Llama2ContentHandlerAmazonAPIGateway, FalconContentHandlerEndpoint, FalconContentHandlerAmazonAPIGateway



In [6]:
sagemaker_session = Session()
aws_role = sagemaker_session.get_caller_identity_arn()
aws_region = boto3.Session().region_name

In [7]:
MODEL_NAME = "FALCON-40B" #LLAMA2-7B, FALCON-40B

In [8]:
#RESTAPI_ID = "6bk4r5mo4f" ## us-east-1
RESTAPI_ID = "tgegz13tr1" ## us-west-2

URL = f'https://{RESTAPI_ID}.execute-api.{aws_region}.amazonaws.com/api/'.replace('"','')
LLM_INFO = {
    "FALCON-40B": f"{URL}llm/falcon_40b",
    "LLAMA2-7B": f"{URL}llm/llama2_7b",
}
LLM_URL = LLM_INFO[MODEL_NAME]
HEADERS = {    
    'Content-Type': 'application/json',
    'Accept': 'application/json',
}

print (f'MODEL_NAME: {MODEL_NAME}\nLLM_URL: {LLM_URL}')

MODEL_NAME: FALCON-40B
LLM_URL: https://tgegz13tr1.execute-api.us-west-2.amazonaws.com/api/llm/falcon_40b


In [9]:
llm = AmazonAPIGateway(api_url=LLM_URL, headers=HEADERS)

if MODEL_NAME == "FALCON-40B": llm.content_handler = FalconContentHandlerAmazonAPIGateway()
elif MODEL_NAME == "LLAMA2-7B": llm.content_handler = Llama2ContentHandlerAmazonAPIGateway()

## Step 2. Load transcript files

In [10]:
transcript_files = [
    "./call_transcripts/negative-refund.txt",
    "./call_transcripts/neutral-short.txt",
    "./call_transcripts/positive-partial-refund.txt",
    "./call_transcripts/aws-short.txt",
    "./call_transcripts/aws.txt"
]
transcripts = []

for file_name in transcript_files:
    with open(file_name, "r") as file:
        transcripts.append(file.read())

In [11]:
for i, trans in enumerate(transcripts):
    print(f"transcript #{i+1}: {trans[:300]}\n")
    print("====================\n\n")

transcript #1: timestamp: 2022-12-27 08:26:49.219717

Agent: Thank you for calling our retail support line. My name is ABC. How can I assist you today?

Customer: Yes, I have received a defective product, and I am extremely angry about it! This is unacceptable, and I want it resolved immediately!

Agent: I'm sorry



transcript #2: timestamp: 2023-01-28 08:26:49.219717

Customer: Hi, I'd like to check the balance on my account.

Retail Support: Sure thing! Can I have your account number or phone number associated with the account?

Customer: My phone number is (123) 456-7890.

Retail Support: Great, thank you. Let me pull up y



transcript #3: timestamp: 2022-12-28 08:26:49.219717

Agent: Thank you for calling [Retailer], my name is [Agent Name]. How may I assist you today?

Customer: Hi, I wanted to check on the status of my order. It was supposed to arrive today, but I haven't received it yet.

Agent: I'm sorry to hear that. Can I have 



transcript #4: What is AWS? AWS or Amazon W

## Step 3. Post Call Analysis

In [12]:
from langchain import PromptTemplate

### Step 3.1. Prompt Template
In this notebook, we'll be performing four different analyses(**Summary, Sentiment, Intent and Resolution**), and we'll need a template for each one. 

* Summary template

In [13]:
summary_template = """
Analyze the retail support call transcript below. Provide a detail summary of the conversation in complete sentence:

context: {transcript}

summary:"""

* Sentiment template

In [14]:
sentiment_template = """
This is a sentiment analysis program. What is the customer sentiment using following classes 
["POSITIVE", "NEUTRAL", "NEGATIVE"]. classify the conversation into one and exact one of these classes. 
If you don't know or not sure, please use ["NEUTRAL"] class. Do not try to make up a class:

context: {transcript}

sentiment: """

* intent template

In [15]:
intent_template = """
This is a intent classification program. What is the purpose of the customer call using following classes
["SHIPMENT_DELAY", "COMPLAIN_PRODUCT_DEFECT", "ACCOUNT_QUESTION"]. classify the conversation into one and exact one of these classes.
If you don't know, please use ["UNKNOWN"] class. Do not try to make up a class. 

context: {transcript}

intent: """

### Step 3.2. Analysis

In [16]:
def analysis(llm, transcript, params, template="", max_tokens=50):

    prompt = PromptTemplate(template=template, input_variables=["transcript"])
    analysis_prompt = prompt.format(transcript=transcript)
    llm.model_kwargs = params

    print (colored(analysis_prompt, 'green'))

    response = llm(analysis_prompt)

    return response

In [17]:
PARAMS = {
    "FALCON-40B": {
        "max_new_tokens": 200,
        "max_length": 1024,
        "top_p": 0.95,
        "do_sample": False,
        "temperature": 0.01,
        "return_full_text": False,
        "include_prompt_in_result": False
    },
    "LLAMA2-7B": {
        'max_new_tokens': 128,
        'top_p': 0.9,
        'temperature': 0.1,
        'return_full_text': False
    },
}

* Summary analysis

In [18]:
%%time

res = analysis(
    llm=llm,
    transcript=transcripts[0],
    params=PARAMS[MODEL_NAME],
    template=summary_template
)

print (res)

[32m
Analyze the retail support call transcript below. Provide a detail summary of the conversation in complete sentence:

context: timestamp: 2022-12-27 08:26:49.219717

Agent: Thank you for calling our retail support line. My name is ABC. How can I assist you today?

Customer: Yes, I have received a defective product, and I am extremely angry about it! This is unacceptable, and I want it resolved immediately!

Agent: I'm sorry to hear that you received a defective product. Can you please let me know what the issue is?

Customer: The product I received is broken and unusable. I spent a lot of money on it, and now I can't even use it! This is unacceptable, and I demand a solution right now!

Agent: I completely understand your frustration, and I'm sorry for any inconvenience this has caused you. Can you please provide me with your order number so that I can look into this for you?

Customer: 2357894561

Agent: Thank you. I'm sorry to hear about the defective product you received. We c

* Sentiment analysis

In [19]:
%%time

res = analysis(
    llm=llm,
    transcript=transcripts[0],
    params=PARAMS[MODEL_NAME],
    template=sentiment_template
)

print (res)

[32m
This is a sentiment analysis program. What is the customer sentiment using following classes 
["POSITIVE", "NEUTRAL", "NEGATIVE"]. classify the conversation into one and exact one of these classes. 
If you don't know or not sure, please use ["NEUTRAL"] class. Do not try to make up a class:

context: timestamp: 2022-12-27 08:26:49.219717

Agent: Thank you for calling our retail support line. My name is ABC. How can I assist you today?

Customer: Yes, I have received a defective product, and I am extremely angry about it! This is unacceptable, and I want it resolved immediately!

Agent: I'm sorry to hear that you received a defective product. Can you please let me know what the issue is?

Customer: The product I received is broken and unusable. I spent a lot of money on it, and now I can't even use it! This is unacceptable, and I demand a solution right now!

Agent: I completely understand your frustration, and I'm sorry for any inconvenience this has caused you. Can you please pro

In [20]:
%%time

res = analysis(
    llm=llm,
    transcript=transcripts[0],
    params=PARAMS[MODEL_NAME],
    template=intent_template
)

print (res)

[32m
This is a intent classification program. What is the purpose of the customer call using following classes
["SHIPMENT_DELAY", "COMPLAIN_PRODUCT_DEFECT", "ACCOUNT_QUESTION"]. classify the conversation into one and exact one of these classes.
If you don't know, please use ["UNKNOWN"] class. Do not try to make up a class. 

context: timestamp: 2022-12-27 08:26:49.219717

Agent: Thank you for calling our retail support line. My name is ABC. How can I assist you today?

Customer: Yes, I have received a defective product, and I am extremely angry about it! This is unacceptable, and I want it resolved immediately!

Agent: I'm sorry to hear that you received a defective product. Can you please let me know what the issue is?

Customer: The product I received is broken and unusable. I spent a lot of money on it, and now I can't even use it! This is unacceptable, and I demand a solution right now!

Agent: I completely understand your frustration, and I'm sorry for any inconvenience this has 

## Handling long call transcripts
We'll cover how to handle long transcripts that exceed the limits of the LLM. 

In [None]:
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import RecursiveCharacterTextSplitter

* prompting to divide and conquer

In [None]:
stuff_prompt_template = """
Please provide a summary of the following text.
TEXT: {text}
SUMMARY:
"""

chuck_prompt_template = """
Please provide a summary of the following text.
Please answer in one sentence.
TEXT: {text}
SUMMARY:
"""

chunk_prompt = PromptTemplate(
    template=chuck_prompt_template,
    input_variables=["text"]
)

combine_prompt_template = """
Write a concise summary of the following text.
Return your response in bullet points which covers the key points of the text.
TEXT: {text}
SUMMARY:
"""

combine_prompt = PromptTemplate(
    template=combine_prompt_template,
    input_variables=["text"]
)

* summarize chain

In [None]:
'''
# summary_chain = load_summarize_chain(
#     llm=llm,
#     chain_type="map_reduce",
#     verbose=True
# ) # map_reduce, refine
# transcript = summary_chain(docs)
'''


def summary_chain_init(chain_type, llm):
    
    if chain_type == "STUFF":
        chain = load_summarize_chain(
            llm,
            chain_type="stuff",
            verbose=True
        )
        
    elif chain_type == "MAP_REDUCE":
        chain = load_summarize_chain(
            llm,
            chain_type="map_reduce",
            map_prompt=chunk_prompt,
            combine_prompt=combine_prompt,
            return_intermediate_steps=True,
            verbose=True
        )
    elif chain_type == "REFINE":
        chain = load_summarize_chain(
            llm,
            chain_type="refine",
            question_prompt=chunk_prompt,
            refine_prompt=combine_prompt,
            return_intermediate_steps=True,
            verbose=True
        )
        
    return chain

In [None]:
def long_call_analysis(llm, transcript, params, template="", chain_type="MAP_REDUCE", max_tokens=50):

    
    llm.model_kwargs = params
    num_tokens = llm.get_num_tokens(transcript) #raise warnning

    if num_tokens > max_tokens:
        text_splitter = RecursiveCharacterTextSplitter(
            separators=["\n\n\n"],
            chunk_size=500,
            chunk_overlap=100
        )
        docs = text_splitter.create_documents([transcript])
        num_docs = len(docs)
        num_tokens_first_doc = llm.get_num_tokens(docs[0].page_content)

        print(f"Now we have {num_docs} documents and the first one has {num_tokens_first_doc} tokens")

        
        summary_chain = summary_chain_init(
            chain_type=chain_type, 
            llm=llm
        )
        response = summary_chain(
            {"input_documents": docs}
        )
        
        print ("Intermediate_steps: \n")
        for idx, step in enumerate(response["intermediate_steps"]):
            print (colored(f'step {idx}: \n', "green"))
            print (colored(f'{step}\n', "green"))
        
        return response["output_text"]
    
    else:
        
        prompt = PromptTemplate(template=stuff_prompt_template, input_variables=["text"])
        analysis_prompt = prompt.format(text=transcript)
        print (colored(analysis_prompt, 'green'))
        
        response = llm(analysis_prompt)
        
        return response
        

In [None]:
PARAMS = {
    "FALCON-40B": {
        "max_new_tokens": 1024,
        "max_length": 1024,
        "top_p": 0.95,
        "do_sample": False,
        "temperature": 0.2,
        "return_full_text": False,
        "include_prompt_in_result": False
    },
    "LLAMA2-7B": {
        'max_new_tokens': 128,
        'top_p': 0.9,
        'temperature': 0.1,
        'return_full_text': False
    },
}

In [None]:
%%time

res = long_call_analysis(
    llm=llm,
    transcript=transcripts[3],
    params=PARAMS[MODEL_NAME],
    template=summary_template,
    chain_type="REFINE" # REFINE, MAP_REDUCE
)

print ("Results: \n")
print (res)