## Post Call Analytics (PCA) Using Amazon Bedrock

Welcome to this training module on post-call analytics use cases using Amazon Bedrock. 

As businesses continue to interact with customers through various channels, it becomes increasingly important to analyze these interactions to gain insights into customer behavior and preferences. Post-call analytics is one such method that involves analyzing customer interactions after the call has ended. The use of large language models can greatly enhance the effectiveness of post-call analytics by enabling more accurate sentiment analysis, identifying specific customer needs and preferences, and improving overall customer experience. 

In this sample notebook, we will explore following topics to demonstrate the various benefits of using Bedrock for post-call analytics and businesses gain a competitive edge in the modern marketplace.

- Choice of LLM models in Bedrock (Titan Text and Anthropic Claude)
- One model handling multiple PCA tasks
- Handling long call transcripts
- [Stretch] Architecture pattern for production workloads

# Environment Setup
Install and upgrade the packages required to run the sample code. <BR>
**Note: you may need to restart the kernel to use updated packages.**

# Import packages

In [None]:
import langchain
from langchain.llms.bedrock import Bedrock
from langchain import PromptTemplate
import boto3

print(f"langchain version check: {langchain.__version__}")
print(f"boto3 version check: {boto3.__version__}")

# Load transcript files

In [None]:
transcript_files = [
    #"./call_transcripts/negative-refund.txt", 
    #"./call_transcripts/neutral-short.txt",
    #"./call_transcripts/positive-partial-refund.txt",
    #"./call_transcripts/aws.txt",
    "./call_transcripts/negative-refund-ko.txt",
    "./call_transcripts/neutral-short-ko.txt",
    "./call_transcripts/positive-partial-refund-ko.txt",
    "./call_transcripts/aws-ko.txt"
]

transcripts = []

for file_name in transcript_files:
    with open(file_name, "r") as file:
        transcripts.append(file.read())

In [None]:
for i, trans in enumerate(transcripts):
    print(f"transcript #{i+1}: {trans[:300]}\n")
    print("====================\n\n")

# Post Call Analysis

## Choice of models in Bedrock
Choose FMs from Amazon, AI21 Labs and Anthropic to find the right FM for your use case.

**Select region: "us-east-1"(M1), "us-west-2"(M2)**

In [None]:
bedrock_region = "us-east-1" 

In [None]:
if bedrock_region == "us-east-1":    
    bedrock_config = {
        "region_name":bedrock_region,
        "endpoint_url":"https://bedrock.us-east-1.amazonaws.com"
    }
elif bedrock_region == "us-west-2":  
    bedrock_config = {
        "region_name":bedrock_region,
        "endpoint_url":"https://prod.us-west-2.frontend.bedrock.aws.dev"
    }

In [None]:
bedrock = boto3.client(
    service_name='bedrock',
    region_name=bedrock_config["region_name"],
    endpoint_url=bedrock_config["endpoint_url"]
)

bedrock_models = {
    "Claude" : "anthropic.claude-v1",
    "TitanText": "amazon.titan-tg1-large", 
    "Claude-instant":"anthropic.claude-instant-v1",
    "Claude-V2" : "anthropic.claude-v2",
}

max_tokens = {
    "Claude" : 12000,
    "TitanText": 4096,
    "Claude-instant": 9000,
    "Claude-V2" : 12000,
}

max_tokens = {"Claude" : 120, "TitanText": 130, "Claude-instant": 120, "Claude-V2" : 120}


In [None]:
from langchain.llms.bedrock import Bedrock
from langchain import PromptTemplate

In [None]:
# Choose one of the bedrock model
model = "Claude-V2" # "Claude", "TitanText", "Claude-instant"
if model in ["Claude", "Claude-instant", "Claude-V2"]:
    llm = Bedrock(
        model_id=bedrock_models[model],
        client=bedrock,
        model_kwargs={
            "max_tokens_to_sample":512,
            "stop_sequences":["\n\nhuman", "\n\n인간", "\n\n상담원"],
            "temperature":0,
            "top_p":0.9
        },
        #endpoint_url='https://prod.us-west-2.frontend.bedrock.aws.dev'
    )
elif model == "TitanText":
    llm = Bedrock(
        model_id=bedrock_models[model],
        client=bedrock,
        model_kwargs={
            "maxTokenCount":4096,
            "stopSequences":[],
            "temperature":0,
            "topP":0.9
        },
        #endpoint_url='https://prod.us-west-2.frontend.bedrock.aws.dev'
    )

## Prompt Template
In this notebook, we'll be performing four different analyses(**Summary, Sentiment, Intent and Resolution**), and we'll need a template for each one. 

In [None]:
summary_template = """Analyze the retail support call transcript below. Provide a detail summary of the conversation in complete sentence:

context: "{transcript}"

summary:"""

# What is the sentiment of the conversation: """
sentiment_template = """
This is a sentiment analysis program. What is the customer sentiment using following classes 
["POSITIVE", "NEUTRAL","NEGATIVE"]. classify the conversation into one and exact one of these classes. 
If you don't know or not sure, please use ["NEUTRAL"] class. Do not try to make up a class.

conversation: "{transcript}"

sentiment: """

intent_template = """This is a intent classification program. What is the purpose of the customer call use following 
classes ["SHIPMENT_DELAY", "PRODUCT_DEFECT", "ACCOUNT_QUESTION"]. Classify the conversation into one 
and exact one of these classes. If you don't know, please use ["UNKNOWN"] class. Do not try to make up a class. 

conversation: "{transcript}"

Answer in one word, why is customer calling today: """

resolution_template = """This is a resolution classification program. How did the agent solved the issue use following 
classes ["FULL_REFUND", "PARTIAL_REFUND",  "QUESTION_ANSWERED", "UNRESOLVED"]. classify the conversation into one 
and exact one of these classes. If you don't know, please use ["UNKNOWN"] class. Do not try to make up a class.

conversation: "{transcript}"

Answer in one word, how did the agent resolve the customer question or issue: """

topic_template = """This is topic identification program. What specific topic agent observed during the call. If you don't know, please say "I don't know". Do not try to make up.

conversation: "{transcript}"

Topic: """

escalation_template = """This is escalation classification program. Did customer asked for escalation during the call. use following classes ["YES", "NO",  "UNKNOWN"]. Classify the conversation into one 
and exact one of these classes. Answer in one word without any explaination. If you don't know, please use ["UNKNOWN"] class. Do not try to make up a class.

conversation: "{transcript}"

Escalation: """


holdup_template = """This is delay identification program. Did agent put customer on hold during the call. 
If Yes, provide top reason concisely. If no wait or hold, then just say "No Hold-up".

conversation: "{transcript}"

Top Reason: 

"""

In [None]:
summary_template_ko = """
아래의 리테일 지원 통화 기록을 분석하세요. 전체 문장으로 대화에 대한 자세한 요약을 제공하세요.

통화: "{transcript}"

요약:"""

sentiment_template_ko = """
감성 분석 프로그램입니다. 다음 클래스를 이용하여 고객의 감성을 분류하세요. 
["긍정", "중립", "부정"]. 대화를 이 클래스 중 한 가지로 정확하게 분류합니다. 
모르거나 확실하지 않은 경우 ["중립"] 클래스를 사용하세요. 클래스를 만들려고 하지 마세요.

대화: "{transcript}"

고객 감성:"""

intent_template_ko = """
이것은 의도 분류 프로그램입니다. 다음 대화에서 고개의 목적은 무엇입니까? 
클래스 ["배송_지연", "제품_결함", "계정_질문"]. 대화를 다음 클래스 중 하나로 분류합니다. 
이 클래스 중 하나에 정확히 일치합니다. 모르는 경우 ["UNKNOWN"] 클래스를 사용하세요. 클래스를 만들려고 하지 마세요. 

대화: "{transcript}"

고객 목적:"""

resolution_template_ko = """
이것은 해결 분류 프로그램입니다. 상담원이 문제를 해결한 방법은 다음과 같습니다. 
클래스 ["FULL_REFUND", "PARTIAL_REFUND", "QUESTION_ANSWERED", "UNRESOLVED"]. 대화를 다음 중 하나로 분류합니다. 
그리고 이 클래스 중 하나를 정확하게 분류하세요. 모르는 경우 ["UNKNOWN"] 클래스를 사용하세요. 클래스를 만들려고 하지 마세요.

대화: "{transcript}"

상담원이 고객의 질문이나 문제를 어떻게 해결했는지 한 마디로 답하세요:"""

## Generate Analysis

In [None]:
def generate_analysis(llm, transcript, max_tokens=50, template=""):

    prompt = PromptTemplate(template=template, input_variables=["transcript"])
    
    analysis_prompt = prompt.format(transcript=transcript)
    print (analysis_prompt)
        
    analysis = llm(analysis_prompt)
    
    return analysis

### Summary

In [None]:
generate_analysis(
    llm=llm,
    transcript=transcripts[0],
    template=summary_template_ko
)

### Sentiment Analysis

In [None]:
generate_analysis(
    llm=llm,
    transcript=transcripts[0],
    template=sentiment_template_ko
)

### Intent Analysis

In [None]:
generate_analysis(llm=llm, transcript=transcripts[0], template=intent_template_ko)

### Resolution Analysis

In [None]:
generate_analysis(llm=llm, transcript=transcripts[6], template=resolution_template_ko)

In [None]:
generate_analysis(llm=llm, transcript=transcripts[0], template=topic_template)

In [None]:
generate_analysis(llm=llm, transcript=transcripts[1], template=escalation_template)

In [None]:
generate_analysis(llm=llm, transcript=transcripts[1], template=holdup_template)

## Handling long call transcripts
We'll cover how to handle long transcripts that exceed the limits of the LLM. 

In [None]:
# Check if exceeds titan limit of 4000 tokens; Chunk it up
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import RecursiveCharacterTextSplitter

In [None]:
# Generate Analysis
def generate_analysis(llm, transcript, template="", model="Claude"):

    prompt = PromptTemplate(template=template, input_variables=["transcript"])
    
    analysis_prompt = prompt.format(transcript=transcript)
    
    num_tokens = llm.get_num_tokens(analysis_prompt)
    print (f'prompt has almost {num_tokens} tokens \n')
    
    print ("max_tokens[model]", max_tokens[model])
    if num_tokens > max_tokens[model]:
        text_splitter = RecursiveCharacterTextSplitter(
            separators=["\n\n", "\n"],
            chunk_size=500,
            chunk_overlap=20
        )
        docs = text_splitter.create_documents([transcript])
        
        num_docs = len(docs)
        num_tokens_first_doc = llm.get_num_tokens(docs[0].page_content)

        print(
            f"Now we have {num_docs} documents and the first one has {num_tokens_first_doc} tokens"
        )
        summary_chain = load_summarize_chain(llm=llm, chain_type="refine", verbose=True) # map_reduce
        
        transcript = summary_chain.run(docs)
                
    analysis_prompt = prompt.format(transcript=transcript) 
    analysis = llm(analysis_prompt)
    
    return (analysis)

### Summary

In [None]:
generate_analysis(llm=llm, transcript=transcripts[0], template=summary_template)

### Sentiment Analysis

In [None]:
generate_analysis(llm=llm, transcript=transcripts[1], template=sentiment_template)

### Intent Analysis

In [None]:
generate_analysis(llm=llm,transcript=transcripts[1], template=intent_template)

### Resolution Analysis

In [None]:
generate_analysis(llm=llm, transcript=transcripts[1], template=resolution_template)