# Chunk Summarizer

* Each LLM comes out with a context-window. That's basically the number of Tokens it can take as input
* Some LLMs have 4K context-window, some have 8K or more
* Since call transcripts can be bigger than 4K or 8K tokens, we have to divide them into pieces
* Those pieces are called **chunks**
* Each chunk is summarized by keeping the context-window in mind for the particular LLM
* Then finally the summaries of all the chunks is combined and a final summary is geneted through LLM

In [None]:
!pip install --upgrade -r "/home/ec2-user/SageMaker/15. Essential Code/requirements.txt" -q

In [10]:
# Basic Imports
import os, re, time, math, asyncio, datetime, json
import pandas as pd

# LangChain
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain_community.callbacks import get_openai_callback
from langchain_core.output_parsers import StrOutputParser
from langchain.docstore.document import Document
from langchain.text_splitter import TokenTextSplitter
from langchain.chains.summarize import load_summarize_chain

# Packages for AWS and SQL server 
import boto3 
import pyodbc 
import pandas as pd
import textwrap
import warnings

# Set init global variables
os.environ["OPENAI_API_KEY"] = openai_key 
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns', None)

# LLM variables
OUTPUT_PARSER = StrOutputParser()
MODEL_KWARGS={"seed":235, "top_p":0.01}
MODEL = "gpt-3.5-turbo"
TEMPERATURE = 0.0
MAX_TOKENS = 800
CONTEXT_WINDOW = 4000

## Sample Transcript

In [23]:
transcript = """

Agent: Good afternoon, thank you for calling MegaMart Customer Service. My name is Sarah. How can I assist you today?

Customer: Hi Sarah, this is John. I recently placed an order with you guys, but there seems to be an issue.

Agent: I'm sorry to hear that, John. Could you please provide me with your order number so I can look it up?

Customer: Sure, it’s 123456789.

Agent: Thank you, John. Let me pull up your order details. Could you please hold for a moment?

Customer: No problem.

Agent: (After a brief pause) Thanks for waiting, Emily. I’ve found your account. It looks like there were two charges for this month. Let me investigate why that happened. Can you tell me the dates of the charges?

Customer: The first charge was on the 1st of this month, and the second was on the 15th.

Agent: I see. Let me check your payment history and subscription plan. (After a brief pause) It appears there was a system error that caused the second charge. I apologize for this inconvenience.

Customer: It’s quite frustrating. I hope this won’t happen again.

Agent: I completely understand, Emily. I’ll ensure that this issue is resolved so it doesn’t happen in the future. In the meantime, I can process a refund for the duplicate charge. Would you like me to do that?

Customer: Yes, please. A refund would be great.

Agent: Absolutely. I’m processing the refund right now. You should see the credit back on your account within 3-5 business days. Additionally, I’ve flagged your account to prevent this error from happening again.

Customer: Thank you. Is there any confirmation I’ll receive about this refund?

Agent: Yes, you’ll receive an email confirmation shortly with all the details about the refund and the steps we’ve taken to ensure this doesn’t happen again.

Customer: Perfect. I appreciate your help.Customer: No problem.

Agent: (After a brief pause) Thanks for waiting, Emily. I’ve found your account. It looks like there were two charges for this month. Let me investigate why that happened. Can you tell me the dates of the charges?

Customer: The first charge was on the 1st of this month, and the second was on the 15th.

Agent: I see. Let me check your payment history and subscription plan. (After a brief pause) It appears there was a system error that caused the second charge. I apologize for this inconvenience.

Customer: It’s quite frustrating. I hope this won’t happen again.

Agent: I completely understand, Emily. I’ll ensure that this issue is resolved so it doesn’t happen in the future. In the meantime, I can process a refund for the duplicate charge. Would you like me to do that?

Customer: Yes, please. A refund would be great.

Agent: Absolutely. I’m processing the refund right now. You should see the credit back on your account within 3-5 business days. Additionally, I’ve flagged your account to prevent this error from happening again.

Customer: Thank you. Is there any confirmation I’ll receive about this refund?

Agent: Yes, you’ll receive an email confirmation shortly with all the details about the refund and the steps we’ve taken to ensure this doesn’t happen again.

Customer: Perfect. I appreciate your help.

Agent: (After a brief pause) Thank you for holding, John. I see that you placed an order for a new laptop and a set of headphones. Could you please tell me what seems to be the problem?

Customer: Yes, I received the laptop, but the headphones were missing from the package.

Agent: I apologize for the inconvenience, John. Let me verify the shipment details. It appears both items were supposed to be shipped together. I’m sorry they didn’t arrive as expected.

Customer: Yeah, it’s quite frustrating because I was really looking forward to using them together.

Agent: I completely understand, John. Let’s get this sorted out for you. I will initiate an investigation to locate the missing headphones and get them to you as quickly as possible. In the meantime, would you prefer a replacement or a refund for the headphones?

Customer: I’d like a replacement, please. I still need them.

Agent: Absolutely, I’ll arrange for a replacement to be shipped out to you right away. It should arrive within 3-5 business days. I’ll also expedite the shipping at no extra cost to you due to the inconvenience.

Customer: That sounds good. Will I get a confirmation email about this?

Agent: Yes, you will receive a confirmation email within the next hour. It will include all the details about your replacement order and the expected delivery date.

Customer: Great, thank you.

Agent: Is there anything else I can help you with today, John?

Customer: Actually, yes. I’m having some trouble setting up the laptop. It’s asking for a product key that I didn’t receive.

Agent: I’m sorry to hear that. The product key should typically be included with the packaging or in an email. Let’s see if we can locate it. Did you check the documentation that came with the laptop?

Customer: I did, but there’s nothing there.

Agent: No problem. Let me check the system here. Sometimes the product key is provided in the order confirmation email. Could you please check your email for the order confirmation while I look it up on my end?

Customer: Sure, give me a second.

Agent: Take your time.

Customer: (After a brief pause) I found the order confirmation email, but there’s no product key mentioned.

Agent: I appreciate you checking, John. Let me escalate this to our technical support team. They will be able to provide you with the product key directly. This might take a few minutes. Can you hold for a bit longer?

Customer: Yes, that’s fine.

Agent: Thank you for your patience. (After a brief pause) I’ve contacted the technical support team, and they will email you the product key shortly. You should receive it within the next 15 minutes.

Customer: Perfect, thanks for your help.

Agent: You’re welcome, John. Is there anything else I can assist you with today?

Customer: No, that’s all. Thanks again for resolving these issues.

Agent: It’s my pleasure, John. Thank you for your patience and understanding. If you have any other questions or need further assistance, please don’t hesitate to call us back. Have a great day!

Customer: You too, Sarah. Goodbye.

Agent: Goodbye, John.

Agent: Thank you for calling StreamNow Support. This is Alex. How can I assist you today?

Customer: Hi Alex, this is Emily. I’m having trouble with my StreamNow subscription. I’ve been charged twice this month.

Agent: I’m sorry to hear that, Emily. I can help you with that. Could you please provide me with your account email or username so I can look up your account?

Customer: Sure, my email is emily.jones@example.com.

Agent: Thank you, Emily. Please hold for a moment while I pull up your account details.

Customer: No problem.

Agent: (After a brief pause) Thanks for waiting, Emily. I’ve found your account. It looks like there were two charges for this month. Let me investigate why that happened. Can you tell me the dates of the charges?

Customer: The first charge was on the 1st of this month, and the second was on the 15th.

Agent: I see. Let me check your payment history and subscription plan. (After a brief pause) It appears there was a system error that caused the second charge. I apologize for this inconvenience.

Customer: It’s quite frustrating. I hope this won’t happen again.

Agent: I completely understand, Emily. I’ll ensure that this issue is resolved so it doesn’t happen in the future. In the meantime, I can process a refund for the duplicate charge. Would you like me to do that?

Customer: Yes, please. A refund would be great.

Agent: Absolutely. I’m processing the refund right now. You should see the credit back on your account within 3-5 business days. Additionally, I’ve flagged your account to prevent this error from happening again.

Customer: Thank you. Is there any confirmation I’ll receive about this refund?

Agent: Yes, you’ll receive an email confirmation shortly with all the details about the refund and the steps we’ve taken to ensure this doesn’t happen again.

Customer: Perfect. I appreciate your help.

Agent: You’re welcome, Emily. Is there anything else I can assist you with today?

Customer: Actually, yes. I’ve also been experiencing buffering issues while streaming. It happens quite frequently.

Agent: I’m sorry to hear that. Buffering can be quite annoying. Let’s troubleshoot that together. First, could you let me know what device you’re using to stream?

Customer: I usually watch on my smart TV.

Agent: Got it. Are you connected via Wi-Fi or an Ethernet cable?

Customer: I’m connected via Wi-Fi.

Agent: Thanks for the information. Buffering issues can sometimes be due to a weak Wi-Fi signal. Have you tried restarting your router and modem?

Customer: Yes, I’ve done that, but it didn’t help.

Agent: I see. Let’s try a few more things. Can you check the speed of your internet connection using a speed test app or website?

Customer: Sure, give me a moment. (After a brief pause) The speed test shows 25 Mbps download and 5 Mbps upload.

Agent: Those speeds should be sufficient for streaming. Let’s check the settings on your smart TV. Can you go to the network settings and see if there are any updates available for your TV’s firmware?

Customer: Okay, let me check. (After a brief pause) There’s an update available. Should I install it?

Agent: Yes, please go ahead and install the update. Sometimes firmware updates can improve streaming performance.

Customer: Alright, it’s installing now. (After a brief pause) The update is complete.

Agent: Great. Now, let’s try streaming again to see if the buffering issue persists.

Customer: Okay, I’ll check. (After a brief pause) It seems to be working better now. The buffering is much less frequent.

Agent: I’m glad to hear that, Emily. If the issue continues, please let us know, and we can look into it further. Is there anything else I can assist you with?

Customer: No, that’s all for now. Thanks a lot for your help, Alex.

Agent: You’re welcome, Emily. Thank you for calling StreamNow Support. Have a great day!

Customer: You too. Goodbye!

Agent: Goodbye!

"""

In [31]:
# Splits big texts
def chunks(long_text): 
    text_splitter = TokenTextSplitter(chunk_size=1500, chunk_overlap=150)
    texts = text_splitter.split_text(long_text)
    docs = [Document(page_content=text) for text in texts]
    return docs

# Summarizes chunks
def chunk_summarizer(text):

    question_prompt_template = """
    Summarize the following text. Ensure the summary comprises a minimum of six lines

    Text: ```{text}```
    """
    
    prompt = PromptTemplate.from_template(template=question_prompt_template)

    llm = ChatOpenAI(model_name=MODEL, temperature=TEMPERATURE, max_tokens=MAX_TOKENS, model_kwargs=MODEL_KWARGS)
    
    refine_chain_summary = load_summarize_chain(
        llm,
        chain_type="refine",
        question_prompt=prompt,
        return_intermediate_steps=False,
        document_variable_name="text")

    with get_openai_callback() as cb:
        summary_refine = refine_chain_summary.invoke({"input_documents": docs, "prompt": prompt})

    return summary_refine


def chunk_handler(text):
    prompt = """    
        Summarize the following text.
        
        Text : ```{text}```
        """

    llm = ChatOpenAI(model_name=MODEL, temperature=TEMPERATURE, max_tokens=MAX_TOKENS, model_kwargs=MODEL_KWARGS)

    prompt_main = PromptTemplate(
          input_variables=["text"],
          template=prompt)

    with get_openai_callback() as cb:
        llm_chain = prompt_main | llm | OUTPUT_PARSER
        all_text = str(prompt) + str(text)
        threshold = (llm.get_num_tokens(text=all_text) + MAX_TOKENS)
        print("Tokens of Prompt:",threshold)
        
        chunk_list = []
        if int(threshold) > 3000:
            text_chunks = chunks(text)
            for chunk in text_chunks:
                chunk_list.append(llm_chain.invoke({"text":chunk}))
            summary = llm_chain.invoke({"text":' '.join(chunk_list)})
        else:
            summary = llm_chain.invoke({"text":text})
            
    return summary

In [32]:
summary = chunk_handler(transcript)

Tokens of Prompt: 3060


In [33]:
summary

'The text describes two separate conversations between customers and customer service agents from different companies. In the first conversation, John had issues with his order from MegaMart, including duplicate charges and missing headphones. Sarah from MegaMart Customer Service resolved the issues by processing a refund, arranging for a replacement, and helping John locate a product key. John was satisfied with the assistance. In the second conversation, Emily from StreamNow Support had issues with her subscription, including being charged twice in one month. Alex from StreamNow Support investigated the issue, identified a system error, processed a refund, and ensured the error would not happen again. Emily also mentioned buffering issues while streaming, and Alex helped troubleshoot by checking internet speed, updating firmware, and resolving the problem. The conversations ended with both customers thanking the agents for their assistance.'