# 4. Challenge Lab
---

The purpose of the challenge lab is to solidify your understanding of the concept learned in this Bootcamp. It provides an avenue for learners to explore and learn how to integrate NVIDIA NeMo into their project workflow.

### Challenge Title: AI-Powered Chat Summarization for Customer Care Efficiency 

#### Motivation

In today's digital landscape, customer support teams handle an ever-increasing volume of interactions across multiple platforms, including chat, email, and social media. These interactions often involve repetitive queries, lengthy exchanges, and important customer context that need to be efficiently understood and addressed. Manual handling of these conversations daily leads to:
- Time wasted: manually reading long chat histories to understand customer issues.  
- Inconsistent service and human error: This is due to agents missing key details in lengthy chats.  
- Slow response times, especially when chats are transferred between agents.
- Difficulty in post-chat analysis: for training and compliance.
     
One solution to this is to automate chat summarization, thereby streamlining customer care workflows. Summarizing long and multi-turn conversations into concise, coherent, and actionable insights allows customer support agents and quality analysts to focus on resolution rather than review. However, traditional summarization tools struggle with the dynamic nature and complexity of customer support dialogues, as well as the domain-specific terminology associated with them.

#### Task Description

The primary objective of this project is to develop an AI-driven chat summarization system using the NVIDIA NeMo framework to:
- Automatically generate concise summaries of customer chats in real time.
- In your summaries, highlight the key intents, issues, and resolutions to facilitate faster agent response.
- Automated tagging for analytics (e.g., common complaints, sentiment) should also be enabled. 


##### Example

Customer Chat:  Hi, I’ve been trying to reset my password for 45 minutes. The link in the email isn’t working. I need access urgently—my payment is due today!

**AI-Generated Summary:**  
***Main Issue***: Password reset failure.  
***Urgency***: High (payment due today).  
***Attempted Solution***: Email link used (failed).  
***Sentiment***: Frustrated (escalate).  
***Suggested Action***: Manual password reset and expedite.


### Dataset

The primary dataset for the challenge is `TweetSumm.`  TWEETSUMM comprises 1100 dialogs reconstructed from Tweets that appear in the Kaggle Customer Support On Twitter dataset2, each accompanied by three extractive and three abstractive summaries generated by human annotators. The dataset is based on conversations between consumers and customer support agents on Twitter.com. It covers a wide range of topics and services provided by various companies, including airlines, retail, gaming, music, etc. TweetSum consists of train, test, and validation sets. For further details, please visit [https://arxiv.org/abs/2111.11894](https://arxiv.org/abs/2111.11894) to read the technical paper. There are two ways to access the dataset:
- **GitHub**:  You can download the dataset by cloning the repo [https://github.com/guyfe/Tweetsumm/tree/main](https://github.com/guyfe/Tweetsumm/tree/main)
- **HuggingFace**: The dataset is gated on `Huggingface` and requires an HF token to access or download it. Click on the [User Access Token link](https://huggingface.co/docs/hub/en/security-tokens) to see the steps for generating the HF security token. TweetSum can be found in the dialogue summarization dataset from the Salesforce DialogStudio.  

##### Dataset example

```text
conversation:
 user: So neither my iPhone nor my Apple Watch are recording my steps/activity, and Health doesn’t recognise either source anymore for some reason. Any ideas?   please read the above.
agent: Let’s investigate this together. To start, can you tell us the software versions your iPhone and Apple Watch are running currently?
user: My iPhone is on 11.1.2, and my watch is on 4.1.
agent: Thank you. Have you tried restarting both devices since this started happening?
user: I’ve restarted both, also un-paired then re-paired the watch.
agent: Got it. When did you first notice that the two devices were not talking to each other. Do the two devices communicate through other apps such as Messages?
user: Yes, everything seems fine, it’s just Health and activity.
agent: Let’s move to DM and look into this a bit more. When reaching out in DM, let us know when this first started happening please. For example, did it start after an update or after installing a certain app?

summary:
 Customer enquired about his Iphone and Apple watch which is not showing his any steps/activity and health activities. Agent is asking to move to DM and look into it. 
```

### Submission Procedure

To submit your result, please follow the requirements below: 
- **Submission Template**: You are required to use the submission template JSON file to upload your result.  Please follow this links  to access the [submission template](../challenge/data/evaluation_asset/submission-template.json) and [evaluation data](../challenge/data/evaluation_asset/evaluation_data.json). The template consists of the following fields (Key-value pairs).
    - ***summaries***: It consists of summary features listed below
    - ***chat_id***: This is the ID for the sequence of chats between the customer and the agent.
    - ***chat_summary***: This is an overall summary of the sequence of chats between a customer and an agent. It can be in the form of a sentence or a paragraph.
    - ***main_issue***: This represents the primary issue (intent) faced by a customer. 
    - ***urgency***: It denotes the priority requirements (e.g., High, Low, Moderate, etc.) 
    - ***attempted_solution***: Represents the solution attempted by the customer
    - ***sentiment***:  It denotes the customer’s emotion (eg, Frustrated, unhappy, happy, worried, etc.)
    - ***suggested_action***: recommended action to be taken by an agent 


<img src="images/submission-template.png" width="800px" height="800px" />


- **Submission to the Leaderboard:**  Participants are expected to start submitting their result JSON file from the start date of the challenge until the deadline. Through several updates, participants will be ranked on the leaderboard.

### Getting Started With the Challenge Solution

- Examine the TWEETSum dataset at `/workspace/challenge/data/TweetSumm/`.
- Preprocess the dataset and save it in the directory `/workspace/challenge/data/finetune_module`
- Use the existing Llama model `(Meta-Llama-3.1-8B)` in the `/workspace/model` directory to fine-tune via NeMo Run
- Save your finetune checkpoint in the path `/workspace/challenge/checkpoint_log`
- Run inference against the [evaluation dataset](../challenge/data/evaluation_asset/evaluation_data.json) and process results into the [submission template](../challenge/data/evaluation_asset/submission-template.json) JSON file.
- Upload your submission template file. **The Bootcamp instructor will provide detailed information on how and where to upload the file.**
- Top 3 candidates on the leaderboard would be required to submit their script and finetuned model `(checkpoints)` for verification and testing; otherwise, they would be dropped from the leaderboard.


---
### Licensing
Copyright © 2025 OpenACC-Standard.org. This material is released by OpenACC-Standard.org, in collaboration with NVIDIA Corporation, under the Creative Commons Attribution 4.0 International (CC BY 4.0). These materials include references to hardware and software developed by other entities; all applicable licensing and copyrights apply.