# Prompter Workflow in Label Studio for Customer Support Classification
This notebook is designed to walk you through setting up a machine learning workflow that provides an for classifying customer support requests using Label Studio. We'll start with data preparation, create a Label Studio project with a prompt-generation workflow and ingest the data into the project. 

![Prompter Workflow](prompter-workflow-screenshot.png)


## Setup
Installation of necessary libraries, specifically the Label Studio SDK, which is used for creating projects and tasks in Label Studio.

In [None]:
!pip install label-studio-sdk

Import the Label Studio SDK and set the [API key](https://labelstud.io/guide/api.html) and URL of your Label Studio instance. 

In [30]:
# Import the SDK and the client module
from label_studio_sdk import Client

# Define the URL where Label Studio is accessible and the API key for your user account
LABEL_STUDIO_URL = 'http://localhost:8080'
API_KEY = '<YOUR_LS_API_KEY>'

# Connect to the Label Studio API and check the connection
ls = Client(url=LABEL_STUDIO_URL, api_key=API_KEY)
ls.check_connection()


{'status': 'UP'}

## Customer Service Dataset
The following cells download and prepare our dataset. We will use the [Task-Oriented Dialogue dataset](https://github.com/amazon-science/dstc11-track2-intent-induction/tree/main). The transformation process will organize the data for our simple chat multi-class labeling example. 

In [49]:
import requests

# URL of the JSONL file to download
file_url = "https://raw.githubusercontent.com/amazon-science/dstc11-track2-intent-induction/main/dstc11/development/dialogues.jsonl"
# Local path where the file will be saved
input_file_path = "dialogues.jsonl"

# Download the file
response = requests.get(file_url)

# Ensure the request was successful
if response.status_code == 200:
    # Open the local file for writing in binary mode
    with open(input_file_path, 'wb') as file:
        # Write the content of the response to the file
        file.write(response.content)
else:
    print(f"Failed to download the file. Status code: {response.status_code}")


In [50]:
import json

# Reset the approach to correctly structure the dataset based on the inspected format
transformed_dialogues = []

# Process each dialogue and its turns
with open(input_file_path, 'r') as infile:
    for line in infile:
        # Parse the JSON object from the line
        dialogue = json.loads(line)
        dialogue_transformed = {'dialogue': []}
        
        # Process each turn in the dialogue
        for turn in dialogue['turns']:
            speaker_role = turn['speaker_role']
            utterance = turn['utterance']
            # Append the turn to the dialogue
            dialogue_transformed['dialogue'].append({'author': speaker_role, 'text': utterance})
        
        # Append the transformed dialogue to the list
        transformed_dialogues.append(dialogue_transformed)

print(f"Transformed {len(transformed_dialogues)} dialogues.")

Transformed 948 dialogues.


In [51]:
# List possible intents
# Initialize a set to hold all unique intents
unique_intents = set()

# Process each dialogue and its turns to extract intents
with open(input_file_path, 'r') as infile:
    for line in infile:
        # Parse the JSON object from the line
        dialogue = json.loads(line)
        
        # Process each turn in the dialogue for intents
        for turn in dialogue['turns']:
            # Extract and add the intents to the set
            if 'intents' in turn:  # Check if the intents field exists
                for intent in turn['intents']:
                    unique_intents.add(intent)

# Convert the set to a sorted list for better readability
sorted_unique_intents = sorted(list(unique_intents))

print(f"Possible intents in dialogues: {len(sorted_unique_intents)}")

Possible intents in dialogues: 22


We can view the different labels assigned to the dataset below. We will use these labels to create our classes in the Label Studio project setup. 

In [52]:
print(sorted_unique_intents)

['AddDependent', 'CancelAutomaticBilling', 'CancelPlan', 'ChangeAddress', 'ChangePlan', 'ChangeSecurityQuestion', 'CheckAccountBalance', 'CheckPaymentStatus', 'CreateAccount', 'EnrollInPlan', 'FileClaim', 'FindAgent', 'GetPolicyNumber', 'GetQuote', 'PayBill', 'RemoveDependent', 'ReportAutomobileAccident', 'ReportBillingIssue', 'RequestProofOfInsurance', 'ResetPassword', 'UpdateBillingFrequency', 'UpdatePaymentPreference']


# Dialogue Project Setup
The following cells set up a new project in Label Studio specifically for this classification task. This section explains how to dynamically generate choice elements based on identified intents in the dataset.

Additionally, we have an additional `prompt` area that will allow our project to interact with an LLM using the [Label Studio ML Backend - LLM Interactive](https://github.com/HumanSignal/label-studio-ml-backend/tree/master/label_studio_ml/examples/llm_interactive) example. This gives us a prompt area in our Labeling Interface to apply LLM interactions our output categories. 

In [39]:
# Generate choice XML elements dynamically from the sorted_unique_intents list
choices_xml = '\n'.join([f'      <Choice value="{intent}" />' for intent in sorted_unique_intents])

project = ls.start_project(
    title='Finance Support Chats',
    label_config=f'''
<View>
   <Style>
    .lsf-main-content.lsf-requesting .prompt::before {{ content: ' loading...'; color: #808080; }}
  </Style>
  <Paragraphs name="chat" value="$dialogue" layout="dialogue" />
  <Header value="User prompt:" />
  <View className="prompt">
  <TextArea name="prompt" toName="chat" rows="4" editable="true" maxSubmissions="1" showSubmitButton="false" />
  </View>
  <Header value="Bot answer:"/>
    <TextArea name="response" toName="chat" rows="4" editable="true" maxSubmissions="1" showSubmitButton="false" />

    <Choices name="response2" toName="chat" choice="multiple">
{choices_xml}
  	</Choices>
</View>
    '''
)

## Ingest into Label Studio
We can now ingest the prepared data into the newly created Label Studio project.

In [None]:
# Import dialogues for the first movie
project.import_tasks(transformed_dialogues) 

Here is a sample labeling prompt to get you started. 

```text
Label the dialogue according to the appropriate labels. separating them. Make sure there are no duplicates and separate the classes with a new line.   ['AskAboutATMFees', 'AskAboutCardArrival', 'AskAboutCashDeposits', 'AskAboutCreditScore', 'AskAboutTransferFees', 'AskAboutTransferTime', 'CheckAccountBalance', 'CheckAccountInterestRate', 'CheckTransactionHistory', 'CloseBankAccount', 'DisputeCharge', 'ExternalWireTransfer', 'FindATM', 'FindBranch', 'GetAccountInfo', 'GetBranchHours', 'GetBranchInfo', 'GetWithdrawalLimit', 'InternalFundsTransfer', 'OpenBankingAccount', 'OpenCreditCard', 'OrderChecks', 'ReportLostStolenCard', 'ReportNotice', 'RequestNewCard', 'SetUpOnlineBanking', 'UpdateEmail', 'UpdatePhoneNumber', 'UpdateStreetAddress'] Output the labels only with a newline character between each label.
```