## **Problem Statement**

### Business Context

In today’s dynamic business landscape, we understand the critical role customer feedback plays in shaping the success of products and services. Responding swiftly and effectively to customer input not only enhances their experience but also drives growth, strengthens engagement, and builds long-term value relationships. As dedicated Product Managers or Product Analysts, staying aligned with the voice of our customers isn’t just a best practice—it’s a strategic priority.

With an abundance of customer feedback and support tickets, our focus extends beyond simply processing these inputs. To truly impact customer experience and expectations, we will adopt a structured approach—one that identifies pressing issues, prioritizes effectively, and allocates resources wisely. To achieve this, we will leverage the power of Support Ticket Categorization as a key strategy.


### Objective

We are developing an advanced support ticket categorization system that accurately classifies incoming tickets, assigns relevant tags based on their content, implements mechanisms and generate the first response based on the sentiment for prioritizing tickets for prompt resolution.


## **Installing and Importing Necessary Libraries and Dependencies**

In [None]:
# Installation for GPU llama-cpp-python
# uncomment and run the following code in case GPU is being used

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.8 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.6/1.8 MB[0m [31m18.5 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m31.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.9/60.9 kB[0m [31m124.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m145.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.5/19.5 MB[0m [31m196.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for llama-cpp-python (pyproject.toml) ... [?25l[?25hdone
[31mERROR

In [None]:
# Installation for CPU llama-cpp-python
# uncomment and run the following code in case GPU is not being used

# !CMAKE_ARGS="-DLLAMA_CUBLAS=off" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

**Note** : There may be an error related to a dependency issue thrown by the pip package. This can be ignored as it will not impact the execution of the code.

In [None]:
# For downloading the models from HF Hub
!pip install huggingface_hub==0.20.3 -q

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/330.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m327.7/330.1 kB[0m [31m11.7 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m330.1/330.1 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
accelerate 0.32.1 requires numpy<2.0.0,>=1.17, but you have numpy 2.0.1 which is incompatible.
transformers 4.42.4 requires huggingface-hub<1.0,>=0.23.2, but you have huggingface-hub 0.20.3 which is incompatible.
transformers 4.42.4 requires numpy<2.0,>=1.17, but you have numpy 2.0.1 which is incompatible.[0m[31m
[0m

In [None]:
# Function to download the model from the Hugging Face model hub
from huggingface_hub import hf_hub_download

# Importing the Llama class from the llama_cpp module
from llama_cpp import Llama

# Importing the json module
import json

# for loading and manipulating data
import pandas as pd

# for time computations
import time

## **Loading the Data**

In [None]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [None]:
# code to read the CSV file.
data = pd.read_csv("/content/drive/MyDrive/NLP/Support_ticket_text_data_mid_term.csv")

## **Data Overview**

### Checking the first 5 rows of the data

In [None]:
# checking the first five rows of the data
data.head()


Unnamed: 0,support_tick_id,support_ticket_text
0,ST2023-006,My internet connection has significantly slowe...
1,ST2023-007,Urgent help required! My laptop refuses to sta...
2,ST2023-008,I've accidentally deleted essential work docum...
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...
4,ST2023-010,"My smartphone battery is draining rapidly, eve..."


### Checking the shape of the data

In [None]:
# checking the shape of the data
data.shape

(21, 2)

* There are 21 rows and 2 columns in the dataset.

### Checking the missing values in the data

In [None]:
# checking for missing values
data.isnull().sum()

Unnamed: 0,0
support_tick_id,0
support_ticket_text,0


* There is no missing value in the data.

## **Model Building**

### Loading the model

In [None]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [None]:
model_path = hf_hub_download(
    repo_id=model_name_or_path,
    filename=model_basename
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [None]:

llm = Llama(
     model_path=model_path,
     n_ctx=1024, # Context window
 )

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


### Utility functions

In [None]:
# defining a function to parse the JSON output from the model
"""def extract_json_data(json_str):
    try:
        # Find the indices of the opening and closing curly braces
        json_start = json_str.find('{')
        json_end = json_str.rfind('}')

        if json_start != -1 and json_end != -1:
            extracted_category = json_str[json_start:json_end + 1]  # Extract the JSON object
            data_dict = json.loads(extracted_category)
            return data_dict
        else:
            print(f"Warning: JSON object not found in response: {json_str}")
            return {}
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return {} """

def extract_json_data(response):
    try:
        # Attempt to load as JSON
        json_data = json.loads(response)
    except json.JSONDecodeError:
        # If not a valid JSON, parse the text format and convert it to JSON
        lines = response.splitlines()
        json_data = {}
        for line in lines:
            if ":" in line:
                key, value = line.split(":", 1)
                json_data[key.strip()] = value.strip()
            else:
                print(f"Warning: Unable to parse line: {line}")

    return json_data

## **Task 1: Ticket Categorization and Returning Structured Output**

In [None]:
# creating a copy of the data
data_1 = data.copy()

In [None]:
#Defining the response funciton for Task 1.
def response_1(prompt,ticket):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      A:
      """,
      max_tokens=32, # setting the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.01, # setting the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
     # Try to parse the output as JSON
    try:
        json_output = json.loads(temp_output)
        # Extract the value (assuming it's a single key-value pair)
        category = list(json_output.values())[0]
        return category  # Return just the category value
    except json.JSONDecodeError:
        # If it's not valid JSON, try to extract the part that looks like JSON
        start_index = temp_output.find('{')
        end_index = temp_output.rfind('}') + 1
        if start_index != -1 and end_index != -1:
            json_part = temp_output[start_index:end_index]
            try:
                json_output = json.loads(json_part)
                category = list(json_output.values())[0]
                return category  # Return just the category value
            except json.JSONDecodeError:
                pass

        # If all else fails, return the original output
        return temp_output.strip('"{}')  # Remove quotes and braces


In [None]:
prompt_1 = """
You are an AI assistant specializing in support ticket categorization. Your task is to analyze the given support ticket and categorize it into the most relevant category. Choose the single most appropriate category from the list below.

Support Ticket: {support_ticket_text}

Please categorize this support ticket into one of the following categories:
1. Hardware Issue
2. Software Problem
3. Network Connectivity
4. Account Access
5. Billing Inquiry
6. Product Functionality
7. Data Loss/Recovery
8. Security Concern
9. Performance Issue
10. Update/Upgrade Request
11. User Training
12. Feature Request
13. Third-party Integration
14. Configuration
15. Technical Issue
16. Other (specify if none of the above fit)

Example:
"Support Ticket": "My smartphone battery is draining rapidly, even with minimal use. Can you help me identify and rectify this battery issue?"
"Category": "Technical Issue"

Return your response as a JSON object with a single key "Category" whose value is a string. For example:
{"Category": "Technical Issue"}

Only return the JSON, do NOT return any other text or information.
"""


In [None]:
start = time.time()
data_1['model_response'] = data_1['support_ticket_text'].apply(lambda x: response_1(prompt_1, x))
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
print("Time taken ",(end-start))


Time taken  290.763546705246


In [None]:
#  checking the first five rows of the data to confirm whether the new column has been added
data_1['model_response'].head()

Unnamed: 0,model_response
0,Network Connectivity
1,Hardware Issue
2,Data Loss/Recovery
3,Network Connectivity
4,Hardware Issue


In [None]:
i = 1
print(data_1.loc[i, 'support_ticket_text'])

Urgent help required! My laptop refuses to start, and I have a crucial presentation scheduled for tomorrow. I've attempted a restart, but it hasn't worked. Please provide immediate assistance to resolve this hardware issue


In [None]:
print(data_1.loc[i, 'model_response'])

Hardware Issue


In [None]:
data_1['model_response'].value_counts()

Unnamed: 0_level_0,count
model_response,Unnamed: 1_level_1
Hardware Issue,7
Data Loss/Recovery,6
Network Connectivity,5
Account Access,1
Performance Issue,1
Software Problem,1


In [None]:
#  renaming the column
data_1 = data_1.rename(columns={'model_response': 'Category'})
data_1.head()

Unnamed: 0,support_tick_id,support_ticket_text,Category
0,ST2023-006,My internet connection has significantly slowe...,Network Connectivity
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issue
2,ST2023-008,I've accidentally deleted essential work docum...,Data Loss/Recovery
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Network Connectivity
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issue


In [None]:
final_data_1=data_1.copy()

* Category column has been generated.

## **Task 2: Creating Tags**

In [None]:
# creating a copy of the data
data_2 = data.copy()

In [None]:
def response_2(prompt,ticket,category):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category: {category}
      A:
      """,
      max_tokens=150, #setting the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.01, #setting set the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
     # Try to parse the output as JSON
    try:
        json_output = json.loads(temp_output)
        if isinstance(json_output, list):
            return ', '.join(json_output)  # Join list items with comma
        elif isinstance(json_output, dict) and 'Tags' in json_output:
            return ', '.join(json_output['Tags'])  # Join list items with comma
        else:
            return json.dumps(json_output)  # Return properly formatted JSON string
    except json.JSONDecodeError:
        # If it's not valid JSON, try to extract the part that looks like a list
        try:
            list_output = ast.literal_eval(temp_output)
            if isinstance(list_output, list):
                return ', '.join(list_output)  # Join list items with comma
        except:
            pass

        # If all else fails, clean up the output manually
        cleaned_output = temp_output.strip('[]').replace('"', '').replace("'", "")
        return cleaned_output


In [None]:
prompt_2 = """
You are a support assistant. Based on the following support ticket description, generate relevant tags that accurately categorize the issue. The tags should be concise and reflect the key problems or themes in the ticket.

Example:
"Support Ticket": "Urgent help required! My laptop refuses to start, and I have a crucial presentation scheduled for tomorrow. I've attempted a restart, but it hasn't worked. Please provide immediate assistance to resolve this hardware issue"
"Tags": ["Laptop", "Restart", "Hardware issue"]

Return your response as a JSON object with a single key "Tags" whose value is an array of strings. For example:
{"Tags": ["Tag1", "Tag2", "Tag3"]}

Only return the JSON, do NOT return any other text or information.
"""

In [None]:
start = time.time()
data_2["model_response"]=final_data_1[['support_ticket_text','Category']].apply(lambda x: response_2(prompt_2, x[0],x[1]),axis =1)
end = time.time()

  data_2["model_response"]=final_data_1[['support_ticket_text','Category']].apply(lambda x: response_2(prompt_2, x[0],x[1]),axis =1)
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
print("Time taken ",end-start)

Time taken  364.71262073516846


In [None]:
# checking the first five rows of the data to confirm whether the new column has been added
data_2['model_response'].head()

Unnamed: 0,model_response
0,"Internet, Slow connection, Disconnections, Net..."
1,"Laptop, Startup, Hardware issue"
2,"Data Loss, Data Recovery"
3,"Wi-Fi, Signal strength, Network connectivity"
4,"Smartphone, Battery, Hardware Issue"


In [None]:
i = 1
print(data_2.loc[i, 'support_ticket_text'])


Urgent help required! My laptop refuses to start, and I have a crucial presentation scheduled for tomorrow. I've attempted a restart, but it hasn't worked. Please provide immediate assistance to resolve this hardware issue


In [None]:
print(data_2.loc[i, 'model_response'])

Smartphone, Battery, Hardware Issue


In [None]:
#  renaming the column
data_2 = data_2.rename(columns={'model_response': 'Tags'})
data_2.head()

Unnamed: 0,support_tick_id,support_ticket_text,Tag,model_response_parsed
0,ST2023-006,My internet connection has significantly slowe...,"Internet, Slow connection, Disconnections, Net...",{}
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"Laptop, Startup, Hardware issue",{}
2,ST2023-008,I've accidentally deleted essential work docum...,"Data Loss, Data Recovery",{}
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"Wi-Fi, Signal strength, Network connectivity",{}
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","Smartphone, Battery, Hardware Issue",{}


In [None]:
# Dropping model_response and model_response_parsed columns
final_data_2 = data_2.drop(['model_response_parsed'], axis=1)
final_data_2.head()

Unnamed: 0,support_tick_id,support_ticket_text,Tag
0,ST2023-006,My internet connection has significantly slowe...,"Internet, Slow connection, Disconnections, Net..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"Laptop, Startup, Hardware issue"
2,ST2023-008,I've accidentally deleted essential work docum...,"Data Loss, Data Recovery"
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"Wi-Fi, Signal strength, Network connectivity"
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","Smartphone, Battery, Hardware Issue"


* Tag column has been added.

In [None]:
# Checking the value counts of Category column
final_data_2['Tag'].value_counts()

Unnamed: 0_level_0,count
Tag,Unnamed: 1_level_1
"Internet, Slow connection, Disconnections, Network connectivity",2
"USB Drive, Data Loss, Data Recovery",2
"External Hard Drive, Data Loss, Recovery",2
"Wi-Fi, Network connectivity",1
"Internet Connection, Network Connectivity",1
"Laptop, Touchpad, Hardware issue",1
"USB flash drive, Data loss, File recovery",1
"Laptop, Water Damage, Data Recovery",1
"Computer, Screen, Hardware issue",1
"Gaming Laptop, Graphics Card, Hardware Issue",1


In [None]:
final_data_2 = pd.concat([final_data_2,final_data_1["Category"]],axis=1)

In [None]:
final_data_2 = final_data_2[["support_tick_id","support_ticket_text","Category","Tag"]]
final_data_2

Unnamed: 0,support_tick_id,support_ticket_text,Category,Tag
0,ST2023-006,My internet connection has significantly slowe...,Network Connectivity,"Internet, Slow connection, Disconnections, Net..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issue,"Laptop, Startup, Hardware issue"
2,ST2023-008,I've accidentally deleted essential work docum...,Data Loss/Recovery,"Data Loss, Data Recovery"
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Network Connectivity,"Wi-Fi, Signal strength, Network connectivity"
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issue,"Smartphone, Battery, Hardware Issue"
5,ST2023-011,I'm locked out of my online banking account an...,Account Access,"Online Banking, Account Access, Password Reset"
6,ST2023-012,"My computer's performance is sluggish, severel...",Performance Issue,"Performance, Optimization"
7,ST2023-013,I'm experiencing a recurring blue screen error...,Hardware Issue,"PC, Blue Screen Error, Hardware Issue"
8,ST2023-014,My external hard drive isn't being recognized ...,Data Loss/Recovery,"External Hard Drive, Data Loss, Recovery"
9,ST2023-015,The graphics card in my gaming laptop seems to...,Hardware Issue,"Gaming Laptop, Graphics Card, Hardware Issue"


## **Task 3: Assigning Priority and ETA**

### **Assigning Priority**

In [None]:
# creating a copy of the data
data_3 = data.copy()

In [None]:
def response_3(prompt,ticket,category,tag):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category: {category}
      Tag: {tag}
      A:
      """,
      max_tokens=150,  # setting the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.3, #setting the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
    # Output the raw response for inspection
   # print("Raw output:", temp_output)

     # Try to parse the output as JSON
    try:
        json_output = json.loads(temp_output)
        return json_output
    except json.JSONDecodeError:
        # If parsing fails, return a structured dictionary
        return {
            "Priority": temp_output.split("Priority:")[-1].strip()
        }

In [None]:
prompt_3 ="""
To assign priority for support tickets, you can use the following prompt:
For each support ticket, analyze the ticket text and assign:

Priority level (Low, Medium, High, Critical) based on:

Urgency expressed by the user
Impact on work or productivity
Severity of the issue
Time sensitivity of the problem

Example:
Support Ticket: Urgent help required! My laptop refuses to start, and I have a crucial presentation scheduled for tomorrow. I've attempted a restart, but it hasn't worked. Please provide immediate assistance to resolve this hardware issue
Category: Hardware Issues
Tag : Smartphone, Battery, Hardware Issue
Priority: High

Now, assign Priority  for all the support tickets:

Support Ticket: {support_ticket_text}
Category: {Category}
Tag : {Tag}
Priority:

Respond with only the priority level.
Consider the nature of each issue, its impact on the user, and the likely resources needed for resolution when assigning priority.
"""

In [None]:
# Applying generate_llama_response function on support_ticket_text column
start = time.time()
data_3['model_response'] = final_data_2[['support_ticket_text','Category','Tag']].apply(lambda x: response_3(prompt_3, x[0],x[1],x[2]),axis=1)
end = time.time()

  data_3['model_response'] = final_data_2[['support_ticket_text','Category','Tag']].apply(lambda x: response_3(prompt_3, x[0],x[1],x[2]),axis=1)
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
print("Time taken ",(end-start))

Time taken  65.96540069580078


In [None]:
#  checking the first five rows of the data to confirm whether the new column has been added
data_3['model_response'].head()

Unnamed: 0,model_response
0,{'Priority': 'Medium'}
1,{'Priority': 'High'}
2,{'Priority': 'Critical'}
3,{'Priority': 'Medium'}
4,{'Priority': 'Medium'}


In [None]:
i = 2
print(data_3.loc[i, 'support_ticket_text'])

I've accidentally deleted essential work documents, causing substantial data loss. I understand the need to avoid further actions on my device. Can you please prioritize the data recovery process and guide me through it?


In [None]:
print(data_3.loc[i, 'model_response'])

{'Priority': 'Critical'}


In [None]:
# Normalizing the model_response_parsed column
model_response_parsed_df_3 = pd.json_normalize(data_3['model_response'])
model_response_parsed_df_3.head(21)

Unnamed: 0,Priority
0,Medium
1,High
2,Critical
3,Medium
4,Medium
5,High
6,Medium
7,High
8,High
9,High


In [None]:
# Concatinating two dataframes
data_with_parsed_model_output_3 = pd.concat([data_3, model_response_parsed_df_3], axis=1)
data_with_parsed_model_output_3.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,Priority
0,ST2023-006,My internet connection has significantly slowe...,{'Priority': 'Medium'},Medium
1,ST2023-007,Urgent help required! My laptop refuses to sta...,{'Priority': 'High'},High
2,ST2023-008,I've accidentally deleted essential work docum...,{'Priority': 'Critical'},Critical
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,{'Priority': 'Medium'},Medium
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",{'Priority': 'Medium'},Medium


In [None]:
# Dropping model_response and model_response_parsed columns
final_data_3 = data_with_parsed_model_output_3.drop(['model_response'], axis=1)
final_data_3.head()

Unnamed: 0,support_tick_id,support_ticket_text,Priority
0,ST2023-006,My internet connection has significantly slowe...,Medium
1,ST2023-007,Urgent help required! My laptop refuses to sta...,High
2,ST2023-008,I've accidentally deleted essential work docum...,Critical
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Medium
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Medium


* Priority column has been added.

In [None]:
final_data_3 = pd.concat([final_data_3,final_data_2[["Category","Tag"]]],axis=1)
final_data_3.head()

Unnamed: 0,support_tick_id,support_ticket_text,Priority,Category,Tag
0,ST2023-006,My internet connection has significantly slowe...,Medium,Network Connectivity,"Internet, Slow connection, Disconnections, Net..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,High,Hardware Issue,"Laptop, Startup, Hardware issue"
2,ST2023-008,I've accidentally deleted essential work docum...,Critical,Data Loss/Recovery,"Data Loss, Data Recovery"
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Medium,Network Connectivity,"Wi-Fi, Signal strength, Network connectivity"
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Medium,Hardware Issue,"Smartphone, Battery, Hardware Issue"


In [None]:
final_data_3

Unnamed: 0,support_tick_id,support_ticket_text,Priority,Category,Tag
0,ST2023-006,My internet connection has significantly slowe...,Medium,Network Connectivity,"Internet, Slow connection, Disconnections, Net..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,High,Hardware Issue,"Laptop, Startup, Hardware issue"
2,ST2023-008,I've accidentally deleted essential work docum...,Critical,Data Loss/Recovery,"Data Loss, Data Recovery"
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Medium,Network Connectivity,"Wi-Fi, Signal strength, Network connectivity"
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Medium,Hardware Issue,"Smartphone, Battery, Hardware Issue"
5,ST2023-011,I'm locked out of my online banking account an...,High,Account Access,"Online Banking, Account Access, Password Reset"
6,ST2023-012,"My computer's performance is sluggish, severel...",Medium,Performance Issue,"Performance, Optimization"
7,ST2023-013,I'm experiencing a recurring blue screen error...,High,Hardware Issue,"PC, Blue Screen Error, Hardware Issue"
8,ST2023-014,My external hard drive isn't being recognized ...,High,Data Loss/Recovery,"External Hard Drive, Data Loss, Recovery"
9,ST2023-015,The graphics card in my gaming laptop seems to...,High,Hardware Issue,"Gaming Laptop, Graphics Card, Hardware Issue"


### **Assigning ETA**

In [None]:
# creating a copy of the data
df_3 = data.copy()

In [None]:
def response(prompt,ticket,category,tag,priority):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category: {category}
      Tag : {tag}
      Priority : {priority}
      A:
      """,
      max_tokens=150,  # setting the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.3, # setting the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
    # Output the raw response for inspection
    #print("Raw output:", temp_output)

     # Try to parse the output as JSON
    try:
        json_output = json.loads(temp_output)
        return json_output
    except json.JSONDecodeError:
        # If parsing fails, return a structured dictionary
        return {
            "ETA": temp_output.split("ETA:")[-1].strip()
        }

In [None]:
prompt="""
To assign  ETA (Estimated Time of Arrival/resolution) for support tickets, you can use the following prompt:
For each support ticket, analyze the ticket text and assign:

ETA (Estimated Time of Arrival/resolution) based on:

Complexity of the issue
Typical resolution time for similar problems
Available resources and expertise
Any time-specific requirements mentioned by the user

Use the following guidelines for assigning ETAs:

Immediate: For critical issues requiring instant attention
2-4 hours: For urgent issues that can be resolved quickly
24 hours: For high-priority issues requiring more time
2-3 business days: For medium-priority issues
3-5 business days: For low-priority or complex issues

Fill the ETA columns only with the values mentioned above.

Example:
Support Ticket: Urgent help required! My laptop refuses to start, and I have a crucial presentation scheduled for tomorrow. I've attempted a restart, but it hasn't worked. Please provide immediate assistance to resolve this hardware issue
Category: Hardware Issues
Tag : Smartphone, Battery, Hardware Issue
Priority: High
ETA: 24 hours

Now, assign ETA  for all the support tickets:

Support Ticket: {support_ticket_text}
Category: {Category}
Tag : {Tag}
Priority : {Priority}
ETA:

Respond with only the ETA.
Consider the nature of each issue, its impact on the user, and the likely resources needed for resolution when assigning priority.
"""

In [None]:
# Applying generate_llama_response function on support_ticket_text column
start = time.time()
df_3['model_response'] = final_data_3[['support_ticket_text','Category','Tag','Priority']].apply(lambda x: response(prompt, x[0],x[1],x[2],x[3]),axis=1)
end = time.time()

  df_3['model_response'] = final_data_3[['support_ticket_text','Category','Tag','Priority']].apply(lambda x: response(prompt, x[0],x[1],x[2],x[3]),axis=1)
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
end = time.time()
print("Time taken ",(end-start))

Time taken  331.961323261261


In [None]:
df_3['model_response'].head()

Unnamed: 0,model_response
0,{'ETA': '2-3 business days'}
1,{'ETA': '24 hours'}
2,{'ETA': 'Immediate'}
3,{'ETA': '2-3 business days'}
4,{'ETA': '3-5 business days'}


In [None]:
i = 2
print(df_3.loc[i, 'support_ticket_text'])

I've accidentally deleted essential work documents, causing substantial data loss. I understand the need to avoid further actions on my device. Can you please prioritize the data recovery process and guide me through it?


In [None]:
print(df_3.loc[i, 'model_response'])

{'ETA': 'Immediate'}


In [None]:
# Normalizing the model_response_parsed column
model_response_parsed_df = pd.json_normalize(df_3['model_response'])
model_response_parsed_df.head(21)

Unnamed: 0,ETA
0,2-3 business days
1,24 hours
2,Immediate
3,2-3 business days
4,3-5 business days
5,2 hours
6,2-3 business days
7,3-5 business days
8,3-5 business days
9,3-5 business days


In [None]:
# Concatinating two dataframes
data_with_parsed_model_output = pd.concat([df_3, model_response_parsed_df], axis=1)
data_with_parsed_model_output.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,ETA
0,ST2023-006,My internet connection has significantly slowe...,{'ETA': '2-3 business days'},2-3 business days
1,ST2023-007,Urgent help required! My laptop refuses to sta...,{'ETA': '24 hours'},24 hours
2,ST2023-008,I've accidentally deleted essential work docum...,{'ETA': 'Immediate'},Immediate
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,{'ETA': '2-3 business days'},2-3 business days
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",{'ETA': '3-5 business days'},3-5 business days


In [None]:
# Dropping model_response column
final_data = data_with_parsed_model_output.drop(['model_response'], axis=1)
final_data.head()

Unnamed: 0,support_tick_id,support_ticket_text,ETA
0,ST2023-006,My internet connection has significantly slowe...,2-3 business days
1,ST2023-007,Urgent help required! My laptop refuses to sta...,24 hours
2,ST2023-008,I've accidentally deleted essential work docum...,Immediate
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,2-3 business days
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",3-5 business days


* ETA column has been added.

In [None]:
final_data= pd.concat([final_data,final_data_3[["Category","Tag","Priority"]]],axis=1)

In [None]:
final_data

Unnamed: 0,support_tick_id,support_ticket_text,ETA,Category,Tag,Priority
0,ST2023-006,My internet connection has significantly slowe...,2-3 business days,Network Connectivity,"Internet, Slow connection, Disconnections, Net...",Medium
1,ST2023-007,Urgent help required! My laptop refuses to sta...,24 hours,Hardware Issue,"Laptop, Startup, Hardware issue",High
2,ST2023-008,I've accidentally deleted essential work docum...,Immediate,Data Loss/Recovery,"Data Loss, Data Recovery",Critical
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,2-3 business days,Network Connectivity,"Wi-Fi, Signal strength, Network connectivity",Medium
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",3-5 business days,Hardware Issue,"Smartphone, Battery, Hardware Issue",Medium
5,ST2023-011,I'm locked out of my online banking account an...,2 hours,Account Access,"Online Banking, Account Access, Password Reset",High
6,ST2023-012,"My computer's performance is sluggish, severel...",2-3 business days,Performance Issue,"Performance, Optimization",Medium
7,ST2023-013,I'm experiencing a recurring blue screen error...,3-5 business days,Hardware Issue,"PC, Blue Screen Error, Hardware Issue",High
8,ST2023-014,My external hard drive isn't being recognized ...,3-5 business days,Data Loss/Recovery,"External Hard Drive, Data Loss, Recovery",High
9,ST2023-015,The graphics card in my gaming laptop seems to...,3-5 business days,Hardware Issue,"Gaming Laptop, Graphics Card, Hardware Issue",High


## **Task 4 - Creating a Draft Response**

In [None]:
# creating a copy of the data
data_4 = data.copy()

In [None]:
import re
def response_4(prompt,ticket,category,tag,priority,eta):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category : {category}
      Tag : {tag}
      Priority: {priority}
      ETA: {eta}
      A:
      """,
      max_tokens=500, # setting the maximum number of tokens the model should generate for this task.
      stop= None,
      temperature=0.7, # setting the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"].strip()

# Print raw output for debugging
    print(f"Raw output: {temp_output}")

   # Clean up the output
    cleaned_output = re.sub(r'\s+', ' ', temp_output)  # Replace all whitespace (including \n) with a single space
    cleaned_output = cleaned_output.strip()  # Remove leading/trailing whitespace
    cleaned_output = re.sub(r'^Dear valued customer,?\s*', '', cleaned_output, flags=re.IGNORECASE)  # Remove greeting if present

    return f"Dear valued customer, {cleaned_output}" if cleaned_output else "No response generated"

In [None]:
prompt_4 = """
You are a customer support agent. Your task is to create a concise, empathetic draft response for the following support ticket.

Support Ticket: {support_ticket_text}
Category: {Category}
Tag : {Tag}
Priority: {Priority}
Estimated Time to Resolution: {ETA}

Your response should:
1. Acknowledge the issue
2. Provide a brief explanation of the next steps
3. Give an estimated timeframe for resolution based on the priority
4. Offer any immediate advice or workaround if applicable

IMPORTANT: Write your response as a single, continuous line of text. Do not include any line breaks, newline characters, or the phrase "Dear valued customer,". The response must be crisp, concise, and in one unbroken line.
"""

**Note** : For this task, we will not be using the *`extract_json_data`* function. Hence, the output from the model should be a plain string and not a JSON object.

In [None]:
#Applying generate_llama_response function on support_ticket_text column
start = time.time()
data_4['model_response'] = final_data[['support_ticket_text','Category','Tag','Priority','ETA']].apply(lambda x: response_4(prompt_4, x[0],x[1],x[2],x[3],x[4]),axis=1)
end = time.time()

  data_4['model_response'] = final_data[['support_ticket_text','Category','Tag','Priority','ETA']].apply(lambda x: response_4(prompt_4, x[0],x[1],x[2],x[3],x[4]),axis=1)
Llama.generate: prefix-match hit


Raw output: We're truly sorry to hear about the issues you've been experiencing with your internet connection. Our team is dedicated to resolving this for you as quickly as possible. Next steps include diagnosing potential network congestion, checking your equipment, and contacting your Internet Service Provider (ISP) if necessary. Based on the priority of your ticket, we anticipate a 2-3 business day resolution timeframe. In the meantime, have you tried restarting both your modem and router? This simple step can often help restore a stable connection. If the issue persists after trying this solution, please provide us with more details about your internet plan, connection type, and any error messages you've encountered for further assistance.


Llama.generate: prefix-match hit


Raw output: Understood the urgency of your situation with your laptop not starting before an important presentation. I'll escalate this high-priority hardware issue for immediate assistance. Our team will reach out as soon as possible for a remote diagnosis and potential solution, such as troubleshooting BIOS settings or performing a hard reset if safe to do so. The estimated time to resolution is 24 hours due to the urgency and complexity of this matter. In the interim, you may consider bringing your laptop to a local repair center for an emergency assessment should time become critical. We sincerely apologize for any inconvenience caused and are committed to resolving this issue promptly.


Llama.generate: prefix-match hit


Raw output: We deeply sympathize with your predicament. Rest assured, we'll prioritize data recovery based on the critical priority. Please refrain from using your device to avoid potential overwriting of deleted files. In the meantime, you might consider checking the Recycle Bin or Cloud storage if applicable. Our team will contact you shortly for further instructions and assistance with professional data recovery tools.


Llama.generate: prefix-match hit


Raw output: I'm really sorry to hear that you're experiencing persistent weak Wi-Fi signal issues despite being near your router. Let me guide you through some steps to help strengthen your connection. First, ensure no other devices are interfering with your Wi-Fi by disconnecting them one at a time while testing your connection. Also, check for any potential physical obstructions that might be obstructing the signal. In case these steps don't yield results, please provide your router model number and I will send you specific instructions based on its capabilities to optimize its performance. We aim to have this resolved within 2-3 business days.


Llama.generate: prefix-match hit


Raw output: ☹️ We're sorry to hear about your smartphone's rapid battery drain issue. Our team will investigate this matter thoroughly. Kindly provide some details like the make and model of your device, its current software version, and if possible, the frequency of occurrence for this problem. In the meantime, try to conserve battery life by turning off background apps, reducing screen brightness, or using power saving mode. Estimated resolution time: 3-5 business days.


Llama.generate: prefix-match hit


Raw output: I'm truly sorry to hear that you are unable to access your online banking account at this moment, and I understand the urgency of your situation. We'll work efficiently to help you reset your password as soon as possible. Once we receive your verification information, our team will initiate the password reset process within 2 hours. In the meantime, please ensure that you have access to the email address associated with your account for receiving the verification code. If you're unable to access the email, please contact us through a different communication channel like phone or chat so we can discuss alternative methods for receiving the verification code.


Llama.generate: prefix-match hit


Raw output: We apologize for the inconvenience you're experiencing with your computer's performance. To help improve its speed and restore productivity, our team will conduct a thorough analysis of your system. This process may include checking for outdated software, unnecessary startup programs, or system defragmentation. Rest assured, we are committed to addressing this issue within 2-3 business days. In the meantime, you might consider closing unused applications and browsing windows to free up some memory and processing power. Thank you for your patience as we work on a solution.


Llama.generate: prefix-match hit


Raw output: Apologies for the inconvenience of your blue screen error. We'll investigate this as a hardware concern. Please send us your system specs, including make and model of your PC, graphics card, RAM size, and installed hard drives. In the meantime, you may try updating your drivers or performing a System Restore to an earlier point where the issue didn't occur. The estimated repair time is 3-5 business days due to our high priority queue.


Llama.generate: prefix-match hit


Raw output: We're truly sorry to hear about the issue with your external hard drive not being recognized. Our team will work diligently to recover your vital data. Kindly provide us with the make and model number of the hard drive, along with any relevant error messages, for a faster resolution. Please avoid writing or reading new data on the drive to prevent potential data overwrite. The recovery process may take 3-5 business days due to priority. We'll keep you updated throughout the process.


Llama.generate: prefix-match hit


Raw output: We apologize for the inconvenience you're experiencing with your laptop's graphics card. Our team will investigate and work on a solution to improve your gaming performance. Kindly prepare your system for remote support access and expect resolution within 3-5 business days, prioritized based on your ticket. If possible, try updating your graphics card drivers as a temporary workaround.


Llama.generate: prefix-match hit


Raw output: Apologies for your data loss issue with the critical work files on your USB drive. We understand how important these files are to you. Our team will initiate a thorough data recovery process starting tomorrow, given the priority. It may take up to 3-5 business days to complete the recovery. In the meantime, please avoid using the USB drive to prevent further potential data overwrites. If you require immediate access to any of these files, kindly provide their names for possible interim solutions.


Llama.generate: prefix-match hit


Raw output: We apologize for the inconvenience you're experiencing with your computer screen going black. Our team is here to help and will prioritize this as a high-level issue. We'll need some more information from you to begin troubleshooting, such as the specific model of your computer, any error messages displayed, and if the issue started after installing new hardware or software. Once we have that information, an expert technician will be assigned to your case and will work diligently to resolve it within 24 hours based on our priority. In the meantime, you may want to try a few basic steps like checking power cables, restarting your computer, or trying a different monitor if possible. If none of these steps work, please provide the requested details so we can expedite the process. Thank you for choosing our support and we're here to help get your system back to normal as soon as possible.


Llama.generate: prefix-match hit


Raw output: We're truly sorry for the distress this has caused. Immediately shut down your laptop to prevent further damage, then bring it in for professional assessment and data recovery services. Estimated turnaround time is 3-5 business days based on priority. In the meantime, please secure any important data via backup or external storage if possible.


Llama.generate: prefix-match hit


Raw output: Apologies for the inconvenience with your physically damaged USB flash drive. Our team will prioritize this issue and begin assessing options for data recovery as soon as possible. Given the priority, we estimate a resolution time of 3-5 business days. In the meantime, if you have backup copies of these critical files, please use them to avoid further disruption. Alternatively, we'd be happy to discuss any immediate steps or alternative solutions that may help mitigate potential data loss while our team works on your case.


Llama.generate: prefix-match hit


Raw output: We're truly sorry for the inconvenience you're experiencing with your laptop touchpad. Our team will investigate this hardware issue thoroughly to find a solution as soon as possible. In the meantime, if you have an external mouse, we recommend using it as a workaround until we can resolve the touchpad issue. The process is expected to take approximately 2-3 business days due to the priority level of your support request. We appreciate your patience and will keep you updated on our progress.


Llama.generate: prefix-match hit


Raw output: I'm truly sorry to hear about the disruptions you're experiencing with your internet connection that's impacting your work. Our network team is currently investigating this issue and working on a resolution. We anticipate having it resolved within the next 2-4 hours, depending on the complexity of the problem. In the meantime, some immediate troubleshooting steps you might consider are restarting your modem or router, checking your ethernet cables, or resetting any connected devices. These simple steps can often help restore connectivity while we work on a more permanent fix. If these steps don't yield results, please be assured that our team is prioritizing this issue and will keep you updated as soon as possible.


Llama.generate: prefix-match hit


Raw output: I'm truly sorry for any inconvenience you're experiencing with your Wi-Fi connection. To help diagnose the issue, let's try a few steps. First, ensure all devices are connected to the same network. You can check this by looking at the SSID on both your device and the router. Additionally, restarting your modem and router may help. Perform these steps in order: unplug the power cord, wait for 30 seconds, plug it back in, then repeat this process for the router. Allow about 5 minutes for them to fully reconnect before testing your connection again. If the issue persists after trying these steps, I'll submit a request for further investigation. Estimated time for resolution is around 3-5 business days due to the priority level. In the meantime, if your work relies heavily on Wi-Fi, consider using a wired connection as a temporary solution.


Llama.generate: prefix-match hit


Raw output: Apologies for your data loss issue with the USB drive. We'll initiate a professional data recovery process to retrieve your crucial files. Our team will contact you within 1-2 business days to provide updates and discuss further steps, with an estimated completion time of 3-5 business days based on priority. In the interim, avoid writing or saving new data onto the USB drive.


Llama.generate: prefix-match hit


Raw output: I understand the urgency of your situation regarding the unrecognized external hard drive and the need to recover vital data. Our team will prioritize your case with a high priority. The next steps involve thoroughly diagnosing the issue and initiating recovery procedures using specialized tools. Please make sure that the external hard drive is securely connected and powered on during this process, which may take 3-5 business days for completion. In the meantime, you might consider checking if the issue lies with the USB port or cable by trying it on another computer. If the issue persists, please do not attempt any manual recovery methods as they could potentially worsen the data loss situation. We sincerely apologize for the inconvenience and will keep you updated throughout the process.


Llama.generate: prefix-match hit


Raw output: We're deeply sorry for the significant disruption to your internet connection that has been affecting your ability to work effectively over the past 24 hours. Our team is working diligently to diagnose and resolve this issue, which we understand carries a critical priority due to its impact on your productivity. While our investigation continues, I recommend trying the following steps:
      1. Check your local network cables for any damages or loose connections.
      2. Try connecting another device to your network to determine if the issue is specific to your device.
      3. If possible, try connecting directly to your modem using an Ethernet cable instead of Wi-Fi.
       In the meantime, we will prioritize this ticket and work towards a resolution as soon as possible. We appreciate your patience and understanding as our team works tirelessly to restore your stable internet connection.
       Our top priority is ensuring your connectivity issue is resolved promptly. We

Llama.generate: prefix-match hit


Raw output: We sincerely apologize for the disruption you're experiencing with your work computer. Your issue involves both unusual software behavior and unexpected data loss, which falls under our "Software Problem" category. Our team will thoroughly examine this complex situation to diagnose any underlying system or software compatibility issues. While we don't have immediate solutions, we estimate a resolution timeframe of 3-5 business days based on the priority level you've assigned. In the meantime, please avoid making any changes to your computer, and be sure to save important files elsewhere to prevent further data loss. We appreciate your patience as our team works diligently to identify and address the root cause of this issue. If additional information is needed or if remote diagnostics are necessary, we will contact you promptly. Thank you for bringing this matter to our attention and for your cooperation throughout the resolution process.


In [None]:
print("Time taken",(end-start))

Time taken 2317.702466249466


In [None]:
#  checking the first five rows of the data to confirm whether the new column has been added
data_4['model_response'].head()

Unnamed: 0,model_response
0,"Dear valued customer, We're truly sorry to hea..."
1,"Dear valued customer, Understood the urgency o..."
2,"Dear valued customer, We deeply sympathize wit..."
3,"Dear valued customer, I'm really sorry to hear..."
4,"Dear valued customer, ☹️ We're sorry to hear a..."


In [None]:
i = 6
print(data_4.loc[i, 'support_ticket_text'])

My computer's performance is sluggish, severely impacting my work. I need help optimizing it to regain productivity.


In [None]:
print(data_4.loc[i, 'model_response'])

Dear valued customer, We apologize for the inconvenience you're experiencing with your computer's performance. To help improve its speed and restore productivity, our team will conduct a thorough analysis of your system. This process may include checking for outdated software, unnecessary startup programs, or system defragmentation. Rest assured, we are committed to addressing this issue within 2-3 business days. In the meantime, you might consider closing unused applications and browsing windows to free up some memory and processing power. Thank you for your patience as we work on a solution.


In [None]:
final_data_4 = pd.concat([final_data,data_4["model_response"]],axis=1)

In [None]:
final_data_4.rename(columns={"model_response":"Response"},inplace=True)

In [None]:
final_data_4

Unnamed: 0,support_tick_id,support_ticket_text,ETA,Category,Tag,Priority,Response
0,ST2023-006,My internet connection has significantly slowe...,2-3 business days,Network Connectivity,"Internet, Slow connection, Disconnections, Net...",Medium,"Dear valued customer, We're truly sorry to hea..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,24 hours,Hardware Issue,"Laptop, Startup, Hardware issue",High,"Dear valued customer, Understood the urgency o..."
2,ST2023-008,I've accidentally deleted essential work docum...,Immediate,Data Loss/Recovery,"Data Loss, Data Recovery",Critical,"Dear valued customer, We deeply sympathize wit..."
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,2-3 business days,Network Connectivity,"Wi-Fi, Signal strength, Network connectivity",Medium,"Dear valued customer, I'm really sorry to hear..."
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",3-5 business days,Hardware Issue,"Smartphone, Battery, Hardware Issue",Medium,"Dear valued customer, ☹️ We're sorry to hear a..."
5,ST2023-011,I'm locked out of my online banking account an...,2 hours,Account Access,"Online Banking, Account Access, Password Reset",High,"Dear valued customer, I'm truly sorry to hear ..."
6,ST2023-012,"My computer's performance is sluggish, severel...",2-3 business days,Performance Issue,"Performance, Optimization",Medium,"Dear valued customer, We apologize for the inc..."
7,ST2023-013,I'm experiencing a recurring blue screen error...,3-5 business days,Hardware Issue,"PC, Blue Screen Error, Hardware Issue",High,"Dear valued customer, Apologies for the inconv..."
8,ST2023-014,My external hard drive isn't being recognized ...,3-5 business days,Data Loss/Recovery,"External Hard Drive, Data Loss, Recovery",High,"Dear valued customer, We're truly sorry to hea..."
9,ST2023-015,The graphics card in my gaming laptop seems to...,3-5 business days,Hardware Issue,"Gaming Laptop, Graphics Card, Hardware Issue",High,"Dear valued customer, We apologize for the inc..."


* Response has been added to the dataset. This is our final dataset.

## **Model Output Analysis**

In [None]:
# Creating a copy of the dataframe of task-4
final_df = final_data_4.copy()

In [None]:
final_df['Category'].value_counts()    # complete the code with the column name for the column containing ticket categories

Unnamed: 0_level_0,count
Category,Unnamed: 1_level_1
Hardware Issue,7
Data Loss/Recovery,6
Network Connectivity,5
Account Access,1
Performance Issue,1
Software Problem,1


In [None]:
final_df["Priority"].value_counts() # checking the count in each priority category.

Unnamed: 0_level_0,count
Priority,Unnamed: 1_level_1
High,11
Medium,8
Critical,2


In [None]:
final_data["ETA"].value_counts()# checking the count in each ETA category.

Unnamed: 0_level_0,count
ETA,Unnamed: 1_level_1
3-5 business days,11
2-3 business days,4
24 hours,2
Immediate,2
2 hours,1
2-4 hours,1


Let's dive in a bit deeper here.

In [None]:
final_df.groupby(['Category', 'ETA']).support_tick_id.count() #   group by based on the categories and ETA.

Unnamed: 0_level_0,Unnamed: 1_level_0,support_tick_id
Category,ETA,Unnamed: 2_level_1
Account Access,2 hours,1
Data Loss/Recovery,3-5 business days,5
Data Loss/Recovery,Immediate,1
Hardware Issue,2-3 business days,1
Hardware Issue,24 hours,2
Hardware Issue,3-5 business days,4
Network Connectivity,2-3 business days,2
Network Connectivity,2-4 hours,1
Network Connectivity,3-5 business days,1
Network Connectivity,Immediate,1


In [None]:
final_df.groupby(['Category', 'Priority']).support_tick_id.count() #  group by based on the categories and Prority.

Unnamed: 0_level_0,Unnamed: 1_level_0,support_tick_id
Category,Priority,Unnamed: 2_level_1
Account Access,High,1
Data Loss/Recovery,Critical,1
Data Loss/Recovery,High,4
Data Loss/Recovery,Medium,1
Hardware Issue,High,5
Hardware Issue,Medium,2
Network Connectivity,Critical,1
Network Connectivity,High,1
Network Connectivity,Medium,3
Performance Issue,Medium,1


## **Actionable Insights and Recommendations**

We used an LLM to do multiple tasks, one stage at a time

* We first accurately classified incoming tickets, categorized them, and returned a structured output. Support ticket categorization is crucial for efficient resource allocation, prioritization, trend analysis, and performance measurement, enabling organizations to streamline operations, improve customer satisfaction, and make data-driven decisions for the continuous improvement of their support processes and overall business strategies. By attaching relevant tags, support tickets can be routed to experts with specialized knowledge in handling specific issues.
  Moreover, ticket tagging aids in analyzing trends
Next, in addition to the overall sentiment of the review as well as the sentiment of specific aspects of the experience, we also identified the liked/disliked features of the different aspects of the experience
Finally, in addition to all the above, we also got a response we can share with the customer based on their review.

* Then we generated priority and ETA for each ticket which ensures efficient resource allocation, timely resolution, and better customer satisfaction by managing expectations and workflow.

* Finally, in addition to all the above, we also got a response we can share with the customer based on their ticket.



**Actionable Insights:**

* Category Distribution: Analyzed the distribution of tickets across categories to identify the most common issues customers face. Hardware issues, Data loss/recovery, and Network connectivity are the top three issues that customers face.
* Priority Correlation: Examined the relationship between ticket categories and their assigned priorities to understand which issues are most critical. Data loss/recovery, Network connectivity, and Hardware issues are mostly critical and high priorities.
* Response Time Analysis: Evaluate how response times vary across different categories to identify areas for improvement. Of 21 tickets, 6 tickets have an ETA within 24 hours and rest 15 tickets have comparatively longer ETA. High-priority tickets typically receive a shorter ETA because they need immediate attention. Lower-priority tickets may have a longer ETA, allowing the team to handle them when resources are available. As we have seen, Data loss/recovery, Network connectivity, and Hardware issues are mostly critical and high in priority those have shorter ETA. There is only one issue in the Account access category and it is high in priority and its ETA is immediate.
* Keyword Analysis: Identified frequently occurring keywords within each category to refine your understanding of specific issues.

**Business Recommendations:**

* Automated Routing: An automated ticket routing system can be implemented based on the LLM categorization to direct issues to the most appropriate team or specialist.
* Knowledge Base Enhancement: Insights from common categories can be used to expand and improve your customer-facing knowledge base and self-help resources.
* Proactive Support: For categories with high volumes or critical priorities,  proactive support strategies can be developed to address issues before they escalate.
* Training Focus:  Training programs should be arranged to address the most common and critical issue categories.
* Product Development: Insights on recurring issues can be shared with the product development team to inform future improvements and features.
* Resource Allocation: Staffing and resource allocation should be adjusted based on the volume and complexity of different ticket categories.
* Chatbot Enhancement: The categorization insights can improve chatbot responses and reduce the volume of human-handled tickets.
* Customer Communication: Targeted communication strategies can be developed for different issue categories to inform customers about known issues and resolutions.
* Feedback Loop: A system can be established to continuously feed human-verified categorizations back into the LLM training process to improve accuracy over time.
* Vendor Management: For categories related to third-party integrations or services, data can be used to manage vendor relationships and performance.
* Customer Segmentation: Variation of ticket categories can be analyzed across different customer segments to tailor support strategies.


By implementing these insights and recommendations, we can significantly enhance our customer support operations, improve customer satisfaction, and drive operational efficiencies. We have to regularly review and update our strategies based on ongoing analysis of the categorization data.

<font size=6 color='blue'>Power Ahead</font>
___