Develop an advanced support ticket categorization system that accurately classifies incoming tickets, assigns relevant tags based on their content, implements mechanisms and generate the first response based on the sentiment for prioritizing tickets for prompt resolution.


## **Installing and Importing Necessary Libraries and Dependencies**

In [2]:
# for loading and manipulating data.
# try:
#   import pandas as pd
# except:
#   pip uninstall numpy
#   pip install numpy==1.15.1
#   import pandas as pd

# Ignore warnings
import warnings
warnings.filterwarnings("ignore")

import pandas as pd
# # for time computations.
import time

In [2]:
# Installation for GPU llama-cpp-python.
!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

'CMAKE_ARGS' is not recognized as an internal or external command,
operable program or batch file.


In [3]:
# pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

In [3]:
# Importing the Llama class from the llama_cpp module.
from llama_cpp import Llama

In [7]:
# For downloading the models from HF Hub.
# !pip install huggingface_hub==0.20.3 pandas==1.5.3 -q

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
dask-expr 1.1.13 requires pandas>=2, but you have pandas 1.5.3 which is incompatible.


In [4]:
# Function to download the model from the Hugging Face model hub.
from huggingface_hub import hf_hub_download

# Importing the json module.
import json

In [5]:
import pandas as pd

## **Loading the Data**

In [6]:
# Loading the data into df
df = pd.read_csv("Support_ticket_text_data_mid_term.csv")

# Creating copy of 'df' in the variable data
data = df.copy()

## **Data Overview**

### Checking the first 5 rows of the data

In [7]:
# first 5 rows of the data
data.head(5)

Unnamed: 0,support_tick_id,support_ticket_text
0,ST2023-006,My internet connection has significantly slowe...
1,ST2023-007,Urgent help required! My laptop refuses to sta...
2,ST2023-008,I've accidentally deleted essential work docum...
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...
4,ST2023-010,"My smartphone battery is draining rapidly, eve..."


### Checking the shape of the data

In [8]:
# shape of data
data.shape

(21, 2)

In [9]:
# There are 21 rows and 2 columns present in this data.

### Checking the missing values in the data

In [10]:
# Missing values in data
data.isna().sum().sum()

0

In [11]:
# From the above output we identify there are no missing values in the dataset.

## **Model Building**

### Loading the model

In [12]:
# model name and model base name
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [13]:
# Declaring repo_id and filename
model_path = hf_hub_download(
    repo_id=model_name_or_path, # repo_id = model_name_or_path
    filename=model_basename # filename = model_basename
)

mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [14]:
# Defining the llm model - Llama (Run using GPU)

llm = Llama(
    model_path=model_path,
    n_ctx=1024, # Context window
)

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 | 


### Utility functions

In [15]:
# defining a function to parse the JSON output from the model
def extract_json_data(json_str):
    try:
        # Find the indices of the opening and closing curly braces
        json_start = json_str.find('{')
        json_end = json_str.rfind('}')

        if json_start != -1 and json_end != -1:
            extracted_category = json_str[json_start:json_end + 1]  # Extract the JSON object
            data_dict = json.loads(extracted_category)
            return data_dict
        else:
            print(f"Warning: JSON object not found in response: {json_str}")
            return {}
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return {}

## **Task 1: Ticket Categorization and Returning Structured Output**

In [16]:
# creating a copy of the data
data_1 = data.copy()

In [17]:
# Defining the response funciton for Task 1.
def response_1(prompt,ticket):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      A:
      """,
      max_tokens=10, # defining the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.01, # temperature set to 0.01(low) for deterministic output.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
    final_output = temp_output[temp_output.index('{'):]

    return final_output

In [18]:
# Prompt creation for task 1
prompt_1 = """
   As an AI, your job is to categorize IT support tickets. 
   Please label each ticket as either a Hardware Issue, Data Recovery, or Technical Issue. 
   Your response should be in the format: {"category": "Hardware Issues"}, {"category": "Data Recovery"}, or {"category": "Technical Issues"}. 
   Keep your output simple and accurate. Ensure that all curly braces are closed and there are no additional characters in the output.
"""

**Note**: The output of the model should be in a structured format (JSON format).

In [19]:
# Utilizing generate_llama_response as a function on the variable: support_ticket_text 
start = time.time()
data_1['model_response'] = data_1['support_ticket_text'].apply(lambda x: response_1(prompt_1, x))
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [20]:
# Time taken for model to return output
print("Time taken:", round((end-start)),"seconds")

Time taken: 280 seconds


In [21]:
# Initial model output
data_1['model_response'].head(5)

0    {"category": "Technical Issues"}
1     {"category": "Hardware Issues"}
2       {"category": "Data Recovery"}
3    {"category": "Technical Issues"}
4     {"category": "Hardware Issues"}
Name: model_response, dtype: object

In [22]:
# Displaying the support ticket text
i = 6
print(data_1.loc[i,'support_ticket_text'])

My computer's performance is sluggish, severely impacting my work. I need help optimizing it to regain productivity.


In [23]:
# Model output
print(data_1.loc[i, 'model_response'])

{"category": "Technical Issues"}


In [24]:
# Applying the function to the model response
data_1['model_response_parsed'] = data_1['model_response'].apply(extract_json_data)
data_1['model_response_parsed'].head()

0    {'category': 'Technical Issues'}
1     {'category': 'Hardware Issues'}
2       {'category': 'Data Recovery'}
3    {'category': 'Technical Issues'}
4     {'category': 'Hardware Issues'}
Name: model_response_parsed, dtype: object

In [25]:
# Model output after extracting JSON data
data_1['model_response_parsed'].value_counts()

{'category': 'Technical Issues'}    7
{'category': 'Hardware Issues'}     7
{'category': 'Data Recovery'}       7
Name: model_response_parsed, dtype: int64

In [26]:
# Normalizing the model_response_parsed column
model_response_parsed_df_1 = pd.json_normalize(data_1['model_response_parsed'])
model_response_parsed_df_1.head()

Unnamed: 0,category
0,Technical Issues
1,Hardware Issues
2,Data Recovery
3,Technical Issues
4,Hardware Issues


In [27]:
# Concatinating two dataframes
data_with_parsed_model_output_1 = pd.concat([data_1, model_response_parsed_df_1], axis=1)
data_with_parsed_model_output_1.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,model_response_parsed,category
0,ST2023-006,My internet connection has significantly slowe...,"{""category"": ""Technical Issues""}",{'category': 'Technical Issues'},Technical Issues
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"{""category"": ""Hardware Issues""}",{'category': 'Hardware Issues'},Hardware Issues
2,ST2023-008,I've accidentally deleted essential work docum...,"{""category"": ""Data Recovery""}",{'category': 'Data Recovery'},Data Recovery
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"{""category"": ""Technical Issues""}",{'category': 'Technical Issues'},Technical Issues
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","{""category"": ""Hardware Issues""}",{'category': 'Hardware Issues'},Hardware Issues


In [28]:
# Dropping model_response and model_response_parsed columns
final_data_1 = data_with_parsed_model_output_1.drop(['model_response','model_response_parsed'], axis=1)
final_data_1.head()

Unnamed: 0,support_tick_id,support_ticket_text,category
0,ST2023-006,My internet connection has significantly slowe...,Technical Issues
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues
2,ST2023-008,I've accidentally deleted essential work docum...,Data Recovery
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Technical Issues
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues


## **Task 2: Creating Tags**

In [29]:
# creating a copy of the data
data_2 = data.copy()

In [30]:
def response_2(prompt,ticket,category):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category: {category}
      A:
      """,
      max_tokens=1024,  # defining the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.01,  # temperature set to 0.01(low) for deterministic output.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
    final_output = temp_output[temp_output.index('{'):]

    return final_output

In [31]:
# Prompt creation for task 2
prompt_2 = """
   As an AI, your task is to label IT support tickets with relevant tags. 
   Please identify the most appropriate keywords and include them in your response. 
   Your output should be formatted as follows: {"tags": ["Wifi", "Data Loss", "Connection Issues", "Battery"]}.
   Keep your output simple and accurate. Ensure that all curly braces are closed and there are no additional characters in the output.
"""

**Note**: The output of the model should be in a structured format (JSON format).

In [32]:
# Utilizing generate_llama_response as a function on the variable: support_ticket_text
start = time.time()
data_2["model_response"]=final_data_1[['support_ticket_text','category']].apply(lambda x: response_2(prompt_2, x[0],x[1]),axis =1)
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [33]:
# Time taken for model to generate output
print("Time taken:",round((end-start))," seconds")

Time taken: 2349  seconds


In [34]:
# Initial model output
data_2['model_response'].head(5)

0    {"tags": ["Connection Issues", "Internet", "Sl...
1             {"tags": ["Hardware", "Startup Issues"]}
2                              {"tags": ["Data Loss"]}
3              {"tags": ["Wifi", "Connection Issues"]}
4                                {"tags": ["Battery"]}
Name: model_response, dtype: object

In [35]:
# Support ticket text
i = 0
print(data_2.loc[i,'support_ticket_text'])

My internet connection has significantly slowed down over the past two days, making it challenging to work efficiently from home. Frequent disconnections are causing major disruptions. Please assist in resolving this connectivity issue promptly.


In [36]:
# Model output
print(data_2.loc[i,'model_response'])

{"tags": ["Connection Issues", "Internet", "Slow Connection"]}


In [37]:
# Applying the function to the model response
data_2['model_response_parsed'] = data_2['model_response'].apply(extract_json_data)

In [38]:
# Model output after extracting JSON data
data_2["model_response_parsed"]

0     {'tags': ['Connection Issues', 'Internet', 'Sl...
1              {'tags': ['Hardware', 'Startup Issues']}
2                               {'tags': ['Data Loss']}
3               {'tags': ['Wifi', 'Connection Issues']}
4                                 {'tags': ['Battery']}
5             {'tags': ['Data Loss', 'Account Access']}
6             {'tags': ['Performance', 'Productivity']}
7           {'tags': ['Hardware', 'Blue Screen Error']}
8                               {'tags': ['Data Loss']}
9     {'tags': ['Graphics Card', 'Hardware', 'Perfor...
10                              {'tags': ['Data Loss']}
11                     {'tags': ['Screen', 'Hardware']}
12           {'tags': ['Hardware', 'Laptop', 'Damage']}
13                              {'tags': ['Data Loss']}
14                               {'tags': ['Hardware']}
15          {'tags': ['Connection Issues', 'Internet']}
16              {'tags': ['Wifi', 'Connection Issues']}
17                              {'tags': ['Data 

In [39]:
# Normalizing the model_response_parsed column
model_response_parsed_df_2 = pd.json_normalize(data_2['model_response_parsed'])
model_response_parsed_df_2.head()

Unnamed: 0,tags
0,"[Connection Issues, Internet, Slow Connection]"
1,"[Hardware, Startup Issues]"
2,[Data Loss]
3,"[Wifi, Connection Issues]"
4,[Battery]


In [40]:
# Concatinating two dataframes
data_with_parsed_model_output_2 = pd.concat([data_2, model_response_parsed_df_2], axis=1)
data_with_parsed_model_output_2.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,model_response_parsed,tags
0,ST2023-006,My internet connection has significantly slowe...,"{""tags"": [""Connection Issues"", ""Internet"", ""Sl...","{'tags': ['Connection Issues', 'Internet', 'Sl...","[Connection Issues, Internet, Slow Connection]"
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"{""tags"": [""Hardware"", ""Startup Issues""]}","{'tags': ['Hardware', 'Startup Issues']}","[Hardware, Startup Issues]"
2,ST2023-008,I've accidentally deleted essential work docum...,"{""tags"": [""Data Loss""]}",{'tags': ['Data Loss']},[Data Loss]
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"{""tags"": [""Wifi"", ""Connection Issues""]}","{'tags': ['Wifi', 'Connection Issues']}","[Wifi, Connection Issues]"
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","{""tags"": [""Battery""]}",{'tags': ['Battery']},[Battery]


In [41]:
# Dropping model_response and model_response_parsed columns
final_data_2 = data_with_parsed_model_output_2.drop(['model_response','model_response_parsed'], axis=1)
final_data_2.head()

Unnamed: 0,support_tick_id,support_ticket_text,tags
0,ST2023-006,My internet connection has significantly slowe...,"[Connection Issues, Internet, Slow Connection]"
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"[Hardware, Startup Issues]"
2,ST2023-008,I've accidentally deleted essential work docum...,[Data Loss]
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"[Wifi, Connection Issues]"
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",[Battery]


In [42]:
# Checking the value counts of Category column
final_data_2['tags'].value_counts()

[Data Loss]                                       6
[Wifi, Connection Issues]                         2
[Connection Issues, Internet, Slow Connection]    1
[Hardware, Startup Issues]                        1
[Battery]                                         1
[Data Loss, Account Access]                       1
[Performance, Productivity]                       1
[Hardware, Blue Screen Error]                     1
[Graphics Card, Hardware, Performance]            1
[Screen, Hardware]                                1
[Hardware, Laptop, Damage]                        1
[Hardware]                                        1
[Connection Issues, Internet]                     1
[Connection Issues, Internet, Slow Speed]         1
[Software Issues, Data Loss]                      1
Name: tags, dtype: int64

In [43]:
# Concatinating two dataframes
final_data_2 = pd.concat([final_data_2,final_data_1["category"]],axis=1)

In [44]:
# viewing newly updated dataframe
final_data_2 = final_data_2[["support_tick_id","support_ticket_text","category","tags"]]
final_data_2

Unnamed: 0,support_tick_id,support_ticket_text,category,tags
0,ST2023-006,My internet connection has significantly slowe...,Technical Issues,"[Connection Issues, Internet, Slow Connection]"
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues,"[Hardware, Startup Issues]"
2,ST2023-008,I've accidentally deleted essential work docum...,Data Recovery,[Data Loss]
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Technical Issues,"[Wifi, Connection Issues]"
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues,[Battery]
5,ST2023-011,I'm locked out of my online banking account an...,Data Recovery,"[Data Loss, Account Access]"
6,ST2023-012,"My computer's performance is sluggish, severel...",Technical Issues,"[Performance, Productivity]"
7,ST2023-013,I'm experiencing a recurring blue screen error...,Hardware Issues,"[Hardware, Blue Screen Error]"
8,ST2023-014,My external hard drive isn't being recognized ...,Data Recovery,[Data Loss]
9,ST2023-015,The graphics card in my gaming laptop seems to...,Hardware Issues,"[Graphics Card, Hardware, Performance]"


## **Task 3: Assigning Priority and ETA**

In [45]:
# creating a copy of the data
data_3 = data.copy()

In [46]:
# Function created to generate an output from the model
def response_3(prompt,ticket,category,tags):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category: {category}
      Tags: {tags}
      A:
      """,
      max_tokens=20,   # defining the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.01,  # temperature set to 0.01(low) for deterministic output.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
    final_output = temp_output[temp_output.index('{'):]

    return final_output

In [47]:
# Prompt creation for task 3
prompt_3 = """
    As an AI, your task is to determine the priority and estimated time to resolve (ETA) for IT support tickets. 
    Consider the severity of the issue, the time needed for resolution, and customer satisfaction. 
    Your response should be in the format: {"priority": "High", "eta": "2 Days"}.
    Keep your output simple and accurate. Ensure that all curly braces are closed and there are no additional characters in the output.
"""

**Note**: The output of the model should be in a structured format (JSON format).

In [48]:
# Utilizing generate_llama_response as a function on the variable: support_ticket_text  
start = time.time()
data_3['model_response'] = final_data_2[['support_ticket_text','category','tags']].apply(lambda x: response_3(prompt_3, x[0],x[1],x[2]),axis=1)
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [49]:
# Time taken for model to generate output
print("Time taken:",round((end-start))," seconds")

Time taken: 403  seconds


In [50]:
# Initial model output
data_3['model_response'].head(5)

0      {"priority": "High", "eta": "3 Days"}
1    {"priority": "High", "eta": "Same Day"}
2      {"priority": "High", "eta": "3 Days"}
3    {"priority": "Medium", "eta": "3 Days"}
4    {"priority": "Medium", "eta": "3 Days"}
Name: model_response, dtype: object

In [51]:
# Support ticket text
i = 3
print(data_3.loc[i,'support_ticket_text'])

Despite being in close proximity to my Wi-Fi router, the signal remains persistently weak in my home. This issue has been ongoing, and I need assistance troubleshooting it. Please help me resolve the weak Wi-Fi signal problem.


In [52]:
# Model output
print(data_3.loc[i,'model_response'])

{"priority": "Medium", "eta": "3 Days"}


In [53]:
# Applying the function to the model response
data_3['model_response_parsed'] = data_3['model_response'].apply(extract_json_data)
data_3['model_response_parsed'].head()

0      {'priority': 'High', 'eta': '3 Days'}
1    {'priority': 'High', 'eta': 'Same Day'}
2      {'priority': 'High', 'eta': '3 Days'}
3    {'priority': 'Medium', 'eta': '3 Days'}
4    {'priority': 'Medium', 'eta': '3 Days'}
Name: model_response_parsed, dtype: object

In [54]:
# Normalizing the model_response_parsed column
model_response_parsed_df_3 = pd.json_normalize(data_3['model_response_parsed'])
model_response_parsed_df_3.head(21)

Unnamed: 0,priority,eta
0,High,3 Days
1,High,Same Day
2,High,3 Days
3,Medium,3 Days
4,Medium,3 Days
5,High,1 Day
6,High,3 Days
7,High,3 Days
8,High,3 Days
9,High,3 Days


In [55]:
# Concatinating two dataframes
data_with_parsed_model_output_3 = pd.concat([data_3, model_response_parsed_df_3], axis=1)
data_with_parsed_model_output_3.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,model_response_parsed,priority,eta
0,ST2023-006,My internet connection has significantly slowe...,"{""priority"": ""High"", ""eta"": ""3 Days""}","{'priority': 'High', 'eta': '3 Days'}",High,3 Days
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"{""priority"": ""High"", ""eta"": ""Same Day""}","{'priority': 'High', 'eta': 'Same Day'}",High,Same Day
2,ST2023-008,I've accidentally deleted essential work docum...,"{""priority"": ""High"", ""eta"": ""3 Days""}","{'priority': 'High', 'eta': '3 Days'}",High,3 Days
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"{""priority"": ""Medium"", ""eta"": ""3 Days""}","{'priority': 'Medium', 'eta': '3 Days'}",Medium,3 Days
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","{""priority"": ""Medium"", ""eta"": ""3 Days""}","{'priority': 'Medium', 'eta': '3 Days'}",Medium,3 Days


In [56]:
# Dropping model_response and model_response_parsed columns
final_data_3 = data_with_parsed_model_output_3.drop(['model_response','model_response_parsed'], axis=1)
final_data_3.head()

Unnamed: 0,support_tick_id,support_ticket_text,priority,eta
0,ST2023-006,My internet connection has significantly slowe...,High,3 Days
1,ST2023-007,Urgent help required! My laptop refuses to sta...,High,Same Day
2,ST2023-008,I've accidentally deleted essential work docum...,High,3 Days
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Medium,3 Days
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Medium,3 Days


In [57]:
# Concatinating two dataframes
final_data_3 = pd.concat([final_data_3,final_data_2[["category","tags"]]],axis=1)

In [58]:
# Creating new dataframe
final_data_3 = final_data_3[["support_tick_id","support_ticket_text","category","tags","priority","eta"]]

In [59]:
# viewing newly updated dataframe
final_data_3

Unnamed: 0,support_tick_id,support_ticket_text,category,tags,priority,eta
0,ST2023-006,My internet connection has significantly slowe...,Technical Issues,"[Connection Issues, Internet, Slow Connection]",High,3 Days
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues,"[Hardware, Startup Issues]",High,Same Day
2,ST2023-008,I've accidentally deleted essential work docum...,Data Recovery,[Data Loss],High,3 Days
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Technical Issues,"[Wifi, Connection Issues]",Medium,3 Days
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues,[Battery],Medium,3 Days
5,ST2023-011,I'm locked out of my online banking account an...,Data Recovery,"[Data Loss, Account Access]",High,1 Day
6,ST2023-012,"My computer's performance is sluggish, severel...",Technical Issues,"[Performance, Productivity]",High,3 Days
7,ST2023-013,I'm experiencing a recurring blue screen error...,Hardware Issues,"[Hardware, Blue Screen Error]",High,3 Days
8,ST2023-014,My external hard drive isn't being recognized ...,Data Recovery,[Data Loss],High,3 Days
9,ST2023-015,The graphics card in my gaming laptop seems to...,Hardware Issues,"[Graphics Card, Hardware, Performance]",High,3 Days


## **Task 4 - Creating a Draft Response**

In [60]:
# creating a copy of the data
data_4 = data.copy()

In [61]:
# Function to generate output from the model
def response_4(prompt,ticket,category,tags,priority,eta):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category : {category}
      Tags : {tags}
      Priority: {priority}
      ETA: {eta}
      A:
      """,
      max_tokens=1024,  # defining the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.01,  # temperature set to 0.01(low) for deterministic output.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]

    return temp_output

In [62]:
# Prompt creation for task 4
prompt_4 = """
    As an AI, your task is to draft a response for IT support tickets. 
    Consider customer satisfaction, the severity of the issue, and the company's responsibility. 
    Your response should be in the format: {"response": "This is a draft response"}. 
    Ensure your response is empathetic, professional, helpful, and concise.
    Please ensure that all curly braces are closed and there are no additional characters in the output.
"""

**Note** : For this task, we will not be using the *`extract_json_data`* function. Hence, the output from the model should be a plain string and not a JSON object.

In [63]:
# Utilizing generate_llama_response as a function on the variable: support_ticket_text 
start = time.time()
data_4['model_response'] = final_data_3[['support_ticket_text','category','tags','priority','eta']].apply(lambda x: response_4(prompt_4, x[0],x[1],x[2],x[3],x[4]),axis=1)
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


KeyboardInterrupt: 

In [None]:
# Time taken for output to be generated by model
print("Time taken:", round((end-start)),"seconds")

In [None]:
# Initial model output
data_4['model_response'].head(21)

In [None]:
# Support ticket text
i = 2
print(data_4.loc[i,'support_ticket_text'])

In [None]:
# Model output
print(data_4.loc[i,'model_response'])

In [None]:
# Applying the function to the model response
data_4['model_response_parsed'] = data_4['model_response'].apply(extract_json_data)
data_4['model_response_parsed'].head()

In [None]:
# Normalizing the model_response_parsed column
model_response_parsed_df_4 = pd.json_normalize(data_4['model_response_parsed'])
model_response_parsed_df_4.head(21)

In [None]:
# Concatinating two dataframes
final_data_4 = pd.concat([final_data_3,model_response_parsed_df_4],axis=1)

In [None]:
# Renaming the dataframe
final_data_4.rename(columns={"model_response_parsed":"response"},inplace=True)

In [None]:
# Viewing newly updated dataframe
final_data_4

## **Model Output Analysis**

In [None]:
# Creating a copy of the dataframe of task 4
final_data = final_data_4.copy()

In [None]:
# Value counts of category
final_data['category'].value_counts()

The model output for **category**:
> "Technical Issues" for 8 tickets

> "Hardware Issues" for 7 tickets

> "Data Recovery" for 6 tickets

In [None]:
# Value counts of priority
final_data["priority"].value_counts()

The model output for **priority** of:

> "High" to 19 tickets

> "Medium" to 2 tickets

In [None]:
# Value counts of ETA
final_data["eta"].value_counts()

The model output for **ETA** of:
> "3 Days" to 12 tickets

> "1 Day" to 9 tickets.

Let's dive in a bit deeper here.

In [None]:
 # Group by data with regard to categories and ETA.
final_data.groupby(['category', 'eta']).support_tick_id.count()

> Most "Data Recovery" tickets are estimated by the model to be resolved in "3 Days".

> Most "Hardware Issues" tickets are estimated by the model to be resovled in "3 Days".

> Most "Technical Isses" tickets are estimated by the model to be resovled in "1 Day".

In [None]:
# Final_data(output) generated by model.
final_data.head()

## **Actionable Insights and Recommendations**

**Insights:**
> A detailed company information in the prompts provide better model output.

> Adjust priority levels to align with your business's actual capabilities.

> Curating responses to a specific business by adjusting prompts or outputs.

> Adjust or expand categories to match your business's support needs. 

> Overall, The model's estimation of resolution times aligns with real-world scenarios.

**Recommendations:**
> Fine-tune the model with your company's data or profile for an improved performance.

> Adjust "priority" of support tickets to reflect priorities the business can actually facilitate.

> Need to evaluate on the format of responses with regard to the mail/response delivery methods.

> Require a thorough test of the model with actual data before implementation.