## **Problem Statement**

### Business Context

In today's dynamic business landscape, organizations are increasingly recognizing the pivotal role customer feedback plays in shaping the trajectory of their products and services. The ability to swiftly and effectively respond to customer input not only fosters enhanced customer experiences but also serves as a catalyst for growth, prolonged customer engagement, and the nurturing of lifetime value relationships.

I'll look for a structured approach – a method that allows to discern the most pressing issues, set priorities, and allocate resources judiciously. One of the most effective strategies at disposal is to harness the power of Support Ticket Categorization.


### Objective

Develop an advanced support ticket categorization system that accurately classifies incoming tickets, assigns relevant tags based on their content, implements mechanisms and generate the first response based on the sentiment for prioritizing tickets for prompt resolution.


## **Installing and Importing Necessary Libraries and Dependencies**

In [None]:
# Installation for GPU llama-cpp-python

!CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.85 --force-reinstall --no-cache-dir -q

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m8.7 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.3/19.3 MB[0m [31m63.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m145.9 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for llama-cpp-python (pyproject.toml) ... [?25l[?25hdone
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torch 2.3.0+cu121 requires nvidia-cublas-cu12==12.1.3.1; platform_system == "Linux" and platform_machine == "x86_64", which is not installed.
torch 2.3.0+cu121 requires nvidia-cuda-cupti-cu12==12.1.10

In [None]:
# For downloading the models from HF Hub
!pip install huggingface_hub==0.20.3 pandas==1.5.3 -q

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m330.1/330.1 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.1/12.1 MB[0m [31m23.5 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
cudf-cu12 24.4.1 requires numpy<2.0a0,>=1.23, but you have numpy 2.0.0 which is incompatible.
cudf-cu12 24.4.1 requires pandas<2.2.2dev0,>=2.0, but you have pandas 1.5.3 which is incompatible.
google-colab 1.0.0 requires pandas==2.0.3, but you have pandas 1.5.3 which is incompatible.
ibis-framework 8.0.0 requires numpy<2,>=1, but you have numpy 2.0.0 which is incompatible.
transformers 4.41.2 requires huggingface-hub<1.0,>=0.23.0, but you have huggingface-hub 0.20.3 which is incompatible.[0m[31m
[0m

In [None]:
# Function to download the model from the Hugging Face model hub
from huggingface_hub import hf_hub_download

# Importing the Llama class from the llama_cpp module
from llama_cpp import Llama

# Importing the json module
import json

# for loading and manipulating data
import pandas as pd

# for time computations
import time

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')

## **Loading the Data**

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
data = pd.read_csv('/content/drive/MyDrive/support_ticket_data.csv')

## **Data Overview**

### Checking the first 5 rows of the data

In [None]:
data.head()

Unnamed: 0,support_tick_id,support_ticket_text
0,ST2023-006,My internet connection has significantly slowe...
1,ST2023-007,Urgent help required! My laptop refuses to sta...
2,ST2023-008,I've accidentally deleted essential work docum...
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...
4,ST2023-010,"My smartphone battery is draining rapidly, eve..."


### Checking the shape of the data

In [None]:
data.shape

(21, 2)

There are 21 rows and 2 columns

### Checking the missing values in the data

In [None]:
data.isnull()

Unnamed: 0,support_tick_id,support_ticket_text
0,False,False
1,False,False
2,False,False
3,False,False
4,False,False
5,False,False
6,False,False
7,False,False
8,False,False
9,False,False


There are no missing values. We can proceed with the model.

## **Model Building**

### Loading the model

In [None]:
model_name_or_path = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
model_basename = "mistral-7b-instruct-v0.2.Q6_K.gguf"

In [None]:
model_path = hf_hub_download(
    repo_id=model_name_or_path, 
    filename=model_basename  
)

mistral-7b-instruct-v0.2.Q6_K.gguf:   0%|          | 0.00/5.94G [00:00<?, ?B/s]

In [None]:


llm = Llama(
     model_path=model_path,
     n_ctx=1024, # Context window
 )

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | 


### Utility functions

In [None]:
# defining a function to parse the JSON output from the model
def extract_json_data(json_str):
    try:
        # Find the indices of the opening and closing curly braces
        json_start = json_str.find('{')
        json_end = json_str.rfind('}')

        if json_start != -1 and json_end != -1:
            extracted_category = json_str[json_start:json_end + 1]  # Extract the JSON object
            data_dict = json.loads(extracted_category)
            return data_dict
        else:
            print(f"Warning: JSON object not found in response: {json_str}")
            return {}
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return {}

## **Task 1: Ticket Categorization and Returning Structured Output**

In [None]:
# creating a copy of the data
data_1 = data.copy()

In [None]:
#Defining the response funciton for Task 1.
def response_1(prompt,ticket):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      A:
      """,
      max_tokens=1024, #Complete the code to set the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0, #Complete the code to set the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
# Check if '{' is in temp_output
    if '{' in temp_output:
        final_output = temp_output[temp_output.index('{'):]
    else:
        final_output = temp_output  # Handle the case where '{' is not found

    return final_output

    return final_output

In [None]:
prompt_1 = """
    You are an AI analyzing customer-generated feedback and support tickets. Tag the given tickets using one or more of the below mentioned categories only depending upon the content of the article:
    - Internet and Connectivity Issues
    - Hardware Issues
    - Data Issues
    - Software and Performance Issues

    Format the output as a JSON object with a single key-value pair as shown below:
    {"label": "your_label_prediction"}
"""

LLaMa performs better if you give it delimited and clear instructions. So we give it 4 categories that classify the best the different issues which customers say they have had in their tickets.

**Note**: The output of the model should be in a structured format (JSON format).

In [None]:
start = time.time()
data_1['model_response'] = data_1['support_ticket_text'].apply(lambda x: response_1(prompt_1, x))
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
print("Time taken ",(end-start))

Time taken  315.2261652946472


In [None]:
data_1['model_response'].head(21)

0         {"label": "Internet and Connectivity Issues"}
1                          {"label": "Hardware Issues"}
2                              {"label": "Data Issues"}
3         {"label": "Internet and Connectivity Issues"}
4                          {"label": "Hardware Issues"}
5                              {"label": "Data Issues"}
6          {"label": "Software and Performance Issues"}
7                          {"label": "Hardware Issues"}
8                          {"label": "Hardware Issues"}
9                          {"label": "Hardware Issues"}
10                             {"label": "Data Issues"}
11                         {"label": "Hardware Issues"}
12                         {"label": "Hardware Issues"}
13                         {"label": "Hardware Issues"}
14                         {"label": "Hardware Issues"}
15        {"label": "Internet and Connectivity Issues"}
16        {"label": "Internet and Connectivity Issues"}
17                             {"label": "Data I

Checking responses on all data

In [None]:
i = 2
print(data_1.loc[i, 'support_ticket_text'])

I've accidentally deleted essential work documents, causing substantial data loss. I understand the need to avoid further actions on my device. Can you please prioritize the data recovery process and guide me through it?


In [None]:
print(data_1.loc[i, 'model_response'])

{"label": "Data Issues"}


In [None]:
# applying the function to the model response
data_1['model_response_parsed'] = data_1['model_response'].apply(extract_json_data)
data_1['model_response_parsed'].head()

0    {'label': 'Internet and Connectivity Issues'}
1                     {'label': 'Hardware Issues'}
2                         {'label': 'Data Issues'}
3    {'label': 'Internet and Connectivity Issues'}
4                     {'label': 'Hardware Issues'}
Name: model_response_parsed, dtype: object

In [None]:
data_1['model_response_parsed'].value_counts()

{'label': 'Hardware Issues'}                                 10
{'label': 'Internet and Connectivity Issues'}                 5
{'label': 'Data Issues'}                                      4
{'label': 'Software and Performance Issues'}                  1
{'label': 'Software and Performance Issues, Data Issues'}     1
Name: model_response_parsed, dtype: int64

We've got 4 different categories and another with a combination of two.

In [None]:
# Normalizing the model_response_parsed column
model_response_parsed_df_1 = pd.json_normalize(data_1['model_response_parsed'])
model_response_parsed_df_1.head()

Unnamed: 0,label
0,Internet and Connectivity Issues
1,Hardware Issues
2,Data Issues
3,Internet and Connectivity Issues
4,Hardware Issues


In [None]:
# Concatinating two dataframes
data_with_parsed_model_output_1 = pd.concat([data_1, model_response_parsed_df_1], axis=1)
data_with_parsed_model_output_1.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,model_response_parsed,label
0,ST2023-006,My internet connection has significantly slowe...,"{""label"": ""Internet and Connectivity Issues""}",{'label': 'Internet and Connectivity Issues'},Internet and Connectivity Issues
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"{""label"": ""Hardware Issues""}",{'label': 'Hardware Issues'},Hardware Issues
2,ST2023-008,I've accidentally deleted essential work docum...,"{""label"": ""Data Issues""}",{'label': 'Data Issues'},Data Issues
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"{""label"": ""Internet and Connectivity Issues""}",{'label': 'Internet and Connectivity Issues'},Internet and Connectivity Issues
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","{""label"": ""Hardware Issues""}",{'label': 'Hardware Issues'},Hardware Issues


In [None]:
# Dropping model_response and model_response_parsed columns
final_data_1 = data_with_parsed_model_output_1.drop(['model_response','model_response_parsed'], axis=1)
final_data_1.head()

Unnamed: 0,support_tick_id,support_ticket_text,label
0,ST2023-006,My internet connection has significantly slowe...,Internet and Connectivity Issues
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues
2,ST2023-008,I've accidentally deleted essential work docum...,Data Issues
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Internet and Connectivity Issues
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues


In [None]:
# Renaming columns
final_data_1.rename(columns={'label': 'Category'}, inplace=True)
# Displaying the DataFrame
final_data_1.head()

Unnamed: 0,support_tick_id,support_ticket_text,Category
0,ST2023-006,My internet connection has significantly slowe...,Internet and Connectivity Issues
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues
2,ST2023-008,I've accidentally deleted essential work docum...,Data Issues
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Internet and Connectivity Issues
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues


## **Task 2: Creating Tags**

In [None]:
# creating a copy of the data
data_2 = data.copy()

In [None]:
def response_2(prompt,ticket,category):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category: {category}
      A:
      """,
      max_tokens=1024, #Complete the code to set the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0, #Complete the code to set the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
    # Check if '{' is in temp_output
    if '{' in temp_output:
        final_output = temp_output[temp_output.index('{'):]
    else:
        final_output = temp_output  # Handle the case where '{' is not found

    return final_output

In [None]:
prompt_2 = """
   You are an AI analyzing customer-generated feedback and support tickets. Generate different tags for the support ticket, as much three, summarizing the principal characteristics of the issue and do not give any empty response.

    Format the output as a JSON object with a single key-value pair as shown below:
    {"Tags": "your_tags_prediction"}
"""

In this case it was necessary to ask the model to complete the entire task, because it did not respond on all columns.

**Note**: The output of the model should be in a structured format (JSON format).

Running the model on data

In [None]:
start = time.time()
data_2["model_response"]=final_data_1[['support_ticket_text','Category']].apply(lambda x: response_2(prompt_2, x[0],x[1]),axis =1)
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
print("Time taken ",end-start)

Time taken  433.69974541664124


In [None]:
# Writing the code to check the first five rows of the data to confirm whether the new column has been added
data_2["model_response"].head()

0    {"Tags": ["Slow Internet Connection", "Frequen...
1    {"Tags": ["Urgent", "Hardware Failure", "Prese...
2    {"Tags": ["Data Loss", "Document Recovery", "U...
3    {"Tags": ["Wi-Fi Signal Weakness", "Persistent...
4    {"Tags": ["Battery Drain", "Hardware Malfuncti...
Name: model_response, dtype: object

In [None]:
i = 2
print(data_2.loc[i, 'support_ticket_text'])

I've accidentally deleted essential work documents, causing substantial data loss. I understand the need to avoid further actions on my device. Can you please prioritize the data recovery process and guide me through it?


In [None]:
print(data_2.loc[i, 'model_response'])

{"Tags": ["Data Loss", "Document Recovery", "Urgency"]}


In [None]:
i = 4
print(data_2.loc[i, 'support_ticket_text'])

My smartphone battery is draining rapidly, even with minimal use. Can you help me identify and rectify this battery issue?


In [None]:
print(data_2.loc[i, 'model_response'])

{"Tags": ["Battery Drain", "Hardware Malfunction", "Power Management"]}


In [None]:
# Applying the function to the model response
data_2['model_response_parsed'] = data_2['model_response'].apply(extract_json_data)

- Checking on all data

In [None]:
data_2["model_response_parsed"]

0     {'Tags': ['Slow Internet Connection', 'Frequen...
1     {'Tags': ['Urgent', 'Hardware Failure', 'Prese...
2     {'Tags': ['Data Loss', 'Document Recovery', 'U...
3     {'Tags': ['Wi-Fi Signal Weakness', 'Persistent...
4     {'Tags': ['Battery Drain', 'Hardware Malfuncti...
5     {'Tags': ['Account Access', 'Password Reset', ...
6     {'Tags': ['Software Optimization', 'Performanc...
7     {'Tags': ['Blue Screen Error', 'Hardware Malfu...
8     {'Tags': ['External Hard Drive', 'Data Recover...
9     {'Tags': ['Graphics Card Malfunction', 'Gaming...
10    {'Tags': ['Data Loss', 'File Recovery', 'USB D...
11    {'Tags': ['Display Issue', 'Urgent', 'Hardware...
12    {'Tags': ['Water Damage', 'Laptop Repair', 'Da...
13    {'Tags': ['Physical Damage', 'Data Recovery', ...
14    {'Tags': ['Touchpad Malfunction', 'Hardware Is...
15    {'Tags': ['Internet Dropouts', 'Connectivity I...
16    {'Tags': ['Wi-Fi Instability', 'Connectivity I...
17    {'Tags': ['Data Loss', 'File Recovery', 'A

In [None]:
# Normalizing the model_response_parsed column
model_response_parsed_df_2 = pd.json_normalize(data_2['model_response_parsed'])
model_response_parsed_df_2.head()

Unnamed: 0,Tags
0,"[Slow Internet Connection, Frequent Disconnect..."
1,"[Urgent, Hardware Failure, Presentation Issue]"
2,"[Data Loss, Document Recovery, Urgency]"
3,"[Wi-Fi Signal Weakness, Persistent Issue, Trou..."
4,"[Battery Drain, Hardware Malfunction, Power Ma..."


In [None]:
# Concatinating two dataframes
data_with_parsed_model_output_2 = pd.concat([data_2, model_response_parsed_df_2], axis=1)
data_with_parsed_model_output_2.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,model_response_parsed,Tags
0,ST2023-006,My internet connection has significantly slowe...,"{""Tags"": [""Slow Internet Connection"", ""Frequen...","{'Tags': ['Slow Internet Connection', 'Frequen...","[Slow Internet Connection, Frequent Disconnect..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"{""Tags"": [""Urgent"", ""Hardware Failure"", ""Prese...","{'Tags': ['Urgent', 'Hardware Failure', 'Prese...","[Urgent, Hardware Failure, Presentation Issue]"
2,ST2023-008,I've accidentally deleted essential work docum...,"{""Tags"": [""Data Loss"", ""Document Recovery"", ""U...","{'Tags': ['Data Loss', 'Document Recovery', 'U...","[Data Loss, Document Recovery, Urgency]"
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"{""Tags"": [""Wi-Fi Signal Weakness"", ""Persistent...","{'Tags': ['Wi-Fi Signal Weakness', 'Persistent...","[Wi-Fi Signal Weakness, Persistent Issue, Trou..."
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","{""Tags"": [""Battery Drain"", ""Hardware Malfuncti...","{'Tags': ['Battery Drain', 'Hardware Malfuncti...","[Battery Drain, Hardware Malfunction, Power Ma..."


In [None]:
# Dropping model_response and model_response_parsed columns
final_data_2 = data_with_parsed_model_output_2.drop(['model_response','model_response_parsed'], axis=1)
final_data_2.head()

Unnamed: 0,support_tick_id,support_ticket_text,Tags
0,ST2023-006,My internet connection has significantly slowe...,"[Slow Internet Connection, Frequent Disconnect..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"[Urgent, Hardware Failure, Presentation Issue]"
2,ST2023-008,I've accidentally deleted essential work docum...,"[Data Loss, Document Recovery, Urgency]"
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"[Wi-Fi Signal Weakness, Persistent Issue, Trou..."
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","[Battery Drain, Hardware Malfunction, Power Ma..."


In [None]:
# Checking the value counts of Category column
final_data_2['Tags'].value_counts()

[External Hard Drive, Data Recovery, Hardware Malfunction]                   2
[Slow Internet Connection, Frequent Disconnections, Productivity Impact]     1
[Display Issue, Urgent, Hardware Malfunction]                                1
[Internet Speed Issue, Frequent Disconnections, Productivity Loss]           1
[Data Loss, File Recovery, Accidental Format]                                1
[Wi-Fi Instability, Connectivity Interference, Productivity Impact]          1
[Internet Dropouts, Connectivity Issues, Urgent]                             1
[Touchpad Malfunction, Hardware Issue, Laptop Usability]                     1
[Physical Damage, Data Recovery, Hard Drive]                                 1
[Water Damage, Laptop Repair, Data Recovery]                                 1
[Data Loss, File Recovery, USB Drive]                                        1
[Urgent, Hardware Failure, Presentation Issue]                               1
[Graphics Card Malfunction, Gaming Performance Issue

In [None]:
final_data_2 = pd.concat([final_data_2,final_data_1["Category"]],axis=1)

In [None]:
final_data_2 = final_data_2[["support_tick_id","support_ticket_text","Category","Tags"]]
final_data_2

Unnamed: 0,support_tick_id,support_ticket_text,Category,Tags
0,ST2023-006,My internet connection has significantly slowe...,Internet and Connectivity Issues,"[Slow Internet Connection, Frequent Disconnect..."
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues,"[Urgent, Hardware Failure, Presentation Issue]"
2,ST2023-008,I've accidentally deleted essential work docum...,Data Issues,"[Data Loss, Document Recovery, Urgency]"
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Internet and Connectivity Issues,"[Wi-Fi Signal Weakness, Persistent Issue, Trou..."
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues,"[Battery Drain, Hardware Malfunction, Power Ma..."
5,ST2023-011,I'm locked out of my online banking account an...,Data Issues,"[Account Access, Password Reset, Urgent]"
6,ST2023-012,"My computer's performance is sluggish, severel...",Software and Performance Issues,"[Software Optimization, Performance Issue, Pro..."
7,ST2023-013,I'm experiencing a recurring blue screen error...,Hardware Issues,"[Blue Screen Error, Hardware Malfunction, Cras..."
8,ST2023-014,My external hard drive isn't being recognized ...,Hardware Issues,"[External Hard Drive, Data Recovery, Hardware ..."
9,ST2023-015,The graphics card in my gaming laptop seems to...,Hardware Issues,"[Graphics Card Malfunction, Gaming Performance..."


## **Task 3: Assigning Priority and ETA**

In [None]:
# creating a copy of the data
data_3 = data.copy()

In [None]:
def response_3(prompt,ticket,category,tags):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category: {category}
      Tags: {tags}
      A:
      """,
      max_tokens=1024,  #Complete the code to set the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.01, #Complete the code to set the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]
        # Check if '{' is in temp_output
    if '{' in temp_output:
        final_output = temp_output[temp_output.index('{'):]
    else:
        final_output = temp_output  # Handle the case where '{' is not found

    return final_output

In [None]:
prompt_3 = """
You are an AI analyzing customer-generated feedback and support tickets. Your task is twofold:

1. Assign an estimated time for solving the issue described, based on the severity and nature of the problem, as if you were a computer technician. The possible results for the estimated time of resolution are: 'immediate', '1 day', '2-3 days', and 'a week'.

2. Assign a priority for solving each ticket based on the urgency of the message, sentiment of the customer, and severity of the problem. Categorize the urgency of the problem as "low", "medium", or "high". Please ensure a balanced distribution and do not categorize every case as high or medium.

Consider the following examples for reference:

Example 1:
Customer Issue: "My internet connection has significantly slowed down, impacting my ability to work from home."
{"ETA": "2-3 days", "Priority": "medium"}

Example 2:
Customer Issue: "Urgent help required! My laptop refuses to start, and I have critical work deadlines."
{"ETA": "immediate", "Priority": "high"}

Example 3:
Customer Issue: "I've accidentally deleted essential work documents from my computer. Can you help me recover them?"
{"ETA": "1 day", "Priority": "medium"}

Example 4:
Customer Issue: "Despite being in close proximity to my Wi-Fi router, my laptop keeps disconnecting intermittently."
{"ETA": "2-3 days", "Priority": "medium"}

Example 5:
Customer Issue: "My smartphone battery is draining rapidly, even with minimal use."
{"ETA": "immediate", "Priority": "high"}

Example 6:
Customer Issue: "The touchpad on my laptop has stopped working, making it difficult to use. Can you help me troubleshoot this hardware issue?"
{"ETA": "1 day", "Priority": "low"}

Important:
- Only return the JSON object.
- Do not include any additional information or text.
- Ensure there are no extra spaces in the JSON object.
- Do not provide explanations, only the JSON object.
- Each response must contain a valid JSON object with both ETA and Priority. Do not leave any response empty.

Here is the support ticket for your analysis:

<support_ticket_text_here>

Please return the estimated time of resolution and priority in the following JSON format only:
{"ETA": "your_eta_prediction", "Priority": "your_priority_prediction"}

Remember:
- Only return the JSON object.
- No additional text or explanations.
- Ensure the JSON object is correctly formatted.
- Each response must contain a valid JSON object with both ETA and Priority. Every ticket must receive a response.
"""



In this task was necessary to provide clear indications of how we expect the info to be procesed, the specific answer that we want and several examples because model was giving response only for a few rows and these answer were long, vagues and without json format.


**Note**: The output of the model should be in a structured format (JSON format).

In [None]:
# Applying generate_llama_response function on support_ticket_text column
start = time.time()
data_3['model_response'] = final_data_2[['support_ticket_text','Category','Tags']].apply(lambda x: response_3(prompt_3, x[0],x[1],x[2]),axis=1)
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
print("Time taken ",(end-start))

Time taken  313.9327256679535


In [None]:
# Writing the code to check the first five rows of the data to confirm whether the new column has been added
data_3['model_response'].head(21)

0                                              {
1       {"ETA": "immediate", "Priority": "high"}
2           {"ETA": "1 day", "Priority": "high"}
3      {"ETA": "2-3 days", "Priority": "medium"}
4     {"ETA": "immediate", "Priority": "medium"}
5       {"ETA": "immediate", "Priority": "high"}
6      {"ETA": "2-3 days", "Priority": "medium"}
7        {"ETA": "2-3 days", "Priority": "high"}
8        {"ETA": "2-3 days", "Priority": "high"}
9      {"ETA": "2-3 days", "Priority": "medium"}
10        {"ETA": "1 day", "Priority": "medium"}
11      {"ETA": "immediate", "Priority": "high"}
12         {"ETA": "a week", "Priority": "high"}
13         {"ETA": "a week", "Priority": "high"}
14           {"ETA": "1 day", "Priority": "low"}
15      {"ETA": "immediate", "Priority": "high"}
16                                             {
17        {"ETA": "1 day", "Priority": "medium"}
18       {"ETA": "2-3 days", "Priority": "high"}
19      {"ETA": "immediate", "Priority": "high"}
20       {"ETA": "2-

Although we repeated the prompt several times and it gave us a response for almost all the rows, we could not get responses for the entire data set, missing two answers.

In [None]:
i = 0
print(data_3.loc[i, 'support_ticket_text'])

My internet connection has significantly slowed down over the past two days, making it challenging to work efficiently from home. Frequent disconnections are causing major disruptions. Please assist in resolving this connectivity issue promptly.


In [None]:
print(data_3.loc[i, 'model_response'])

{


In [None]:
# Applying the function to the model response
data_3['model_response_parsed'] = data_3['model_response'].apply(extract_json_data)
data_3['model_response_parsed'].head()



0                                            {}
1      {'ETA': 'immediate', 'Priority': 'high'}
2          {'ETA': '1 day', 'Priority': 'high'}
3     {'ETA': '2-3 days', 'Priority': 'medium'}
4    {'ETA': 'immediate', 'Priority': 'medium'}
Name: model_response_parsed, dtype: object

In [None]:
# Normalizing the model_response_parsed column
model_response_parsed_df_3 = pd.json_normalize(data_3['model_response_parsed'])
model_response_parsed_df_3.head(21)

Unnamed: 0,ETA,Priority
0,,
1,immediate,high
2,1 day,high
3,2-3 days,medium
4,immediate,medium
5,immediate,high
6,2-3 days,medium
7,2-3 days,high
8,2-3 days,high
9,2-3 days,medium


In [None]:
# Concatinating two dataframes
data_with_parsed_model_output_3 = pd.concat([data_3, model_response_parsed_df_3], axis=1)
data_with_parsed_model_output_3.head()

Unnamed: 0,support_tick_id,support_ticket_text,model_response,model_response_parsed,ETA,Priority
0,ST2023-006,My internet connection has significantly slowe...,{,{},,
1,ST2023-007,Urgent help required! My laptop refuses to sta...,"{""ETA"": ""immediate"", ""Priority"": ""high""}","{'ETA': 'immediate', 'Priority': 'high'}",immediate,high
2,ST2023-008,I've accidentally deleted essential work docum...,"{""ETA"": ""1 day"", ""Priority"": ""high""}","{'ETA': '1 day', 'Priority': 'high'}",1 day,high
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,"{""ETA"": ""2-3 days"", ""Priority"": ""medium""}","{'ETA': '2-3 days', 'Priority': 'medium'}",2-3 days,medium
4,ST2023-010,"My smartphone battery is draining rapidly, eve...","{""ETA"": ""immediate"", ""Priority"": ""medium""}","{'ETA': 'immediate', 'Priority': 'medium'}",immediate,medium


In [None]:
# Dropping model_response and model_response_parsed columns
final_data_3 = data_with_parsed_model_output_3.drop(['model_response','model_response_parsed'], axis=1)
final_data_3.head()

Unnamed: 0,support_tick_id,support_ticket_text,ETA,Priority
0,ST2023-006,My internet connection has significantly slowe...,,
1,ST2023-007,Urgent help required! My laptop refuses to sta...,immediate,high
2,ST2023-008,I've accidentally deleted essential work docum...,1 day,high
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,2-3 days,medium
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",immediate,medium


In [None]:
final_data_3 = pd.concat([final_data_3,final_data_2[["Category","Tags"]]],axis=1)

In [None]:
final_data_3 = final_data_3[["support_tick_id","support_ticket_text","Category","Tags","Priority","ETA"]]

In [None]:
final_data_3

Unnamed: 0,support_tick_id,support_ticket_text,Category,Tags,Priority,ETA
0,ST2023-006,My internet connection has significantly slowe...,Internet and Connectivity Issues,"[Slow Internet Connection, Frequent Disconnect...",,
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues,"[Urgent, Hardware Failure, Presentation Issue]",high,immediate
2,ST2023-008,I've accidentally deleted essential work docum...,Data Issues,"[Data Loss, Document Recovery, Urgency]",high,1 day
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Internet and Connectivity Issues,"[Wi-Fi Signal Weakness, Persistent Issue, Trou...",medium,2-3 days
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues,"[Battery Drain, Hardware Malfunction, Power Ma...",medium,immediate
5,ST2023-011,I'm locked out of my online banking account an...,Data Issues,"[Account Access, Password Reset, Urgent]",high,immediate
6,ST2023-012,"My computer's performance is sluggish, severel...",Software and Performance Issues,"[Software Optimization, Performance Issue, Pro...",medium,2-3 days
7,ST2023-013,I'm experiencing a recurring blue screen error...,Hardware Issues,"[Blue Screen Error, Hardware Malfunction, Cras...",high,2-3 days
8,ST2023-014,My external hard drive isn't being recognized ...,Hardware Issues,"[External Hard Drive, Data Recovery, Hardware ...",high,2-3 days
9,ST2023-015,The graphics card in my gaming laptop seems to...,Hardware Issues,"[Graphics Card Malfunction, Gaming Performance...",medium,2-3 days


## **Task 4 - Creating a Draft Response**

In [None]:
# creating a copy of the data
data_4 = data.copy()

In [None]:
def response_4(prompt,ticket,category,tags,priority,eta):
    model_output = llm(
      f"""
      Q: {prompt}
      Support ticket: {ticket}
      Category : {category}
      Tags : {tags}
      Priority: {priority}
      ETA: {eta}
      A:
      """,
      max_tokens=1024, #Complete the code to set the maximum number of tokens the model should generate for this task.
      stop=["Q:", "\n"],
      temperature=0.1, #Complete the code to set the value for temperature.
      echo=False,
    )

    temp_output = model_output["choices"][0]["text"]


    return temp_output

We modify a bit the temperature because was harder to achieve results.

In [None]:
prompt_4 = """
You are an AI analyzing customer-generated feedback and support tickets. Your task is to provide a warm and informative draft response for each case. Consider the estimated time of resolution (ETA) and the specific issue raised by the customer. Ensure your response is detailed, creative, and directly related to the customer's question.

Include in your response:
- Acknowledgment of the customer's issue.
- A warm greeting.
- Explanation of the estimated time for resolution (ETA).
- Reassurance or steps being taken to resolve the problem.

Examples:

Example 1:
Customer Issue: "My internet connection has significantly slowed down over the past two days, making it challenging to work efficiently from home. Frequent disconnections are causing major disruptions. Please assist in resolving this connectivity issue promptly."
Draft Response: "Hello [Customer Name], thank you for reaching out. We apologize for the inconvenience caused by the slowdown in your internet connection. Our technical team is investigating, and we estimate it will be resolved within 2-3 days. We appreciate your patience and value your trust in our service."

Example 2:
Customer Issue: "My laptop refuses to start, and I have critical work deadlines. Urgent help is needed!"
Draft Response: "Dear [Customer Name], we're sorry to hear about your laptop issue. We understand the urgency, especially with your critical work deadlines. Our technicians are prioritizing your case, and we will provide immediate assistance to resolve the problem within 24 hours. Thank you for your understanding."

Example 3:
Customer Issue: "I accidentally deleted essential work documents from my computer. Can you help me recover them?"
Draft Response: "Hello [Customer Name], we're here to help recover your deleted work documents. Our data recovery specialists will begin the process immediately, and we expect to restore the files within 1 day. We will keep you updated on the progress. Thank you for your patience."

Important:
- Make the response customer-centric and empathetic.
- Provide a clear timeframe (ETA) for resolution.
- Maintain a warm and professional tone.

The output should be a plain string.
"""

To have the best possible response and to ensure that the model does not hallucinate, we had to give it specific and clear instructions, examples of possible cases and responses to them, and finally, what things the draft should emphasize.

In [None]:
#Applying generate_llama_response function on support_ticket_text column
start = time.time()
data_4['model_response'] = final_data_3[['support_ticket_text','Category','Tags','Priority','ETA']].apply(lambda x: response_4(prompt_4, x[0],x[1],x[2],x[3],x[4]),axis=1)
end = time.time()

Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit
Llama.generate: prefix-match hit


In [None]:
print("Time taken",(end-start))

Time taken 1545.2083041667938


In [None]:
# Writing the code to check the first five rows of the data to confirm whether the new column has been added
data_4['model_response'].head(21)

0                                Hello [Customer Name],
1     Hello [Customer Name], we're truly sorry for t...
2     Hello [Customer Name], we deeply sympathize wi...
3     Hello [Customer Name], we're sorry to hear tha...
4                                Hello [Customer Name],
5     Hello [Customer Name], we're truly sorry to he...
6                                Hello [Customer Name],
7     Hello [Customer Name], we're sorry to hear abo...
8     Hello [Customer Name], we're truly sorry for t...
9                                Hello [Customer Name],
10    Hello [Customer Name], we're sorry for any inc...
11    Hello [Customer Name], we're truly sorry for t...
12    Hello [Customer Name], we're truly sorry to he...
13    Hello [Customer Name], we're truly sorry to he...
14    Hello [Customer Name], we're sorry to hear tha...
15    Hello [Customer Name], we're truly sorry to he...
16    Hello [Customer Name], we're sorry to hear abo...
17    Hello [Customer Name], we're truly sorry t

- We'd got great responses for almost all data but 4 rows.
- We'd tried with countless prompts and this was the best answer that we achieved

In [None]:
i = 20
print(data_4.loc[i, 'support_ticket_text'])

I hope this message finds you well. I am writing to report a perplexing issue I've encountered with my work computer in recent days. 
The problem seems to involve a combination of unusual software behavior and unexpected data loss.
Over the past week, I've observed that certain software applications on my computer have been 
behaving erratically. For example, some applications freeze randomly, while others exhibit unexplained crashes. Additionally, 
I've noticed that some files and documents that were previously saved on my desktop have mysteriously 
disappeared. These issues are causing significant disruptions to my daily tasks and workflow.
While I don't have specific instructions on how to resolve this complex problem,
I suspect there may be an underlying issue with the system or software compatibility. 
I kindly request your expertise and assistance in thoroughly diagnosing this intricate problem. 
Your insights and guidance would be greatly appreciated. If you require any addition

In [None]:
print(data_4.loc[i, 'model_response'])

Hello [Customer Name],


Here we can see an example of the rows that model couldn't give us a response

In [None]:
final_data_4 = pd.concat([final_data_3,data_4["model_response"]],axis=1)

In [None]:
final_data_4.rename(columns={"model_response":"Response"},inplace=True)

In [None]:
final_data_4

Unnamed: 0,support_tick_id,support_ticket_text,Category,Tags,Priority,ETA,Response
0,ST2023-006,My internet connection has significantly slowe...,Internet and Connectivity Issues,"[Slow Internet Connection, Frequent Disconnect...",,,"Hello [Customer Name],"
1,ST2023-007,Urgent help required! My laptop refuses to sta...,Hardware Issues,"[Urgent, Hardware Failure, Presentation Issue]",high,immediate,"Hello [Customer Name], we're truly sorry for t..."
2,ST2023-008,I've accidentally deleted essential work docum...,Data Issues,"[Data Loss, Document Recovery, Urgency]",high,1 day,"Hello [Customer Name], we deeply sympathize wi..."
3,ST2023-009,Despite being in close proximity to my Wi-Fi r...,Internet and Connectivity Issues,"[Wi-Fi Signal Weakness, Persistent Issue, Trou...",medium,2-3 days,"Hello [Customer Name], we're sorry to hear tha..."
4,ST2023-010,"My smartphone battery is draining rapidly, eve...",Hardware Issues,"[Battery Drain, Hardware Malfunction, Power Ma...",medium,immediate,"Hello [Customer Name],"
5,ST2023-011,I'm locked out of my online banking account an...,Data Issues,"[Account Access, Password Reset, Urgent]",high,immediate,"Hello [Customer Name], we're truly sorry to he..."
6,ST2023-012,"My computer's performance is sluggish, severel...",Software and Performance Issues,"[Software Optimization, Performance Issue, Pro...",medium,2-3 days,"Hello [Customer Name],"
7,ST2023-013,I'm experiencing a recurring blue screen error...,Hardware Issues,"[Blue Screen Error, Hardware Malfunction, Cras...",high,2-3 days,"Hello [Customer Name], we're sorry to hear abo..."
8,ST2023-014,My external hard drive isn't being recognized ...,Hardware Issues,"[External Hard Drive, Data Recovery, Hardware ...",high,2-3 days,"Hello [Customer Name], we're truly sorry for t..."
9,ST2023-015,The graphics card in my gaming laptop seems to...,Hardware Issues,"[Graphics Card Malfunction, Gaming Performance...",medium,2-3 days,"Hello [Customer Name],"


## **Model Output Analysis**

In [None]:
# Creating a copy of the dataframe of task-4
final_data = final_data_4.copy()

In [None]:
final_data['Category'].value_counts()

Hardware Issues                                 10
Internet and Connectivity Issues                 5
Data Issues                                      4
Software and Performance Issues                  1
Software and Performance Issues, Data Issues     1
Name: Category, dtype: int64

In [None]:
final_data["Priority"].value_counts()

high      12
medium     6
low        1
Name: Priority, dtype: int64

In [None]:
final_data["ETA"].value_counts()

2-3 days     7
immediate    6
1 day        4
a week       2
Name: ETA, dtype: int64

Let's dive in a bit deeper here.

In [None]:
final_data.groupby(['Category', 'ETA']).support_tick_id.count()

Category                                      ETA      
Data Issues                                   1 day        3
                                              immediate    1
Hardware Issues                               1 day        1
                                              2-3 days     4
                                              a week       2
                                              immediate    3
Internet and Connectivity Issues              2-3 days     1
                                              immediate    2
Software and Performance Issues               2-3 days     1
Software and Performance Issues, Data Issues  2-3 days     1
Name: support_tick_id, dtype: int64

## **Actionable Insights and Recommendations**

- The principal pain that our customers are having is related with hardware issues (48%), being the principal point of improving that this company has. It's followed by internet and connectivity issues (24%) and data issues (19%), summarizing the 91% of the problems.
- The good notice is that they are not having almost any issue with software performance, telling us this that this product performs well and they should impulse it.
- 52% of customer issues have to be solved with high priority and 64% of them are related to hardware, which shows us that is a critical component and reinforce the above insight of improving this product. The positive thing is that 30% of the issues are solved inmediately and 50% in the range of 1 to 3 days.
- They could conduct a review of the hardware commonly used by customers to identify any patterns or defective products. Another solution could be establishing partnerships with hardware manufacturers for quicker replacements and better support.
- Internet and data issues are solved in 1 day or less in 66% of the cases proving that company is performing well to solve inconvenients related with them.
- Company should develop and promote a robust data backup and recovery service for customers. They could provide educational resources or workshops on preventing data loss and using backup solutions effectively.