# Usage Instructions - MTTR Predictor
The MTTR (Mean Time to Resolution) predictor is an AI/ML based solution which predicts the time taken by a service agent to solve a specific ticket or incident request. The solution learns the efficiency, experience and workload management metrics for various ticket types solved by service agents to arrive at the predictions. The solution helps business in optimal ticket allocation leading to a low MTTR, shorter wait time, fewer open incidents leading to improved efficiency and SLA (Service Level Agreement) adherence.







## Contents

1. Import Libraries
1. Sample Input Data
1. Input Transformation
1. Encoding Features
1. Training and Optimizing the model
1. Predicting the Unassigned data

### Prerequisite

To run this algorithm you need to have access to the following packages:
- scikit-learn.
- pandas,numpy
- optuna.


### Input format
#### Input1:
Name of the file: <b>”Assigned.csv”</b><br>
This file contains historical incidents that have been resolved. The solution uses the following incident specific inputs to derive specific productivity measures such as efficiency, experience and workload management across incident types for incident managers to make the predictions.<br><br>

</ul>
<li> Request ID: Unique identifier for the request- alphanumeric e.g. SRV101_254859</li>
<li> Request Submitted Date and Time: The data and time when the request was submitted (Preferred format: YYYY-MM-DD HH:MM:SS)</li>

<li> Request Priority: Priority of the request e.g. High, Medium, Low</li>
<li> Request Resolved Date and Time: The date and time when the request was closed (Only for closed requests, Preferred format: YYYY-MM-DD HH:MM:SS)</li>
<li> Request Category: Type of request e.g. "Authentication issue", "Server failure issue", "Access grant request"</li>
<li> Request Status: Status of the request e.g. Open/Closed</li>
<li> Request Resolved By: Service Provider ID/name who solved the incident (for closed incidents) or whom incident has been assigned for resolution (for open incidents).</li>
</ul><br>
NOTE:
</ul>
<li>For Assigned requests, all the above data fields are mandatory ("Request resolved by" should NOT be blank).</li>
<li>Provide a minimum of 10000 records (of assigned requests) for better results</li>
</ul>

#### Input2:
<br>
Name of the file:<b> “Unassigned.csv”</b>
<ul>
	
<li>This file contains those incidents which have not been assigned to any service provider.</li>
 <li> This File also requires above mentioned incident specific inputs except the following two inputs:</li>
<li>Request Resolved Date and Time  and Request Resolved By</li>
</ul>




## Import Libraries

In [1]:
import pandas as pd
import numpy as np
import optuna
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.preprocessing import OrdinalEncoder
from sklearn.model_selection import cross_val_score

### Sample input data

reading input data

In [2]:
Assigned = pd.read_csv('Assigned.csv',parse_dates=['Request Submitted Date and Time','Request Resolved Date and Time'])
Unassigned=pd.read_csv('Unassigned.csv',parse_dates=['Request Submitted Date and Time'])

In [3]:
Assigned.head(5)

Unnamed: 0,Request ID,Request Resolved By,Request Submitted Date and Time,Request Priority,Request Resolved Date and Time,Request Category,Request Status
0,INC_1,Solver_1,2016-04-01 04:04:00,3-Low,2016-04-02 10:16:00,Request_Category_1,Closed
1,INC_2,Solver_2,2016-04-01 04:52:00,3-Low,2016-04-01 04:52:00,request_Category_2,Closed
2,INC_3,Solver_3,2016-04-01 04:23:00,3-Low,2016-04-01 04:23:00,request_Category_2,Closed
3,INC_4,Solver_4,2016-04-01 04:10:00,3-Low,2016-04-01 04:31:00,Request_Category_3,Closed
4,INC_5,Solver_5,2016-04-01 05:27:00,3-Low,2016-04-01 05:31:00,Request_Category_4,Closed


In [4]:
Unassigned.head(5)

Unnamed: 0,Request ID,Request Submitted Date and Time,Request Priority,Request Category,Request Status
0,INC_10483,2016-04-01 14:32:00,3-Low,Request_Category_242,Open
1,INC_10484,2016-04-01 14:13:00,3-Low,Request_Category_5,Open
2,INC_10485,2016-04-01 14:27:00,3-Low,Request_Category_3,Open
3,INC_10486,2016-04-01 14:54:00,3-Low,Request_Category_20,Open
4,INC_10487,2016-04-01 14:46:00,3-Low,Request_Category_33,Open


### Input Transformation

In [5]:
Assigned = Assigned[Assigned['Request Status']=='Closed'].reset_index(drop=True)
Assigned['mttr']=Assigned['Request Resolved Date and Time']-Assigned['Request Submitted Date and Time']
Assigned['mttr']=Assigned['mttr'].apply(lambda x: x.total_seconds()/60)
Assigned['Request Priority'] = Assigned['Request Priority'].apply(lambda x:x.split("-")[0])
Assigned['mttr'][Assigned['mttr']<0]=0
## creating label
label = np.log(Assigned['mttr']+1)
Assigned.drop(columns = ['Request ID', 'Request Resolved By', 'Request Submitted Date and Time','Request Resolved Date and Time','Request Status','mttr'], inplace = True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """


### Encoding Features

In [6]:

oe = OrdinalEncoder(handle_unknown='use_encoded_value',
    unknown_value=-99)
Assigned["Request Category"] = oe.fit_transform(Assigned["Request Category"].values.reshape(-1,1))
Assigned.head()

Unnamed: 0,Request Priority,Request Category
0,3,0.0
1,3,713.0
2,3,713.0
3,3,220.0
4,3,331.0


## Training and Optimizing the model.

In [7]:
def objective(trial):

    n_estimators = trial.suggest_int('n_estimators', 10, 200,10)
    max_depth = int(trial.suggest_loguniform('max_depth', 1, 5))
    min_samples_leaf = round(trial.suggest_loguniform('min_samples_leaf', 1, 5))
    min_samples_split = int(trial.suggest_loguniform('min_samples_split', 10, 100))
    learning_rate = trial.suggest_float('learning_rate', 0.005, 0.2,log=True)
    clf = GradientBoostingRegressor(max_depth=max_depth, max_features='sqrt', min_samples_leaf=min_samples_leaf,
                           min_samples_split=min_samples_split, n_estimators=n_estimators,random_state = 42,learning_rate=learning_rate)
    score = cross_val_score(clf,Assigned,label,cv=3)
   
    return np.mean(score)



study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=100)

trial = study.best_trial
print('r2_score: {}'.format(trial.value))

print("Best hyperparameters: {}".format(trial.params))

[32m[I 2021-11-25 14:48:04,139][0m A new study created in memory with name: no-name-0fbfe5aa-a90d-4399-9d6c-a59af7ae920f[0m
[32m[I 2021-11-25 14:48:04,264][0m Trial 0 finished with value: 0.031646317495677555 and parameters: {'n_estimators': 80, 'max_depth': 1.720760944399821, 'min_samples_leaf': 1.5396019866814692, 'min_samples_split': 19.507452795784488, 'learning_rate': 0.03018196791234342}. Best is trial 0 with value: 0.031646317495677555.[0m
[32m[I 2021-11-25 14:48:04,454][0m Trial 1 finished with value: 0.02035272746383676 and parameters: {'n_estimators': 70, 'max_depth': 2.6623993365732117, 'min_samples_leaf': 2.3908698763487384, 'min_samples_split': 31.467729443723183, 'learning_rate': 0.005659481164494329}. Best is trial 0 with value: 0.031646317495677555.[0m
[32m[I 2021-11-25 14:48:04,660][0m Trial 2 finished with value: 0.019361790477052887 and parameters: {'n_estimators': 140, 'max_depth': 1.1396151402150123, 'min_samples_leaf': 4.994918692793642, 'min_samples_sp

r2_score: 0.30091325551600917
Best hyperparameters: {'n_estimators': 200, 'max_depth': 4.086207589113958, 'min_samples_leaf': 1.6003936489285848, 'min_samples_split': 12.281736893410903, 'learning_rate': 0.1563741915066204}


In [8]:
clf = GradientBoostingRegressor(max_depth=trial.params['max_depth'],
                                     max_features='sqrt', 
                                     min_samples_leaf=round(trial.params['min_samples_leaf']),
                                     min_samples_split=round(trial.params['min_samples_split']), 
                                     n_estimators=trial.params['n_estimators'],
                                     random_state = 42,
                                     learning_rate=trial.params['learning_rate'])
clf.fit(Assigned,label)

GradientBoostingRegressor(learning_rate=0.1563741915066204,
                          max_depth=4.086207589113958, max_features='sqrt',
                          min_samples_leaf=2, min_samples_split=12,
                          n_estimators=200, random_state=42)

## Prdicting the Unassigned data

### Transforming the Unassigned data

In [10]:
Unassigned['Request Priority'] = Unassigned['Request Priority'].apply(lambda x:x.split("-")[0])
Unassigned['Request Category'] = oe.transform(Unassigned['Request Category'].values.reshape(-1,1))
test = Unassigned[['Request Priority','Request Category']]
test.head()

Unnamed: 0,Request Priority,Request Category
0,3,157.0
1,3,442.0
2,3,220.0
3,3,111.0
4,3,254.0


In [11]:
data = pd.DataFrame()
data['Request ID'] = Unassigned['Request ID']
data['MTTR'] = clf.predict(test)
data.to_csv('output.csv',index = False)