# Absenteeism Integration

Now that we have the preprocessing steps and the built model and scaler, we can integrate all of them into one file.

Here, I will run the preprocessing and model testing on a new unseen absenteeism dataset, which we can later deploy if we want.

## Import the Module and Libraries

In [1]:
import pandas as pd

from absenteeism_module import *

## Load the Data

In [2]:
pd.read_csv('../Dataset/Absenteeism_new_data.csv')

Unnamed: 0,ID,Reason for Absence,Date,Transportation Expense,Distance to Work,Age,Daily Work Load Average,Body Mass Index,Education,Children,Pets
0,22,27,01/06/2018,179,26,30,237.656,19,3,0,0
1,10,7,04/06/2018,361,52,28,237.656,27,1,1,4
2,14,23,06/06/2018,155,12,34,237.656,25,1,2,0
3,17,25,08/06/2018,179,22,40,237.656,22,2,2,0
4,14,10,08/06/2018,155,12,34,237.656,25,1,2,0
5,28,11,11/06/2018,225,26,28,237.656,24,1,1,2
6,16,7,13/06/2018,118,15,46,275.089,25,1,2,0
7,22,27,13/06/2018,179,26,30,275.089,19,3,0,0
8,34,26,15/06/2018,118,10,37,275.089,28,1,0,0
9,34,10,20/06/2018,118,10,37,275.089,28,1,0,0


## Create an Absenteeism Model

In [3]:
# create an instance of the model
# pass the model and scaler created previously
model = absenteeism_model('model', 'scaler')

In [4]:
# load and clean the data
model.load_and_clean_data('../Dataset/Absenteeism_new_data.csv')

In [5]:
# check the scaled data
model.data

Unnamed: 0,Reason 1,Reason 2,Reason 3,Reason 4,Month Value,Day of the Week,Transportation Expense,Age,Body Mass Index,Education,Children,Pets
0,0,0.0,0,1,-0.102784,1.344231,-0.654143,-1.006686,-1.819793,1,-0.91903,-0.58969
1,1,0.0,0,0,-0.102784,-1.359682,2.092381,-1.320435,0.061825,0,-0.01928,2.843016
2,0,0.0,0,1,-0.102784,-0.007725,-1.016322,-0.379188,-0.40858,0,0.880469,-0.58969
3,0,0.0,0,1,-0.102784,1.344231,-0.654143,0.562059,-1.114186,1,0.880469,-0.58969
4,1,0.0,0,0,-0.102784,1.344231,-1.016322,-0.379188,-0.40858,0,0.880469,-0.58969
5,1,0.0,0,0,-0.102784,-1.359682,0.040034,-1.320435,-0.643782,0,-0.01928,1.126663
6,1,0.0,0,0,-0.102784,-0.007725,-1.574681,1.503305,-0.40858,0,0.880469,-0.58969
7,0,0.0,0,1,-0.102784,-0.007725,-0.654143,-1.006686,-1.819793,1,-0.91903,-0.58969
8,0,0.0,0,1,-0.102784,1.344231,-1.574681,0.091435,0.297027,0,-0.91903,-0.58969
9,1,0.0,0,0,-0.102784,-0.007725,-1.574681,0.091435,0.297027,0,-0.91903,-0.58969


## Predict the Outputs

1: Excessive Absenteeism
0: Moderate Absenteeism

In [6]:
model.predicted_outputs()

Unnamed: 0,Reason 1,Reason 2,Reason 3,Reason 4,Month Value,Day of the Week,Transportation Expense,Distance to Work,Age,Daily Work Load Average,Body Mass Index,Education,Children,Pets,Probability,Prediction
0,0,0.0,0,1,6,4,179,26,30,237.656,19,1,0,0,0.122031,0
1,1,0.0,0,0,6,0,361,52,28,237.656,27,0,1,4,0.92017,1
2,0,0.0,0,1,6,2,155,12,34,237.656,25,0,2,0,0.27568,0
3,0,0.0,0,1,6,4,179,22,40,237.656,22,1,2,0,0.212287,0
4,1,0.0,0,0,6,4,155,12,34,237.656,25,0,2,0,0.652933,1
5,1,0.0,0,0,6,0,225,26,28,237.656,24,0,1,2,0.803103,1
6,1,0.0,0,0,6,2,118,15,46,275.089,25,0,2,0,0.582053,1
7,0,0.0,0,1,6,2,179,26,30,275.089,19,1,0,0,0.170836,0
8,0,0.0,0,1,6,4,118,10,37,275.089,28,0,0,0,0.083704,0
9,1,0.0,0,0,6,2,118,10,37,275.089,28,0,0,0,0.498037,0


As seen above, the model is predicting whether is person is likely to be excessively absent from work given the input values. Hence, our model is working as expected.

## Export the Predicted Outputs

In [7]:
model.predicted_outputs().to_csv('../Dataset/Absenteeism_predictions_new_data.csv', index=False)