### Create a Neural Network that classifies employees by job satisfaction (Tensorflow/Keras) [IBM dataset]

### Information about the dataset

- Number of inputs: **1479**
- Number of features: **33**
- Dataset: https://www.ibm.com/communities/analytics/watson-analytics-blog/hr-employee-attrition/
- Data fields description: https://www.ibm.com/communities/analytics/watson-analytics-blog/hr-employee-attrition/

In [1]:
import warnings; warnings.simplefilter('ignore')
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
#import warnings; warnings.simplefilter('ignore')

df = pd.read_csv('Classification.csv')

**DNN requires that y starts from 0 and continues like this 0,1,2,3...**

In [2]:
from sklearn import preprocessing
from sklearn.preprocessing import LabelEncoder
le = preprocessing.LabelEncoder()
df['JobSatisfaction'] = le.fit_transform(df['JobSatisfaction'])
#df['EnvironmentSatisfaction'] = le.fit_transform(df['EnvironmentSatisfaction'])

In [3]:
df['JobSatisfaction'].unique()

array([3, 1, 2, 0], dtype=int64)

**Assigning values to predict**

In [4]:
y = df['JobSatisfaction']

**For Keras y values should be encoded as dummy variables**

In [5]:
y_dummy= pd.get_dummies(df['JobSatisfaction'])

**Removing unneeded columns**

In [6]:
df = df.drop(['EmployeeCount', 'EmployeeNumber'], axis=1)

#remove columns to predict
df = df.drop(['EnvironmentSatisfaction', 'JobSatisfaction','RelationshipSatisfaction'], axis=1)

In [7]:
pd.set_option('display.max_columns', None)

In [8]:
df.head()

Unnamed: 0,Age,Attrition,BusinessTravel,DailyRate,Department,DistanceFromHome,Education,EducationField,Gender,HourlyRate,JobInvolvement,JobLevel,JobRole,MaritalStatus,MonthlyIncome,MonthlyRate,NumCompaniesWorked,OverTime,PercentSalaryHike,PerformanceRating,StockOptionLevel,TotalWorkingYears,TrainingTimesLastYear,WorkLifeBalance,YearsAtCompany,YearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager
0,41,Yes,Travel_Rarely,1102,Sales,1,2,Life Sciences,Female,94,3,2,Sales Executive,Single,5993,19479,8,Yes,11,3,0,8,0,1,6,4,0,5
1,49,No,Travel_Frequently,279,Research & Development,8,1,Life Sciences,Male,61,2,2,Research Scientist,Married,5130,24907,1,No,23,4,1,10,3,3,10,7,1,7
2,37,Yes,Travel_Rarely,1373,Research & Development,2,2,Other,Male,92,2,1,Laboratory Technician,Single,2090,2396,6,Yes,15,3,0,7,3,3,0,0,0,0
3,33,No,Travel_Frequently,1392,Research & Development,3,4,Life Sciences,Female,56,3,1,Research Scientist,Married,2909,23159,1,Yes,11,3,0,8,3,3,8,7,3,0
4,27,No,Travel_Rarely,591,Research & Development,2,1,Medical,Male,40,3,1,Laboratory Technician,Married,3468,16632,9,No,12,3,1,6,3,3,2,2,2,2


### Building a Keras model

**Dividing variables, putting them in categorical and nom-categorical dataframe to encode only categorical variables**

In [9]:
#indexes of columns with and without categorical variables
col_list = [1,2,4,6,7,8,10,11,12,13,17,19,20,23]
no_cat_var = [0,3,5,9,14,15,16,18, 21,22,24, 25,26,27]

df_un_cat = df.iloc[:, col_list]
df_un_non_cat = df.iloc[:, no_cat_var]

In [10]:
df_un_cat.head()

Unnamed: 0,Attrition,BusinessTravel,Department,Education,EducationField,Gender,JobInvolvement,JobLevel,JobRole,MaritalStatus,OverTime,PerformanceRating,StockOptionLevel,WorkLifeBalance
0,Yes,Travel_Rarely,Sales,2,Life Sciences,Female,3,2,Sales Executive,Single,Yes,3,0,1
1,No,Travel_Frequently,Research & Development,1,Life Sciences,Male,2,2,Research Scientist,Married,No,4,1,3
2,Yes,Travel_Rarely,Research & Development,2,Other,Male,2,1,Laboratory Technician,Single,Yes,3,0,3
3,No,Travel_Frequently,Research & Development,4,Life Sciences,Female,3,1,Research Scientist,Married,Yes,3,0,3
4,No,Travel_Rarely,Research & Development,1,Medical,Male,3,1,Laboratory Technician,Married,No,3,1,3


In [11]:
df_un_non_cat.head()

Unnamed: 0,Age,DailyRate,DistanceFromHome,HourlyRate,MonthlyIncome,MonthlyRate,NumCompaniesWorked,PercentSalaryHike,TotalWorkingYears,TrainingTimesLastYear,YearsAtCompany,YearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager
0,41,1102,1,94,5993,19479,8,11,8,0,6,4,0,5
1,49,279,8,61,5130,24907,1,23,10,3,10,7,1,7
2,37,1373,2,92,2090,2396,6,15,7,3,0,0,0,0
3,33,1392,3,56,2909,23159,1,11,8,3,8,7,3,0
4,27,591,2,40,3468,16632,9,12,6,3,2,2,2,2


**Conversion so get_dummies works as it should**

In [12]:
df_un_cat['Education']= df_un_cat['Education'].astype('category')
df_un_cat['JobInvolvement'] = df_un_cat['JobInvolvement'].astype('category')
df_un_cat['JobLevel']= df_un_cat['JobLevel'].astype('category')
df_un_cat['PerformanceRating']= df_un_cat['PerformanceRating'].astype('category')
df_un_cat['StockOptionLevel']= df_un_cat['StockOptionLevel'].astype('category')
df_un_cat['WorkLifeBalance'] = df_un_cat['WorkLifeBalance'].astype('category')

In [13]:
df = pd.get_dummies(df_un_cat, drop_first=True)

**Merging converted into dummies categorical variables and non-categorical variables**

In [14]:
X = pd.merge(df, df_un_non_cat, left_index=True, right_index=True)

In [15]:
X.head()

Unnamed: 0,Attrition_Yes,BusinessTravel_Travel_Frequently,BusinessTravel_Travel_Rarely,Department_Research & Development,Department_Sales,Education_2,Education_3,Education_4,Education_5,EducationField_Life Sciences,EducationField_Marketing,EducationField_Medical,EducationField_Other,EducationField_Technical Degree,Gender_Male,JobInvolvement_2,JobInvolvement_3,JobInvolvement_4,JobLevel_2,JobLevel_3,JobLevel_4,JobLevel_5,JobRole_Human Resources,JobRole_Laboratory Technician,JobRole_Manager,JobRole_Manufacturing Director,JobRole_Research Director,JobRole_Research Scientist,JobRole_Sales Executive,JobRole_Sales Representative,MaritalStatus_Married,MaritalStatus_Single,OverTime_Yes,PerformanceRating_4,StockOptionLevel_1,StockOptionLevel_2,StockOptionLevel_3,WorkLifeBalance_2,WorkLifeBalance_3,WorkLifeBalance_4,Age,DailyRate,DistanceFromHome,HourlyRate,MonthlyIncome,MonthlyRate,NumCompaniesWorked,PercentSalaryHike,TotalWorkingYears,TrainingTimesLastYear,YearsAtCompany,YearsInCurrentRole,YearsSinceLastPromotion,YearsWithCurrManager
0,1,0,1,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,41,1102,1,94,5993,19479,8,11,8,0,6,4,0,5
1,0,1,0,1,0,0,0,0,0,1,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,1,1,0,0,0,1,0,49,279,8,61,5130,24907,1,23,10,3,10,7,1,7
2,1,0,1,1,0,1,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,1,0,37,1373,2,92,2090,2396,6,15,7,3,0,0,0,0
3,0,1,0,1,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,1,0,33,1392,3,56,2909,23159,1,11,8,3,8,7,3,0
4,0,0,1,1,0,0,0,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,1,0,27,591,2,40,3468,16632,9,12,6,3,2,2,2,2


**Train-test split**

In [16]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y_dummy, test_size = 0.2)

**Scaling variables**

In [17]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

**Importing Keras**

In [18]:
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout

Using TensorFlow backend.


**How the model will look like:** <br/>
4 hidden layers, each has 96 neurons, small dropout to prevent overfitting, relu as an activator, mean squared error as loss function, softmax as an optimizer, 200 epochs.

In [50]:
# Initialising the ANN
model = Sequential()    
model.add(Dense(units = 96, kernel_initializer = 'uniform', activation = 'relu', input_dim = 54))
model.add(Dropout(p = 0.1))
model.add(Dense(units = 96, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(p = 0.1))
model.add(Dense(units = 96, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(p = 0.1))
model.add(Dense(units = 96, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dropout(p = 0.1))
model.add(Dense(units = 4, kernel_initializer = 'uniform', activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit(X_train, y_train, batch_size = 20, epochs = 200)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/

Epoch 162/200
Epoch 163/200
Epoch 164/200
Epoch 165/200
Epoch 166/200
Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


In [51]:
y_pred = model.predict(X_test)

**Results of a model**

In [52]:
import tensorflow as tf
from keras.metrics import categorical_accuracy
accuracy = categorical_accuracy(y_test, y_pred)
session = tf.Session()
session.run(accuracy)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 1., 0., 1., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1., 0., 0., 1., 1.,
       0., 1., 0., 1., 0., 0., 1., 1., 0., 1., 1., 1., 0., 1., 0., 1., 0.,
       1., 0., 1., 0., 1., 1., 0., 1., 0., 0., 0., 0., 1., 1., 0., 1., 0.,
       1., 0., 0., 1., 1., 0., 1., 1., 1., 0., 0., 1., 0., 0., 0., 0., 0.,
       0., 1., 1., 0., 1., 1., 1., 0., 0., 0., 1., 1., 0., 0., 1., 0., 1.,
       0., 1., 0., 0., 1., 1., 0., 1., 1., 0., 1., 0., 0., 0., 1., 0., 0.,
       0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 1., 0.,
       0., 0., 0., 0., 1., 0., 1., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.,
       0., 0., 0., 1., 0., 0., 1., 0., 0., 0., 1., 1., 1., 0., 1., 0., 1.,
       0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 1.,
       0., 0., 1., 0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 1., 1., 0.,
       0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0., 0., 1., 0., 0., 0.,
       0., 1., 0., 0., 1.

**Accuracy of a model**

In [53]:
sum(session.run(accuracy))/len(session.run(accuracy))

0.32653061224489793

**First 10 real values**

In [57]:
y_test.round(3)[50:60]

Unnamed: 0,0,1,2,3
1143,1,0,0,0
8,0,0,1,0
104,0,0,0,1
372,0,1,0,0
367,0,0,0,1
217,0,0,1,0
342,0,0,0,1
501,0,0,1,0
437,0,1,0,0
634,1,0,0,0


**First 10 predicted values**

In [58]:
y_pred.round(3)[50:60]

array([[0.057, 0.937, 0.007, 0.   ],
       [0.   , 0.   , 0.814, 0.186],
       [0.039, 0.96 , 0.001, 0.   ],
       [0.   , 0.998, 0.   , 0.002],
       [0.   , 0.   , 1.   , 0.   ],
       [0.017, 0.   , 0.973, 0.009],
       [0.   , 0.001, 0.021, 0.977],
       [0.001, 0.   , 0.   , 0.999],
       [0.119, 0.863, 0.017, 0.001],
       [0.006, 0.001, 0.986, 0.008]], dtype=float32)

### Building a Tensorflow model

In [59]:
df = pd.read_csv('Classification.csv')
y = df['JobSatisfaction']

**DNN requires that y starts from 0 and continues like this 0,1,2,3...**

In [60]:
from sklearn import preprocessing
from sklearn.preprocessing import LabelEncoder
le = preprocessing.LabelEncoder()
df['JobSatisfaction'] = le.fit_transform(df['JobSatisfaction'])

In [61]:
#remove unneeded columns
df = df.drop(['EmployeeCount', 'EmployeeNumber'], axis=1)

#remove columns to predict
df = df.drop(['EnvironmentSatisfaction', 'JobSatisfaction','RelationshipSatisfaction'], axis=1)

**Train-test split**

In [62]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(df, y)

In [None]:
import tensorflow as tf

**Creating feature columns**

Categorical variables

In [63]:
Attrition = tf.feature_column.categorical_column_with_vocabulary_list(key="Attrition", 
                                                                      vocabulary_list=['No', 'Yes'])
BusinessTravel = tf.feature_column.categorical_column_with_vocabulary_list(key="BusinessTravel", 
                                                                       vocabulary_list=['Travel_Rarely', 'Travel_Frequently', 'Non-Travel'])
Department = tf.feature_column.categorical_column_with_vocabulary_list(key="Department", 
                                                                   vocabulary_list=['Sales', 'Research & Development', 'Human Resources'])
Education = tf.feature_column.categorical_column_with_vocabulary_list(key="Education", 
                                                                  vocabulary_list=[2, 3, 4, 1, 5])
EducationField = tf.feature_column.categorical_column_with_vocabulary_list(key="EducationField", 
                                                                       vocabulary_list=['Life Sciences', 'Medical', 'Marketing', 'Technical Degree','Other', 'Human Resources'])
Gender = tf.feature_column.categorical_column_with_vocabulary_list(key="Gender", 
                                                               vocabulary_list=['Female', 'Male'])
JobInvolvement = tf.feature_column.categorical_column_with_vocabulary_list(key="JobInvolvement", 
                                                                       vocabulary_list=[3, 1, 4, 2])
JobLevel = tf.feature_column.categorical_column_with_vocabulary_list(key="JobLevel", 
                                                                 vocabulary_list=[2, 4, 1, 3, 5])
JobRole = tf.feature_column.categorical_column_with_vocabulary_list(key="JobRole", 
                                                                vocabulary_list=['Sales Executive', 'Research Scientist', 'Healthcare Representative', 'Sales Representative','Manufacturing Director', 'Laboratory Technician', 'Manager','Research Director', 'Human Resources'])
MaritalStatus = tf.feature_column.categorical_column_with_vocabulary_list(key="MaritalStatus", 
                                                                      vocabulary_list=['Married', 'Divorced', 'Single'])
OverTime = tf.feature_column.categorical_column_with_vocabulary_list(key="OverTime", 
                                                                 vocabulary_list=['No', 'Yes'])
PerformanceRating = tf.feature_column.categorical_column_with_vocabulary_list(key="PerformanceRating", 
                                                                          vocabulary_list=[3, 4])
StockOptionLevel = tf.feature_column.categorical_column_with_vocabulary_list(key="StockOptionLevel", 
                                                                         vocabulary_list=[1, 0, 2, 3])
WorkLifeBalance = tf.feature_column.categorical_column_with_vocabulary_list(key="WorkLifeBalance", 
                                                                        vocabulary_list=[3, 1, 2, 4])

Continuous variables

In [64]:
Age = tf.feature_column.numeric_column("Age")
DailyRate = tf.feature_column.numeric_column("DailyRate")
DistanceFromHome = tf.feature_column.numeric_column("DistanceFromHome")
HourlyRate = tf.feature_column.numeric_column("HourlyRate")
MonthlyIncome = tf.feature_column.numeric_column("MonthlyIncome")
MonthlyRate = tf.feature_column.numeric_column("MonthlyRate")
NumCompaniesWorked = tf.feature_column.numeric_column("NumCompaniesWorked")
PercentSalaryHike = tf.feature_column.numeric_column("PercentSalaryHike")
TotalWorkingYears = tf.feature_column.numeric_column("TotalWorkingYears")
TrainingTimesLastYear = tf.feature_column.numeric_column("TrainingTimesLastYear")
YearsAtCompany = tf.feature_column.numeric_column("YearsAtCompany")
YearsInCurrentRole = tf.feature_column.numeric_column("YearsInCurrentRole")
YearsSinceLastPromotion = tf.feature_column.numeric_column("YearsSinceLastPromotion")
YearsWithCurrManager = tf.feature_column.numeric_column("YearsWithCurrManager")

In [65]:
feat_cols = [tf.feature_column.indicator_column(Attrition),
             tf.feature_column.indicator_column(BusinessTravel),
             tf.feature_column.indicator_column(Department),
             tf.feature_column.indicator_column(Education),
             tf.feature_column.indicator_column(EducationField),
             tf.feature_column.indicator_column(Gender),
             tf.feature_column.indicator_column(JobInvolvement),
             tf.feature_column.indicator_column(JobLevel),
             tf.feature_column.indicator_column(JobRole),
             tf.feature_column.indicator_column(MaritalStatus),
             tf.feature_column.indicator_column(OverTime),
             tf.feature_column.indicator_column(PerformanceRating), 
             tf.feature_column.indicator_column(StockOptionLevel), 
             tf.feature_column.indicator_column(WorkLifeBalance),
            Age, DailyRate, DistanceFromHome, HourlyRate, MonthlyIncome, MonthlyRate, NumCompaniesWorked, 
            PercentSalaryHike, TotalWorkingYears, TrainingTimesLastYear, YearsAtCompany, YearsInCurrentRole,
            YearsSinceLastPromotion, YearsWithCurrManager]

**Scaling of continuos columns**

In [66]:
con_col = ['Age', 'DailyRate', 'DistanceFromHome', 'HourlyRate', 'MonthlyIncome', 'MonthlyRate',
         'NumCompaniesWorked', 'PercentSalaryHike', 'TotalWorkingYears', 'TrainingTimesLastYear',
         'YearsAtCompany', 'YearsInCurrentRole',
         'YearsSinceLastPromotion', 'YearsWithCurrManager']

In [67]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train[con_col] = scaler.fit_transform(X_train[con_col])
X_test[con_col] = scaler.fit_transform(X_test[con_col])

**How the model will look like:** <br/>
4 hidden layers, each has 96 neurons, small dropout to prevent overfitting, relu as an activator, mean squared error as loss function, softmax as an optimizer, 200 epochs.

In [183]:
input_func = tf.estimator.inputs.pandas_input_fn(x=X_train,y=y_train ,batch_size=25,num_epochs=200,
                                            shuffle=True)

In [240]:
#Grid - learning_rate=0.001, 
#model = tf.estimator.DNNRegressor(hidden_units=[6,6,6],feature_columns=feature_columns)

model = tf.estimator.DNNClassifier(feature_columns=feat_cols, hidden_units=[96,96,96,96],
                                   optimizer=tf.train.AdamOptimizer(learning_rate=0.001),
                                   activation_fn = tf.nn.relu,
                                   n_classes=4)

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': 'C:\\Users\\alexa\\AppData\\Local\\Temp\\tmpzqaa51pj', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x000002AD90C1E940>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [241]:
model.train(input_fn=input_func,steps=1000000)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 1 into C:\Users\alexa\AppData\Local\Temp\tmpzqaa51pj\model.ckpt.
INFO:tensorflow:loss = 34.689518, step = 1
INFO:tensorflow:global_step/sec: 151.382
INFO:tensorflow:loss = 33.234344, step = 101 (0.661 sec)
INFO:tensorflow:global_step/sec: 258.519
INFO:tensorflow:loss = 31.972576, step = 201 (0.387 sec)
INFO:tensorflow:global_step/sec: 197.958
INFO:tensorflow:loss = 32.23688, step = 301 (0.506 sec)
INFO:tensorflow:global_step/sec: 239.874
INFO:tensorflow:loss = 13.2178135, step = 401 (0.417 sec)
INFO:tensorflow:global_step/sec: 169.893
INFO:tensorflow:loss = 5.809194, step = 501 (0.595 sec)
INFO:tensorflow:global_step/sec: 141.218
INFO:tensorflow:loss = 3.5631287, step = 601 (0.706 sec)
INFO:tensorflow:global_step/s

INFO:tensorflow:loss = 0.0001879914, step = 7801 (0.607 sec)
INFO:tensorflow:global_step/sec: 196.422
INFO:tensorflow:loss = 0.00017440198, step = 7901 (0.509 sec)
INFO:tensorflow:global_step/sec: 186.05
INFO:tensorflow:loss = 0.00019347534, step = 8001 (0.538 sec)
INFO:tensorflow:global_step/sec: 203.113
INFO:tensorflow:loss = 0.0001602166, step = 8101 (0.489 sec)
INFO:tensorflow:global_step/sec: 183.977
INFO:tensorflow:loss = 0.0001504411, step = 8201 (0.544 sec)
INFO:tensorflow:global_step/sec: 215.552
INFO:tensorflow:loss = 0.0001243344, step = 8301 (0.466 sec)
INFO:tensorflow:global_step/sec: 197.765
INFO:tensorflow:loss = 8.952583e-05, step = 8401 (0.506 sec)
INFO:tensorflow:global_step/sec: 163.836
INFO:tensorflow:loss = 0.00012516923, step = 8501 (0.607 sec)
INFO:tensorflow:global_step/sec: 135.586
INFO:tensorflow:loss = 8.201572e-05, step = 8601 (0.747 sec)
INFO:tensorflow:global_step/sec: 151.461
INFO:tensorflow:loss = 0.00011289062, step = 8701 (0.656 sec)
INFO:tensorflow:gl

<tensorflow.python.estimator.canned.dnn.DNNClassifier at 0x2ad90c1e7b8>

**Creating predictions**

In [242]:
pred_fn = tf.estimator.inputs.pandas_input_fn(x=X_test,batch_size=len(X_test),shuffle=False)

In [243]:
predictions = list(model.predict(input_fn=pred_fn))

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\alexa\AppData\Local\Temp\tmpzqaa51pj\model.ckpt-8816
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.


**Confusion matrix**

In [244]:
final_preds = []

for pred in predictions:
    final_preds.append(pred['class_ids'][0])

In [245]:
from sklearn.metrics import classification_report

print(classification_report(y_test,final_preds))

             precision    recall  f1-score   support

          0       0.22      0.22      0.22        69
          1       0.23      0.16      0.19        75
          2       0.28      0.34      0.31       107
          3       0.27      0.28      0.28       117

avg / total       0.26      0.26      0.26       368



**Summary**

With 1479 values it is only possible to classify 25-30% of employee satisfaction level. I suppose more entries are required to build a stable model