# WE07 Neural Network Model Building

### Predicting the service type of electric connection given to different customers by states and electric stations  by building ML models such as Logistic regression, Support vector Machine(SVM), and Decision tree. Hyperparameter tuning can be used to improve the accuracy of the models. To evaluate the performance of the models, metrics F1 score is being used

## Importing the preprocessed data for finding the model prediction

In [1]:
import pandas as pd
import numpy as np
from sklearn.svm import SVC
from sklearn import preprocessing
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from scipy.stats import reciprocal, uniform
from sklearn.linear_model import LogisticRegression

# set random seed to ensure that results are repeatable
np.random.seed(1)

In [2]:
X_train = pd.read_csv("C:/Users/Aravind Dudam/Downloads/DSP/X_train.csv")
X_test = pd.read_csv("C:/Users/Aravind Dudam/Downloads/DSP/X_test.csv")
y_train = pd.read_csv("C:/Users/Aravind Dudam/Downloads/DSP/y_train.csv")
y_test = pd.read_csv("C:/Users/Aravind Dudam/Downloads/DSP/y_test.csv")

In [3]:
y_train.shape

(986, 1)

## 1. Logistic Regression with Random Search for hyperparameter tuning

In [4]:
score_measure = "f1"
kfolds = 3

param_grid = {
    'C': uniform(loc=0, scale=10)
}

Lr = LogisticRegression()
rand_search = RandomizedSearchCV(estimator=Lr, param_distributions=param_grid, cv=kfolds, n_iter=100,
                                 scoring=score_measure, n_jobs=-1, random_state=42)

_ = rand_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {rand_search.best_score_}")
print(f"... with parameters: {rand_search.best_params_}")

bestLr = rand_search.best_estimator_


  y = column_or_1d(y, warn=True)


The best f1 score is 0.9928636256262976
... with parameters: {'C': 9.50714306409916}


## 2. Logistic Regression with Grid Search for hyperparameter tuning

In [5]:
score_measure = "f1"
kfolds = 3

param_grid = {
    'penalty': ['l1', 'l2'],
    'C': [0.01, 0.1, 1, 10, 100]
}

Lr = LogisticRegression()
grid_search = GridSearchCV(estimator=Lr, param_grid=param_grid, cv=kfolds, 
                           scoring=score_measure, n_jobs=-1)

_ = grid_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {grid_search.best_score_}")
print(f"... with parameters: {grid_search.best_params_}")

bestLr = grid_search.best_estimator_


15 fits failed out of a total of 30.
The score on these train-test partitions for these parameters will be set to nan.
If these failures are not expected, you can try to debug them by setting error_score='raise'.

Below are more details about the failures:
--------------------------------------------------------------------------------
15 fits failed with the following error:
Traceback (most recent call last):
  File "C:\Users\Aravind Dudam\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 686, in _fit_and_score
    estimator.fit(X_train, y_train, **fit_params)
  File "C:\Users\Aravind Dudam\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py", line 1162, in fit
    solver = _check_solver(self.solver, self.penalty, self.dual)
  File "C:\Users\Aravind Dudam\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py", line 54, in _check_solver
    raise ValueError(
ValueError: Solver lbfgs supports only 'l2' or 'none' penalties, got l1 penalty.

     

The best f1 score is 0.9939019288868232
... with parameters: {'C': 100, 'penalty': 'l2'}


# Support Vector Machine (SVM) Hyperparameter Tuning

## Random Search for Linear kernel

In [6]:
score_measure = "f1"
kfolds = 3

param_grid = {
     'C': reciprocal(0.001, 1000), 
     'kernel': ['linear']
}

rand_linear_SVC = SVC(max_iter=200)
rand_search = RandomizedSearchCV(estimator=rand_linear_SVC, param_distributions=param_grid, cv=kfolds, n_iter=100,
                                 scoring=score_measure, n_jobs=-1, random_state=42)

_ = rand_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {rand_search.best_score_}")
print(f"... with parameters: {rand_search.best_params_}")

bestSVC = rand_search.best_estimator_


The best f1 score is 0.9969230769230769
... with parameters: {'C': 51.41096648805749, 'kernel': 'linear'}


  y = column_or_1d(y, warn=True)


## Grid Search for Linear kernel

In [7]:
score_measure = "f1"
kfolds = 3

param_grid = {
     'C': [0.001, 0.01, 0.1, 1, 10, 100], 
     'kernel': ['linear']
}

Grid_Linear_SVC = SVC(max_iter=200)
grid_search = GridSearchCV(estimator=Grid_Linear_SVC, param_grid=param_grid, cv=kfolds, 
                           scoring=score_measure, n_jobs=-1)

_ = grid_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {grid_search.best_score_}")
print(f"... with parameters: {grid_search.best_params_}")

bestSVC = grid_search.best_estimator_


The best f1 score is 0.9948654056707064
... with parameters: {'C': 10, 'kernel': 'linear'}


  y = column_or_1d(y, warn=True)


## Random Search for Polynomial Kernels

In [8]:
score_measure = "f1"
kfolds = 3

param_grid = {
    'C': np.logspace(-3, 3, 7),
    'kernel': ['poly'],
    'degree': [2, 3, 4],
    'gamma': ['scale', 'auto'] + list(np.logspace(-3, 2, 6))
}

rand_poly_SVC = SVC(max_iter=50)
rand_search = RandomizedSearchCV(estimator=rand_poly_SVC, param_distributions=param_grid, cv=kfolds, n_iter=500,
                                 scoring=score_measure, n_jobs=-1)

_ = rand_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {rand_search.best_score_}")
print(f"... with parameters: {rand_search.best_params_}")

bestSVC = rand_search.best_estimator_




The best f1 score is 0.9959037089312318
... with parameters: {'kernel': 'poly', 'gamma': 100.0, 'degree': 3, 'C': 0.001}


  y = column_or_1d(y, warn=True)


## Grid Search for Polynomial Kernels

In [9]:
score_measure = "f1"
kfolds = 3

param_grid = {
    'C': np.logspace(-3, 3, 7),
    'kernel': ['poly'],
    'degree': [2, 3, 4],
    'gamma': ['scale', 'auto'] + list(np.logspace(-3, 2, 6))
}

Grid_poly_SVC = SVC(max_iter=50)
grid_search = GridSearchCV(estimator=Grid_poly_SVC, param_grid=param_grid, cv=kfolds, 
                           scoring=score_measure, n_jobs=-1)

_ = grid_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {grid_search.best_score_}")
print(f"... with parameters: {grid_search.best_params_}")

bestSVC = grid_search.best_estimator_


The best f1 score is 0.9959037089312318
... with parameters: {'C': 0.001, 'degree': 3, 'gamma': 100.0, 'kernel': 'poly'}


  y = column_or_1d(y, warn=True)


## Random Search for Radial basis function kernel

In [10]:
score_measure = "f1"
kfolds = 3

param_grid = {
     'C': np.logspace(-3, 3, 7), 
     'gamma': [0.0001, 0.001, 0.1, 1], 
     'kernel': ['rbf']
}

rand_rbf_SVC = SVC(max_iter=50)
rand_search = RandomizedSearchCV(estimator = rand_rbf_SVC, param_distributions=param_grid, cv=kfolds, n_iter=500,
                           scoring=score_measure, verbose=1, n_jobs=-1,
                           return_train_score=True)

_ = rand_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {rand_search.best_score_}")
print(f"... with parameters: {rand_search.best_params_}")

best_model = rand_search.best_estimator_




Fitting 3 folds for each of 28 candidates, totalling 84 fits
The best f1 score is 0.9969788519637461
... with parameters: {'kernel': 'rbf', 'gamma': 0.0001, 'C': 0.001}


  y = column_or_1d(y, warn=True)


## Grid Search for Radial basis function kernel

In [11]:
score_measure = "f1"
kfolds = 3

param_grid = {
     'C': np.logspace(-3, 3, 7),
     'gamma': [0.0001, 0.001, 0.1, 1], 
     'kernel': ['rbf']
}

grid_rbf_SVC = SVC(max_iter=50)
grid_search = GridSearchCV(estimator = grid_rbf_SVC, param_grid=param_grid, cv=kfolds, 
                           scoring=score_measure, verbose=1, n_jobs=-1, 
                           return_train_score=True)

_ = grid_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {grid_search.best_score_}")
print(f"... with parameters: {grid_search.best_params_}")

best_model = grid_search.best_estimator_


Fitting 3 folds for each of 28 candidates, totalling 84 fits
The best f1 score is 0.9969788519637461
... with parameters: {'C': 0.001, 'gamma': 0.0001, 'kernel': 'rbf'}


  y = column_or_1d(y, warn=True)


# Decision Tree model using Random Search and Grid Search for hyperparameter tuning:

## Decision Tree with Random Search:

In [40]:
score_measure = "f1"
kfolds = 5

param_grid = {
    'min_samples_split': np.arange(1, 20),  
    'min_samples_leaf': np.arange(1,20),
    'min_impurity_decrease': np.arange(0.0001, 0.01, 0.0005),
    'max_leaf_nodes': np.arange(5, 20), 
    'max_depth': np.arange(1,10), 
    'criterion': ['gini', 'entropy'],
}

dtree = DecisionTreeClassifier()
rand_search = RandomizedSearchCV(estimator=dtree, param_distributions=param_grid, cv=kfolds, n_iter=500,
                                 scoring=score_measure, verbose=1, n_jobs=-1, # n_jobs=-1 will utilize all available CPUs 
                                 return_train_score=True)

_ = rand_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {rand_search.best_score_}")
print(f"... with parameters: {rand_search.best_params_}")

bestPrecTree = rand_search.best_estimator_

Fitting 5 folds for each of 500 candidates, totalling 2500 fits
The best f1 score is 1.0
... with parameters: {'min_samples_split': 1, 'min_samples_leaf': 1, 'min_impurity_decrease': 0.0006000000000000001, 'max_leaf_nodes': 18, 'max_depth': 8, 'criterion': 'entropy'}


## Decision Tree with Grid Search:

In [32]:
score_measure = "f1"
kfolds = 5

param_grid = {
    'min_samples_split': np.arange(26,36),  
    'min_samples_leaf': np.arange(8,16),
    'min_impurity_decrease': np.arange( 0.0005, 0.0010, 0.0020),
    'max_leaf_nodes': [10,30], 
    'max_depth': [5,15], 
    'criterion': ['entropy']
}


dtree = DecisionTreeClassifier()
grid_search = GridSearchCV(estimator = dtree, param_grid=param_grid, cv=kfolds, 
                           scoring=score_measure, verbose=1, n_jobs=-1,  # n_jobs=-1 will utilize all available CPUs 
                           return_train_score=True)

_ = grid_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {grid_search.best_score_}")
print(f"... with parameters: {grid_search.best_params_}")

bestPrecisionTree = grid_search.best_estimator_

Fitting 5 folds for each of 320 candidates, totalling 1600 fits
The best f1 score is 0.9897921608280347
... with parameters: {'criterion': 'entropy', 'max_depth': 15, 'max_leaf_nodes': 10, 'min_impurity_decrease': 0.0005, 'min_samples_leaf': 10, 'min_samples_split': 30}


### Neural Network

In [41]:
from __future__ import print_function
import numpy as np
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, precision_score, recall_score
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
import matplotlib.pyplot as plt

from sklearn import datasets

from sklearn.tree import DecisionTreeClassifier 


np.random.seed(1)

In [42]:
%%time

ann = MLPClassifier(hidden_layer_sizes=(60,50,40), solver='adam', max_iter=200) #max_iter - how often we are changing the w's(weight)
_ = ann.fit(X_train, y_train)

  y = column_or_1d(y, warn=True)


CPU times: total: 55 s
Wall time: 1min 32s


In [43]:
%%time
y_pred = ann.predict(X_test)

CPU times: total: 93.8 ms
Wall time: 148 ms


In [44]:
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00       184
           1       1.00      1.00      1.00       888

    accuracy                           1.00      1072
   macro avg       1.00      1.00      1.00      1072
weighted avg       1.00      1.00      1.00      1072



In [45]:
%%time

score_measure = "f1"
kfolds = 5

param_grid = {
    'hidden_layer_sizes': [ (50,), (70,),(50,30), (40,20), (60,40, 20), (70,50,40)],
    'activation': ['logistic', 'tanh', 'relu'],
    'solver': ['adam', 'sgd'],
    'alpha': [0, .2, .5, .7, 1],
    'learning_rate': ['constant', 'invscaling', 'adaptive'],
    'learning_rate_init': [0.001, 0.01, 0.1, 0.2, 0.5],
    'max_iter': [5000]
}

ann = MLPClassifier()
rand_search = RandomizedSearchCV(estimator = ann, param_distributions=param_grid, cv=kfolds, n_iter=100,
                           scoring=score_measure, verbose=1, n_jobs=-1,  
                           return_train_score=True)

_ = rand_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {rand_search.best_score_}")
print(f"... with parameters: {rand_search.best_params_}")

bestPrecisionTree = rand_search.best_estimator_

Fitting 5 folds for each of 100 candidates, totalling 500 fits


  y = column_or_1d(y, warn=True)


The best f1 score is 1.0
... with parameters: {'solver': 'sgd', 'max_iter': 5000, 'learning_rate_init': 0.1, 'learning_rate': 'adaptive', 'hidden_layer_sizes': (70, 50, 40), 'alpha': 0.5, 'activation': 'tanh'}
CPU times: total: 5min
Wall time: 9min 45s


In [46]:
%%time

score_measure = "f1"
kfolds = 5

param_grid = {
    'hidden_layer_sizes': [ (30,), (50,), (70,), (90,)],
    'activation': ['tanh', 'relu'],
    'solver': ['adam'],
    'alpha': [.5, .7, 1],
    'learning_rate': ['adaptive', 'invscaling'],
    'learning_rate_init': [0.005, 0.01, 0.15],
    'max_iter': [5000]
}

ann = MLPClassifier()
grid_search = GridSearchCV(estimator = ann, param_grid=param_grid, cv=kfolds, 
                           scoring=score_measure, verbose=1, n_jobs=-1,  # n_jobs=-1 will utilize all available CPUs 
                           return_train_score=True)

_ = grid_search.fit(X_train, y_train)

print(f"The best {score_measure} score is {grid_search.best_score_}")
print(f"... with parameters: {grid_search.best_params_}")

bestPrecisionTree = grid_search.best_estimator_

Fitting 5 folds for each of 144 candidates, totalling 720 fits


  y = column_or_1d(y, warn=True)


The best f1 score is 0.9974522292993632
... with parameters: {'activation': 'relu', 'alpha': 0.5, 'hidden_layer_sizes': (90,), 'learning_rate': 'invscaling', 'learning_rate_init': 0.01, 'max_iter': 5000, 'solver': 'adam'}
CPU times: total: 24 s
Wall time: 2min 26s


### Conclusion:
### Based on the evaluation results, the best model appears to be the Neural Network model, as both Random Search and Grid Search achieved an F1 score of 99.7%. This suggests that the Neural Network model may be the most suitable for this particular classification problem.

### It is also worth noting that the Logistic Regression models, both with Random Search and Grid Search, performed very well, achieving F1 scores of 99.2% and 99.3%, respectively. The SVM models with both Linear and Radial Basis Function kernels, with both Random Search and Grid Search, also performed well, achieving F1 scores ranging from 99.4% to 99.6%.

### The Polynomial Kernel SVM models, with both Random Search and Grid Search, also performed well, achieving F1 scores of 99.5%. However, the Decision Tree models, with both Random Search and Grid Search, had relatively lower F1 scores ranging from 98.5% to 98.7%.

### Overall, based on the evaluation results, the Neural Network model using either Random Search or Grid Search is likely the best model for this classification problem.