# ARTIFICIAL NEURAL NETWORKS
#### Classification Using Artificial Neural Networks with Hyperparameter Tuning on Alphabets Data
#### Overview
#### In this assignment, you will be tasked with developing a classification model using Artificial Neural Networks (ANNs) to classify data points from the "Alphabets_data.csv" dataset into predefined categories of alphabets. This exercise aims to deepen your understanding of ANNs and the significant role hyperparameter tuning plays in enhancing model performance.

#### Dataset: "Alphabets_data.csv"
#### The dataset provided, "Alphabets_data.csv", consists of labeled data suitable for a classification task aimed at identifying different alphabets. Before using this data in your model, you'll need to preprocess it to ensure optimal performance.
#### Tasks
### 1. Data Exploration and Preprocessing
#### ●	Begin by loading and exploring the "Alphabets_data.csv" dataset. Summarize its key features such as the number of samples, features, and classes.
#### ●	Execute necessary data preprocessing steps including data normalization, managing missing values.


In [1]:
import pandas as pd

# Load the dataset
df = pd.read_csv("Alphabets_data.csv")

# Display basic information about the dataset
df_info = df.info()
df_description = df.describe(include='all')
df_head = df.head()
df_shape = df.shape
df_nulls = df.isnull().sum()

df_shape, df_nulls, df_head, df_description

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 17 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   letter  20000 non-null  object
 1   xbox    20000 non-null  int64 
 2   ybox    20000 non-null  int64 
 3   width   20000 non-null  int64 
 4   height  20000 non-null  int64 
 5   onpix   20000 non-null  int64 
 6   xbar    20000 non-null  int64 
 7   ybar    20000 non-null  int64 
 8   x2bar   20000 non-null  int64 
 9   y2bar   20000 non-null  int64 
 10  xybar   20000 non-null  int64 
 11  x2ybar  20000 non-null  int64 
 12  xy2bar  20000 non-null  int64 
 13  xedge   20000 non-null  int64 
 14  xedgey  20000 non-null  int64 
 15  yedge   20000 non-null  int64 
 16  yedgex  20000 non-null  int64 
dtypes: int64(16), object(1)
memory usage: 2.6+ MB


((20000, 17),
 letter    0
 xbox      0
 ybox      0
 width     0
 height    0
 onpix     0
 xbar      0
 ybar      0
 x2bar     0
 y2bar     0
 xybar     0
 x2ybar    0
 xy2bar    0
 xedge     0
 xedgey    0
 yedge     0
 yedgex    0
 dtype: int64,
   letter  xbox  ybox  width  height  onpix  xbar  ybar  x2bar  y2bar  xybar  \
 0      T     2     8      3       5      1     8    13      0      6      6   
 1      I     5    12      3       7      2    10     5      5      4     13   
 2      D     4    11      6       8      6    10     6      2      6     10   
 3      N     7    11      6       6      3     5     9      4      6      4   
 4      G     2     1      3       1      1     8     6      6      6      6   
 
    x2ybar  xy2bar  xedge  xedgey  yedge  yedgex  
 0      10       8      0       8      0       8  
 1       3       9      2       8      4      10  
 2       3       7      3       7      3       9  
 3       4      10      6      10      2       8  
 4       5   

### 2. Model Implementation
#### ●	Construct a basic ANN model using your chosen high-level neural network library. Ensure your model includes at least one hidden layer.
#### ●	Divide the dataset into training and test sets.
#### ●	Train your model on the training set and then use it to make predictions on the test set.


In [2]:
!pip install tensorflow



In [3]:
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import pandas as pd

# Load data
df = pd.read_csv("Alphabets_data.csv")
print(df.columns)  # Use this to find your target column

# Assuming 'Letter' is your label column
X = df.drop('letter', axis=1)
y = df['letter']

# Encode target labels
le = LabelEncoder()
y = le.fit_transform(y)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train basic ANN
model = MLPClassifier(hidden_layer_sizes=(64, 32), max_iter=300)
model.fit(X_train, y_train)

# Accuracy
accuracy = model.score(X_test, y_test)
print(f"Test Accuracy: {accuracy:.2f}")


Index(['letter', 'xbox', 'ybox', 'width', 'height', 'onpix', 'xbar', 'ybar',
       'x2bar', 'y2bar', 'xybar', 'x2ybar', 'xy2bar', 'xedge', 'xedgey',
       'yedge', 'yedgex'],
      dtype='object')
Test Accuracy: 0.94




### 3. Hyperparameter Tuning
#### ●	Modify various hyperparameters, such as the number of hidden layers, neurons per hidden layer, activation functions, and learning rate, to observe their impact on model performance.
#### ●	Adopt a structured approach like grid search or random search for hyperparameter tuning, documenting your methodology thoroughly.


In [4]:
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
import pandas as pd

# Load and prepare data
df = pd.read_csv("Alphabets_data.csv")
X = df.drop('letter', axis=1)
y = LabelEncoder().fit_transform(df['letter'])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define hyperparameter grid
param_grid = {
    'hidden_layer_sizes': [(32,), (64,), (64, 32), (128, 64)],
    'activation': ['relu', 'tanh'],
    'learning_rate': ['constant', 'adaptive'],
    'alpha': [0.001, 0.01],
    'max_iter': [300]
}

# Model
mlp = MLPClassifier(random_state=42)

# Grid Search
grid_search = GridSearchCV(mlp, param_grid, cv=3, scoring='accuracy', verbose=2)
grid_search.fit(X_train, y_train)

# Best parameters and score
print("Best Parameters:", grid_search.best_params_)
print("Best CV Accuracy:", grid_search.best_score_)

# Test accuracy
test_accuracy = grid_search.best_estimator_.score(X_test, y_test)
print(f"Test Accuracy with Best Model: {test_accuracy:.2f}")

Fitting 3 folds for each of 32 candidates, totalling 96 fits




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=   9.5s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=   9.4s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=  11.5s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=   9.8s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  10.1s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  10.4s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  28.8s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  31.0s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  29.6s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  29.7s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  31.0s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  32.6s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time=  40.1s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time=  39.9s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time=  43.8s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time=  42.2s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time=  38.7s




[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time=  43.1s
[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time=  57.6s
[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 1.2min
[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 1.2min
[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 1.3min
[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 1.1min
[CV] END activation=relu, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 1.2min




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=   9.6s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=   9.0s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=   8.7s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=   9.0s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=   8.6s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=   8.9s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  33.1s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  32.2s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  32.5s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  32.7s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  31.6s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  33.5s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time=  44.6s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time=  41.6s
[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time=  38.8s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time=  44.8s




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time=  42.7s
[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time=  36.6s
[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 1.3min




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 1.3min
[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time=  58.9s
[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 1.3min




[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 1.3min
[CV] END activation=relu, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 1.1min




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=  13.5s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=  13.5s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=  13.0s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  12.5s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  14.7s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  13.1s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  54.9s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  55.3s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  56.0s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  56.8s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  56.6s




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  56.7s
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time= 1.2min




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time= 1.3min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time= 1.2min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time= 1.2min




[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time= 1.3min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time= 1.2min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 2.0min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 2.2min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 2.3min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 9.8min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 2.2min
[CV] END activation=tanh, alpha=0.001, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 2



[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=  30.5s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=  30.3s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=constant, max_iter=300; total time=  30.5s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  30.4s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  30.4s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(32,), learning_rate=adaptive, max_iter=300; total time=  30.2s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  56.7s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  56.6s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=constant, max_iter=300; total time=  56.4s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  55.5s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  57.2s




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64,), learning_rate=adaptive, max_iter=300; total time=  56.5s
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time= 1.2min




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time= 1.3min




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=constant, max_iter=300; total time= 1.3min
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time= 1.2min




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time= 1.3min




[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(64, 32), learning_rate=adaptive, max_iter=300; total time= 1.3min
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 2.0min
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 2.1min
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=constant, max_iter=300; total time= 1.8min
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 2.0min
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 2.1min
[CV] END activation=tanh, alpha=0.01, hidden_layer_sizes=(128, 64), learning_rate=adaptive, max_iter=300; total time= 1.9min
Best Parameters: {'activation': 'tanh', 'alpha': 0.01, 'hidden_layer_sizes': (128, 64), 'learning_rate': 'constant', 'max_iter

### 4. Evaluation
#### ●	Employ suitable metrics such as accuracy, precision, recall, and F1-score to evaluate your model's performance.
#### ●	Discuss the performance differences between the model with default hyperparameters and the tuned model, emphasizing the effects of hyperparameter tuning.


In [5]:
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Predict using default model
y_pred_default = model.predict(X_test)

# Evaluation metrics
print("Default Model Performance:")
print("Accuracy:", accuracy_score(y_test, y_pred_default))
print("Classification Report:\n", classification_report(y_test, y_pred_default))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_default))

Default Model Performance:
Accuracy: 0.93775
Classification Report:
               precision    recall  f1-score   support

           0       0.97      0.97      0.97       149
           1       0.93      0.93      0.93       153
           2       0.91      0.91      0.91       137
           3       0.94      0.93      0.93       156
           4       0.90      0.92      0.91       141
           5       0.88      0.91      0.89       140
           6       0.91      0.88      0.90       160
           7       0.86      0.83      0.85       144
           8       0.96      0.94      0.95       146
           9       0.97      0.94      0.95       149
          10       0.94      0.90      0.92       130
          11       0.94      0.97      0.96       155
          12       0.96      0.98      0.97       168
          13       0.94      0.92      0.93       151
          14       0.91      0.96      0.93       145
          15       0.92      0.96      0.94       173
          16

In [6]:
best_model = grid_search.best_estimator_  # or random_search.best_estimator_
y_pred_tuned = best_model.predict(X_test)

print("Tuned Model Performance:")
print("Accuracy:", accuracy_score(y_test, y_pred_tuned))
print("Classification Report:\n", classification_report(y_test, y_pred_tuned))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred_tuned))


Tuned Model Performance:
Accuracy: 0.96425
Classification Report:
               precision    recall  f1-score   support

           0       0.97      0.99      0.98       149
           1       0.96      0.96      0.96       153
           2       0.95      0.96      0.95       137
           3       0.96      0.96      0.96       156
           4       0.93      0.93      0.93       141
           5       0.94      0.96      0.95       140
           6       0.97      0.94      0.96       160
           7       0.95      0.90      0.93       144
           8       0.99      0.92      0.95       146
           9       0.94      0.99      0.96       149
          10       0.87      0.96      0.92       130
          11       0.97      0.98      0.97       155
          12       0.99      0.98      0.99       168
          13       0.98      0.97      0.98       151
          14       0.96      0.96      0.96       145
          15       1.00      0.96      0.98       173
          16  