ARTIFICIAL NEURAL NETWORKS

Classification Using Artificial Neural Networks with Hyperparameter Tuning on Alphabets Data

Overview

In this assignment, you will be tasked with developing a classification model using Artificial Neural Networks (ANNs) to classify data points from the "Alphabets_data.csv" dataset into predefined categories of alphabets. This exercise aims to deepen your understanding of ANNs and the significant role hyperparameter tuning plays in enhancing model performance.

Dataset: "Alphabets_data.csv"

The dataset provided, "Alphabets_data.csv", consists of labeled data suitable for a classification task aimed at identifying different alphabets. Before using this data in your model, you'll need to preprocess it to ensure optimal performance.

Tasks

1. Data Exploration and Preprocessing

    ●	Begin by loading and exploring the "Alphabets_data.csv" dataset. Summarize its key features such as the number of samples, features, and classes.

    ●	Execute necessary data preprocessing steps including data normalization, managing missing values.

2. Model Implementation

    ●	Construct a basic ANN model using your chosen high-level neural network library. Ensure your model includes at least one hidden layer.

    ●	Divide the dataset into training and test sets.

    ●	Train your model on the training set and then use it to make predictions on the test set.

3. Hyperparameter Tuning

    ●	Modify various hyperparameters, such as the number of hidden layers, neurons per hidden layer, activation functions, and learning rate, to observe their impact on model performance.

    ●	Adopt a structured approach like grid search or random search for hyperparameter tuning, documenting your methodology thoroughly.

4. Evaluation

    ●	Employ suitable metrics such as accuracy, precision, recall, and F1-score to evaluate your model's performance.

    ●	Discuss the performance differences between the model with default hyperparameters and the tuned model, emphasizing the effects of hyperparameter tuning.

Evaluation Criteria

    ●	Accuracy and completeness of the implementation.
    ●	Proficiency in data preprocessing and model development.
    ●	Systematic approach and thoroughness in hyperparameter tuning.
    ●	Depth of evaluation and discussion.
    ●	Overall quality of the report.




In [3]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from sklearn.metrics import accuracy_score, classification_report
import warnings
warnings.filterwarnings('ignore')

In [4]:

# Load the dataset
df = pd.read_csv('/content/Alphabets_data.csv')

# Basic exploration
print(df.info())  # Summary of the dataset
print(df.head())  # Display the first few rows of the dataset
print(df.describe())  # Get statistics of the data

# Check for missing values
print(df.isnull().sum())


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 17 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   letter  20000 non-null  object
 1   xbox    20000 non-null  int64 
 2   ybox    20000 non-null  int64 
 3   width   20000 non-null  int64 
 4   height  20000 non-null  int64 
 5   onpix   20000 non-null  int64 
 6   xbar    20000 non-null  int64 
 7   ybar    20000 non-null  int64 
 8   x2bar   20000 non-null  int64 
 9   y2bar   20000 non-null  int64 
 10  xybar   20000 non-null  int64 
 11  x2ybar  20000 non-null  int64 
 12  xy2bar  20000 non-null  int64 
 13  xedge   20000 non-null  int64 
 14  xedgey  20000 non-null  int64 
 15  yedge   20000 non-null  int64 
 16  yedgex  20000 non-null  int64 
dtypes: int64(16), object(1)
memory usage: 2.6+ MB
None
  letter  xbox  ybox  width  height  onpix  xbar  ybar  x2bar  y2bar  xybar  \
0      T     2     8      3       5      1     8    13      0      6    

## Data Preprocessing

In [5]:
# Extract features and target
X = df.drop(columns='letter')  # Features
y = df['letter']  # Target (Alphabet classes)

# Encode target labels
le = LabelEncoder()
y_encoded = le.fit_transform(y)

# Normalize the features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_encoded, test_size=0.2, random_state=42)

print('x_train',X_train.shape)
print('x_test',X_test.shape)
print('y_train',y_train.shape)
print('y_test',y_test.shape)

x_train (16000, 16)
x_test (4000, 16)
y_train (16000,)
y_test (4000,)


## Basic ANN Model Implementation

In [6]:
# Build a basic ANN model
model = Sequential()
model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))  # Hidden layer
model.add(Dense(32, activation='relu'))  # Another hidden layer
model.add(Dense(len(le.classes_), activation='softmax'))  # Output layer for multi-class classification

# Compile the model
model.compile(optimizer=Adam(learning_rate=0.001), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test))

# Predict on test data
y_pred = np.argmax(model.predict(X_test), axis=-1)


Epoch 1/10
[1m500/500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - accuracy: 0.3223 - loss: 2.4967 - val_accuracy: 0.7165 - val_loss: 1.0237
Epoch 2/10
[1m500/500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.7401 - loss: 0.9404 - val_accuracy: 0.7955 - val_loss: 0.7484
Epoch 3/10
[1m500/500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7940 - loss: 0.7184 - val_accuracy: 0.8282 - val_loss: 0.6220
Epoch 4/10
[1m500/500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8229 - loss: 0.6122 - val_accuracy: 0.8420 - val_loss: 0.5431
Epoch 5/10
[1m500/500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8450 - loss: 0.5302 - val_accuracy: 0.8560 - val_loss: 0.4979
Epoch 6/10
[1m500/500[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.8643 - loss: 0.4575 - val_accuracy: 0.8717 - val_loss: 0.4455
Epoch 7/10
[1m500/500[0m 

In [7]:
# Evaluation metrics
accuracy = accuracy_score(y_test, y_pred)
print(f'Accuracy: {accuracy}')
print(classification_report(y_test, y_pred, target_names=le.classes_))

Accuracy: 0.902
              precision    recall  f1-score   support

           A       0.86      0.94      0.90       149
           B       0.86      0.83      0.85       153
           C       0.94      0.90      0.92       137
           D       0.86      0.95      0.90       156
           E       0.89      0.93      0.91       141
           F       0.82      0.89      0.85       140
           G       0.85      0.82      0.84       160
           H       0.87      0.72      0.79       144
           I       0.96      0.91      0.94       146
           J       0.94      0.91      0.92       149
           K       0.81      0.90      0.85       130
           L       0.95      0.90      0.92       155
           M       0.97      0.92      0.95       168
           N       0.99      0.92      0.96       151
           O       0.90      0.92      0.91       145
           P       0.97      0.83      0.90       173
           Q       0.92      0.95      0.93       166
           

## Hyperparameter Tuning (Manual)

In [8]:
!pip install tensorflow



In [9]:
!pip install scikeras

Collecting scikeras
  Downloading scikeras-0.13.0-py3-none-any.whl.metadata (3.1 kB)
Downloading scikeras-0.13.0-py3-none-any.whl (26 kB)
Installing collected packages: scikeras
Successfully installed scikeras-0.13.0


In [10]:
from sklearn.model_selection import GridSearchCV
from scikeras.wrappers import KerasClassifier
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam

# Function to create model for GridSearch
def create_model(learning_rate=0.001, activation='relu'):
    model = Sequential()
    model.add(Dense(64, input_dim=X_train.shape[1], activation=activation))
    model.add(Dense(32, activation=activation))
    model.add(Dense(len(le.classes_), activation='softmax'))
    model.compile(optimizer=Adam(learning_rate=learning_rate), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

In [11]:
# Wrap the model for scikit-learn compatibility
model = KerasClassifier(model=create_model, verbose=0) # Updated to use model instead of build_fn
model

In [12]:

# Define hyperparameters to tune
param_grid = {
    'batch_size': [32, 64],
    'epochs': [10, 20],
    'model__learning_rate': [0.001, 0.01], # Updated parameter name
    'model__activation': ['relu', 'tanh'] # Updated parameter name
}

# Perform GridSearchCV
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=3)
grid_result = grid.fit(X_train, y_train)
grid_result

In [13]:
# Display best parameters and score
print(f'Best parameters: {grid_result.best_params_}')
print(f'Best score: {grid_result.best_score_}')

Best parameters: {'batch_size': 64, 'epochs': 20, 'model__activation': 'relu', 'model__learning_rate': 0.01}
Best score: 0.9185623548415908


# Evaluation

In [19]:

# Evaluate the best model on the test set
best_model = grid_result.best_estimator_
y_pred_best = best_model.predict(X_test)
accuracy_best = accuracy_score(y_test, y_pred_best)
print(f'Accuracy of the best model: {accuracy_best}')
print(classification_report(y_test, y_pred_best, target_names=le.classes_))

# Compare performance with the basic model
print(f'Accuracy of the basic model: {accuracy}')



Accuracy of the best model: 0.922
              precision    recall  f1-score   support

           A       0.98      0.95      0.96       149
           B       0.89      0.95      0.92       153
           C       0.93      0.88      0.91       137
           D       0.90      0.91      0.91       156
           E       0.95      0.88      0.92       141
           F       0.99      0.76      0.86       140
           G       0.85      0.91      0.88       160
           H       0.93      0.76      0.84       144
           I       0.93      0.92      0.92       146
           J       0.96      0.97      0.96       149
           K       0.83      0.92      0.87       130
           L       0.83      0.99      0.90       155
           M       0.97      0.99      0.98       168
           N       0.94      0.90      0.92       151
           O       0.94      0.92      0.93       145
           P       0.92      0.98      0.95       173
           Q       0.92      0.94      0.93    

## Discussion on Performance Differences
The hyperparameter tuning process significantly impacted the model's performance, leading to improved accuracy and overall effectiveness in classifying alphabet data.

**Basic Model:**
- Accuracy: 0.902

**Tuned Model:**
- Accuracy: 0.922

**Effects of Hyperparameter Tuning:**
- By systematically exploring different combinations of hyperparameters (e.g., learning rate, batch size, epochs, activation function), the GridSearchCV method helped us identify the optimal configuration that maximized the model's performance on the validation set.
- The improved accuracy of the tuned model suggests that the chosen hyperparameters allowed the model to learn more effectively from the training data and generalize better to unseen data.
- The tuning process likely helped to reduce overfitting, leading to improved performance on the test data.
- Specific hyperparameters, such as the learning rate, can significantly influence the training process. A properly tuned learning rate ensures that the model converges efficiently towards the optimal weights and avoids issues like overshooting or getting stuck in local minima.
- Tuning the batch size and number of epochs allows for finding the optimal balance between computational efficiency and model performance.
- The choice of activation functions plays a crucial role in the model's ability to learn complex patterns in the data. Optimally selecting the activation function enables the model to represent intricate relationships between the input features and output classes more effectively.

**Conclusion:**
Hyperparameter tuning is a critical step in developing high-performing ANN models. The results clearly show that carefully optimized hyperparameters can lead to significant improvements in accuracy and overall model performance. Further experimentation with a wider range of hyperparameters and different optimization techniques could potentially lead to even better results.