# Task 1: Data Exploration and Preprocessing

Q1. Begin by loading and exploring the "Alphabets_data.csv" dataset. Summarize its key features such as the number of samples, features, and classes.

Q2. Execute necessary data preprocessing steps including data normalization, managing missing values.

In [1]:
import pandas as pd
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from scikeras.wrappers import KerasClassifier

import warnings
warnings.filterwarnings('ignore')

# Load the dataset
df = pd.read_csv("Alphabets_data.csv")
df

Unnamed: 0,letter,xbox,ybox,width,height,onpix,xbar,ybar,x2bar,y2bar,xybar,x2ybar,xy2bar,xedge,xedgey,yedge,yedgex
0,T,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8
1,I,5,12,3,7,2,10,5,5,4,13,3,9,2,8,4,10
2,D,4,11,6,8,6,10,6,2,6,10,3,7,3,7,3,9
3,N,7,11,6,6,3,5,9,4,6,4,4,10,6,10,2,8
4,G,2,1,3,1,1,8,6,6,6,6,5,9,1,7,5,10
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19995,D,2,2,3,3,2,7,7,7,6,6,6,4,2,8,3,7
19996,C,7,10,8,8,4,4,8,6,9,12,9,13,2,9,3,7
19997,T,6,9,6,7,5,6,11,3,7,11,9,5,2,12,2,4
19998,S,2,3,4,2,1,8,7,2,6,10,6,8,1,9,5,8


In [2]:
# Check the dimensions of the dataset (number of rows and columns)
print("Dataset dimensions:", df.shape)

Dataset dimensions: (20000, 17)


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 17 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   letter  20000 non-null  object
 1   xbox    20000 non-null  int64 
 2   ybox    20000 non-null  int64 
 3   width   20000 non-null  int64 
 4   height  20000 non-null  int64 
 5   onpix   20000 non-null  int64 
 6   xbar    20000 non-null  int64 
 7   ybar    20000 non-null  int64 
 8   x2bar   20000 non-null  int64 
 9   y2bar   20000 non-null  int64 
 10  xybar   20000 non-null  int64 
 11  x2ybar  20000 non-null  int64 
 12  xy2bar  20000 non-null  int64 
 13  xedge   20000 non-null  int64 
 14  xedgey  20000 non-null  int64 
 15  yedge   20000 non-null  int64 
 16  yedgex  20000 non-null  int64 
dtypes: int64(16), object(1)
memory usage: 2.6+ MB


In [4]:
# Display summary statistics of numerical columns
print("\nSummary statistics:")
print(df.describe())


Summary statistics:
               xbox          ybox         width       height         onpix  \
count  20000.000000  20000.000000  20000.000000  20000.00000  20000.000000   
mean       4.023550      7.035500      5.121850      5.37245      3.505850   
std        1.913212      3.304555      2.014573      2.26139      2.190458   
min        0.000000      0.000000      0.000000      0.00000      0.000000   
25%        3.000000      5.000000      4.000000      4.00000      2.000000   
50%        4.000000      7.000000      5.000000      6.00000      3.000000   
75%        5.000000      9.000000      6.000000      7.00000      5.000000   
max       15.000000     15.000000     15.000000     15.00000     15.000000   

               xbar          ybar         x2bar         y2bar         xybar  \
count  20000.000000  20000.000000  20000.000000  20000.000000  20000.000000   
mean       6.897600      7.500450      4.628600      5.178650      8.282050   
std        2.026035      2.325354      

In [5]:
# Check for any missing values
print("\nMissing values:")
df.isnull().sum()


Missing values:


letter    0
xbox      0
ybox      0
width     0
height    0
onpix     0
xbar      0
ybar      0
x2bar     0
y2bar     0
xybar     0
x2ybar    0
xy2bar    0
xedge     0
xedgey    0
yedge     0
yedgex    0
dtype: int64

We can conclude that this dataset does not have any null values.

In [6]:
# Check the number of unique classes
classes = df['letter'].nunique()
print("\nNumber of classes:", classes)


Number of classes: 26


In [7]:
# Explore unique values of column 'letter'
df['letter'].unique()

array(['T', 'I', 'D', 'N', 'G', 'S', 'B', 'A', 'J', 'M', 'X', 'O', 'R',
       'F', 'C', 'H', 'W', 'L', 'P', 'E', 'V', 'Y', 'Q', 'U', 'K', 'Z'],
      dtype=object)

# Task 2:  Model Implementation

Q1. Construct a basic ANN model using your chosen high-level neural network library. Ensure your model includes at least one hidden layer.

Q2. Divide the dataset into training and test sets.

Q3. Train your model on the training set and then use it to make predictions on the test set.

In [8]:
# Separate features (X) and target variable (y)
X = df.drop(columns=['letter'])  # Features
y = df['letter']  # Target variable

In [9]:
# Encode categorical target variable 'letter'
encoder = LabelEncoder()
y = encoder.fit_transform(y)

In [10]:
# Split the dataset into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [11]:
# Initialize StandardScaler
scaler = StandardScaler()

# Fit scaler on training data and transform both training and testing data
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [12]:
# Build ANN Model

# Initialize the Sequential model
model = tf.keras.Sequential()

# Add the input layer (specify input_dim for the number of features)
model.add(tf.keras.layers.Dense(units=64, activation='relu', input_dim=X_train.shape[1]))

# Add one or more hidden layers
model.add(tf.keras.layers.Dense(units=32, activation='relu'))

# Add the output layer (units should match the number of classes, activation depends on task)
model.add(tf.keras.layers.Dense(units=len(np.unique(y)), activation='softmax'))

# Compile the model (specify loss function, optimizer, and metrics)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Print a summary of the model architecture
model.summary()

In [13]:
# Train the model on training data
history = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_split=0.1, verbose=1)

Epoch 1/20
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - accuracy: 0.2298 - loss: 2.8471 - val_accuracy: 0.6069 - val_loss: 1.3713
Epoch 2/20
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.6460 - loss: 1.2507 - val_accuracy: 0.7194 - val_loss: 1.0188
Epoch 3/20
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7315 - loss: 0.9761 - val_accuracy: 0.7431 - val_loss: 0.9164
Epoch 4/20
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7589 - loss: 0.8538 - val_accuracy: 0.7569 - val_loss: 0.8438
Epoch 5/20
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7685 - loss: 0.7948 - val_accuracy: 0.7750 - val_loss: 0.7615
Epoch 6/20
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7933 - loss: 0.7177 - val_accuracy: 0.7969 - val_loss: 0.7166
Epoch 7/20
[1m450/450[0m 

In [14]:
# Evaluate the model on the test data
loss, accuracy = model.evaluate(X_test, y_test, verbose=1)
print(f"Accuracy on test data: {accuracy:.2f}")

[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.8725 - loss: 0.4132
Accuracy on test data: 0.88


In [15]:
# Predict classes on test data
y_pred = np.argmax(model.predict(X_test), axis=-1)

# Compute evaluation metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

# Print evaluation metrics
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-score: {f1:.4f}")

[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step
Accuracy: 0.8755
Precision: 0.8794
Recall: 0.8755
F1-score: 0.8761


# Task 3: Hyperparameter Tuning
Q1. Modify various hyperparameters, such as the number of hidden layers, neurons per hidden layer, activation functions, and learning rate, to observe their impact on model performance.

Q2. Adopt a structured approach like grid search or random search for hyperparameter tuning, documenting your methodology thoroughly.

In [16]:
# Define a Function for Building the Model

def build_model(num_hidden_layers=1, num_neurons=32, activation='relu', learning_rate=0.001):
    # Initialize the Sequential model
    model = Sequential()

    # Add the input layer
    model.add(Dense(units=num_neurons, activation=activation, input_dim=X_train.shape[1]))

    # Add hidden layers
    for _ in range(num_hidden_layers):
        model.add(Dense(units=num_neurons, activation=activation))

    # Add the output layer
    model.add(Dense(units=len(np.unique(y)), activation='softmax'))

    # Compile the model with specified learning rate
    optimizer = Adam(learning_rate=learning_rate)
    model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

    return model

In [17]:
best_accuracy = 0.0
best_hyperparameters = {}

num_hidden_layers_list = [1, 2, 3]  # Number of hidden layers
num_neurons_list = [32, 64, 128]    # Neurons per hidden layer
activation_list = ['relu', 'sigmoid']  # Activation functions
learning_rate_list = [0.001, 0.01, 0.1]  # Learning rates

# Iterate through different combinations of hyperparameters
for num_hidden_layers in num_hidden_layers_list:
    for num_neurons in num_neurons_list:
        for activation in activation_list:
            for learning_rate in learning_rate_list:
                # Build the model with current hyperparameters
                model = build_model(num_hidden_layers=num_hidden_layers,
                                    num_neurons=num_neurons,
                                    activation=activation,
                                    learning_rate=learning_rate)
                
                # Train the model
                history = model.fit(X_train_scaled, y_train, epochs=10, batch_size=32, validation_split=0.1, verbose=0)
                
                # Evaluate the model
                loss, accuracy = model.evaluate(X_test_scaled, y_test, verbose=0)
                
                # Print results
                print(f"Hidden Layers: {num_hidden_layers}, Neurons: {num_neurons}, Activation: {activation}, 
                      Learning Rate: {learning_rate}")
                print(f"Test Accuracy: {accuracy:.4f}")
                print("-------------------------------------------")
                
                # Update best model details if current model is better
                if accuracy > best_accuracy:
                    best_accuracy = accuracy
                    best_hyperparameters = {
                        'num_hidden_layers': num_hidden_layers,
                        'num_neurons': num_neurons,
                        'activation': activation,
                        'learning_rate': learning_rate
                    }

# Print best hyperparameters and accuracy
print("Best Hyperparameters:")
print(best_hyperparameters)
print(f"Best Test Accuracy: {best_accuracy:.4f}")

Hidden Layers: 1, Neurons: 32, Activation: relu, Learning Rate: 0.001
Test Accuracy: 0.8620
-------------------------------------------
Hidden Layers: 1, Neurons: 32, Activation: relu, Learning Rate: 0.01
Test Accuracy: 0.8873
-------------------------------------------
Hidden Layers: 1, Neurons: 32, Activation: relu, Learning Rate: 0.1
Test Accuracy: 0.2045
-------------------------------------------
Hidden Layers: 1, Neurons: 32, Activation: sigmoid, Learning Rate: 0.001
Test Accuracy: 0.7197
-------------------------------------------
Hidden Layers: 1, Neurons: 32, Activation: sigmoid, Learning Rate: 0.01
Test Accuracy: 0.8880
-------------------------------------------
Hidden Layers: 1, Neurons: 32, Activation: sigmoid, Learning Rate: 0.1
Test Accuracy: 0.7462
-------------------------------------------
Hidden Layers: 1, Neurons: 64, Activation: relu, Learning Rate: 0.001
Test Accuracy: 0.9160
-------------------------------------------
Hidden Layers: 1, Neurons: 64, Activation: re

By systematically varying hyperparameters and evaluating the model, we can determine the optimal configuration that maximizes accuracy for our specific dataset. The model with the best hyperparameters will be further used for Hyperparameter Tuning.

#### Why we are using GridSearchCV for Hyperparameter Tuning

1. Systematic Search: GridSearchCV conducts a structured exploration of predefined hyperparameter grids. It evaluates model performance across every possible combination of hyperparameters provided, ensuring thorough testing of each configuration.
2. Optimization: The main objective of hyperparameter tuning is to discover the combination that maximizes the model’s performance on unseen data. GridSearchCV automates this process by assessing numerous hyperparameter combinations and selecting the optimal one based on specified performance metrics such as accuracy or F1-score.
3. Cross-Validation: GridSearchCV integrates cross-validation during the hyperparameter search. This technique partitions the dataset into multiple subsets (folds), using each subset as a validation set. By averaging performance across these folds, it provides a more reliable estimate of model performance and mitigates overfitting.
4. Efficiency: Despite its exhaustive nature, GridSearchCV remains computationally efficient for smaller datasets or when the hyperparameter grid is reasonably sized. It efficiently explores potential configurations and presents a clear overview of the best hyperparameter combination for the model.

In [24]:
def build_model(num_hidden_layers=1, num_neurons=128, activation='sigmoid', learning_rate=0.01):
    # Initialize the Sequential model
    model = Sequential()

    # Add the input layer
    model.add(Dense(units=num_neurons, activation=activation, input_dim=X_train_scaled.shape[1]))

    # Add hidden layers
    for _ in range(num_hidden_layers):
        model.add(Dense(units=num_neurons, activation=activation))

    # Add the output layer (assuming it's a multi-class classification problem)
    model.add(Dense(units=len(np.unique(y_train)), activation='softmax'))

    # Compile the model with specified learning rate
    optimizer = Adam(learning_rate=learning_rate)
    model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

    return model

# Create a KerasClassifier using the best hyperparameters
keras_model = KerasClassifier(build_fn=build_model, **best_hyperparameters)

In [25]:
# Define the parameter grid for GridSearchCV
param_grid = {
    'epochs': [10, 20, 30],
    'batch_size': [32, 64, 128],
    'validation_split': [0.1, 0.2]
}

# Initialize GridSearchCV
grid_search = GridSearchCV(estimator=keras_model, param_grid=param_grid, cv=3, scoring='accuracy', verbose=1)

# Fit GridSearchCV
grid_search.fit(X_train_scaled, y_train)

# Evaluate best model on test data
best_model = grid_search.best_estimator_

Fitting 3 folds for each of 18 candidates, totalling 54 fits
Epoch 1/10
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.3347 - loss: 2.2878 - val_accuracy: 0.7273 - val_loss: 0.9376
Epoch 2/10
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.7575 - loss: 0.7823 - val_accuracy: 0.7666 - val_loss: 0.7183
Epoch 3/10
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8311 - loss: 0.5427 - val_accuracy: 0.8416 - val_loss: 0.5121
Epoch 4/10
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.8699 - loss: 0.4105 - val_accuracy: 0.8725 - val_loss: 0.4165
Epoch 5/10
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.9034 - loss: 0.3073 - val_accuracy: 0.8875 - val_loss: 0.3534
Epoch 6/10
[1m300/300[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.9197 - loss: 0.2571 - val_acc

# Task 4: Evaluation

Q1. Employ suitable metrics such as accuracy, precision, recall, and F1-score to evaluate your model's performance.

Q2. Discuss the performance differences between the model with default hyperparameters and the tuned model, emphasizing the effects of hyperparameter tuning.

In [26]:
# Evaluate best model on test data
y_pred = best_model.predict(X_test_scaled)

# Compute evaluation metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='weighted')
recall = recall_score(y_test, y_pred, average='weighted')
f1 = f1_score(y_test, y_pred, average='weighted')

# Print evaluation metrics
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1-score: {f1:.4f}")

[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step
Accuracy: 0.9657
Precision: 0.9660
Recall: 0.9657
F1-score: 0.9657


#### Evaluation Metrics for Model with Default Hyperparameters:

1. Accuracy: 0.8755
2. Precision: 0.8794
3. Recall: 0.8755
4. F1-score: 0.8761

#### Evaluation Metrics for Tuned Model:

1. Accuracy: 0.9657
2. Precision: 0.9660
3. Recall: 0.9657
4. F1-score: 0.9657

#### Analysis:

1. Accuracy: The default model achieved an accuracy of 0.8755, while the tuned model significantly improved to 0.9657. This indicates that the tuned model correctly predicts 96.57% of all instances, compared to 87.55% for the default model. Hyperparameter tuning has effectively boosted the model's ability to correctly classify instances.

2. Precision: Precision measures the ratio of correctly predicted positive observations to the total predicted positive observations. The default model had a precision of 0.8794, while the tuned model increased it to 0.9660. This means the tuned model has a higher proportion of correctly predicted positive cases among all predicted positive cases.

3. Recall: Recall, also known as sensitivity or true positive rate, measures the ratio of correctly predicted positive observations to all observations in the actual class. The default model had a recall of 0.8755, while the tuned model increased it to 0.9657. This indicates that hyperparameter tuning has improved the recall to identify 96.57% of all positive instances correctly.

4. F1-score: The F1-score combines precision and recall into a single metric. The default model had an F1-score of 0.8761, while the tuned model improved it to 0.9657. This shows that the tuned model achieves a better balance between precision and recall, resulting in higher overall model performance.

#### Effects of Hyperparameter Tuning:

1. Improved Accuracy: Hyperparameter tuning led to a substantial increase in accuracy, demonstrating the effectiveness of finding optimal combinations of hyperparameters that enhance the model's predictive power.
2. Enhanced Precision: Tuning resulted in higher precision, indicating fewer false positives in predictions. This is crucial in applications where minimizing false positives is critical.
3. Consistent Recall: While both models achieved the same recall score, tuning ensures that the model maintains high sensitivity in identifying positive instances, which is essential for comprehensive detection tasks.
4. Overall Model Effectiveness: The F1-score improvement from 0.8761 to 0.9657 highlights that hyperparameter tuning not only improves individual metrics but also enhances the overall robustness and reliability of the model across different evaluation criteria.

In conclusion, hyperparameter tuning plays a pivotal role in optimizing machine learning models by significantly improving performance metrics such as accuracy, precision, recall, and F1-score. These improvements translate into more reliable and effective models for real-world applications.