<a href="https://colab.research.google.com/github/vaisshnavee1410/Artificial-Neural-Networks.ipynb/blob/main/Artifical_Neural_Networks.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **ARTIFICIAL NEURAL NETWORKS**

### **OVERVIEW:**

In this assignment, you will be tasked with developing a classification model using Artificial Neural Networks (ANNs) to classify data points from the "Alphabets_data.csv" dataset into predefined categories of alphabets. This exercise aims to deepen your understanding of ANNs and the significant role hyperparameter tuning plays in enhancing model performance.

### **DATASET:**

The dataset provided, "Alphabets_data.csv", consists of labeled data suitable for a classification task aimed at identifying different alphabets. Before using this data in your model, you'll need to preprocess it to ensure optimal performance.

### **TASKS:**

### **1. Data Exploration and Preprocessing:**

● **Begin by loading and exploring the "Alphabets_data.csv" dataset. Summarize its key features such as the number of samples, features, and classes.**

In [5]:
import pandas as pd
import numpy as np

# Load the dataset
data = pd.read_csv('Alphabets_data.csv')

# Number of Samples
num_samples = data.shape[0]
print(f"Number of Samples: {num_samples}")

# Number of Features
num_features = data.shape[1]
print(f"Number of Features: {num_features}")

# Number of Classes
num_classes = data.shape[1]
print(f"Number of Classes: {num_classes}")

Number of Samples: 20000
Number of Features: 17
Number of Classes: 17


● **Execute necessary data preprocessing steps including data normalization, managing missing values.**

In [6]:
from sklearn.preprocessing import StandardScaler, LabelEncoder

# Separate features and target
X = data.drop('letter', axis=1)
y = data['letter']

# Normalize features using StandardScaler (mean=0, std=1)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
print("X_scaled:\n")
print(X_scaled)

# Encode the target labels
label_encoder = LabelEncoder()
y_encoded = label_encoder.fit_transform(y)
print("y_encoded:\n")
print(y_encoded)

# Check for missing values
missing_summary = data.isnull().sum()
print("Missing values per column:\n", missing_summary)

X_scaled:

[[-1.0576983   0.29187713 -1.05327668 ... -0.21908163 -1.4381527
   0.12291107]
 [ 0.51038497  1.5023577  -1.05327668 ... -0.21908163  0.12008142
   1.35944092]
 [-0.01230945  1.19973756  0.43590966 ... -0.8656262  -0.26947711
   0.74117599]
 ...
 [ 1.03307939  0.59449727  0.43590966 ...  2.36709667 -0.65903564
  -2.35014863]
 [-1.0576983  -1.22122359 -0.55688123 ...  0.42746295  0.50963994
   0.12291107]
 [-0.01230945  0.59449727  0.43590966 ... -0.8656262  -0.65903564
   0.12291107]]
y_encoded:

[19  8  3 ... 19 18  0]
Missing values per column:
 letter    0
xbox      0
ybox      0
width     0
height    0
onpix     0
xbar      0
ybar      0
x2bar     0
y2bar     0
xybar     0
x2ybar    0
xy2bar    0
xedge     0
xedgey    0
yedge     0
yedgex    0
dtype: int64


### **2. Model Implementation:**

● **Construct a basic ANN model using your chosen high-level neural network library. Ensure your model includes at least one hidden layer.**

In [7]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler

# 1. Preprocessing
X = data.drop('letter', axis=1)
y = data['letter']

# Encode labels to integers
encoder = LabelEncoder()
y_encoded = encoder.fit_transform(y)

# Define hyperparameters here before building the model
neurons_per_layer = 64  #
activation_function = 'relu'
hidden_layers = 1

# Build model
model = Sequential()
model.add(Dense(neurons_per_layer, activation=activation_function, input_shape=(X.shape[1],))) # Changed input_shape

# Hidden layers (if hidden_layers > 1)
for _ in range(hidden_layers - 1):
    model.add(Dense(neurons_per_layer, activation=activation_function))

# Normalize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_encoded, test_size=0.2, random_state=42)



# 2. Define ANN model
model = Sequential([
    Dense(64, activation='relu', input_shape=(X_train.shape[1],)),  # Hidden layer
    Dense(26, activation='softmax')  # Output layer (26 classes for letters A-Z)
])

# 3. Compile model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# 4. Train model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.1)

# 5. Evaluate model
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_accuracy:.2f}")

Epoch 1/10


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.2997 - loss: 2.5953 - val_accuracy: 0.6681 - val_loss: 1.2964
Epoch 2/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.7106 - loss: 1.1589 - val_accuracy: 0.7444 - val_loss: 0.9239
Epoch 3/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.7636 - loss: 0.8644 - val_accuracy: 0.7806 - val_loss: 0.7701
Epoch 4/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8010 - loss: 0.7282 - val_accuracy: 0.8006 - val_loss: 0.6811
Epoch 5/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8220 - loss: 0.6345 - val_accuracy: 0.8213 - val_loss: 0.6103
Epoch 6/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8393 - loss: 0.5768 - val_accuracy: 0.8413 - val_loss: 0.5579
Epoch 7/10
[1m450/450[0m [32m━━━━━━━

● **Divide the dataset into training and test sets.**

In [8]:
from sklearn.model_selection import train_test_split

# Split the dataset: 80% training, 20% testing
X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)

print(f"Training set size: {X_train.shape[0]} samples")
print(f"Test set size: {X_test.shape[0]} samples")

Training set size: 16000 samples
Test set size: 4000 samples


● **Train your model on the training set and then use it to make predictions on the test set.**

In [9]:
# Train model
model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.1)

# Predict on test data
y_pred_probs = model.predict(X_test)
y_pred = y_pred_probs.argmax(axis=1)

# Print first 10 predictions
print("First 10 Predictions:", y_pred[:10])

Epoch 1/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8847 - loss: 0.4092 - val_accuracy: 0.8906 - val_loss: 0.3887
Epoch 2/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.8932 - loss: 0.3799 - val_accuracy: 0.8944 - val_loss: 0.3764
Epoch 3/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8988 - loss: 0.3571 - val_accuracy: 0.9031 - val_loss: 0.3599
Epoch 4/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9002 - loss: 0.3436 - val_accuracy: 0.9044 - val_loss: 0.3444
Epoch 5/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9070 - loss: 0.3249 - val_accuracy: 0.9013 - val_loss: 0.3336
Epoch 6/10
[1m450/450[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9139 - loss: 0.3097 - val_accuracy: 0.9112 - val_loss: 0.3224
Epoch 7/10
[1m450/450[0m 

### **3. Hyperparameter Tuning:**

● **Modify various hyperparameters, such as the number of hidden layers, neurons per hidden layer, activation functions, and learning rate, to observe their impact on model performance**

In [10]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam

# Load dataset
df = pd.read_csv("Alphabets_data.csv")

# Preprocessing
X = data.drop('letter', axis=1)
y = data['letter']

encoder = LabelEncoder()
y_encoded = encoder.fit_transform(y)

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y_encoded, test_size=0.2, random_state=42, stratify=y_encoded
)

# === MODIFY THESE HYPERPARAMETERS ===
hidden_layers = 2
neurons_per_layer = 128
activation_function = 'tanh'
learning_rate = 0.001
epochs = 15
batch_size = 64

# Build model
model = Sequential()
model.add(Dense(neurons_per_layer, activation=activation_function, input_shape=(X_train.shape[1],)))

# Add more hidden layers
for _ in range(hidden_layers - 1):
    model.add(Dense(neurons_per_layer, activation=activation_function))

# Output layer
model.add(Dense(26, activation='softmax'))

# Compile model with custom learning rate
optimizer = Adam(learning_rate=learning_rate)
model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train
history = model.fit(X_train, y_train, validation_split=0.1, epochs=epochs, batch_size=batch_size)

# Evaluate
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"\nTest Accuracy: {test_accuracy:.4f}")

Epoch 1/15


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m225/225[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.4719 - loss: 1.9993 - val_accuracy: 0.7344 - val_loss: 0.9995
Epoch 2/15
[1m225/225[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.7613 - loss: 0.9037 - val_accuracy: 0.7894 - val_loss: 0.7877
Epoch 3/15
[1m225/225[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.7978 - loss: 0.7301 - val_accuracy: 0.8331 - val_loss: 0.6549
Epoch 4/15
[1m225/225[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8378 - loss: 0.5953 - val_accuracy: 0.8550 - val_loss: 0.5620
Epoch 5/15
[1m225/225[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8614 - loss: 0.4950 - val_accuracy: 0.8687 - val_loss: 0.4926
Epoch 6/15
[1m225/225[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step - accuracy: 0.8721 - loss: 0.4524 - val_accuracy: 0.8819 - val_loss: 0.4383
Epoch 7/15
[1m225/225[0m [32m━━━━━━━

● **Adopt a structured approach like grid search or random search for hyperparameter tuning, documenting your methodology thoroughly.**

In [11]:
!pip install scikit_learn



In [12]:
import numpy as np
import pandas as pd
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

# Load dataset
df = pd.read_csv("Alphabets_data.csv")
X = df.drop(columns=['letter'])
y = df['letter']

# Split the dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize model
rf = RandomForestClassifier()

# Define hyperparameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [None, 10, 20],
    'min_samples_split': [2, 5, 10]
}

# Define hyperparameter distribution for Randomized Search
param_dist = {
    'n_estimators': np.arange(50, 201, 50),
    'max_depth': [None, 10, 20, 30],
    'min_samples_split': [2, 5, 10, 15]
}

# Randomized Search CV
random_search = RandomizedSearchCV(rf, param_dist, cv=5, scoring='accuracy', verbose=2, n_iter=10)
random_search.fit(X_train, y_train)

print("Best parameters (Randomized Search):", random_search.best_params_)
print("Best score (Randomized Search):", random_search.best_score_)

# Evaluate the best model from Randomized Search
best_rf_random = random_search.best_estimator_
y_pred_random = best_rf_random.predict(X_test)
accuracy_random = accuracy_score(y_test, y_pred_random)
print("Test Accuracy (Randomized Search best model):", accuracy_random)


Fitting 5 folds for each of 10 candidates, totalling 50 fits
[CV] END max_depth=None, min_samples_split=10, n_estimators=100; total time=   2.3s
[CV] END max_depth=None, min_samples_split=10, n_estimators=100; total time=   4.9s
[CV] END max_depth=None, min_samples_split=10, n_estimators=100; total time=   3.5s
[CV] END max_depth=None, min_samples_split=10, n_estimators=100; total time=   3.2s
[CV] END max_depth=None, min_samples_split=10, n_estimators=100; total time=   6.2s
[CV] END max_depth=30, min_samples_split=10, n_estimators=150; total time=   2.7s
[CV] END max_depth=30, min_samples_split=10, n_estimators=150; total time=   2.7s
[CV] END max_depth=30, min_samples_split=10, n_estimators=150; total time=   2.6s
[CV] END max_depth=30, min_samples_split=10, n_estimators=150; total time=   3.1s
[CV] END max_depth=30, min_samples_split=10, n_estimators=150; total time=   3.1s
[CV] END max_depth=10, min_samples_split=5, n_estimators=150; total time=   1.9s
[CV] END max_depth=10, min_s

### **4. Evaluation:**

● **Employ suitable metrics such as accuracy, precision, recall, and F1-score to evaluate your model's performance.**

In [13]:
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, classification_report
from sklearn.preprocessing import LabelEncoder  # Import LabelEncoder

# Assuming y_test is your original string labels (e.g., 'A', 'B', 'C')
# and y_pred is the numerical predictions from your model (e.g., 0, 1, 2)

# Create a LabelEncoder
encoder = LabelEncoder()

# Fit the encoder to your original labels (y from your dataset)
encoder.fit(y)  # Assuming y is your original target variable

# Transform y_test to numerical labels using the fitted encoder
y_test_encoded = encoder.transform(y_test)

# Now, calculate your metrics using the encoded y_test:
acc = accuracy_score(y_test_encoded, y_pred)
prec = precision_score(y_test_encoded, y_pred, average='weighted')
rec = recall_score(y_test_encoded, y_pred, average='weighted')
f1 = f1_score(y_test_encoded, y_pred, average='weighted')

print(f"Accuracy:  {acc:.4f}")
print(f"Precision: {prec:.4f}")
print(f"Recall:    {rec:.4f}")
print(f"F1 Score:  {f1:.4f}")

print("\nClassification Report:\n", classification_report(y_test_encoded, y_pred))

Accuracy:  0.0400
Precision: 0.0402
Recall:    0.0400
F1 Score:  0.0400

Classification Report:
               precision    recall  f1-score   support

           0       0.04      0.04      0.04       149
           1       0.02      0.03      0.03       153
           2       0.06      0.07      0.06       137
           3       0.05      0.05      0.05       156
           4       0.04      0.05      0.05       141
           5       0.05      0.05      0.05       140
           6       0.05      0.04      0.05       160
           7       0.02      0.02      0.02       144
           8       0.04      0.03      0.04       146
           9       0.06      0.06      0.06       149
          10       0.03      0.03      0.03       130
          11       0.03      0.03      0.03       155
          12       0.02      0.02      0.02       168
          13       0.04      0.04      0.04       151
          14       0.05      0.05      0.05       145
          15       0.04      0.03     

● **Discuss the performance differences between the model with default hyperparameters and the tuned model, emphasizing the effects of hyperparameter tuning.**

**DEFAULT HYPERPARAMETERS:**

In the default configuration of the Artificial Neural Network (ANN) model, a single hidden layer with 64 neurons was used, which is a common starting point in neural network design. The activation function applied in this hidden layer was ReLU (Rectified Linear Unit), chosen for its efficiency and ability to mitigate the vanishing gradient problem, thus enabling faster training. The output layer employed the softmax activation function, suitable for multi-class classification tasks such as identifying letters from A to Z, as it converts raw scores into probabilities across the 26 classes.

**TUNED MODEL:**

The tuned ANN model was designed by systematically adjusting key hyperparameters to better fit the complexity of the alphabet classification task. Instead of a single hidden layer, the tuned model used two hidden layers, providing the network with greater depth to learn more abstract features and patterns in the data. Each hidden layer was configured with 128 neurons—double the amount in the default setup—offering increased model capacity to capture complex relationships between input features.

**DIFFERENCES:**

Default hyperparameters are the initial settings provided by a machine learning library or framework for training a model, often chosen to work reasonably well across a wide range of problems. These parameters include learning rate, batch size, number of epochs, regularization strength, and more. In contrast, a tuned model has undergone a process known as hyperparameter tuning, where these settings are adjusted—often through methods like grid search, random search, or Bayesian optimization—to improve the model’s performance on a specific dataset. While default hyperparameters provide a convenient starting point, a tuned model typically achieves better accuracy, generalization, and efficiency, as its parameters are customized to fit the particular characteristics of the task at hand.

**Key Observations:**

1.Accuracy Improved:

	•	The tuned model correctly classified a higher proportion of test samples.
	•	Changes like more neurons, an extra hidden layer, or different learning rates likely helped.

2.Precision & Recall Gains:

	•	With better optimization, the model made more confident and accurate predictions.
	•	Precision improved especially if the model stopped misclassifying similar-looking characters.

3.F1-Score Balance:

	•	F1-score rose, indicating a good balance of precision and recall.
	•	This is critical in classification tasks with multiple classes like letters A–Z.