###  Neural Network Tutorial for TMNIST Alphabet Character Classification

This guide will demonstrate how to develop and train a straightforward neural network to classify handwritten characters from the TMNIST Alphabet dataset, which encompasses 94 distinct characters.

###  Abstract

The aim of this tutorial is to guide you through the steps needed to create and train a neural network that can accurately recognize and classify characters from the TMNIST Alphabet dataset.

###  Introduction

The TMNIST Alphabet dataset is a comprehensive collection of 281,000 grayscale images, each of 28x28 pixel resolution, representing one of 94 different characters. This dataset is an excellent tool for training and evaluating machine learning models that perform character recognition tasks.

Characters included range from alphanumeric ('0'-'9', 'a'-'z', 'A'-'Z') to special symbols like punctuation marks and other common keyboard characters.

###  Dataset Overview

Here's a closer look at what the TMNIST Alphabet dataset includes:

- **Characters**: Includes a diverse set of 94 characters:
  - **Numbers**: '0' to '9'
  - **Lowercase letters**: 'a' to 'z'
  - **Uppercase letters**: 'A' to 'Z'
  - **Special characters**: `! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ \` { | } ~`

- **Data Format**: The dataset is provided in a CSV file format with the following structure:
  - **Header**: The first row provides column headers which include `names`, `labels`, and pixel columns `1` through `784`.
  - **Names Column**: Lists the font names used for each character image, such as `Acme-Regular` or `ZillaSlab-Bold`.
  - **Labels Column**: Identifies the character represented in each image.
  - **Pixel Columns**: Each of the 784 columns represents a pixel value in the 28x28 image.



## Image Classification on TMNIST Dataset Using CNN

### 1. Input Image 
- **Description**: Start with an input image from the TMNIST dataset, typically a 28x28 pixel grayscale image of a handwritten character.

### 2. Convolutional Layer 🌀
- **Description**: Apply several filters to the image to create feature maps. These filters detect various features such as edges, textures, and shapes specific to characters.
- **Emoji Key**: 🌀 (represents the action of filtering across the image)

### 3. Activation Function ⚡
- **Description**: Implement a ReLU (Rectified Linear Unit) activation function to introduce non-linearity, enabling the network to learn complex patterns.
- **Emoji Key**: ⚡ (indicates activation 'firing' upon detecting significant features)

### 4. Pooling Layer 🎱
- **Description**: Use max pooling to reduce the dimensionality of each feature map, making the detection of features invariant to scale and orientation.
- **Emoji Key**: 🎱 (signifies selecting the strongest feature signals like choosing the best ball in pool)

### 5. Flattening Layer 🔨
- **Description**: Flatten the pooled feature maps into a single long vector to prepare them for the fully connected layer.
- **Emoji Key**: 🔨 (depicts the action of flattening the data into a 1D vector)

### 6. Fully Connected Layer (Dense Layer) 🌐
- **Description**: A dense layer that connects every input to every output within its layer, responsible for classifying the image into one of the 94 character categories based on the learned features.
- **Emoji Key**: 🌐 (illustrates a dense network of connections)

### 7. Output Layer 🎯
- **Description**: The final layer uses softmax activation to output a probability distribution over the 94 classes, where each class corresponds to a different character.
- **Emoji Key**: 🎯 (targets the most likely character prediction)

### 8. Prediction 🏆
- **Description**: The character corresponding to the highest probability is selected as the predicted output, completing the classification process.
- **Emoji Key**: 🏆 (celebrates the successful prediction of the character)


In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

## Setting Up Our Environment
Here, we discuss the necessary Python libraries and tools required for this project, including TensorFlow, Keras, NumPy, and Pandas. Installation commands and verification of the environment setup are covered to ensure all necessary tools are ready for use.

In [None]:
!pip install numpy
!pip install tensorflow
!pip install keras
!pip install pandas
!pip install seaborn

In [None]:
import tensorflow as tf
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPooling2D
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import Adam
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer
from sklearn.utils import shuffle

import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import pandas as pd
import seaborn as sns
import warnings
warnings.filterwarnings("ignore")

# Initialize TensorFlow and Keras components
model = Sequential([
    Conv2D(32, kernel_size=3, activation='relu', input_shape=(28,28,1)),
    MaxPooling2D(pool_size=2),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(94, activation='softmax')
])

# Configure the model's optimizer and metrics
model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])

# Data preparation utilities
encoder = OneHotEncoder(sparse=False)
label_binarizer = LabelBinarizer()


## Data Loading
This step involves loading the TMNIST dataset from a CSV file. We demonstrate how to read the data using Pandas, display the structure of the dataset, and verify that the data is loaded correctly by viewing the first few rows.

In [None]:
# Import the necessary libraries for data handling
import pandas as pd

# Define the path to the dataset, adjust as necessary for your specific environment
data_file_path = '/kaggle/input/tmnist-alphabet-94-characters/94_character_TMNIST.csv'

# Load the dataset from the specified file path
data_frame = pd.read_csv(data_file_path)

# Display the first five rows of the dataset to verify it's loaded correctly
data_frame.head(5)


Let's extract and display all unique values from the 'labels' column of a DataFrame, showing the distinct characters or classes. Let's also calculate and print the total number of these unique classes, indicating the variety in the dataset.

## Data Exploration
In this section, we identify and list all unique characters in the dataset. We also calculate and display the total number of distinct classes, which is crucial for understanding the scope of the classification task.



In [None]:
# Extract and display the distinct characters from the 'labels' column
distinct_labels = data_frame['labels'].unique()
print("Distinct characters in dataset:", distinct_labels)

# Determine and print the count of unique character classes in the dataset
class_count = data_frame['labels'].nunique()
print(f"Total number of distinct character classes: {class_count}")


## Data Cleaning
Although the TMNIST dataset is typically clean, this step involves checking for any missing or inconsistent data. We would handle missing values, incorrect labels, and any anomalies found in the dataset. Here, we also ensure that all feature data is numeric, converting any non-numeric entries and handling missing values appropriately.



In [None]:
# Separate the feature data and target labels from the dataset
features = data_frame.drop(columns=['labels'])  # Remove the 'labels' column to isolate the features
targets = data_frame['labels']                  # Isolate the 'labels' column as the target variable


In [None]:
# Assuming there's a redundant first column in feature data that should be removed
feature_data = features.iloc[:, 1:]  # Adjust to exclude the first column if not needed

# Setting up the plotting area for displaying images
plot_area, plot_axes = plt.subplots(nrows=8, ncols=8, figsize=(20, 10))
plot_area.suptitle("Sample Character Images from TMNIST", fontsize=20)

# Display each character image using a 'viridis' colormap for better visualization
for index, axis in enumerate(plot_axes.flat):
    # Reshape feature data into 28x28 array for image display
    character_image = axis.imshow(feature_data.values[index].reshape(28, 28), cmap='viridis')
    axis.set_title(str(targets.iloc[index]), fontsize=14)

# Adjust subplot layout to prevent overlap and ensure clarity
plt.subplots_adjust(hspace=0.4, wspace=0.4)

# Include a colorbar to illustrate the color mapping used in images
color_bar = plot_area.colorbar(character_image, ax=plot_axes.ravel().tolist(), orientation='horizontal', fraction=0.05, pad=0.1)
color_bar.ax.tick_params(labelsize=14)

# Render the visualizations
plt.show()


In [None]:
X = data_frame.drop(columns=['labels'])
y = data_frame['labels']

In [None]:
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.2, random_state=32,stratify=y)

In [None]:
pip install squarify


In [None]:
# Calculate counts for each class
train_counts = y_train.value_counts().sort_index()
test_counts = y_test.value_counts().sort_index()

# Normalize counts to fit the area of the treemap (area is typically set to 100 or 1 for simplicity)
train_norm = train_counts / train_counts.sum() * 100
test_norm = test_counts / test_counts.sum() * 100

# Create a DataFrame for better handling
train_df = pd.DataFrame({'label': train_counts.index, 'size': train_norm})
test_df = pd.DataFrame({'label': test_counts.index, 'size': test_norm})

# Plotting
fig, ax = plt.subplots(2, 1, figsize=(16, 12))

# Treemap for training data
squarify.plot(sizes=train_df['size'], label=train_df['label'], alpha=.8, ax=ax[0])
ax[0].set_title('Treemap of Class Distribution in Train Data')
ax[0].axis('off')  # Hide the axes

# Treemap for testing data
squarify.plot(sizes=test_df['size'], label=test_df['label'], alpha=.8, ax=ax[1])
ax[1].set_title('Treemap of Class Distribution in Test Data')
ax[1].axis('off')  # Hide the axes

plt.tight_layout()
plt.show()


## Normalization: 
Scaling the pixel values to a range of 0 to 1 to facilitate more efficient training by standardizing the input scale.


In [None]:
# Select numeric columns only from both training and testing data using a broad numeric type filter
X_train_numeric = X_train.select_dtypes(include=[np.number])  # Simplifies the selection to include all numeric types
X_test_numeric = X_test.select_dtypes(include=[np.number])

# Ensure all data in these columns is numeric, converting any non-numeric entries to NaN
X_train_numeric = X_train_numeric.apply(lambda x: pd.to_numeric(x, errors='coerce'))
X_test_numeric = X_test_numeric.apply(lambda x: pd.to_numeric(x, errors='coerce'))

# Replace any NaN values that resulted from the conversion with zero
X_train_numeric.fillna(0, inplace=True)
X_test_numeric.fillna(0, inplace=True)

# Normalize the numeric data to range between 0 and 1 for use in machine learning models
X_train_normalized = X_train_numeric.astype('float32') / 255
X_test_normalized = X_test_numeric.astype('float32') / 255

In [None]:
# Extract the first few features for visualization
first_few_features = X_train.iloc[:, :4]  # Adjust the number of features as needed

# Create scatter plots for each pair of features
plt.figure(figsize=(12, 8))  # Adjust the figure size for better clarity
for i in range(len(first_few_features.columns)):
    for j in range(i+1, len(first_few_features.columns)):
        plt.subplot(2, 3, j - i)
        plt.scatter(first_few_features.iloc[:, i], first_few_features.iloc[:, j], alpha=0.5)
        plt.xlabel(first_few_features.columns[i])
        plt.ylabel(first_few_features.columns[j])
plt.suptitle('Scatter Plots of the First Few Features', size=20, y=1.02)
plt.tight_layout()
plt.show()


In [None]:
# Filter out non-numeric columns to include only numeric data for correlation analysis, considering the first 10 numeric columns
numeric_cols = X_train.select_dtypes(include=[np.number]).iloc[:, :10]

# Compute the correlation matrix for the selected numeric features
correlation_matrix = numeric_cols.corr()

# Visualize the correlations using a heatmap
plt.figure(figsize=(10, 8))  # Adjust the figure size for clarity
sns.heatmap(correlation_matrix, annot=True, fmt=".2f", linewidths=.5, cmap='viridis', vmin=-1, vmax=1)
plt.title('Correlation Matrix of Numeric Features', fontsize=18)
plt.show()


In [None]:
enc = OneHotEncoder(sparse=False,handle_unknown='ignore')
y_train_encoded=enc.fit_transform(y_train.values.reshape(-1,1))
y_test_encoded=  enc.transform(y_test.values.reshape(-1,1))

In [None]:
# Sum up occurrences of each class in the encoded arrays
y_train_sum = y_train_encoded.sum(axis=0)
y_test_sum = y_test_encoded.sum(axis=0)

# Get the class labels from the encoder
class_labels = enc.get_feature_names_out(input_features=['Class'])

def plot_grouped_bar_chart(class_sums_train, class_sums_test, class_labels, title):
    """
    Plot the class distributions as a grouped bar chart.
    
    Parameters:
    - class_sums_train: Array of sums of occurrences for each class in the training set
    - class_sums_test: Array of sums of occurrences for each class in the testing set
    - class_labels: Labels of the classes
    - title: Title for the plot
    """
    fig, ax = plt.subplots(figsize=(14, 8))
    bar_width = 0.35  # Width of the bars
    index = np.arange(len(class_labels))  # The label locations
    
    # Creating bars for training and testing sets
    bars_train = ax.bar(index - bar_width/2, class_sums_train, bar_width, label='Training', color='skyblue')
    bars_test = ax.bar(index + bar_width/2, class_sums_test, bar_width, label='Testing', color='orange')
    
    ax.set_title(title, fontsize=20)
    ax.set_xlabel('Classes', fontsize=16)
    ax.set_ylabel('Occurrences', fontsize=16)
    ax.set_xticks(index)
    ax.set_xticklabels(class_labels, rotation=45, ha="right")
    ax.legend()

    # Adding the text labels on the bars
    for bars in [bars_train, bars_test]:
        for bar in bars:
            height = bar.get_height()
            ax.annotate('{}'.format(height),
                        xy=(bar.get_x() + bar.get_width() / 2, height),
                        xytext=(0, 3),  # 3 points vertical offset
                        textcoords="offset points",
                        ha='center', va='bottom')

    plt.tight_layout()
    plt.show()

# Assuming 'y_train_sum', 'y_test_sum', and 'class_labels' are defined as from previous implementations
plot_grouped_bar_chart(y_train_sum, y_test_sum, class_labels, 'Class Distribution in Training and Testing Set')


## Overview of Configuring and Training a CNN with TensorFlow Keras
This guide provides a comprehensive overview of preparing, configuring, and training a convolutional neural network (CNN) using TensorFlow's Keras API for image classification tasks.

**Model Configuration:**
The CNN model is set up using a sequential architecture, which includes multiple layers:

**Reshape Layer:** Adjusts input dimensions to suit the model's requirements.
Convolutional Layers (Conv2D): Extracts features from the input images using filters, applying ReLU activation.

**Batch Normalization:** Stabilizes learning by normalizing the activation of the previous layer.

**Max Pooling Layers:** Reduces dimensionality which helps in reducing computational complexity and overfitting.

**Dropout Layers:** Prevents overfitting by randomly setting input units to 0 during training at each update cycle.

**Flatten Layer:** Converts pooled feature maps into a single column that is passed to the fully connected layer.

**Dense Layers:** Fully connected layers that learn non-linear combinations of the high-level features extracted by the convolutional layers.

**Output Layer:** Uses softmax activation to output a probability distribution over the target classes.

**Compilation of the Model:**
The model is compiled with:

**Optimizer:** Adam, an efficient stochastic gradient descent algorithm.
Loss Function: Categorical crossentropy, suitable for multi-class classification tasks.
Metrics: Accuracy, to evaluate the performance during training and testing.

**Data Preparation:**

Encoding Labels: Labels are transformed into a one-hot encoded format to fit the output layer using an OneHotEncoder.
Feature Scaling: Image data features are scaled (normalized) by dividing pixel values by 255 to range between 0 and 1, enhancing model training efficiency.

**Model Training:**

The model is trained on preprocessed images and labels, using a portion of the data for validation to monitor performance and avoid overfitting.
Training involves multiple epochs where the model iteratively adjusts its parameters (weights and biases) to minimize the loss.

**Model Evaluation and Visualization:**

After training, the model's structure and performance can be visualized using plot_model, which saves a diagram of the model's architecture.
Performance on the training set is evaluated to confirm the model is trained properly without overfitting.
This workflow encapsulates the complete process from model architecture design, through data preprocessing, to training and evaluation, suitable for tackling complex image classification problems with high efficacy.



In [None]:
# Assuming the extra feature is the last column, you can exclude it like this:
X_train_pixels = X_train.iloc[:, :-1]  # Exclude the last column
X_test_pixels = X_test.iloc[:, :-1]    # Exclude the last column

# Now, you can safely reshape the pixel data to (28, 28) format
X_train_norm = X_train_pixels.values.reshape(-1, 28, 28)
X_test_norm = X_test_pixels.values.reshape(-1, 28, 28)



In [None]:
# Import necessary layers and functions from Keras
from tensorflow.keras.layers import Reshape, Conv2D, MaxPooling2D, Flatten, Dense, BatchNormalization, Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam

# Calculate the number of unique classes in the dataset
no_of_classes = data_frame['labels'].nunique()

# Define the model using the Sequential API, which allows adding layers in a stack
model_new = Sequential([
    # First reshape the input to ensure the model gets the right dimension
    Reshape((28,28,1), input_shape=(28,28)),
    
    # First convolutional layer with 64 filters of size 3x3, ReLU activation, and 'same' padding
    Conv2D(64, (3,3), activation='relu', padding='same'),
    
    # Batch normalization to maintain the mean output close to 0 and the output standard deviation close to 1
    BatchNormalization(),
    
    # Max pooling layer to reduce spatial dimensions (width and height)
    MaxPooling2D((2,2)),
    
    # Dropout layer to reduce overfitting by dropping out units randomly during training
    Dropout(0.25),
    
    # Second convolutional layer with 128 filters, ReLU activation, and 'same' padding
    Conv2D(128, (3,3), activation='relu', padding='same'),
    
    # Batch normalization layer
    BatchNormalization(),
    
    # Second max pooling layer
    MaxPooling2D((2,2)),
    
    # Dropout layer
    Dropout(0.25),
    
    # Third convolutional layer with 256 filters
    Conv2D(256, (3,3), activation='relu', padding='same'),
    
    # Batch normalization layer
    BatchNormalization(),
    
    # Third max pooling layer
    MaxPooling2D((2,2)),
    
    # Dropout layer
    Dropout(0.25),
    
    # Flatten the output of the previous layer to a single vector
    Flatten(),
    
    # Fully connected (dense) layer with 512 units and ReLU activation
    Dense(512, activation='relu'),
    
    # Batch normalization layer
    BatchNormalization(),
    
    # Dropout layer
    Dropout(0.5),
    
    # Another dense layer with 256 units
    Dense(256, activation='relu'),
    
    # Batch normalization layer
    BatchNormalization(),
    
    # Dropout layer
    Dropout(0.5),
    
    # Another dense layer with 128 units
    Dense(128, activation='relu'),
    
    # Batch normalization layer
    BatchNormalization(),
    
    # Dropout layer
    Dropout(0.5),
    
    # Dense layer with 32 units
    Dense(32, activation='relu'),
    
    # Output layer with a unit for each class, using softmax to achieve a probability distribution
    Dense(no_of_classes, activation='softmax')
])

# Define the optimizer, Adam, with a learning rate of 0.001
opt = Adam(lr=0.001)


In [None]:
from tensorflow.keras.utils import plot_model

model_new.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
model_new.summary()

# For a graphical representation
plot_model(model_new, to_file='model_new.png', show_shapes=True, show_layer_names=True)


In [None]:
# Compile the neural network model
model_new.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

model_new.summary()

In [None]:
# Compile the model with specified optimizer, loss function, and metrics
model_new.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])

# Display the summary of the model
model_new.summary()


In [None]:
# Encode the labels to one-hot format
encoder = OneHotEncoder(sparse=False)
y_encoded = encoder.fit_transform(data_frame['labels'].values.reshape(-1, 1))

X = data_frame.drop(columns=['names', 'labels'])

# Now convert this to numpy arrays and reshape it assuming each image is 28x28 pixels
X = X.values.reshape(-1, 28, 28, 1).astype('float32') / 255  # Normalize pixel values by dividing by 255

# Now, your X_train_norm and X_test_norm should be set as follows:
X_train_norm, X_test_norm, y_train_encoded, y_test_encoded = train_test_split(X, y_encoded, test_size=0.2, random_state=32)

# You can now proceed to fit your model:
model_history = model_new.fit(X_train_norm, y_train_encoded, epochs=10, validation_data=(X_test_norm, y_test_encoded), verbose=2, batch_size=128)


In [None]:
# Evaluate the model on the training data
# 'evaluate' function returns the loss value & metrics values for the model in test mode
# 'batch_size' defines the number of samples per gradient update
# 'verbose=0' means silent mode, no output will be printed during the evaluation process
score = model_new.evaluate(X_train_norm, y_train_encoded, batch_size=64, verbose=0)

# Extract the accuracy part of the score and format it to percentage with 4 decimal places
# 'score[1]' generally refers to accuracy if the model.compile metrics parameter is set to ['accuracy']
print(f"Test Accuracy: {round(score[1], 4) * 100}%")


In [None]:
# Extracting values from the history object
epochs = range(1, len(model_history.history['accuracy']) + 1)
train_accuracy = model_history.history['accuracy']
val_accuracy = model_history.history['val_accuracy']
train_loss = model_history.history['loss']
val_loss = model_history.history['val_loss']  # Extract validation loss

# Plotting Epoch vs Accuracy
plt.figure(figsize=(14, 8))
plt.subplot(2, 2, 1)  # 2 rows, 2 columns, 1st subplot
plt.plot(epochs, train_accuracy, 'bo-', label='Training Accuracy')
plt.plot(epochs, val_accuracy, 'ro-', label='Validation Accuracy')
plt.title('Epoch vs Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

# Plotting Loss vs Accuracy
plt.subplot(2, 2, 2)  # 2 rows, 2 columns, 2nd subplot
plt.plot(train_loss, train_accuracy, 'go-', label='Training')
plt.title('Loss vs Accuracy')
plt.xlabel('Loss')
plt.ylabel('Accuracy')
plt.legend()

# Plotting Epoch vs Loss
plt.subplot(2, 2, 3)  # 2 rows, 2 columns, 3rd subplot
plt.plot(epochs, train_loss, 'b^-', label='Training Loss')
plt.plot(epochs, val_loss, 'r^-', label='Validation Loss')
plt.title('Epoch vs Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.tight_layout()  # Adjust layout to prevent overlap
plt.show()


## Conclusion
Model Training and Validation Performance
Training and Validation Accuracy:

The model shows a steady increase in both training and validation accuracy over 10 epochs. Starting from an initial accuracy of 64.5% on training data, it improved to 92.36% by the 10th epoch. Validation accuracy also improved consistently, beginning at 88.35% and reaching 93.98%.
This consistent improvement in accuracy suggests that the model is learning effectively from the training data and generalizing well to the unseen validation data.
Training and Validation Loss:

Both training and validation loss decreased significantly across epochs, indicating a good convergence of the model. The training loss dropped from 1.3018 to 0.2540, while validation loss decreased from 0.4076 to 0.1937.
The closeness of training and validation loss values by the end of the training indicates that the model is not overfitting significantly. The model maintains a good balance between bias and variance.
Model Evaluation
The evaluation on the training data after training completion showed an **accuracy of approximately 94.7%.** This high accuracy rate on the training set, closely matching the validation accuracy, further confirms the model's robustness and its ability to generalize beyond just the training dataset.
Visualization Insights
The plots of Epoch vs. Accuracy and Epoch vs. Loss provide visual confirmation of the model's steady improvement and stability. The plots do not show any erratic changes in accuracy or loss, which often indicate problems like learning rate issues or data inconsistencies.

The Loss vs. Accuracy plot for the training set, although less common in reporting results, suggests that as the model's loss decreases, its ability to correctly classify the training data increases, as expected.

Conclusions
The model is well-tuned for the task with a suitable architecture (depth and breadth of layers) and hyperparameters (learning rate, batch size). The use of dropout and batch normalization layers has effectively helped in managing overfitting and ensuring stable training dynamics.

There is room for potential improvement. Experimenting with further tuning of the learning rate, increasing the model complexity, or implementing advanced techniques like data augmentation or different regularization methods might yield even better results.

The CNN has proven to be effective for the classification of complex patterns in image data, as evidenced by its performance on the TMNIST dataset with 94 classes, making it a good choice for similar tasks in image recognition.

In summary, the CNN model has demonstrated strong performance characteristics in this scenario, and with minor adjustments, it could potentially be improved further. This model serves as a robust baseline for further experimentation and refinement.


## Citations and Licensing

Majority of the techniques have been adapted from the following notebook
Link - https://www.kaggle.com/datasets/nikbearbrown/tmnist-alphabet-94-characters/code 

https://github.com/aiskunks/Skunks_Skool

https://towardsdatascience.com/convolutional-neural-networks-explained-9cc5188c4939 , 

https://towardsdatascience.com/recurrent-neural-networks-rnns-3f06d7653a85




Copyright (c) 2023 Palak Rajdev

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
Disclaimer: The Software is provided "AS IS", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. In no event shall the author or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort or otherwise, arising from, out of, or in connection with the Software or the use or other dealings in the Software.

