<a href="https://colab.research.google.com/github/Rohit-Saswadkar/Font-Text-classification-project-using-CNN/blob/main/1%5D_Font_detection_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Font detection using CNN

by Rohit Saswadkar
* email - rohit.saswadkar1@gmail.com
* Project Link - https://github.com/Rohit-Saswadkar/Font-Text-classification-project-using-CNN/tree/main\
* Dataset download link - https://www.kaggle.com/datasets/muhammadardiputra/font-recognition-data/download?datasetVersionNumber=1
* Dataset website - https://www.kaggle.com/datasets/muhammadardiputra/font-recognition-data

In [None]:
!pip install tensorflow
!pip install cv2

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


1] Import required libraries

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
from sklearn.model_selection import train_test_split
import os
import cv2

2] Download the dataset from the provided link and paste the file path below to begin the training process.
* There are 2 Datasets 1] color and 2] grey scale. I trained on grey scale to reduce computation load on system. Load that grescale datasets in the path

In [None]:
# Load the font dataset
path = r"C:\Users\saswa\OneDrive\Documents\pandas data\data\Deep Learning\3 Font detection\Datase 22t\Font Dataset Large"

3] This function loads font images and their corresponding labels from the specified directory. It get input as dictionary , extracts the images and its corresponding labels and stores this into images and labels lists.

* I used cv2 to convert the color images to grayscale as we have to only text classification ( also it reduce the computationl load on the PC). we can also train the model using color data as we have to only remove the line number 1] & 2] as shown in below


In [None]:
# load font images and labels
def load_data(data_directory):
    images = []
    labels = []
    font_names = os.listdir(data_directory)
    for font_name in font_names:
        font_dir = os.path.join(data_directory, font_name)
        for image_name in os.listdir(font_dir):
            image_path = os.path.join(font_dir, image_name)
            image = cv2.imread(image_path)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)  # Convert to grayscale if needed  ----------------1] remove this to train on color data
            image = cv2.resize(image, (32, 32))  # Resize image to uniform size                ----------------2] replace as (32,32,3) for color data
            images.append(image)
            labels.append(font_name)
    return np.array(images), np.array(labels)

In [None]:
data_directory = path
images, labels = load_data(data_directory)

Applied one hot encoding on the labels as we have multiple classes

In [None]:
# Convert labels to one-hot encoding
label_to_index = {label: i for i, label in enumerate(np.unique(labels))}
index_to_label = {i: label for label, i in label_to_index.items()}
labels_encoded = np.array([label_to_index[label] for label in labels])
labels_one_hot = tf.keras.utils.to_categorical(labels_encoded)

4] This code snippet splits the dataset into training, validation, and test sets using the train_test_split function.

In [None]:
# Split data into training, validation, and test sets
X_train, X_test, y_train, y_test = train_test_split(images, labels_one_hot, test_size=0.2, random_state=42)
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.2, random_state=42)

5] This code snippet splits the dataset into training, validation, and test sets using the train_test_split function.

* 1] in this I created 32 filters as of size 3 * 3 and slides along the image to extract meaninful features.

* 2] I used max pooling to extract the most valuable features by size of 2 * 2 to minimize computation and reduce overfitting.

*  I used relu as activation function in hidden layer to introduce non linearity in the model which returns values if its positive else returns 0 if the vlues is negative.

* 3] Used Flatten layer as it flattens the input matrix into 1d array

* 4] I Created dense layers of 64 neurons to reduce computational power as I trained this model in my laptop and also it reduces overfitting.

* 5] I used softmax in output layer as its mostly used for classification tasks. it converts raw inputs to the probabilities of the classes.


In [None]:
# Define the CNN model architecture
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 1)), #  - 1]
    layers.MaxPooling2D((2, 2)), #                                            - 2]
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(), #                                                       - 3]
    layers.Dense(64, activation='relu'), #                                    - 4]
    layers.Dense(len(label_to_index), activation='softmax')  #                - 5] Output layer with softmax activation
])

6] This code snippet compiles the CNN model with the specified optimizer, loss function, and metrics.

* Used adam as learning to adjust the weights during each epochs as its robust to sparse or noisy gradients.

* I used categorical crossentropy as i applied one hot encoding on labels else we have to use sparse categoruical cross entropy.

* Used Accuracy as performance monitoring metrics during training as our classification task.

In [None]:
# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

7] This code preprocesses the data by normalizing pixel values and reshaping the input data to match the CNN input shape.

* Each value in matrix of an image represents the color range from 0 to 255 so
 I Normalized the values of each image by dividing by 255 to convert them in the range of 0 to 1.

In [None]:
# preprocess the data (normalize pixel values)
X_train = X_train.astype('float32') / 255
X_val = X_val.astype('float32') / 255
X_test = X_test.astype('float32') / 255

# reshape data to match the input shape of the CNN
X_train = np.expand_dims(X_train, axis=-1)
X_val = np.expand_dims(X_val, axis=-1)
X_test = np.expand_dims(X_test, axis=-1)

8] This code snippet trains the CNN model on the training data and validates it on the validation data.

I applied 10 epochs to reuce computational load on my laptop

In [None]:
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val))

9] This code created to evaluate the trained model on the validation and test datasets.

In [None]:
# Evaluates the model on validation data and test data and gets accuracvy
val_loss, val_accuracy = model.evaluate(X_val, y_val)
print("Validation Accuracy:", val_accuracy)

In [None]:
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy:", test_accuracy)

10] Save the model at specified path

In [None]:
# Path eto save the model
path2 = r"C:\Users\saswa\OneDrive\Documents\pandas data\data\Deep Learning\3 Font detection\Datase 22t\Font Dataset Large"

model.save(path + 'font_recognition_model_large_model_3.h5')