<a id='top'></a>
<h1 style="font-family:Bahnschrift Condensed;text-align:center; font-size:240%;">2024 National Data Science Competition by RIMAI: License Plate Recognition in Mauritania - Data Science Phase</h1>
<h style="font-family:Bahnschrift Condensed;text-align:center; font-size:240%;">2024 National Data Science Competition by RIMAI: License Plate Recognition in Mauritania - Data Science Phase</h2>
<div style="text-align:center;">
    <img src="https://www.rim-ai.com/assets/logo.png" style="width:30%; height:auto;" alt="Logo RIM AI">
</div>

- **El Moustapha Cheikh Jiddou** 
- **Pseudo: Jiddou26**
- **Email:** elmoustapha.cheikh.jiddou@gmail.com

<p style="font-family:Bahnschrift Condensed;font-size:120%; text-align:justify;">
    The second phase of the 2024 National Data Science Competition by RIMAI shifts the focus to the heart of the challenge: data science. After gathering diverse and high-quality images in the first phase, this stage is dedicated to developing and fine-tuning algorithms for the automatic recognition of Mauritanian license plates.
</p>

<p style="font-size:120%; text-align:justify;">
    Participants are invited to leverage their data science skills to create robust models capable of accurately identifying and reading license plates under various conditions. This phase is crucial, as it will test the effectiveness of the algorithms in real-world scenarios, pushing the boundaries of AI and machine learning in the context of Mauritanian license plate recognition.
</p>
<p style="font-size:120%; text-align:justify;">
    The competition aims to drive innovation in computer vision, with a particular emphasis on enhancing the accuracy, efficiency, and scalability of the solutions. The challenge is designed not only to assess technical prowess but also to encourage creativity in overcoming the unique challenges presented by this application.
</p>

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 1: Importing Necessary Libraries</b></span></div>

In [3]:
import os
import cv2
import numpy as np
from shutil import copyfile
from easyocr import Reader
import pytesseract
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.sequence import pad_sequences
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.utils import to_categorical
import matplotlib.pyplot as plt

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 2: License Plate Detection and Transformation Functions</b></span></div>

In [4]:
# Fonctions de détection de plaque d'immatriculation
def order_points(pts):
    rect = np.zeros((4, 2), dtype="float32")
    s = pts.sum(axis=1)
    rect[0] = pts[np.argmin(s)]
    rect[2] = pts[np.argmax(s)]

    diff = np.diff(pts, axis=1)
    rect[1] = pts[np.argmin(diff)]
    rect[3] = pts[np.argmax(diff)]

    return rect

def four_point_transform(image, pts):
    rect = order_points(pts)
    (tl, tr, br, bl) = rect

    widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
    widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
    maxWidth = max(int(widthA), int(widthB))

    heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
    heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
    maxHeight = max(int(heightA), int(heightB))

    dst = np.array([
        [0, 0],
        [maxWidth - 1, 0],
        [maxWidth - 1, maxHeight - 1],
        [0, maxHeight - 1]], dtype="float32")

    M = cv2.getPerspectiveTransform(rect, dst)
    warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))

    return warped

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 3: Processing and Annotating Images</b></span></div>

In [5]:
def process_single_image(image_path, output_dir):
    # Charger l'image
    car = cv2.imread(image_path)
    # Convertir en niveau de gris
    gray = cv2.cvtColor(car, cv2.COLOR_BGR2GRAY)
    # Appliquer un flou gaussien
    blur = cv2.GaussianBlur(gray, (5, 5), 0)
    # Détecter les contours
    edged = cv2.Canny(blur, 50, 200)
    # Trouver les contours les plus grands
    cont, _ = cv2.findContours(edged, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cont = sorted(cont, key=cv2.contourArea, reverse=True)[:5]

    plate_cnt = None
    for c in cont:
        arc = cv2.arcLength(c, True)
        approx = cv2.approxPolyDP(c, 0.02 * arc, True)
        if len(approx) == 4:
            plate_cnt = approx
            break

    if plate_cnt is None:
        print(f"No license plate detected in {image_path}.")
        return None

    # Redresser la plaque d'immatriculation
    warped_plate = four_point_transform(gray, plate_cnt.reshape(4, 2))

    # Redimensionner la plaque pour améliorer la précision de l'OCR
    plate = cv2.resize(warped_plate, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)

    # Utiliser EasyOCR pour reconnaître les caractères de la plaque
    reader = Reader(['en'])
    detection = reader.readtext(plate)

    if len(detection) == 0:
        print(f"Unable to read license plate text in {image_path}.")
        return None

    # Extraire le texte reconnu
    text = detection[0][1]

    # Filtrer pour obtenir uniquement les lettres et les chiffres
    filtered_text = ''.join([c for c in text if c.isalnum()])

    # Nommer l'image avec le numéro de plaque d'immatriculation
    output_filename = os.path.join(output_dir, f'{filtered_text}.jpg')

    # Sauvegarder l'image annotée
    cv2.imwrite(output_filename, car)

    return output_filename

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 4: Processing Images in Folders</b></span></div>

In [6]:
data_train_dir = "data_train"
data_mat_train_dir = "data_mat_train"
os.makedirs(data_mat_train_dir, exist_ok=True)

for filename in os.listdir(data_train_dir):
    if filename.endswith(".jpg") or filename.endswith(".png"):
        image_path = os.path.join(data_train_dir, filename)
        output_filename = process_single_image(image_path, data_mat_train_dir)
        if output_filename:
            print(f"Processed: {filename} -> Saved as: {output_filename}")
        else:
            print(f"Processed: {filename} -> No license plate detected or unable to read text.")

FileNotFoundError: [WinError 3] Le chemin d’accès spécifié est introuvable: 'data_train'

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 5: Loading Data and Labels</b></span></div>

In [None]:
def load_data_and_labels(data_dir):
    images = []
    labels = []
    for filename in os.listdir(data_dir):
        img_path = os.path.join(data_dir, filename)
        img = cv2.imread(img_path)
        if img is not None:  # Vérifier si l'image est valide
            img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            img = cv2.resize(img, (128, 128))  # Redimensionner les images
            images.append(img)
            # Extraire l'étiquette à partir du nom du fichier
            label = filename.split('_')[0]  
            labels.append(label)
    return np.array(images), np.array(labels)

train_dir = 'data_mat_train'
test_dir = 'data_mat_test'

X_train, y_train = load_data_and_labels(train_dir)
X_test, y_test = load_data_and_labels(test_dir)

# Normaliser les images
X_train = X_train / 255.0
X_test = X_test / 255.0


<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 6: Building and Training the CNN Model</b></span></div>

In [None]:
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 7: Evaluating the Model</b></span></div>

In [None]:
# Encodage des étiquettes d'entraînement
label_encoder = LabelEncoder()
y_train_encoded = label_encoder.fit_transform(y_train)

# Création d'un dictionnaire des étiquettes d'entraînement
train_labels_set = set(y_train)

# Filtrer les données de test pour inclure uniquement les étiquettes présentes dans les données d'entraînement
filtered_indices = [i for i, label in enumerate(y_test) if label in train_labels_set]
X_test_filtered = X_test[filtered_indices]
y_test_filtered = y_test[filtered_indices]

# Encoder les étiquettes de test filtrées
y_test_encoded = label_encoder.transform(y_test_filtered)

# Conversion des étiquettes en one-hot encoding
num_classes = len(np.unique(y_train_encoded))
y_train_one_hot = to_categorical(y_train_encoded, num_classes)
y_test_one_hot = to_categorical(y_test_encoded, num_classes)

# Entraîner le modèle
history = model.fit(X_train, y_train_one_hot, epochs=10, validation_split=0.2)

# Évaluer le modèle sur les données de test filtrées
loss, accuracy = model.evaluate(X_test_filtered, y_test_one-hot)
print(f'Loss: {loss}, Accuracy: {accuracy}')

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 8: Testing the Model on a Single Image</b></span></div>

In [None]:
# Charger votre image
image_path = 'DataChallenge.jpg'
image = cv2.imread(image_path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)  # Assurez-vous que l'image est en format RGB
image = cv2.resize(image, (128, 128))  # Redimensionnez l'image à la taille d'entrée du modèle

# Normaliser l'image
image = image / 255.0

# Faire une prédiction
prediction = model.predict(np.expand_dims(image, axis=0))

# Convertir les probabilités en étiquettes
predicted_label = label_encoder.inverse_transform([np.argmax(prediction)])

print(f'Predicted label: {predicted_label}')

<div align="center"><span style="font-family:Bahnschrift Condensed;font-size:40px"><b>Section 9: Advanced License Plate Detection and Visualization</b></span></div>

In [None]:
def order_points(pts):
    rect = np.zeros((4, 2), dtype="float32")
    s = pts.sum(axis=1)
    rect[0] = pts[np.argmin(s)]
    rect[2] = pts[np.argmax(s)]

    diff = np.diff(pts, axis=1)
    rect[1] = pts[np.argmin(diff)]
    rect[3] = pts[np.argmax(diff)]

    return rect

def four_point_transform(image, pts):
    rect = order_points(pts)
    (tl, tr, br, bl) = rect

    # Calculer les largeurs et hauteurs des côtés du rectangle
    widthA = np.sqrt(((br[0] - bl[0]) ** 2) + ((br[1] - bl[1]) ** 2))
    widthB = np.sqrt(((tr[0] - tl[0]) ** 2) + ((tr[1] - tl[1]) ** 2))
    maxWidth = max(int(widthA), int(widthB))

    heightA = np.sqrt(((tr[0] - br[0]) ** 2) + ((tr[1] - br[1]) ** 2))
    heightB = np.sqrt(((tl[0] - bl[0]) ** 2) + ((tl[1] - bl[1]) ** 2))
    maxHeight = max(int(heightA), int(heightB))

    # Destination points for perspective transform
    dst = np.array([
        [0, 0],
        [maxWidth - 1, 0],
        [maxWidth - 1, maxHeight - 1],
        [0, maxHeight - 1]], dtype="float32")

    # Compute the perspective transform matrix and apply it
    M = cv2.getPerspectiveTransform(rect, dst)
    warped = cv2.warpPerspective(image, M, (maxWidth, maxHeight))

    return warped

def process_single_image(image_path, output_path):
    reader = Reader(['en'], gpu=False, verbose=False)
    
    car = cv2.imread(image_path)
    car = cv2.resize(car, (800, 600))
    gray = cv2.cvtColor(car, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (5, 5), 0)
    edged = cv2.Canny(blur, 50, 200)
    cont, _ = cv2.findContours(edged, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
    cont = sorted(cont, key=cv2.contourArea, reverse=True)[:5]

    plate_cnt = None
    for c in cont:
        arc = cv2.arcLength(c, True)
        approx = cv2.approxPolyDP(c, 0.02 * arc, True)
        if len(approx) == 4:
            plate_cnt = approx
            break

    if plate_cnt is None:
        print("No license plate detected.")
        return

    # Redresser la plaque d'immatriculation
    warped_plate = four_point_transform(gray, plate_cnt.reshape(4, 2))

    # Redimensionner la plaque pour améliorer la précision de l'OCR
    plate = cv2.resize(warped_plate, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)

    # Appliquer une égalisation de l'histogramme pour améliorer le contraste
    plate = cv2.equalizeHist(plate)

    # Filtrage morphologique pour améliorer les contours des caractères
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
    plate = cv2.morphologyEx(plate, cv2.MORPH_CLOSE, kernel)

    # Utiliser EasyOCR pour reconnaître les caractères
    detection = reader.readtext(plate)
    print(detection)

    if len(detection) == 0:
        text = "Impossible de lire le texte de la plaque d'immatriculation"
        cv2.putText(car, text, (20, 40), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 0, 255), 3)
    else:
        # Extraire le texte reconnu
        text = detection[0][1]
        # Filtrer pour obtenir uniquement les lettres et les chiffres
        filtered_text = ''.join([c for c in text if c.isalnum()])
        if len(filtered_text) > 8:
            filtered_text = filtered_text[:8]
        cv2.drawContours(car, [plate_cnt], -1, (0, 255, 0), 3)
        display_text = f"{filtered_text} ({detection[0][2] * 100:.2f}%)"
        cv2.putText(car, display_text, (plate_cnt[0][0][0], plate_cnt[0][0][1] - 20), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 255, 0), 2)
        print(display_text)

    # Sauvegarder l'image annotée
    cv2.imwrite(output_path, car)
    # Sauvegarder l'image de la plaque redressée
    cv2.imwrite('warped_plate.jpg', plate)

# Tester avec une seule image
process_single_image('hamada.jpg', 'annotated_test.jpg')

# Afficher les images avec matplotlib
annotated_image = cv2.imread('annotated_test.jpg')
warped_plate = cv2.imread('warped_plate.jpg', cv2.IMREAD_GRAYSCALE)

plt.figure(figsize=(10, 8))
plt.subplot(1, 2, 1)
plt.imshow(warped_plate, cmap='gray')
plt.title('Plaque d\'immatriculation')
plt.axis('off')
plt.subplot(1, 2, 2)
plt.imshow(cv2.cvtColor(annotated_image, cv2.COLOR_BGR2RGB))
plt.title('Image avec détection')
plt.axis('off')

plt.show()