<a href="https://colab.research.google.com/github/TaraOnGit/CapstonePropertyPricePredictor/blob/master/NumberPlateDetectionCropping.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Greetings!**

In [1]:
!pip install keras



#**Importing Libraries**

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from keras.models import Sequential
from keras.layers import Dropout, Flatten, Conv2D, MaxPooling2D, Dense

import cv2 #OpenCV Library
import os #
import glob #helps in finding matching file paths

1. Dropout acts as Regularization Technique and Avoids Overfitting
2. Flatten - Flattens data into 1D array
3. Conv2D - Applies 2D Convolution Filter
4. MaxPooling2d - Reduces Dimensions by Downsampling
5. Dense - Fully Connected Layer

#**Working with Data (Vehicle Images)**
##**Extracting the Zip File Content**

In [3]:
import zipfile
zip_file = '/content/Licplatesdetection_train.zip'
extract_folder = 'lpd_images'

if not os.path.exists(extract_folder):
    os.makedirs(extract_folder)

with zipfile.ZipFile(zip_file, 'r') as zip_ref:
    zip_ref.extractall(extract_folder)

#**Fetching the Individual Image Files from the Folder**

In [4]:
img_dir = "/content/lpd_images/license_plates_detection_train"
data_path = os.path.join(img_dir,"*g")
files = glob.glob(data_path)

In [5]:
files.sort()
#files

In [6]:

X = [] # To store input images

for file in files:  #Loop through image files
  img = cv2.imread(file)  # Read the image file
  img = cv2.resize(img, (244,244))   # Resizing all image files to same size
  X.append(img) # Appending image file data to X

#**Reading the Boundary Box Values**

In [7]:
df = pd.read_csv('/content/Licplatesdetection_train.csv')
df.head(2)

Unnamed: 0,img_id,ymin,xmin,ymax,xmax
0,1.jpg,276,94,326,169
1,10.jpg,311,395,344,444


In [8]:
y = []

for index, row in df.iterrows():
  img_id = row['img_id']
  ymin,xmin,ymax,xmax = row['ymin'], row['xmin'], row['ymax'], row['xmax']
  y.append([ymin,xmin,ymax,xmax])

#y = np.array(y)

#**Exploring License Plates with CV2**

In [9]:
ymin, xmin, ymax, xmax = y[2]

img = cv2.imread(files[2])
old_height, old_width = img.shape[:2]
lic_plate_img = cv2.rectangle(X[2],
                              (int(xmin * 244 / old_width), int(ymin *244 / old_height)),  # Top-left corner (xmin, ymin)
                              (int(xmax * 244 / old_width), int(ymax * 244 / old_height)),  # Bottom-right corner (xmax, ymax)
                              (255, 0, 150), 2)
#plt.imshow(lic_plate_img)
plt.show()

1. X has images
2. y has the coordinates of bounding box of the number/license plate

In [10]:
top_left = (int(xmin * 244 / old_width), int(ymin * 244 / old_height))
bottom_right = (int(xmax * 244 / old_width), int(ymax * 244 / old_height))

cropped_img = lic_plate_img[top_left[1]:bottom_right[1], top_left[0]:bottom_right[0]]

#plt.imshow(cropped_img)
#cropped_img
#plt.imshow(cv2.cvtColor(cropped_img, cv2.COLOR_BGR2RGB))
#uncomment the above two plt.imshow() to see the images

#**Normalizing the values of X, so that the model treats them equally**

In [11]:
#Converting X and y to numpy arrays
X = np.array(X)
y = np.array(y)

In [12]:
X = X / 255
y = y // 255

In [13]:
X[0]

array([[[0.31764706, 0.3254902 , 0.3254902 ],
        [0.38823529, 0.4       , 0.4       ],
        [0.18823529, 0.19607843, 0.19607843],
        ...,
        [1.        , 1.        , 1.        ],
        [1.        , 1.        , 1.        ],
        [1.        , 1.        , 1.        ]],

       [[0.31764706, 0.32941176, 0.32941176],
        [0.2745098 , 0.28235294, 0.28235294],
        [0.28235294, 0.29019608, 0.29019608],
        ...,
        [1.        , 1.        , 1.        ],
        [1.        , 1.        , 1.        ],
        [1.        , 1.        , 1.        ]],

       [[0.30980392, 0.31764706, 0.31764706],
        [0.22745098, 0.23529412, 0.23529412],
        [0.14117647, 0.14901961, 0.14901961],
        ...,
        [0.99607843, 0.99607843, 0.99607843],
        [0.99607843, 0.99607843, 0.99607843],
        [0.99607843, 0.99607843, 0.99607843]],

       ...,

       [[0.54901961, 0.54117647, 0.49803922],
        [0.55686275, 0.54901961, 0.50588235],
        [0.54509804, 0

1. Values have been normalized

---



#**Train Test Split**

In [14]:
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=42)
X_train,X_val,y_train,y_val = train_test_split(X_train,y_train,test_size=0.2,random_state=42)

#**Model Training - CNN**

In [18]:
model = Sequential()

#1. Feature Extractor
#__________________

#First Convolutional Layer
model.add(Conv2D(64, (3,3), input_shape=(244,244,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.1))

#Second Convolutional Layer
model.add(Conv2D(32, (3,3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.1))

#2. Classification (Fully Connected Layer)
#__________________

model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(4, activation='sigmoid'))

model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])

train = model.fit(X_train,y_train,validation_data=(X_val,y_val),epochs=10,batch_size=32)

Epoch 1/10
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 129ms/step - accuracy: 0.5734 - loss: 289.3533 - val_accuracy: 0.7431 - val_loss: 3068.1868
Epoch 2/10
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 73ms/step - accuracy: 0.6466 - loss: 8250.6045 - val_accuracy: 0.7431 - val_loss: 33004.6680
Epoch 3/10
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 74ms/step - accuracy: 0.6731 - loss: 61491.5938 - val_accuracy: 0.7431 - val_loss: 168194.7188
Epoch 4/10
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 73ms/step - accuracy: 0.6514 - loss: 247788.7812 - val_accuracy: 0.7431 - val_loss: 472240.4375
Epoch 5/10
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 72ms/step - accuracy: 0.6568 - loss: 687652.6250 - val_accuracy: 0.7431 - val_loss: 994576.6875
Epoch 6/10
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 71ms/step - accuracy: 0.6585 - loss: 1262990.6250 - val_accuracy: 0.7431 - val

#**Model Evalaution**

In [19]:
scores = model.evaluate(X_test,y_test)

[1m6/6[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 67ms/step - accuracy: 0.6662 - loss: 1383806.8750


In [20]:
scores

[1413644.125, 0.6555555462837219]

#**Model Perfroamance**

1. The accuracy of model even on training data is very low indicating high bias. Therefore, the model is underfitting. We need to optimize the model's performance.
2. The accuracy of model on test data is also very low indicating very high variance. We need to improve on this also.

#**Thank You**