# CRACK DETECTION USING CNN
**It ia an image classification problem using deep convolutional neural networks. The
dataset consists of images obtained from concrete bridge decks, pavements, and walls. The images can
contain either cracks or no cracks. Our objective in this challenge is to develop a binary image
classification model that can classify from a given image if it contains any cracks or not.**

The dataset consists of 2 folders: 'train' and 'test'. The folder 'train' is a labelled dataset and consists of
2 sub-folders: 'cracked' and 'uncracked'. The sub-folder 'cracked' consists of a total of 7501 images
containing cracks, while the 'uncracked' sub-folder contains 7467 images with no cracks. All images
are of the ‘.jpg’ format.
The ‘test’ folder consists of the test dataset images. There are a total of 2000 images in this folder (all
‘.jpg’ files). These images are unlabelled i.e., the labels are not provided, whether they contain cracks
or not. The objective of this challenge is to predict the class of each of these test images. The class can
be either ‘cracked’ or ‘uncracked’. The filenames of these test images are given as 1.jpg, 2.jpg, 3.jpg,
…, 2000.jpg.

In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import tensorflow as tf
import os
from keras.preprocessing.image import ImageDataGenerator

**data augmentation and scaling done for train data**

In [4]:

training_data=ImageDataGenerator(rescale= 1./255, shear_range=0.3,validation_split= 0.2, zoom_range=0.3 , horizontal_flip=True) 
train_dataset= training_data.flow_from_directory("train", 
                                                 target_size=(32,32), 
                                                 batch_size=32, 
                                                 shuffle= True, 
                                                 seed=42 , 
                                                 class_mode='binary',
                                                 subset='training')

Found 11975 images belonging to 2 classes.


**validation dataset scaling**

In [6]:

val_dataset=training_data.flow_from_directory('train', target_size=(32,32), batch_size=32, shuffle= True, seed=42 , class_mode='binary',subset='validation')

Found 2993 images belonging to 2 classes.


**creating CNN model**

In [21]:

my_mod=tf.keras.models.Sequential()

In [22]:
my_mod.add(tf.keras.Input(shape=(32,32, 3)))
my_mod.add(tf.keras.layers.Conv2D(filters=16, kernel_size=(3, 3), activation='relu')) #convolution layer added
my_mod.add(tf.keras.layers.MaxPool2D(pool_size=(2, 2)))  # adding maxpool layer
my_mod.add(tf.keras.layers.Conv2D(filters=32, kernel_size=(3, 3), activation='relu'))
my_mod.add(tf.keras.layers.MaxPool2D(pool_size=(2, 2)))
my_mod.add(tf.keras.layers.Flatten())  # adding the flattening layer
my_mod.add(tf.keras.layers.Dense(32, activation='relu'))
my_mod.add(tf.keras.layers.Dense(1, activation='sigmoid'))



In [25]:
#model summary checked
my_mod.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_14 (Conv2D)          (None, 30, 30, 16)        448       
                                                                 
 max_pooling2d_14 (MaxPoolin  (None, 15, 15, 16)       0         
 g2D)                                                            
                                                                 
 conv2d_15 (Conv2D)          (None, 13, 13, 32)        4640      
                                                                 
 max_pooling2d_15 (MaxPoolin  (None, 6, 6, 32)         0         
 g2D)                                                            
                                                                 
 flatten_7 (Flatten)         (None, 1152)              0         
                                                                 
 dense_13 (Dense)            (None, 32)               

In [27]:
#compiling the CNN model
my_mod.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

In [29]:
#training the model
my_mod.fit(train_dataset, validation_data = val_dataset, batch_size=32, epochs=30)

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


<keras.callbacks.History at 0x2045996adf0>

In [30]:
test_path ='test'

In [31]:
test_filenames = os.listdir(test_path)

In [32]:
test_set = np.array([])
predictions = np.array([])

In [34]:
import seaborn as sns
import plotly.express as px
from pathlib import Pathfrom sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
import cv2
from tensorflow.keras.layers import Input, Lambda, Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.applications.inception_v3 import preprocess_input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator,load_img
from tensorflow.keras.models import Sequential
from glob import glob

In [35]:
#creating class labels
a = train_dataset.class_indices
class_names = list(a.keys())

In [38]:
#creating array to store test predictions
test_predictions = np.array([])

for img in os.listdir(test_path):
    image_path = os.path.join(test_path,img)
    image = tf.keras.utils.load_img(image_path, target_size = (32,32))
    image = tf.keras.utils.img_to_array(image)
    image = np.array([image])
    image = image/255
    y_pred = my_mod.predict(image)
    y_pred_label = np.squeeze((y_pred >= 0.5861).astype(np.int))
    test_predictions = np.append(test_predictions, class_names[y_pred_label])



Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
  y_pred_label = np.squeeze((y_pred >= 0.5861).astype(np.int))




**# Create a DataFrame with the filename and predicted values**

In [39]:

results_df = pd.DataFrame({
    "filename": test_filenames,
    "class": test_predictions
})

# Save the DataFrame to a CSV file
results_df.to_csv("A2_22103032_Miska.csv", index=False)