# Crack detection problem
## Shiva Singh    kaggle id - Shivasingh9927
**It is image classification problem using deep convolutional neural networks. The 
dataset consists of images obtained from concrete bridge decks, pavements, and walls. The images can 
contain either cracks or no cracks. Our objective in this challenge is to develop a binary image 
classification model that can classify from a given image if it contains any cracks or not.**

## Data
The dataset consists of 2 folders: 'train' and 'test'. The folder 'train' is a labelled dataset and consists of 
2 sub-folders: 'Raveling' and 'Non_raveling'. The sub-folder 'Raveling' consists of a total of 350 
images containing raveling distress, while the 'Non_raveling' sub-folder contains 350 images with no 
raveling distress. All images are of the ‘.jpg’ format.
The ‘test’ folder consists of the test dataset images. There are a total of 700 images in this folder (all 
‘.jpg’ files). These images are unlabelled i.e., the labels are not provided, whether they contain 
raveling distress or not. The objective of this challenge is to predict the class of each of these test 
images. The class can be either ‘Raveling’ or ‘Non_raveling’. The filenames of these test images are 
given as 1.jpg, 2.jpg, 3.jpg, …, 300.jpg.

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image 

Train Data preprocessing

In [15]:
train_datagen = ImageDataGenerator(rescale = 1./255,
                                   shear_range = 0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip = True)
training_set = train_datagen.flow_from_directory("train/",target_size = (64, 64),batch_size = 32,class_mode = 'binary')

Found 14968 images belonging to 2 classes.


CNN Model formation

In [18]:
cnn = tf.keras.models.Sequential()

In [19]:
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu', input_shape=[64, 64, 3])) # Applying 2d convolution

In [20]:
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2)) # Applying Maxpooling with 2 strides

In [21]:
cnn.add(tf.keras.layers.Conv2D(filters=32, kernel_size=3, activation='relu')) # Adding another conv. layer
cnn.add(tf.keras.layers.MaxPool2D(pool_size=2, strides=2)) # Applying maxpooling

In [22]:
cnn.add(tf.keras.layers.Flatten())

In [23]:
cnn.add(tf.keras.layers.Dense(units=128, activation='relu'))   # adding connected nueral network layer

In [24]:
cnn.add(tf.keras.layers.Dense(units=1, activation='sigmoid')) # Applying sigmoid activation layer for predicting two classes

Compiling and fitting the model

In [25]:
cnn.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

In [26]:
cnn.fit(x = training_set, validation_data = test_set, epochs = 35)

Epoch 1/35
Epoch 2/35
Epoch 3/35
Epoch 4/35
Epoch 5/35
Epoch 6/35
Epoch 7/35
Epoch 8/35
Epoch 9/35
Epoch 10/35
Epoch 11/35
Epoch 12/35
Epoch 13/35
Epoch 14/35
Epoch 15/35
Epoch 16/35
Epoch 17/35
Epoch 18/35
Epoch 19/35
Epoch 20/35
Epoch 21/35
Epoch 22/35
Epoch 23/35
Epoch 24/35
Epoch 25/35
Epoch 26/35
Epoch 27/35
Epoch 28/35
Epoch 29/35
Epoch 30/35
Epoch 31/35
Epoch 32/35
Epoch 33/35
Epoch 34/35
Epoch 35/35


<keras.callbacks.History at 0x206644af040>

In [27]:
cnn.save('model.h5')

In [28]:
import os

 Preprocessing the test data

In [29]:
img_names=os.listdir('test/') # creating a list of test image names
img_names

['1.jpg',
 '10.jpg',
 '100.jpg',
 '1000.jpg',
 '1001.jpg',
 '1002.jpg',
 '1003.jpg',
 '1004.jpg',
 '1005.jpg',
 '1006.jpg',
 '1007.jpg',
 '1008.jpg',
 '1009.jpg',
 '101.jpg',
 '1010.jpg',
 '1011.jpg',
 '1012.jpg',
 '1013.jpg',
 '1014.jpg',
 '1015.jpg',
 '1016.jpg',
 '1017.jpg',
 '1018.jpg',
 '1019.jpg',
 '102.jpg',
 '1020.jpg',
 '1021.jpg',
 '1022.jpg',
 '1023.jpg',
 '1024.jpg',
 '1025.jpg',
 '1026.jpg',
 '1027.jpg',
 '1028.jpg',
 '1029.jpg',
 '103.jpg',
 '1030.jpg',
 '1031.jpg',
 '1032.jpg',
 '1033.jpg',
 '1034.jpg',
 '1035.jpg',
 '1036.jpg',
 '1037.jpg',
 '1038.jpg',
 '1039.jpg',
 '104.jpg',
 '1040.jpg',
 '1041.jpg',
 '1042.jpg',
 '1043.jpg',
 '1044.jpg',
 '1045.jpg',
 '1046.jpg',
 '1047.jpg',
 '1048.jpg',
 '1049.jpg',
 '105.jpg',
 '1050.jpg',
 '1051.jpg',
 '1052.jpg',
 '1053.jpg',
 '1054.jpg',
 '1055.jpg',
 '1056.jpg',
 '1057.jpg',
 '1058.jpg',
 '1059.jpg',
 '106.jpg',
 '1060.jpg',
 '1061.jpg',
 '1062.jpg',
 '1063.jpg',
 '1064.jpg',
 '1065.jpg',
 '1066.jpg',
 '1067.jpg',
 '1068.jpg'

In [30]:
# testing on one image
test_image = image.load_img('test/1.jpg',target_size=(64,64))
test_image = image.img_to_array(test_image)
test_image = test_image/255
test_image = np.expand_dims(test_image,axis=0)
result = cnn.predict(test_image)
result



array([[0.65816575]], dtype=float32)

Predicting the image class

In [31]:
# Applying for loop to convert the prediction cracked if <= 0.5 , else uncracked image and pridicting the class
pred_class=[]
for i in img_names:
    test_image = image.load_img(f'test/{i}',target_size=(64,64),color_mode='rgb')
    test_image = image.img_to_array(test_image)
    test_image = test_image/255
    test_image = np.expand_dims(test_image,axis=0)
    result = cnn.predict(test_image)
    if result[0,0] <= 0.5: # applying if else condition 
        pred_class.append('cracked')
    else:
        pred_class.append('uncracked')
print(pred_class)

['uncracked', 'cracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'uncracked', 'uncracked', 'uncracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'uncracked', 'uncracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'uncracked', 'uncracked', 'uncracked', 'uncracked', 'cracked', 'uncracked', 'uncracked', 'uncracked', 'uncracked', 'cracked', 'cracked', 'uncracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'uncracked', 'uncracked', 'cracked', 'uncracked', 'uncracked', 'uncracked', 'cracked', 'uncracked', 'cracked', 'uncracked', 'cracked', 'cracked', 'cracked', 'uncracked', 'cracked',

Creating the dataframe

In [41]:
result=pd.DataFrame({'filename': img_names,'class':pred_class})
result

Unnamed: 0,filename,class
0,1.jpg,uncracked
1,10.jpg,cracked
2,100.jpg,cracked
3,1000.jpg,cracked
4,1001.jpg,cracked
...,...,...
1995,995.jpg,cracked
1996,996.jpg,uncracked
1997,997.jpg,uncracked
1998,998.jpg,cracked


In [42]:
result=result.set_index('filename')
result

Unnamed: 0_level_0,class
filename,Unnamed: 1_level_1
1.jpg,uncracked
10.jpg,cracked
100.jpg,cracked
1000.jpg,cracked
1001.jpg,cracked
...,...
995.jpg,cracked
996.jpg,uncracked
997.jpg,uncracked
998.jpg,cracked


In [43]:
test_csv_data =result.to_csv('test_pred_2.csv', index = True) # saving to csv file
print('\nCSV String:\n', test_csv_data)


CSV String:
 None
