# Crack Detection using Convolutional Neural Networks

In this session, we will take part of a [Kaggle competition](https://www.kaggle.com/t/28f427df84ea4d06a395beb6a0436cd3). For those of you who never heard about it before, Kaggle is an online data science community with lots of interesting data sets, codes, open competitions, etc. In Kaggle, I set up a competition for this assignment---you can download necessary data for this assignment from it, view samples, and submit your results to get them graded.

To get started, open [this link](https://www.kaggle.com/t/28f427df84ea4d06a395beb6a0436cd3) and browse through the competition page. The webpage will walk you through the details of data and how to make a submission.

To download the dataset, there are basically two ways. An easy (but perhaps more time-consuming) way is to just click `Data` tab in the competition page, and hit `Download All` button. It will then download the entire dataset in a zip file, which you can extract to a desired location.

Alaternatively, you can also use Kaggle CLI (Command Line Interface), which can be installed via the following command in terminal:
```bash
pip install kaggle
```
After installing it, go to the `Account` tab of your user profile on Kaggle and select `Create New Token`. This will trigger the download of `kaggle.json`, a file containing your API credentials. The Kaggle CLI tool will look for this token at `~/.kaggle/kaggle.json` on Linux, OSX, and other UNIX-based operating systems, and at `C:\Users\<username>\.kaggle\kaggle.json` on Windows. If you are using Google Colab, see '[How to use the Kaggle API from Colab](https://colab.research.google.com/github/corrieann/kaggle/blob/master/kaggle_api_in_colab.ipynb).'

Once the configuration is complete, running the following command in terminal will let you download the dataset:
```bash
kaggle competitions download -c <competition-name> -p <download-folder>
```

In our case, executing the following sell will let you download the competition dataset:

In [None]:
!kaggle competitions download -c padl-assignment-1-concrete-crack-detection -p ./data

Now you can unzip the file using Python `zipfile` library.

In [1]:
data_dir = 'data/padl-assignment-1-concrete-crack-detection'

In [2]:
import zipfile

with zipfile.ZipFile('data/padl-assignment-1-concrete-crack-detection.zip', 'r') as zip_ref:
    zip_ref.extractall(data_dir)

In [None]:
import os
import platform
if platform.system() == "Windows":
    !dir "{data_dir}"
else:
    !ls "{data_dir}"

### Load Data

Okay, now that we have downloaded all the competition data from Kaggle, let's start loading them. As our usual step, we are going to import PyTorch and find an available device.

In [None]:
import torch

if torch.cuda.is_available():
    device = torch.device("cuda") 
elif torch.backends.mps.is_available():
    device = torch.device("mps")
else: 
    device = torch.device("cpu")
print("Device:", device)

In [3]:
from torchvision import datasets, transforms
from torch.utils.data import random_split

image_width = 64     # TODO: Change the image size to optimize the performance
image_height = 64    # TODO: Change the image size to optimize the performance
batch_size = 32      # TODO: Change the batch size to optimize the performance
transform = transforms.Compose([   # TODO: Explore other image transformations to optimize the performance
    transforms.Grayscale(),
    transforms.Resize((image_height, image_width)),
    transforms.ToTensor(),
    transforms.Lambda(torch.flatten)  # TODO: We are flattening image to a 1D vector to make it compatible with MLP. For CNN, remove this line.
])
dataset = datasets.ImageFolder(data_dir + '/train', transform=transform)

dataset_size = len(dataset)
train_size = int(0.8*dataset_size)
val_size = dataset_size - train_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = torch.utils.data.DataLoader(val_dataset, batch_size=batch_size)

In [None]:
import matplotlib.pyplot as plt
images, labels = next(iter(train_loader))
plt.figure()
for i, (image, label) in enumerate(zip(images, labels)):
    plt.subplot(4,8,i+1)
    plt.imshow(image.reshape(image_height,image_width),cmap='gray')
    plt.xticks([])
    plt.yticks([])
    plt.title(dataset.classes[label])

### Model Design

Below is a simple baseline model for testing. Design your own model here to make it perform better.

In [None]:
import torch.nn as nn

class SimpleClassifier(nn.Module):

    def __init__(self, num_inputs, num_hidden, num_outputs):
        super().__init__()
        # TODO: Try different architecture design (e.g., your own CNN, transfer a pre-trained CNN, etc.)
        self.linear1 = nn.Linear(num_inputs, num_hidden)
        self.act_fn = nn.Tanh()
        self.linear2 = nn.Linear(num_hidden, num_outputs)

    def forward(self, x):
        # TODO: Try different architecture design
        x = self.linear1(x)
        x = self.act_fn(x)
        x = self.linear2(x)
        return x
    
model = SimpleClassifier(num_inputs=image_height*image_width, num_hidden=4096, num_outputs=1)

# Printing a module shows all its submodules
print(model)

### Train

Finally, it's time to train the model!

In [6]:
# TODO: Try different optimizers and parameter combinations
optimizer = torch.optim.SGD(model.parameters(), lr=3e-4)

In [None]:
model.to(device)    # Load the model to the device

loss_module = nn.BCEWithLogitsLoss()

# Training loop
num_epochs = 100     # TODO: Increase the number of epochs (e.g., 1000) to see if it converges better
for epoch in range(num_epochs):
    model.train()       # Set model to train mode
    running_loss = 0
    for i, (data_inputs, data_labels) in enumerate(train_loader):

        ## Step 1: Move input data to device (only strictly necessary if we use GPU)
        data_inputs = data_inputs.to(device)
        data_labels = data_labels.to(device)

        ## Step 2: Run the model on the input data
        preds = model(data_inputs)
        preds = preds.squeeze(dim=1) # Output is [Batch size, 1], but we want [Batch size]

        ## Step 3: Calculate the loss
        loss = loss_module(preds, data_labels.float())

        ## Step 4: Perform backpropagation
        # Before calculating the gradients, we need to ensure that they are all zero.
        # The gradients would not be overwritten, but actually added to the existing ones.
        optimizer.zero_grad()
        # Perform backpropagation
        loss.backward()

        ## Step 5: Update the parameters
        optimizer.step()

        ## Step 6: Print the progress before moving to the next iteration
        running_loss += loss.detach().cpu().numpy()
        print(f"Epoch {epoch} loss: {running_loss/(i+1)}", end='\r')
    avg_train_loss = running_loss / i

    # Repeat the same, but this time on the validation data to evaluate the progress
    model.eval()       # Set model to evaluation mode
    running_val_loss = 0
    for i, (data_inputs, data_labels) in enumerate(val_loader):
        
        data_inputs = data_inputs.to(device)
        data_labels = data_labels.to(device)

        preds = model(data_inputs)
        preds = preds.squeeze(dim=1)

        loss = loss_module(preds, data_labels.float())

        # For validation data, we skip backpropagation (we are not training!)

        running_val_loss += loss.detach().cpu().numpy()
        print(f"Epoch {epoch} - loss: {avg_train_loss}, val: {running_val_loss/(i+1)}", end='\r')
    print('')

In [8]:
def eval_model(model, data_loader):
    model.eval() # Set model to eval mode
    true_preds, num_preds = 0., 0.

    with torch.no_grad(): # Deactivate gradients for the following code
        for data_inputs, data_labels in data_loader:

            # Determine prediction of model on dev set
            data_inputs, data_labels = data_inputs.to(device), data_labels.to(device)
            preds = model(data_inputs)
            preds = preds.squeeze(dim=1)
            preds = torch.sigmoid(preds) # Sigmoid to map predictions between 0 and 1
            pred_labels = (preds >= 0.5).long() # Binarize predictions to 0 and 1

            # Keep records of predictions for the accuracy metric (true_preds=TP+TN, num_preds=TP+TN+FP+FN)
            true_preds += (pred_labels == data_labels).sum()
            num_preds += data_labels.shape[0]

    acc = true_preds / num_preds
    print(f"Accuracy of the model: {100.0*acc:4.2f}%")

In [None]:
eval_model(model, train_loader)

In [None]:
eval_model(model, val_loader)

### Submission

Now that we have a model trained, let's evaluate it on the test data. To do so, we create a separate test data loader.

In [16]:
import os
from PIL import Image

# Custom dataset class to keep track of the filename
class TestDataset(torch.utils.data.Dataset):
    def __init__(self, root_dir, transform=None):
        self.root_dir = root_dir
        self.transform = transform
        self.image_files = [f for f in os.listdir(root_dir) if f.endswith('.jpg')]

    def __len__(self):
        return len(self.image_files)

    def __getitem__(self, idx):
        img_name = self.image_files[idx]
        img_path = os.path.join(self.root_dir, img_name)
        image = Image.open(img_path).convert('RGB')

        if self.transform:
            image = self.transform(image)

        return image, img_name

In [21]:
test_transform = transforms.Compose([   # Be careful: No other data augmentation should be included in test transform
    transforms.Grayscale(),
    transforms.Resize((image_height, image_width)),
    transforms.ToTensor(),
    transforms.Lambda(torch.flatten)  # TODO: We are flattening image to a 1D vector to make it compatible with MLP. For CNN, remove this line.
])
test_dataset = TestDataset(data_dir + '/test', transform=test_transform)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size)

In [None]:
model.eval() # Set model to eval mode

predictions = {}
with torch.no_grad(): # Deactivate gradients for the following code
    for data_inputs, filenames in test_loader:

        # Infer model on a test batch
        data_inputs = data_inputs.to(device)
        preds = model(data_inputs)
        preds = preds.squeeze(dim=1)
        preds = torch.sigmoid(preds) # Sigmoid to map predictions between 0 and 1
        pred_labels = (preds >= 0.5).long() # Binarize predictions to 0 and 1

        for filename, pred in zip(filenames, pred_labels):
            predictions[filename] = int(pred.detach().cpu().numpy())

print(predictions)

In [None]:
import csv

# Write the result to a CSV file
csv_file = 'submission.csv'
with open(csv_file, mode='w', newline='') as file:
    writer = csv.writer(file)
    # Write header
    writer.writerow(['FILENAME', 'TARGET'])
    # Write content
    for filename, label in predictions.items():
        writer.writerow([filename, label])

print(f"Prediction results have been successfully written to {csv_file}")

Now you can submit it through either the [competition page](https://www.kaggle.com/competitions/padl-assignment-1-concrete-crack-detection) or the following command.


In [None]:
!kaggle competitions submit -c padl-assignment-1-concrete-crack-detection -f submission.csv -m [PROVIDE-A-SHORT-DESCRIPTION]