# Defect Detection Project Example

This notebook will show the process for building and training a model for multi-class defect classification and localization.  

We will use the **Capsule** data from the [MVTec AD dataset](https://www.mvtec.com/company/research/datasets/mvtec-ad).  
The data is released under the *Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)*.  
The dataset is used strictly for non-commercial, educational purposes.

The model will detect if the image of a medicinal capsule has a defect, and in that case will detect its location on the pircure.

## 1. Setup

### 1.1 Imports

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
from pathlib import Path

2025-06-12 15:10:28.673674: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-06-12 15:10:28.681457: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1749733828.690249   29082 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1749733828.693011   29082 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1749733828.700541   29082 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

### 1.2 Global Settings

### 1.3 Dataset Paths & Metadata

In [2]:
root = Path("../data/capsule")
train_dir = root / "train" / "good"
test_dir = root / "test"
gt_dir = root / "ground_truth"

In [3]:
# TRAIN SET
train_df = pd.DataFrame({
    "filepath": [str(p) for p in (train_dir.glob("*.png"))],
    "label_binary": 0,
    "defect_type": "good",
    "mask_path": None
})

In [4]:
# TEST SET
records = []
for defect_folder in test_dir.iterdir():
    label = 0 if defect_folder.name == "good" else 1
    for img_path in defect_folder.glob("*.png"):
        mask_name = img_path.stem + "_mask.png"  # e.g. '001' -> '001_mask.png'
        mask_path = (gt_dir / defect_folder.name / mask_name
                     if label == 1 else None)
        records.append({
            "filepath": str(img_path),
            "label_binary": label,
            "defect_type": defect_folder.name,
            "mask_path": str(mask_path) if mask_path and mask_path.exists() else None
        })

test_df = pd.DataFrame(records)

In [5]:
class_names = ['good'] + sorted([cls for cls in test_df["defect_type"].unique() if cls != 'good'])
class_to_idx = {name: idx for idx, name in enumerate(class_names)}

# Add class ID column
test_df["label_multiclass"] = test_df["defect_type"].map(class_to_idx)
train_df["label_multiclass"] = class_to_idx["good"]

In [6]:
class_names

['good', 'crack', 'faulty_imprint', 'poke', 'scratch', 'squeeze']

In [7]:
train_df.head()

Unnamed: 0,filepath,label_binary,defect_type,mask_path,label_multiclass
0,../data/capsule/train/good/095.png,0,good,,0
1,../data/capsule/train/good/036.png,0,good,,0
2,../data/capsule/train/good/028.png,0,good,,0
3,../data/capsule/train/good/039.png,0,good,,0
4,../data/capsule/train/good/149.png,0,good,,0


In [9]:
test_df.head()

Unnamed: 0,filepath,label_binary,defect_type,mask_path,label_multiclass
0,../data/capsule/test/squeeze/017.png,1,squeeze,../data/capsule/ground_truth/squeeze/017_mask.png,5
1,../data/capsule/test/squeeze/002.png,1,squeeze,../data/capsule/ground_truth/squeeze/002_mask.png,5
2,../data/capsule/test/squeeze/005.png,1,squeeze,../data/capsule/ground_truth/squeeze/005_mask.png,5
3,../data/capsule/test/squeeze/013.png,1,squeeze,../data/capsule/ground_truth/squeeze/013_mask.png,5
4,../data/capsule/test/squeeze/010.png,1,squeeze,../data/capsule/ground_truth/squeeze/010_mask.png,5


In [10]:

def load_image_and_label(path, label):
    image = tf.io.read_file(path)
    image = tf.image.decode_png(image, channels=3)
    image = tf.image.resize(image, [1000, 1000])  # Adjust as needed
    image = tf.cast(image, tf.float32) / 255.0
    return image, label

def df_to_dataset(df, label_column, batch_size=32, shuffle=True):
    paths = df["filepath"].values
    labels = df[label_column].values
    ds = tf.data.Dataset.from_tensor_slices((paths, labels))
    ds = ds.map(lambda x, y: load_image_and_label(x, y),
                num_parallel_calls=tf.data.AUTOTUNE)
    if shuffle:
        ds = ds.shuffle(buffer_size=len(df))
    return ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)

train_ds = df_to_dataset(train_df, "label_multiclass")
test_ds = df_to_dataset(test_df, "label_multiclass", shuffle=False)

W0000 00:00:1749734463.677982   29082 gpu_device.cc:2430] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
W0000 00:00:1749734463.683388   29082 gpu_device.cc:2430] TensorFlow was not built with CUDA kernel binaries compatible with compute capability 12.0. CUDA kernels will be jit-compiled from PTX, which could take 30 minutes or longer.
I0000 00:00:1749734463.768814   29082 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 12953 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 5060 Ti, pci bus id: 0000:01:00.0, compute capability: 12.0


## 2. Data Analysis & Visualization

### 2.1 Class/Type Distribution

### 2.2 Sample Visualization

### 2.3 Image Properties

## 3. Preprocessing & Data Pipeline

### 3.1 Data Splits

### 3.2 Data Augmentation

### 3.3 Label Encoding

## 4. Model Architecture

### 4.1 Base (Transfer Learning)

### 4.2 Multi-Head Outputs

### 4.3 Combined Model

## 5. Losses & Metrics

### 5.1 Custom Loss

### 5.2 Callbacks

## 6. Training

### 6.1 Fit Model

### 6.2 Monitor Performance

## 7. Evaluation

### 7.1 Quantitative Evaluation

### 7.2 Qualitative Results

## 8. Exporting for Reuse & Deployment

## 9. Inference Code