# **Project: Anomaly Detection for AITEX Dataset**
#### Track: VAE
## `Notebook 8`: Why the VAE Approach Struggles with AITEX Anomaly Detection
**Author**: Oliver Grau 

**Date**: 27.03.2025  
**Version**: 1.0

## 📚 Table of Contents

- [Why the VAE Approach Struggles with AITEX Anomaly Detection](#why-the-vae-approach-struggles-with-aitex-anomaly-detection)
  - [1. Introduction](#1-introduction)
  - [2. What We Tried: Architectural Experiments](#2-what-we-tried-architectural-experiments)
    - [Original VAE Architecture](#original-vae-architecture)
    - [Introducing Skip Connections](#introducing-skip-connections)
    - [Attention Blocks (SEBlocks)](#attention-blocks-seblocks)
    - [Gated Skip Connections](#gated-skip-connections)
  - [3. Observations from Experiments](#3-observations-from-experiments)
    - [Interpreting Our Observations: Why Did the VAE Struggle?](#interpreting-our-observations-why-did-the-vae-struggle)
    - [Why Thresholding Error Maps Isn’t Enough](#why-thresholding-error-maps-isnt-enough)
    - [How VAEs Respond to Different Anomaly Types](#how-vaes-respond-to-different-anomaly-types)
      - [Behavior Summary](#-behavior-summary)
      - [Why does this happen?](#why-does-this-happen)
      - [Takeaway for Learners](#takeaway-for-learners)
  - [3. What Did We Learn from This?](#️-3-what-did-we-learn-from-this)
    - [Next Steps: Beyond VAE](#-next-steps-beyond-vae)
    - [Conclusion for Learners](#-conclusion-for-learners)
      - [Final thought](#-final-thought)
    - [Final Reflection: Why VAEs Aren’t Enough for AITEX](#final-reflection-why-vaes-arent-enough-for-aitex)

---

## 1. Introduction

In this notebook, we summarize the exploratory process and key findings from using **Variational Autoencoders (VAEs)** for anomaly detection on the **AITEX fabric dataset**. 

You’ll see clearly why a simple **reconstruction-based VAE approach** faces difficulties, despite several attempts to improve it through architectural enhancements like skip connections, attention blocks, and gated skip connections.

---

## 2. What We Tried: Architectural Experiments

Throughout our experiments, we explored multiple ways to make the VAE reconstruction more accurate and more sensitive to anomalies. Here’s what we tested in detail:

### **1. Original VAE Architecture**
- A basic Convolutional VAE trained only on **normal (defect-free)** fabric images.
- Goal: **Detect anomalies by reconstruction error** (high error → anomaly).

### **2. Introducing Skip Connections**
We tried two types of skip connections to enhance reconstruction:
- **Additive Skip Connections** (directly adding encoder features to the decoder).
- **Concatenative Skip Connections** (concatenating encoder features to decoder features, followed by convolutional fusion).

**Why we did this:**  
We wanted better reconstruction quality so anomalies would stand out more sharply in the error map.

### **3. Attention Blocks (SEBlocks)**
We integrated **Squeeze-and-Excitation (SE) blocks** to dynamically recalibrate feature channels at the latent bottleneck, helping the network to decide which features were important.

**Motivation:**  
To see if attention could help the VAE reconstruct normal textures well, yet struggle to reconstruct anomalies, thus highlighting them clearly.

### **4. Gated Skip Connections**
We finally tested **ChannelGate**, a learned gating mechanism applied to skip connections, giving the model control over how much encoder information should flow into the decoder.

**Reasoning:**  
We hypothesized gating could help suppress the reconstruction of anomalous details carried through skip connections.

---

## 3. Observations from Experiments

During multiple training sessions and thorough evaluations, we observed several consistent behaviors in the VAE approach:

**1. Bright Anomalies (White spots):**  
- Clearly reconstructed incorrectly (turned darker), resulting in high reconstruction error.
- ✅ **Good anomaly detection**.

![Bright anomaly example](images/white_anomaly.png)

---

**2. Dark Anomalies (Black fissures, texture disruptions):**  
- Reconstructed inversely (turned brighter/whitish), thus reducing pixel-wise error, or even causing inversion in the error map.
- ❌ **Detected inversely** (low usefulness of error maps).

![Inverted dark anomaly example](images/inversed_anomaly.png)

---

**3. Brightness-Only Variations:**  
- VAE reconstructed these too well (minor brightness shift), yielding minimal or no significant error.
- ❌ **Not detectable by VAE**.

![Undetected Brightness change](images/brightness_only_anomaly.png)
---

**4. Texture Disruptions (lines, folds):**  
- Clearly visible in error heat maps but resulted in noisy or unusable binary masks.
- ❌ **Error map too noisy or ambiguous for reliable detection**.

![Non usable prediction mask](images/line_not_usable.png)

---

**5. Small Dot-like Anomalies (Blobs, specks):**  
- Clearly highlighted in the error heat map (yellow high-intensity region), but thresholding methods like **Otsu** failed to isolate these small anomalies.
- ❌ **Difficult to threshold reliably**.

![Dot-like anomaly](images/dot_not_usable.png)

---

### Interpreting Our Observations: Why Did the VAE Struggle?

The difficulties we observed weren’t due to flaws in our implementation but inherent limitations in how VAE-based anomaly detection operates:

- **VAEs aim to generalize:** They reconstruct patterns broadly similar to what they saw during training even if these patterns contain subtle anomalies. Thus, anomalies similar to normal patterns were reconstructed too well.
- **Pixel-wise losses (MSE, SSIM) are limited:** They heavily penalize brightness shifts or minor noise but do not always correlate strongly with structural anomalies.
- **Skip Connections & Attention:** Improved reconstruction quality globally but unfortunately, this also meant some anomalies became easier to reconstruct, resulting in lower error scores for real anomalies.
- **Thresholding is a challenging post-processing step:** Even when the anomaly is clearly visible in the error map, simple thresholding (e.g., Otsu) often fails because it doesn’t consider spatial or structural context.

---

### Why Thresholding Error Maps Isn’t Enough

As an illustrative example, take the following image:

- Error map clearly highlights the anomaly (bright yellow spot).
- Yet, thresholding produces an unusable noisy mask.

![Clear anomaly but noisy threshold](images/dot_not_usable_2.png)

This happens because methods like **Otsu** are global, unsupervised, and lack spatial understanding. They work well with simple or large anomalies, but fail when anomalies are small, subtle, or amidst noisy backgrounds.

---

### How VAEs Respond to Different Anomaly Types

After extensive experimentation with the AITEX dataset, we noticed that the VAE responded **very differently** to various types of anomalies. Some were detected well, others were reconstructed too well, and some were completely ignored.

Let’s explore why this happens:

---

#### ✅ Behavior Summary

| **Anomaly Type**            | **Model Response** | **Explanation**                                                                 | **Detection Result** |
|-----------------------------|--------------------|----------------------------------------------------------------------------------|-----------------------|
| **Bright anomalies** (e.g. white dots, specks) | Reconstructed too dark | VAE has never seen such high intensities — it fails to reproduce them correctly | ✔️ Detected well via high error |
| **Dark anomalies** (e.g. fissures, holes) | Reconstructed too bright | VAE fills in plausible normal textures; may invert brightness during recon | 🔄 Sometimes inverted / misleading |
| **Brightness shifts** (global or regional) | Reconstructed smoothly | Brightness changes are treated as normal variation — decoder generalizes well | ❌ Undetected (very low error) |
| **Texture disruptions** (lines, folds) | Reconstructed with noise | Model can’t reconstruct unfamiliar structure, but adds noise instead of structure | ⚠️ Error map visible, mask too noisy |
| **Small dots or blobs** | Localized in heatmap | Clearly visible in error map, but thresholding (e.g. Otsu) fails due to small size | ⚠️ Good map, but mask extraction fails |

---

#### Why does this happen?

VAEs are trained on **normal data only**. They learn to:
- **Compress** input images into a low-dimensional latent space,
- And then **reconstruct** them as well as possible.

When something **unusual** (an anomaly) appears, the model tries to:
- **Explain it away** using what it knows
- Or **fails to reconstruct it** and produces high error

But:
- If the anomaly looks **similar to normal patterns**, the VAE just **reconstructs it as if it's normal**
- If the anomaly causes **small, global changes** (like brightness shifts), the decoder **smooths over it**
- If the anomaly is **very different**, the VAE **fails** → large error → detection works

---

#### Takeaway for Learners

> VAEs don’t "detect" anomalies — they just try to **rebuild** what they’re given.
> Whether an anomaly gets caught depends entirely on how **strange** it is to the model.

This makes VAEs:
- Great for **clear outliers** (blobs, holes)
- Weak for **subtle shifts** (brightness, small-scale texture drift)
- Highly dependent on **post-processing** to extract meaningful masks from error maps

---

## 🔚 3. What Did We Learn from This?

This extensive exploration taught us that:

- Pure **VAE-based anomaly detection** relies entirely on reconstruction error. It works best when anomalies are distinctly out-of-distribution.
- With textured images (like fabrics), anomalies often are subtle or structurally similar to normal patterns, causing reconstruction-based methods alone to struggle.
- To move forward, we need models capable of:
  - Learning more **semantic or structural features**.
  - Better interpreting subtle anomalies.
  - Segmenting anomalies from error maps explicitly, rather than relying on global pixel thresholds.

---

### 🚀 **Next Steps: Beyond VAE**

Given our observations, we naturally moved toward methods like:

- **PatchCore:** Leveraging pre-trained features to detect anomalies in feature space.
- **DRAEM:** Combining synthetic anomaly generation and supervised segmentation directly, explicitly learning how to identify anomalies.

These methods tackle the problems encountered in the VAE experiments head-on, and in our following notebooks, we'll explore what PatchCore and DRAEM can do better where our VAE struggled.

---

### 🎯 **Conclusion for Learners**

An important take-home message from this notebook series:

- Even experiments that don't yield "success" can give invaluable insights.
- Understanding why a method **doesn't work** is often the key to choosing methods that **do**.
- Being clear about the limitations of reconstruction-based models is crucial when tackling real-world datasets like AITEX.

---

#### 🎓 **Final thought:**

> “In anomaly detection, knowing what doesn’t work (and why) is as valuable as knowing what does. Every 'failure' is just another step toward the right solution.”

---

### Final Reflection: Why VAEs Aren’t Enough for AITEX

Despite multiple architectural improvements like skip connections, attention mechanisms, and gating strategies the core challenge remained:

> **Pure VAE-based anomaly detection cannot solve the anomaly segmentation task on AITEX.**

This limitation isn’t a flaw in implementation. It’s a **design limitation** of the VAE paradigm:

- VAEs are built to **reconstruct normal data** — but they often **reconstruct anomalies too well**.
- Anomalies that look “plausible” to the model are **recreated without high error**, and thus **undetected**.
- Subtle anomalies (like brightness shifts or minor texture variations) are **smoothed over**, and never trigger the model’s alarm.
- Postprocessing methods like thresholding or Otsu often fail to extract **meaningful binary masks** from the error map.

> In short: **VAEs don't know what an anomaly is. They just try to make sense of what they see.**

To go further, we need to:
- Move beyond raw reconstruction error,
- Introduce **feature-based comparisons**, **learned segmentation**, or **representation learning**,
- And **train models that can explicitly distinguish** normal from abnormal — not just recreate.

---

In the next notebook branches, we’ll explore two approaches that do exactly that:

- **PatchCore**, which compares feature space distances using pre-trained encoders,
- **DRAEM**, which learns to detect anomalies by reconstructing synthetic ones and training a segmentation head.

➡️ These methods go beyond the VAE’s limitations and deliver strong, usable results on the AITEX dataset.

<p style="font-size: 0.8em; text-align: center;">© 2025 Oliver Grau. Educational content for personal use only. See LICENSE.txt for full terms and conditions.</p>