# **Project: Anomaly Detection for AITEX Dataset**
#### Track: MLOps
## `Notebook 1`: Introduction and Motivation: Operationalization
**Author**: Oliver Grau 

**Date**: 27.03.2025  
**Version**: 1.0


## üìö Table of Contents

- [1. Introduction](#1-introducton)
- [2. Goals of the Notebook](#goals-of-the-notebook)
- [3. Notebook Structure](#3-notebook-structure)
- [4. Outlook & Next Steps](#4-outlook-and-next-steps)

---

## 1. Introduction
Welcome to the **Operationalization Track** of this notebook series on anomaly detection in manufacturing using image data.

In the previous notebooks tracks -

- `01_VAE`
- `02_PatchCore`
- `03_DRAEM`

- we explored and implemented several powerful methods to detect anomalies in the **AITEX fabric dataset**. And we now have a trained **DRAEM-based model** that performs exceptionally well on the test set.

The next step is not about improving accuracy or building new models ‚Äî it's about **bringing our best model into production.** This is the world of **MLOps**.

### Key Question: What Should the MLOps Pipeline Include?

Before jumping in, we face an important decision:

> **Should our MLOps pipeline include everything from data preparation and training to deployment, or just the inference and monitoring part?**

#### ‚úÖ Short Answer:

It depends on our use case, but for most real-world **industrial applications** like fabric defect detection, the answer is:

> **Focus on the inference and operationalization pipeline first**, and optionally support retraining later.

#### üè≠ In the context of manufacturing:

| Stage                        | Include in MLOps?       | Why / When?                                                   |
|-----------------------------|--------------------------|----------------------------------------------------------------|
| Data preparation            | Optional                 | Only if new data arrives regularly                             |
| Model training              | Optional                 | Only needed for regular updates or concept drift               |
| Evaluation / testing        | Optional (once-off)      | Often done offline and manually                                |
| Inference pipeline          | ‚úÖ Yes                   | The core of production usage                                   |
| Monitoring / drift detection| ‚úÖ Yes (if critical)     | Ensure input images remain within expected distribution        |
| Logging / reproducibility   | ‚úÖ Yes                   | Track predictions and model versioning                         |

In our case, training was **intensive and complex**, and it's common to perform this step **offline**. Production environments typically care about:
- Stable inference
- Fast response
- Logging and monitoring
- Reproducibility and traceability

---

## 2. Goals of the Notebook

In this operationalization track, we‚Äôll transform our trained DRAEM model into a usable system:

- Serve predictions on new data
- Log and track predictions + metadata
- Manage inference configurations cleanly
- Package everything as a deployable microservice (e.g., FastAPI)
- Discuss model versioning, updates, and optional retraining

### Planned Notebook Structure

| Notebook | Title                                  | Description |
|---------|----------------------------------------|-------------|
| `01_Introduction and Motivation.ipynb` | üëà You are here | Motivation, scope, and roadmap |
| `02_Inference Pipeline.ipynb`         | Inference Pipeline | Load model, run inference, visualize anomalies |
| `03_Tracking and Config with MLflow and Wandb.ipynb`       | MLflow Tracking with W&B    | Log predictions, metrics, images, artifacts |
| `04_Model API Deployment.ipynb`       | Serving the Model  | Wrap model in FastAPI and deploy on a service (e.g., Render) |
| `05_(Coming soon) Training Pipeline.ipynb` | Optional Retraining| How to retrain/update the model if required |
| `06_(Coming soon) Monitoring and Drift.ipynb`       | Monitoring & Drift | Optional: track input stats, detect data drift, send alerts |
| `07_(Coming soon) ONNX Export and Inference.ipynb`       | Inference with ONNX | Optional: add an ONNX export + inference notebook |
---

## 3. Notebook Structure:

---

#### üìì `02 Inference Pipeline.ipynb` ‚Äî **Preparing Inference Logic as Reusable Code**

##### ‚úÖ Goal:
Refactor your working inference code into **modular Python files** and add some more **Post Processing**.

##### üì¶ Outputs (Artifacts):
- `/inference/` directory: with working code for inference given a full image of size 4096x256

##### This notebook is **still notebook-based**, but helps you:
- Wrap inference into functions and scripts
- Test those functions locally
- Lay the foundation for `04_Model API Deployment.ipynb`

---

#### üìì `03 Tracking and Config with MLflow and Wandb.ipynb` ‚Äî **Offline/Local Logging of Inferences**

##### ‚úÖ Goal:
Use the code from `02` to run local inference on a few samples and log:
- input image
- anomaly map
- anomaly score
- model version

##### üì¶ Outputs:
- MLflow experiment logs for your inference tests
- Registered model (optional)
- Set up of local MLflow tracking URI (e.g. `mlruns/`)

Wrap all paths, parameters, and settings into Hydra `.yaml` files.

- Choose different test images
- Swap between model versions
- Control thresholding behavior, etc.

##### üì¶ Outputs:
- `/inference` folder: added code for MLFlow (project file, requirements.txt etc.)

##### üí° No REST service yet! Just simulate inferences **as if** they were in production. This is the ‚Äútracking-ready‚Äù test stage.

---

#### üìì `04 Model API Deployment.ipynb` ‚Äî **Deploy Model as a REST API**

##### ‚úÖ Goal:
Build a `FastAPI` REST service that wraps your inference code.

##### üì¶ Outputs:
- `/service/` folder with FastAPI routes
- `main.py` server entry
- Render deployment notes

---

### üìà Visual: Notebook Flow with Artifacts

```
01 Intro
   ‚Üì
02 Inference Pipeline.ipynb
   ‚ûú infer.py, preprocess.py, model weights
   ‚Üì
03 Tracking with MLflow.ipynb
   ‚ûú inference logs + MLflow setup (no service yet!)
   ‚Üì
04 Model API Deployment.ipynb
   ‚ûú FastAPI + REST service wrapping the above
```

---

### ‚úÖ Summary

| Notebook | Purpose | Creates... |
|----------|---------|------------|
| `02` | Modularizes your working inference into files | `infer.py`, `model/`, `.pth` |
| `03` | Tracks local inferences with MLflow (offline, no REST needed) | `mlruns/` |
| `04` | Turns your pipeline into a REST API | `/service/`, `main.py` |

---

## 4. üîö Conclusion & Outlook

In this notebook, we have defined how our fourth area **‚ÄúMLOps‚Äù** is structured and thus laid the foundations for the other notebooks. In the next notebook `02 Inference Pipeline.ipynb` we will design and implement a complete pipeline for the detection of anomalies based on our model.

Let's get started!

<p style="font-size: 0.8em; text-align: center;">¬© 2025 Oliver Grau. Educational content for personal use only. See LICENSE.txt for full terms and conditions.</p>