# Paint Color Detection with VAE + GAN – MLOps Project Report

**Student:** Thiviru Dilith  
**Project:** Paint-Color-Detection – VAE + GAN color detector with MLOps  
**GitHub repository:** https://github.com/Nightkingcrypto/Paint-Color-Detection.git


## 1. Project overview

This project implements an end-to-end paint color detector that can identify the three closest matching paint shades for any uploaded color patch. The underlying model combines a **Variational Autoencoder (VAE)** with a **Generative Adversarial Network (GAN)**. The VAE learns a compact latent representation of each paint color, while the GAN is trained to make the model more robust to changes in brightness and small variations in the input patch.

The dataset consists of a folder-of-folders structure containing around **264 paint colors**. Each top-level folder corresponds to a single catalog color (for example *0N05 Fawn*), and inside each folder there is a key image plus multiple versions of the same color rendered at different brightness levels. This structure mimics how the same color appears under different lighting conditions and makes the model more realistic and less sensitive to illumination.

The final system is exposed as a **FastAPI web application** and can also be containerised with Docker. The project is fully version-controlled with **Git & GitHub**, tracked with **MLflow**, and tested automatically on every push using **GitHub Actions**. This notebook summarises the architecture, experiments, observations, and the overall MLOps workflow used in the project.


## 2. Dataset and preprocessing

The dataset is stored locally in a directory similar to `F:/Desktop/Colors/Dataset`. Each subfolder is named using the paint catalog code and human-readable color name, such as `0N05 Fawn`. Within each folder there is:

* A **key color patch** representing the canonical shade for that color.
* A series of additional patches generated under different **brightness levels**. These are named using a pattern such as `brightness_1_0.3`, `brightness_10_0.8`, up to about twenty variants per color.

For training, images are loaded using a custom `ColorFolderDataset` class. During preprocessing each image is:

1. **Resized** to a square resolution of `64 × 64` pixels to keep the model small and fast while still capturing enough information about the color region.
2. **Converted to RGB** and scaled to the `[0, 1]` range, then normalised to a symmetric range around zero. This is convenient for both VAE and GAN training.
3. **Assigned a label** corresponding to the folder name so that similar images from the same color category can later be grouped together in latent space.

The dataset is split into training and validation subsets. Because the number of images per color is relatively small, the split is done at the **image level** rather than at the color-code level, keeping the overall distribution of colors similar in both sets. No heavy data augmentation is used; instead, the natural brightness variations inside each color folder already provide useful diversity.

Overall, the dataset is compact but highly structured. The folder-of-folders layout makes it straightforward to aggregate embeddings per color and to later compute the three closest color matches for a new query patch.


## 3. Model architectures

The color detector is based on two neural components: a **Variational Autoencoder (VAE)** and a **GAN**. Both models operate on `64 × 64` RGB images and are implemented in PyTorch.

### 3.1 Variational Autoencoder (VAE)

The VAE is responsible for learning a smooth, low-dimensional latent representation of each color patch. The architecture includes:

* A **convolutional encoder** that progressively downsamples the input image and maps it into a latent vector of dimension `latent_dim = 32`. The encoder outputs both a mean and a log-variance for the latent distribution.
* A **reparameterisation step** that samples `z` from the learned distribution using the standard VAE trick `z = μ + σ ⊙ ε`.
* A **convolutional decoder** that mirrors the encoder and reconstructs the image back to `64 × 64 × 3` from the latent vector.

The training objective combines a **reconstruction loss** (mean squared error between input and output) with a **KL divergence term** that keeps the latent distribution close to a standard normal prior. In code this is logged as `recon_loss`, `kl_div`, and their sum `train_loss`.

Hyperparameters, taken from the project `config.py`, include:

* `image_size = 64`
* `latent_dim = 32`
* `batch_size = 64`
* `vae_epochs = 25`
* `lr = 1e-3`

Once trained, the encoder part of the VAE is used to generate fixed-length embeddings for every color image in the dataset.

### 3.2 GAN for robustness to brightness

To make the system more tolerant to lighting changes and subtle variations, a simple **GAN** is trained on the same dataset:

* The **generator** maps random latent vectors to synthetic color patches that resemble real paint swatches.
* The **discriminator** receives an image and predicts whether it is real or generated.

The GAN is trained for a similar number of epochs with a lower learning rate (`2e-4`) so that the generator gradually learns to mimic the distribution of real patches. The discriminator loss (`d_loss`) and generator loss (`g_loss`) are tracked separately in MLflow to monitor the training dynamics. Although the GAN is not used directly at inference time, the training process and generated examples help verify that the latent space learned by the VAE is meaningful and that brightness variations are captured well.


## 4. Training experiments and observations

### 4.1 MLflow experiment setup

All experiments are tracked under a single MLflow experiment named **`color_vae_gan`**. Every call to the training scripts `train_vae.py` and `train_gan.py` starts a new MLflow run where hyperparameters, metrics, and model artefacts are logged. MLflow is configured to use a local tracking URI so that the UI can be launched with `mlflow ui` and visited at `http://127.0.0.1:5000`.

For the **VAE**, each run logs parameters such as `batch_size`, `epochs`, `image_size`, `latent_dim`, and `lr`. During training, three main metrics are recorded per epoch: the KL divergence, the reconstruction loss, and the total training loss. The final trained model is saved as an MLflow artefact and also exported to the `models/` directory for use by the inference app.

For the **GAN**, the MLflow runs record hyperparameters like `batch_size`, `epochs`, `latent_dim`, and learning rate. Two metrics are tracked per epoch: the discriminator loss (`d_loss`) and generator loss (`g_loss`). Both the discriminator and generator models are logged as separate artefacts so they can be reloaded later if needed.

### 4.2 VAE training behaviour

From the MLflow **train_vae** run, the reconstruction loss curve shows a sharp decrease in the first few epochs followed by a gradual flattening, indicating that the model quickly learns a good approximation of the color patches and then fine-tunes smaller details. The KL divergence starts close to zero and slowly increases, which is expected: the encoder initially behaves like a deterministic autoencoder and gradually learns to spread the latent codes to match the Gaussian prior.

The combined training loss therefore decreases smoothly and stabilises after roughly twenty epochs. There is no strong sign of overfitting: because the input patches are simple and the model relatively small, the VAE converges quickly without memorising the training data. The final reconstructions preserve the overall color while being slightly blurred, which is sufficient for embedding and nearest-neighbour search.

### 4.3 GAN training behaviour

The **train_gan** run reveals the typical adversarial training dynamics. The discriminator loss fluctuates around values between roughly 0.5 and 1.0, while the generator loss gradually increases but remains within a reasonable range. Short spikes in `d_loss` correspond to moments when the discriminator momentarily becomes confident, while drops in `d_loss` followed by increases in `g_loss` indicate the generator catching up.

Because the goal of the GAN in this project is not perfect image generation but rather to verify that the color distribution is learnable, these dynamics are acceptable. Example generated swatches resemble the real color patches in tone and brightness, confirming that the dataset and preprocessing pipeline are coherent.

Overall, MLflow makes it very easy to compare multiple runs with different hyperparameters and to visually inspect the loss curves. For the report and video I can show the VAE and GAN run pages, the metrics tab, the registered models for `vae_model`, `gan_generator`, and `gan_discriminator`, and the logged artefacts directly from the MLflow UI.


## 5. Inference pipeline and color retrieval

Once the VAE is trained, all images in the dataset are passed through the encoder to produce a matrix of latent embeddings. For each catalog color (for example *0N05 Fawn*) the embeddings of all its brightness variants are aggregated. This produces a compact representation for every color folder, stored on disk as `color_embeddings.pkl` along with label metadata.

When a user uploads a new color patch through the FastAPI web app, the following steps occur:

1. The image is preprocessed in exactly the same way as during training: resized to `64 × 64`, converted to RGB, normalised, and converted to a PyTorch tensor.
2. The preprocessed patch is passed through the **VAE encoder** to obtain a latent vector `z_query` of dimension 32.
3. Cosine similarity (or Euclidean distance) is computed between `z_query` and every stored catalog embedding. The colors are ranked from most similar to least similar.
4. The top three closest colors, along with their folder names and example images, are returned to the API and shown on the web interface as the **three best matches**.

Because the embeddings have been trained across multiple brightness levels of each color, the system is robust to moderate changes in illumination. Even if the uploaded patch is slightly lighter or darker than the stored key patch, its latent code remains close in the VAE’s feature space, and the correct catalog color is usually ranked among the top results. The deployed FastAPI page clearly displays the uploaded file name and the three nearest matches with their distances, which is ideal to demonstrate in the video.


## 6. Version control, CI/CD, and experiment tracking

A central goal of this project was to implement not just a model, but a small **MLOps workflow** that makes experiments reproducible and deployments reliable.

### 6.1 Git and GitHub

All project code, configuration, and documentation live in a public GitHub repository:

> https://github.com/Nightkingcrypto/Paint-Color-Detection.git

The repository is structured with separate folders for the application (`app/`), training and utility code (`src/`), notebooks, Docker files, and GitHub Actions workflows. I followed a feature-branch workflow where new functionality such as training scripts, MLflow integration, and Docker support were developed on dedicated branches and then merged back into `main` via commits with descriptive messages. This history is clearly visible in the GitHub interface and demonstrates how the project evolved from a simple VAE prototype to a full VAE + GAN system with MLOps tooling.

### 6.2 Continuous integration with GitHub Actions

A simple but effective **continuous integration (CI)** pipeline is configured using GitHub Actions. The workflow file `.github/workflows/ci.yml` triggers on pushes and pull requests to the `main` branch. Each CI run performs the following steps on an Ubuntu runner:

1. Check out the latest snapshot of the repository.
2. Set up Python 3.11.
3. Install the project dependencies from `requirements.txt`.
4. Set `PYTHONPATH` so that the `src` package can be imported correctly.
5. Run the automated tests with `pytest`.

Initially the tests failed due to import path issues (for example, `ModuleNotFoundError: No module named 'src.models'`). By turning `src/` into a proper package and adjusting `PYTHONPATH` in the workflow, the tests were fixed and the pipeline now shows green check marks for recent commits. This CI job acts as a **health check** for the repository: any future change that breaks the core VAE forward pass or imports will cause the build to fail, signalling that the code needs attention before deployment.

### 6.3 Experiment tracking with MLflow

As shown in the MLflow screenshots, all experiments for both the VAE and GAN are logged under the `color_vae_gan` experiment. MLflow captures:

* Hyperparameters (batch size, number of epochs, image size, learning rate, latent dimension).
* Training metrics per epoch (VAE: KL divergence, reconstruction loss, total loss; GAN: generator and discriminator losses).
* Trained models saved as artefacts and visible under the **Models** tab (for example `vae_model`, `gan_generator`, and `gan_discriminator`).

This tracking is extremely valuable. It allows me to compare different configurations (for example changing the latent dimension or learning rate) and to reason about which settings produce the most stable training curves. It also serves as the basis for **model registry** style workflows, because each run produces a self-contained snapshot that can later be promoted to production.


## 7. Data and model versioning

The repository includes both traditional Git-based version control and additional mechanisms tailored to machine-learning assets.

For code and configuration files, **Git** is sufficient: every change to the training scripts, FastAPI app, or configuration module is tracked through commits and visible in the GitHub history. The trained models and embeddings are stored in a dedicated `models/` directory and tracked via MLflow inside the local `mlruns` folder. This combination means that for any given commit and MLflow run ID, I can always reproduce the exact model that was trained.

For large datasets it is often better to rely on tools such as **DVC (Data Version Control)**. In this project the color dataset is relatively small and stored locally, so a full DVC pipeline was not strictly necessary. However, DVC could easily be added to:

* Track the entire `Dataset/` directory as a versioned artefact, with pointers stored in the Git repository.
* Define pipeline stages for data preparation, VAE training, embedding generation, and GAN training.
* Reproduce any stage with a single `dvc repro` command, guaranteeing that the model and embeddings are always consistent with the underlying data version.

Even without full DVC integration, the current setup already separates raw data, models, logs, and code, which is an important MLOps principle for keeping experiments manageable.


## 8. Dockerised deployment with FastAPI

To make the application easy to run on different machines, a **Docker image** is provided in the `docker/` folder. The Dockerfile uses a lightweight `python:3.11-slim` base image, installs CPU-only PyTorch via the official wheel index, and then installs the rest of the project dependencies from a Docker-specific requirements file.

During the image build the following steps occur:

1. System libraries required by Pillow and PyTorch are installed (`build-essential`, `libglib2.0-0`, and related packages).
2. `pip` is upgraded and then used to install `torch` and `torchvision` for CPU only.
3. All remaining Python dependencies such as FastAPI, Uvicorn, MLflow client libraries, and data-science utilities are installed.
4. The project source code and trained model artefacts (`models/vae.pt` and `models/color_embeddings.pkl`) are copied into the image.
5. The default command launches the FastAPI app with Uvicorn on port 8000.

After building the image, the container can be run with a simple command such as:

```bash
docker run -p 9000:8000 color-vae-gan
```

In this configuration the container listens on port 8000 internally and Docker publishes it on port 9000 on the host. The color-detection web interface is then available at `http://127.0.0.1:9000/predict` from any browser, as shown in the screenshot. This container-based deployment completes the MLOps story by providing a reproducible, portable runtime for the trained model.


## 9. Overall MLOps workflow

Putting everything together, the overall workflow for this project looks like a compact end-to-end MLOps pipeline:

1. **Data management** – A structured folder-of-folders dataset of paint colors is prepared and inspected. Simple but consistent preprocessing ensures that every image is converted to a common resolution and scale.
2. **Model development** – A VAE and GAN are designed and implemented in PyTorch. Early experiments are run locally, and hyperparameters are tuned based on the MLflow loss curves.
3. **Experiment tracking** – All training runs are logged to MLflow, with metrics, parameters, and model artefacts. The MLflow UI serves as the main dashboard for comparing experiments and understanding model behaviour.
4. **Version control and CI** – Git and GitHub manage the evolution of the codebase, while GitHub Actions provide continuous integration via automated tests. Import problems and package-layout issues are caught and fixed early thanks to CI.
5. **Model packaging and deployment** – The final models and embeddings are persisted, a FastAPI web app wraps the inference logic, and a Docker image encapsulates the complete runtime environment for easy deployment and demonstration.

Even though this is a relatively small academic project, the same principles apply to real-world systems: by treating data, code, and models as first-class versioned artefacts and by automating testing and packaging, the reliability and reproducibility of machine-learning solutions improve dramatically.


## 10. Repository link and project video

The full source code, configuration, and this Jupyter notebook are hosted in the public GitHub repository:

> **GitHub:** https://github.com/Nightkingcrypto/Paint-Color-Detection.git

For the coursework submission, the project documentation notebook and the demonstration video from Task 5 will both be uploaded to this repository. The video walks through the training scripts, MLflow UI, GitHub Actions CI pipeline, Docker build and container run, and the FastAPI web interface, showing the model making predictions for different uploaded color patches.

Together, the repository, this notebook, and the video provide a complete picture of how the paint color detection system was designed, implemented, and operationalised with basic MLOps tooling.


## Appendix: Screenshots

The following embedded screenshots document key parts of the project, including MLflow experiments and runs, the GitHub repository and CI pipeline, the FastAPI prediction page, and detailed loss curves for both the VAE and GAN.


![
](<Screenshot 2025-11-15 184852.png>)

![alt text](<Screenshot 2025-11-15 184915.png>)

![alt text](<Screenshot 2025-11-15 184950.png>)

![alt text](<Screenshot 2025-11-15 185027.png>)

![alt text](<Screenshot 2025-11-15 185205.png>)

![alt text](<Screenshot 2025-11-15 185237.png>)

![alt text](<Screenshot 2025-11-15 185252.png>)

![alt text](<Screenshot 2025-11-15 185347.png>)

![alt text](<Screenshot 2025-11-15 185404.png>)

![alt text](<Screenshot 2025-11-15 185424.png>)

![alt text](<Screenshot 2025-11-15 185456.png>)

![alt text](<Screenshot 2025-11-15 185521.png>)

![alt text](<Screenshot 2025-11-15 185631.png>)