# MSAI 495 | Image Generator | T-Shirt Design Generator

## Business Goal / Case Statement

Accelerate and Innovate T-shirt Graphic Design Through AI.

## Assignment Context

**Relevant Industry and/or Business Function:** Fashion / Apparel Design / E-commerce

**Description:**

Our fashion startup aims to disrupt the custom apparel market by leveraging AI-generated designs. Current challenges in the industry include long design cycles, high costs for professional designers, and difficulty creating truly unique graphics at scale. By implementing a diffusion model trained on tattoo imagery, we will:

1.	Reduce design-to-market time from weeks to days

2.	Lower production costs by 40% through automated design generation

3.	Increase customer engagement by offering unique, on-demand graphic options

4.	Establish a competitive advantage through AI-powered design capabilities

This T-shirt Graphic Design Generator will serve as both a creative assistant for our in-house designers and a foundation for a customer-facing design customization tool, driving revenue growth while positioning our brand as an innovator in the fashion-tech space.

## AI/ML Task(s)

Image Generation with Denoising Diffusion Probabilistic Models (DDPM)

## Algorithmic Technique(s)

* Denoising Diffusion Model (DDPM) for generative modeling

* U-Net neural network architecture for image synthesis

* Sinusoidal positional embeddings for encoding noise schedules

## The Data

**Dataset Name:** <code>[tattoo_v3](https://huggingface.co/datasets/Drozdik/tattoo_v3)</code><br>
**Data Location:** <code>https://huggingface.co/datasets/Drozdik/tattoo_v3</code>

## Step 1: Setup and Imports

In [7]:
%load_ext autoreload
%autoreload 2
import numpy as np
import matplotlib.pyplot as plt

plt.style.use("seaborn-v0_8-colorblind")

import math

import tensorflow as tf
from tensorflow.keras import (
    layers,
    models,
    optimizers,
    utils,
    callbacks,
    losses,
)
from datasets import load_dataset

from util.util_functions import display, sample_batch

## Step 2: Set Parameters

In [None]:
IMAGE_SIZE = 64
BATCH_SIZE = 64
DATASET_REPETITIONS = 5
LOAD_MODEL = False

PLOT_DIFFUSION_STEPS = 20

# optimization
LEARNING_RATE = 1e-3
WEIGHT_DECAY = 1e-4
EPOCHS = 50

## Step 3: Prepare the Data

Load the data:

In [None]:
dataset = load_dataset("Drozdik/tattoo_v3")

Check the available splits and features:

In [None]:
print(dataset)
print(dataset["train"].features)

**Extract and Preprocess Images:**

The images are likely stored as PIL images or file paths under a key like **`"image"`**. Convert them to tensors and resize them to your desired **`IMAGE_SIZE`**.

Define a preprocessing function:

In [None]:
def preprocess(example):
    # Convert PIL image to float32 tensor and resize
    image = example["image"].resize((IMAGE_SIZE, IMAGE_SIZE))
    image = np.array(image).astype("float32") / 255.0
    return {"image": image}


# Map preprocessing over the train split
dataset["train"] = dataset["train"].map(preprocess)

**Convert to tf.data.Dataset:**

To use w/ TensorFlow/Keras, convert the Hugging Face dataset to a `tf.data.Dataset`:

In [None]:
def gen():
    for example in dataset["train"]:
        yield example["image"]

train_data = tf.data.Dataset.from_generator(
    gen,
    output_signature=tf.TensorSpec(shape=(IMAGE_SIZE, IMAGE_SIZE, 3), dtype=tf.float32)
)

**Shuffle, Batch, and Repeat**

Apply batching and other transformations as in your original notebook:

In [None]:
train = train_data.shuffle(1000).repeat(DATASET_REPETITIONS).batch(BATCH_SIZE, drop_remainder=True)
train = train.prefetch(tf.data.AUTOTUNE)

Show some items of clothing from the training set:

In [None]:
train_sample = sample_batch(train)
display(train_sample)

### 3.1 Diffusion Schedules

In [None]:
from util.diffusion_schedules import (
    linear_diffusion_schedule,
    cosine_diffusion_schedule,
    offset_cosine_diffusion_schedule,
)

T = 1000
diffusion_times = tf.convert_to_tensor([x / T for x in range(T)])

linear_noise_rates, linear_signal_rates = linear_diffusion_schedule(
    diffusion_times
)

cosine_noise_rates, cosine_signal_rates = cosine_diffusion_schedule(
    diffusion_times
)

(
    offset_cosine_noise_rates,
    offset_cosine_signal_rates,
) = offset_cosine_diffusion_schedule(diffusion_times)

## Step 4: Build the Model

Build the U-Net:

In [None]:
from util.residual_block import DownBlock, UpBlock, ResidualBlock
from util.sinusoidal_embedding import sinusoidal_embedding

# The first input to the U-Net is the image that we wish to denoise.
noisy_images = layers.Input(shape=(IMAGE_SIZE, IMAGE_SIZE, 3))

# This image is passed through a Conv2D layer to increase the number of channels.
x = layers.Conv2D(32, kernel_size=1)(noisy_images)

# The second input to the U-Net is the noise variance (a scalar).
noise_variances = layers.Input(shape=(1, 1, 1))
# This is encoded using a sinusoidal embedding.
noise_embedding = layers.Lambda(sinusoidal_embedding)(noise_variances)
# This embedding is copied across spatial dimensions to match the size of the input
# image.
noise_embedding = layers.UpSampling2D(size=IMAGE_SIZE, interpolation="nearest")(
    noise_embedding
)

# The two input streams are concatenated across channels.
x = layers.Concatenate()([x, noise_embedding])

# The skips list will hold the output from the DownBlock layers that we wish to
# connect to UpBlock layers downstream.
skips = []

# The tensor is passed through a series of DownBlock layers that reduce the size of
# the image, while increasing the number of channels.
x = DownBlock(32, block_depth=2)([x, skips])
x = DownBlock(64, block_depth=2)([x, skips])
x = DownBlock(96, block_depth=2)([x, skips])

# The tensor is then passed through two ResidualBlock layers that hold the image
#size and number of channels constant.
x = ResidualBlock(128)(x)
x = ResidualBlock(128)(x)

# Next, the tensor is passed through a series of UpBlock layers that increase the size
# of the image, while decreasing the number of channels. The skip connections
# incorporate output from the earlier DownBlock layers.
x = UpBlock(96, block_depth=2)([x, skips])
x = UpBlock(64, block_depth=2)([x, skips])
x = UpBlock(32, block_depth=2)([x, skips])

# The final Conv2D layer reduces the number of channels to three (RGB).
x = layers.Conv2D(3, kernel_size=1, kernel_initializer="zeros")(x)

# The U-Net is a Keras Model that takes the noisy images and noise variances as
# input and outputs a predicted noise map.
unet = models.Model([noisy_images, noise_variances], x, name="unet")

In [None]:
from util.diffusion_model import DiffusionModel

# Instantiate the model.
ddm = DiffusionModel(unet)

# Calculate the normalization statistics using the training set.
ddm.normalizer.adapt(train)

In [None]:
if LOAD_MODEL:
    ddm.built = True
    ddm.load_weights("./checkpoint/checkpoint.ckpt")

## Step 5: Train the Model

Compile the model, using the AdamW optimizer (similar to Adam but with weight decay, which helps stabilize the training process) and mean absolute error loss function.

In [None]:
ddm.compile(
    optimizer=optimizers.experimental.AdamW(
        learning_rate=LEARNING_RATE, weight_decay=WEIGHT_DECAY
    ),
    loss=losses.mean_absolute_error,
)

In [None]:
# run training and plot generated images periodically
model_checkpoint_callback = callbacks.ModelCheckpoint(
    filepath="./checkpoint/checkpoint.ckpt",
    save_weights_only=True,
    save_freq="epoch",
    verbose=0,
)

tensorboard_callback = callbacks.TensorBoard(log_dir="./logs")


class ImageGenerator(callbacks.Callback):
    def __init__(self, num_img):
        self.num_img = num_img

    def on_epoch_end(self, epoch, logs=None):
        generated_images = self.model.generate(
            num_images=self.num_img,
            diffusion_steps=PLOT_DIFFUSION_STEPS,
        ).numpy()
        display(
            generated_images,
            save_to="./output/generated_img_%03d.png" % (epoch),
        )


image_generator_callback = ImageGenerator(num_img=10)

# Fit the model over 50 epochs.
ddm.fit(
    train,
    epochs=EPOCHS,
    callbacks=[
        model_checkpoint_callback,
        tensorboard_callback,
        image_generator_callback,
    ],
)

## Step 6: Inference

Generate some novel images of flowers:

In [None]:
generated_images = ddm.generate(num_images=10, diffusion_steps=20).numpy()
display(generated_images)

View improvement over greater number of diffusion steps:

In [None]:
for diffusion_steps in list(np.arange(1, 6, 1)) + [20] + [100]:
    tf.random.set_seed(42)
    generated_images = ddm.generate(
        num_images=10,
        diffusion_steps=diffusion_steps,
    ).numpy()
    display(generated_images)

Interpolation between two points in the latent space:

In [None]:
# Interpolation between two points in the latent space
tf.random.set_seed(100)


def spherical_interpolation(a, b, t):
    return np.sin(t * math.pi / 2) * a + np.cos(t * math.pi / 2) * b


for i in range(5):
    a = tf.random.normal(shape=(IMAGE_SIZE, IMAGE_SIZE, 3))
    b = tf.random.normal(shape=(IMAGE_SIZE, IMAGE_SIZE, 3))
    initial_noise = np.array(
        [spherical_interpolation(a, b, t) for t in np.arange(0, 1.1, 0.1)]
    )
    generated_images = ddm.generate(
        num_images=2, diffusion_steps=20, initial_noise=initial_noise
    ).numpy()
    display(generated_images, n=11)