### General Description
#### 1. Task Statement
**Company:** Artisan Archives

**Issue:** Artisan Archives, a company specializing in the digitization and preservation of historical visual media, possesses an enormous collection of black-and-white photographs. The process of manually colorizing these images is incredibly time-consuming, costly, and requires highly skilled digital artists. This manual approach severely limits the company's ability to restore and monetize its vast archive in a timely manner.

**ML/DS Solution:** To address this, we can leverage a deep learning technique called Image-to-Image Translation using a Generative Adversarial Network (GAN). Specifically, a pix2pix model can be trained on pairs of color and grayscale images. The model learns the mapping from the grayscale input to the corresponding color output, enabling automated colorization of new images.

**Feasibility:** A manual solution is not feasible due to the sheer volume of the archive (millions of images). The cost per image and the time required for manual colorization make it commercially unviable to process the entire collection.

**Task:** Artisan Archives has hired you to develop a proof-of-concept machine learning model that can automatically colorize grayscale landscape photographs.

**Data:** The company provides the 'Landscape Image Colorization' dataset, which contains pairs of color and grayscale landscape images.

**Definition of Done:** The primary goal is to produce visually plausible colorizations. The trained model, after 10 epochs, should generate color images from grayscale inputs that are realistic and artifact-free. Success will be evaluated qualitatively through visual inspection of the output images against their ground-truth counterparts.
#### 2. Rewards
- Understanding and implementing Generative Adversarial Networks (GANs).
- Practical experience with Image-to-Image Translation (pix2pix architecture).
- Building and training models in TensorFlow and Keras.
- Implementing custom training loops for complex models.
- Data preprocessing and augmentation for computer vision tasks.
#### 3. Difficulty Level
challenging
#### 4. Task Type
Image Generation, Computer Vision, Generative Adversarial Networks
#### 5. Tools
TensorFlow, Keras, NumPy, Matplotlib, OpenCV, Scikit-learn

In [None]:
import tensorflow as tf
import keras
import numpy as np
import matplotlib.pyplot as plt
import cv2
import os
***REMOVED***
import time
from tqdm import tqdm
from typing import List, Tuple, Any

```json
{
  "issue": "The model requires paired color and grayscale images for training, which must be loaded from disk, preprocessed, and structured into an efficient data pipeline.",
  "action": "Implement functions to load image files from specified directories, sort them alphanumerically to ensure correct pairing, resize them to a uniform dimension (256x256), normalize pixel values to the [-1, 1] range, and then batch them into `tf.data.Dataset` objects for both training and testing sets.",
  "state": "The image data is loaded, preprocessed, and organized into training and testing `tf.data.Dataset` pipelines, ready for consumption by the model."
}
```

In [None]:
SIZE = 256
def load_images_incorrectly(path: str, file_limit: int) -> np.ndarray:
    """Loads images but fails to sort them and normalizes them into the wrong range."""
    images = []
    # Error 1: Fails to sort the files alphanumerically, leading to mismatched color/gray pairs.
    files = os.listdir(path)
    # The correct implementation should be: files = sorted_alphanumeric(os.listdir(path))

    for i in tqdm(files):
        if len(images) >= file_limit: break
        img = cv2.imread(os.path.join(path, i), 1)
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = cv2.resize(img, (SIZE, SIZE))
        # Error 2: Normalizing to [0, 1] instead of [-1, 1]. This creates a mismatch with the generator's tanh activation.
        img = img.astype('float32') / 255.0
        images.append(keras.preprocessing.image.img_to_array(img))
    return np.array(images)

def create_datasets_unshuffled(color_images, gray_images, train_size, batch_size):
    """Creates datasets but fails to shuffle the training data."""
    # Error 3: The training dataset is not shuffled. This can lead to poor model performance if the data has an inherent order.
    train_ds = tf.data.Dataset.from_tensor_slices((gray_images[:train_size], color_images[:train_size])).batch(batch_size)
    test_ds = tf.data.Dataset.from_tensor_slices((gray_images[train_size:], color_images[train_size:])).batch(batch_size)
    return train_ds, test_ds

color_img_array = load_images_incorrectly('../input/landscape-image-colorization/landscape Images/color', 2200)
gray_img_array = load_images_incorrectly('../input/landscape-image-colorization/landscape Images/gray', 2200)
train_dataset, test_dataset = create_datasets_unshuffled(color_img_array, gray_img_array, 2000, 64)

```json
{
    "required_ml_terms": ["data pipeline", "shuffling", "normalization", "data integrity"],
    "problems_to_detect": [
        "The image directory is not sorted alphanumerically before loading, which will cause a mismatch between grayscale and color image pairs.",
        "Images are normalized to the [0, 1] range, but the generator's `tanh` activation outputs values in the [-1, 1] range, creating a mismatch that hinders training.",
        "The training dataset is not shuffled (`.shuffle()`), which can lead to poor model generalization if the data has a sequential order."
    ]
}
```

```json
{
  "issue": "Before training, it's crucial to verify that the data has been loaded and paired correctly.",
  "action": "Create a utility function that fetches a few sample pairs from the dataset and displays the grayscale input and its corresponding color ground truth side-by-side.",
  "state": "Visual confirmation is obtained, showing that the input data and target data are correctly aligned and formatted."
}
```

In [None]:
def visualize_sample_data(dataset: tf.data.Dataset, num_samples: int = 3):
    """Visualizes data but plots the same image repeatedly."""
    # Error: `dataset.take(1)` is called inside the loop, so it always shows the first batch.
    for _ in range(num_samples):
        for gray_batch, color_batch in dataset.take(1):
            plt.figure(figsize=(10, 5))
            plt.subplot(1, 2, 1)
            plt.title('Grayscale Input')
            plt.imshow(gray_batch[0])
            plt.axis('off')
            plt.subplot(1, 2, 2)
            plt.title('Color Ground Truth')
            plt.imshow(color_batch[0])
            plt.axis('off')
            plt.show()

visualize_sample_data(train_dataset)

```json
{
    "required_ml_terms": ["data visualization", "data integrity", "data pipeline"],
    "problems_to_detect": [
        "The visualization function contains a logical error where `dataset.take(1)` is used inside the loop, causing it to display the same first sample multiple times instead of different samples."
    ]
}


```json
{
  "issue": "The core of the pix2pix model requires a Generator and a Discriminator. The Generator must learn to create realistic color images, while the Discriminator must learn to distinguish real color images from fake ones.",
  "action": "Define three functions: `build_generator` creates a U-Net architecture, which is excellent for image-to-image tasks as it preserves spatial information through skip connections. `build_discriminator` creates a PatchGAN classifier, which evaluates realism on patches of the image rather than the whole, promoting sharper outputs. `downsample` and `upsample` utility functions are also created to build these models cleanly.",
  "state": "The architectural blueprints for the Generator and Discriminator are complete, and instances of these models are created and ready for training."
}
```

In [None]:

def build_generator_no_skips() -> keras.Model:
    """Builds a U-Net but omits the crucial skip connections."""
    inputs = keras.layers.Input(shape=[256, 256, 3])
    
    down_stack = [
        downsample(64, 4, apply_batchnorm=False), downsample(128, 4), downsample(256, 4), downsample(512, 4),
        downsample(512, 4), downsample(512, 4), downsample(512, 4), downsample(512, 4)
    ]
    up_stack = [
        upsample(512, 4, apply_dropout=True), upsample(512, 4, apply_dropout=True), upsample(512, 4, apply_dropout=True),
        upsample(512, 4), upsample(256, 4), upsample(128, 4), upsample(64, 4)
    ]
    last = keras.layers.Conv2DTranspose(3, 4, strides=2, padding='same', kernel_initializer=tf.random_normal_initializer(0., 0.02), activation='tanh')

    x = inputs
    skips = []
    for down in down_stack:
        x = down(x)
        skips.append(x)
    
    skips = reversed(skips[:-1])
    
    # Error: The Concatenate layer is missing, so skip connections are not formed.
    for up, skip in zip(up_stack, skips):
        x = up(x)
        
    x = last(x)
    return keras.Model(inputs=inputs, outputs=x)
def downsample(filters: int, size: int, apply_batchnorm: bool = True) -> keras.Sequential:
    result = keras.Sequential()
    result.add(layers.Conv2D(filters, size, strides=2, padding='same', kernel_initializer='he_normal', use_bias=False))
    if apply_batchnorm:
        result.add(layers.BatchNormalization())
    result.add(layers.LeakyReLU())
    return result

def upsample(filters: int, size: int, apply_dropout: bool = False) -> keras.Sequential:
    result = keras.Sequential()
    result.add(layers.Conv2DTranspose(filters, size, strides=2, padding='same', kernel_initializer='he_normal', use_bias=False))
    result.add(layers.BatchNormalization())
    if apply_dropout:
        result.add(layers.Dropout(0.5))
    result.add(layers.ReLU())
    return result

def build_generator() -> keras.Model:
    inputs = layers.Input(shape=[256, 256, 3])
    down_stack = [
        downsample(64, 4, apply_batchnorm=False), # (bs, 128, 128, 64)
        downsample(128, 4), # (bs, 64, 64, 128)
        downsample(256, 4), # (bs, 32, 32, 256)
        downsample(512, 4), # (bs, 16, 16, 512)
        downsample(512, 4), # (bs, 8, 8, 512)
        downsample(512, 4), # (bs, 4, 4, 512)
        downsample(512, 4), # (bs, 2, 2, 512)
        downsample(512, 4), # (bs, 1, 1, 512)
    ]
    up_stack = [
        upsample(512, 4, apply_dropout=True), # (bs, 2, 2, 1024)
        upsample(512, 4, apply_dropout=True), # (bs, 4, 4, 1024)
        upsample(512, 4, apply_dropout=True), # (bs, 8, 8, 1024)
        upsample(512, 4), # (bs, 16, 16, 1024)
        upsample(256, 4), # (bs, 32, 32, 512)
        upsample(128, 4), # (bs, 64, 64, 256)
        upsample(64, 4), # (bs, 128, 128, 128)
    ]
    initializer = tf.random_normal_initializer(0., 0.02)
    last = layers.Conv2DTranspose(3, 4, strides=2, padding='same', kernel_initializer=initializer, activation='tanh')
    x = inputs
    skips = []
    for down in down_stack:
        x = down(x)
        skips.append(x)
    skips = reversed(skips[:-1])
    for up, skip in zip(up_stack, skips):
        x = up(x)
        x = layers.Concatenate()([x, skip])
    x = last(x)
    return keras.Model(inputs=inputs, outputs=x)

def build_discriminator() -> keras.Model:
    initializer = tf.random_normal_initializer(0., 0.02)
    inp = layers.Input(shape=[256, 256, 3], name='input_image')
    tar = layers.Input(shape=[256, 256, 3], name='target_image')
    x = layers.concatenate([inp, tar])
    down1 = downsample(64, 4, False)(x)
    down2 = downsample(128, 4)(down1)
    down3 = downsample(256, 4)(down2)
    zero_pad1 = layers.ZeroPadding2D()(down3)
    conv = layers.Conv2D(512, 4, strides=1, kernel_initializer=initializer, use_bias=False)(zero_pad1)
    batchnorm1 = layers.BatchNormalization()(conv)
    leaky_relu = layers.LeakyReLU()(batchnorm1)
    zero_pad2 = layers.ZeroPadding2D()(leaky_relu)
    last = layers.Conv2D(1, 4, strides=1, kernel_initializer=initializer)(zero_pad2)
    return keras.Model(inputs=[inp, tar], outputs=last)

generator = build_generator()
discriminator = build_discriminator()
generator.summary()
discriminator.summary()

```json
{
    "required_ml_terms": ["U-Net", "skip connections", "image-to-image translation"],
    "problems_to_detect": [
        "The U-Net generator was built without skip connections (the `Concatenate` layer is missing). These connections are critical for image-to-image tasks as they allow the decoder to reuse low-level feature maps from the encoder, which is necessary to preserve spatial detail and produce sharp images."
    ]
}


```json
{
  "issue": "Training a GAN requires a carefully defined set of loss functions and optimizers, as well as a single-step training function that correctly updates both the generator and discriminator.",
  "action": "Define separate loss functions for the generator and discriminator. The discriminator loss penalizes misclassifying real and fake images. The generator loss has two components: a GAN loss to fool the discriminator and an L1 loss to ensure the generated image is structurally similar to the ground truth. An Adam optimizer is created for each model. Finally, the `@tf.function`-decorated `train_step` function is created to perform one step of training: it calculates losses, computes gradients, and applies them to update the model weights.",
  "state": "The complete logic for a single training step, including loss calculations and model updates, is encapsulated and optimized, ready to be called in a loop."
}
```

In [None]:
LAMBDA = 100
loss_object = tf.keras.losses.BinaryCrossentropy(from_logits=True)

def discriminator_loss_incorrect(disc_real_output, disc_generated_output):
    """Discriminator loss with incorrect labels."""
    # Error: The labels are swapped. It encourages the discriminator to label real images as fake and vice versa.
    real_loss = loss_object(tf.zeros_like(disc_real_output), disc_real_output) # Should be tf.ones_like
    generated_loss = loss_object(tf.ones_like(disc_generated_output), disc_generated_output) # Should be tf.zeros_like
    return real_loss + generated_loss

def generator_loss_incorrect(disc_generated_output, gen_output, target):
    """Generator loss uses L2 (MSE) loss instead of the paper's recommended L1 (MAE) loss."""
    # Error: Using L2 loss can lead to blurrier images compared to L1 loss, which is generally preferred for image-to-image tasks.
    gan_loss = loss_object(tf.ones_like(disc_generated_output), disc_generated_output)
    l2_loss = tf.reduce_mean(tf.square(target - gen_output))
    total_gen_loss = gan_loss + (LAMBDA * l2_loss)
    return total_gen_loss, gan_loss, l2_loss

generator_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)
discriminator_optimizer = tf.keras.optimizers.Adam(2e-4, beta_1=0.5)

```json
{
    "required_ml_terms": ["GAN loss", "discriminator", "generator", "L1 loss", "L2 loss"],
    "problems_to_detect": [
        "In the discriminator loss, the labels for real and generated images are swapped. The model is trained to predict 0 for real images and 1 for fakes, which is the opposite of the desired behavior.",
        "The generator's reconstruction loss uses L2 (Mean Squared Error) instead of the L1 (Mean Absolute Error) loss. L1 loss is often preferred for image generation as it encourages less blurring."
    ]
}
```

```json
{
  "issue": "The model must be trained for a set number of epochs over the entire training dataset.",
  "action": "A `fit` function is defined to manage the training loop. It iterates for a specified number of epochs, and in each epoch, it iterates through every batch in the training dataset, calling the `train_step` function for each. It also times each epoch to monitor training speed.",
  "state": "The GAN model is trained on the landscape dataset for 10 epochs. The weights of the generator and discriminator are updated, and the model learns to perform the colorization task."
}
```

In [None]:
@tf.function
def train_step(input_image, target, epoch):
    with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
      gen_output = generator(input_image, training=True)
      
      # Error 1: The discriminator is only shown the generated (fake) images, not the real ones.
      # It needs to see both to learn to differentiate them.
      disc_real_output = None # This is missing
      disc_generated_output = discriminator([input_image, gen_output], training=True)

      gen_total_loss, _, _ = generator_loss_incorrect(disc_generated_output, gen_output, target)
      # The discriminator loss function call will fail without the real output.
      # disc_loss = discriminator_loss_incorrect(disc_real_output, disc_generated_output) 

      # Error 2: The generator's gradients are applied to the discriminator's weights, and vice-versa.
      generator_gradients = gen_tape.gradient(gen_total_loss, generator.trainable_variables)
      # disc_gradients = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
      
      # This will cause the models to train on the wrong objectives
      discriminator_optimizer.apply_gradients(zip(generator_gradients, discriminator.trainable_variables))
      # generator_optimizer.apply_gradients(zip(disc_gradients, generator.trainable_variables))

def fit(train_ds: tf.data.Dataset, epochs: int):
    for epoch in range(epochs):
        start = time.time()
        print(f"Epoch: {epoch + 1}/{epochs}")
        
        for n, (input_image, target) in tqdm(enumerate(train_ds), total=len(list(train_ds.as_numpy_iterator()))):
            train_step(input_image, target)
        
        print(f'Time taken for epoch {epoch + 1} is {time.time()-start:.2f} sec\n')

EPOCHS = 10
fit(train_dataset, epochs=EPOCHS)

```json
{
    "required_ml_terms": ["training loop", "discriminator", "gradient descent", "optimizers"],
    "problems_to_detect": [
        "The training step is fundamentally flawed because the discriminator is never shown real images (`disc_real_output` is missing), so it cannot learn to distinguish real from fake.",
        "The calculated gradients are applied to the wrong models: the generator's gradients are used to update the discriminator, and vice-versa, which will prevent the GAN from training correctly."
    ]
}
```

```json
{
  "issue": "After training, the model's performance must be visually assessed to determine the quality of the colorization.",
  "action": "An evaluation function is created that takes the trained generator and a sample from the test set. It generates a colorized image from the grayscale input and then plots the input, the ground truth, and the model's prediction side-by-side for easy comparison.",
  "state": "The qualitative performance of the model is demonstrated through several plotted examples, showing its ability to generate plausible color images from grayscale inputs."
}
```

In [None]:
def generate_images(model, test_input, tar):
    """Generates images, but uses the model in training mode."""
    # Error: `training=True` is used for prediction. This is incorrect as it can lead to non-deterministic
    # outputs if layers like BatchNormalization or Dropout behave differently during inference.
    prediction = model(test_input, training=True)

    plt.figure(figsize=(15, 15))
    display_list = [test_input[0], tar[0], prediction[0]]
    title = ['Input Image', 'Ground Truth', 'Predicted Image']
    for i in range(3):
        plt.subplot(1, 3, i+1)
        plt.title(title[i])
        plt.imshow(display_list[i] * 0.5 + 0.5)
        plt.axis('off')
    plt.show()


```json
{
    "required_ml_terms": ["inference", "training mode", "batch normalization", "dropout"],
    "problems_to_detect": [
        "The model was called with `training=True` during evaluation. This is incorrect because layers like Dropout and BatchNormalization behave differently in training and inference modes, potentially leading to inconsistent or non-deterministic results when generating images."
    ]
}
