## Introduction 
Autoencoder are special type of deep learning architecture that consist of two networks encoder and decoder.
The encoder, through a series of CNN and downsampling, learns a reduced dimensional representation of the input data while decoder  through the use of CNN and upsampling, attempts to regenerate the data from the these representations. A well-trained decoder is able to regenerated data that is identical or as close as possible to the original input data.
Autoencoder are generally used for anamoly detection, denoising image, colorizing the images. Here, i am going to colorize the landscape images using autoencoder.

<img src = 'https://miro.medium.com/max/600/1*nqzWupxC60iAH2dYrFT78Q.png' >

## Image Colorization
Image colorization using different softwares require large amount of human effort, time and skill.But special type of deep learning architecture called autoencoder has made this task quiet easy. Automatic image colorization often involves the use of a class of convolutional neural networks (CNN) called autoencoders. These neural networks are able to distill the salient features of an image, and then regenerate the image based on these learned features. 

<img src = "https://tinyclouds.org/colorize/best/6.jpg">

## Import necessary libraries

In [None]:
import numpy as np
import tensorflow as tf
import keras
import cv2
from keras.layers import MaxPool2D,Conv2D,UpSampling2D,Input,Dropout
from keras.models import Sequential
from keras.preprocessing.image import img_to_array
import os
from tqdm import tqdm
import re
import matplotlib.pyplot as plt

### Getting landscape image data,resizing them and appending in array
To get the image in sorted order i have defined the function sorted_alphanumeric. Here, I have used open cv library to read and resize images. Finally images are normalized and are converted to array and are appended in empty list

In [None]:
import os
import re
import cv2
import numpy as np
from tqdm import tqdm
from tensorflow.keras.preprocessing.image import img_to_array

# Helper function for alphanumeric sorting
def sorted_alphanumeric(data):
    convert = lambda text: int(text) if text.isdigit() else text.lower()
    alphanum_key = lambda key: [convert(c) for c in re.split('([0-9]+)', key)]
    return sorted(data, key=alphanum_key)

# Defining the size of the image
SIZE = 160
color_img = []
gray_img = []

# Path to color images
color_path = '/kaggle/input/imagenetsubsub'
color_files = os.listdir(color_path)
color_files = sorted_alphanumeric(color_files)

# Processing color and grayscale images
for file_name in tqdm(color_files):
    if file_name == 'ILSVRC2012_test_00080098':
        break
    else:
        # Read color image
        color_image = cv2.imread(os.path.join(color_path, file_name), 1)
        
        if color_image is None:
            continue  # Skip if the file is not a valid image

        # Convert to RGB
        color_image = cv2.cvtColor(color_image, cv2.COLOR_BGR2RGB)
        # Resize image
        color_image = cv2.resize(color_image, (SIZE, SIZE))
        # Normalize and append to color images
        color_image = color_image.astype('float32') / 255.0
        color_img.append(img_to_array(color_image))

        # Convert to grayscale
        gray_image = cv2.cvtColor(color_image, cv2.COLOR_RGB2GRAY)
        # Expand dimensions to match the expected shape (H, W, 1)
        gray_image = np.expand_dims(gray_image, axis=-1)
        # Append to grayscale images
        gray_img.append(img_to_array(gray_image))

In [None]:
# Print shapes to confirm processing
print("Color images shape:", len(color_img))
print("Grayscale images shape:", len(gray_img))


### Plotting Color image and it's corresponding grayscale image

In [None]:
def plot_images(color, grayscale):
    plt.figure(figsize=(15, 15))
    plt.subplot(1, 2, 1)
    plt.title('Color Image', color='green', fontsize=20)
    plt.imshow(color)
    plt.subplot(1, 2, 2)
    plt.title('Grayscale Image', color='black', fontsize=20)
    plt.imshow(grayscale.squeeze(), cmap='gray')  # Squeeze to remove extra dimension
    plt.show()


**Plotting image pair**

In [None]:
for i in range(3,10):
     plot_images(color_img[i],gray_img[i])

### Slicing and reshaping
Out of 5000 images I have sliced them to two part. train images consist 4000 images  while test images contains 1000 images.
After slicing the image array, I reshaped them so that images can be fed directly into our encoder network

In [None]:
train_gray_image = gray_img[:3000]
train_color_image = color_img[:3000]

test_gray_image = gray_img[3000:]
test_color_image = color_img[3000:]

# Ensuring the shapes match without reshaping unnecessarily
train_g = np.array(train_gray_image)  # Grayscale: (3000, SIZE, SIZE, 1)
train_c = np.array(train_color_image)  # Color: (3000, SIZE, SIZE, 3)

print('Train gray image shape:', train_g.shape)
print('Train color image shape:', train_c.shape)

test_g = np.array(test_gray_image)  # Grayscale: (remaining, SIZE, SIZE, 1)
test_c = np.array(test_color_image)  # Color: (remaining, SIZE, SIZE, 3)

print('Test gray image shape:', test_g.shape)
print('Test color image shape:', test_c.shape)


## Defining our model
Encoder layer of our model consist blocks of Convolution layer with different number of kernel and kernel_size. Here, Convolution is used for downsampling.
Similary, Decoder layer of our model consist of  transpose convolution layer with different kernel size. Here, Decoder layer upsample image downsampled by encoder.
Since there is feature loss between the encoder and decoder layers so inorder to prevent feature loss i have concatenate corresponding encoder and decoder layers. Check U_Net architecture for better understanding......

In [None]:
from keras import layers
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.applications import VGG16
import tensorflow.keras.backend as K



In [None]:
def down(filters, kernel_size, apply_batch_normalization=True):
    downsample = tf.keras.models.Sequential()
    downsample.add(layers.Conv2D(filters, kernel_size, padding='same', strides=2))
    if apply_batch_normalization:
        downsample.add(layers.BatchNormalization())
    downsample.add(layers.LeakyReLU())
    return downsample

def up(filters, kernel_size, dropout=False):
    upsample = tf.keras.models.Sequential()
    upsample.add(layers.Conv2DTranspose(filters, kernel_size, padding='same', strides=2))
    if dropout:
        upsample.add(layers.Dropout(0.2))
    upsample.add(layers.LeakyReLU())
    return upsample

def build_generator():
    inputs = layers.Input(shape=[160, 160, 3])
    
    # Downsampling
    d1 = down(128, (3, 3), False)(inputs)
    d2 = down(128, (3, 3), False)(d1)
    d3 = down(256, (3, 3), True)(d2)
    d4 = down(512, (3, 3), True)(d3)
    d5 = down(512, (3, 3), True)(d4)
    
    # Upsampling
    u1 = up(512, (3, 3), False)(d5)
    u1 = layers.concatenate([u1, d4])
    u2 = up(256, (3, 3), False)(u1)
    u2 = layers.concatenate([u2, d3])
    u3 = up(128, (3, 3), False)(u2)
    u3 = layers.concatenate([u3, d2])
    u4 = up(128, (3, 3), False)(u3)
    u4 = layers.concatenate([u4, d1])
    u5 = up(3, (3, 3), False)(u4)
    u5 = layers.concatenate([u5, inputs])
    output = layers.Conv2D(3, (2, 2), strides=1, padding='same', activation='tanh')(u5)
    
    return Model(inputs, output)


In [None]:
def build_discriminator():
    inputs = layers.Input(shape=(160, 160, 6))  # Grayscale + Color image
    x = layers.Conv2D(64, (4, 4), strides=2, padding='same')(inputs)
    x = layers.LeakyReLU()(x)
    
    x = layers.Conv2D(128, (4, 4), strides=2, padding='same')(x)
    x = layers.BatchNormalization()(x)
    x = layers.LeakyReLU()(x)
    
    x = layers.Conv2D(256, (4, 4), strides=2, padding='same')(x)
    x = layers.BatchNormalization()(x)
    x = layers.LeakyReLU()(x)
    
    x = layers.Conv2D(512, (4, 4), strides=2, padding='same')(x)
    x = layers.BatchNormalization()(x)
    x = layers.LeakyReLU()(x)
    
    x = layers.Flatten()(x)
    x = layers.Dense(1, activation='sigmoid')(x)
    
    return Model(inputs, x)


In [None]:
# Load VGG16 pre-trained model
vgg = VGG16(weights='imagenet', include_top=False, input_shape=(160, 160, 3))
feature_extractor = Model(inputs=vgg.input, outputs=vgg.get_layer('block3_conv3').output)
feature_extractor.trainable = False  # Freeze VGG16 layers

def perceptual_loss(y_true, y_pred):
    # Ensure inputs are 4D tensors
    y_true = tf.reshape(y_true, (-1, 160, 160, 3))
    y_pred = tf.reshape(y_pred, (-1, 160, 160, 3))
    
    # Normalize inputs to match VGG expectations
    y_true = y_true * 255.0  # Scale to [0, 255]
    y_pred = y_pred * 255.0  # Scale to [0, 255]

    # Extract VGG features
    y_true_features = feature_extractor(y_true)
    y_pred_features = feature_extractor(y_pred)

    # Compute perceptual loss as MSE in feature space
    return K.mean(K.square(y_true_features - y_pred_features))




In [None]:
generator = build_generator()
discriminator = build_discriminator()

# Compile the discriminator
discriminator.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5),
                      loss='binary_crossentropy',
                      metrics=['accuracy'])

# Combined model: Grayscale input -> Generator -> Discriminator
grayscale_input = layers.Input(shape=(160, 160, 3))
color_output = generator(grayscale_input)

# Combine grayscale + generated color image
combined_input = layers.Concatenate()([grayscale_input, color_output])
validity = discriminator(combined_input)

combined = Model(grayscale_input, validity)
combined.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0002, beta_1=0.5),
                 loss=perceptual_loss)


In [None]:
import numpy as np

epochs = 50
batch_size = 50

real = np.ones((batch_size, 1))  # Real labels
fake = np.zeros((batch_size, 1))  # Fake labels

In [None]:
for epoch in range(epochs):
    for batch in range(0, len(train_g_rgb), batch_size):
        # Get a batch of grayscale and color images
        gray_batch = train_g_rgb[batch:batch + batch_size]  # (batch_size, 160, 160, 3)
        color_batch = train_c[batch:batch + batch_size]     # (batch_size, 160, 160, 3)

        # Skip incomplete batches
        if gray_batch.shape[0] < batch_size:
            continue

        # Generate fake color images
        fake_color_batch = generator.predict(gray_batch)

        # Train the discriminator
        real_combined = np.concatenate([gray_batch, color_batch], axis=-1)  # Shape: (batch_size, 160, 160, 6)
        fake_combined = np.concatenate([gray_batch, fake_color_batch], axis=-1)  # Shape: (batch_size, 160, 160, 6)

        d_loss_real = discriminator.train_on_batch(real_combined, real)
        d_loss_fake = discriminator.train_on_batch(fake_combined, fake)
        d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

        # Train the generator (use perceptual loss)
        g_loss = combined.train_on_batch(gray_batch, color_batch)

    print(f"Epoch {epoch + 1}/{epochs}, Discriminator Loss: {d_loss[0]}, Generator Loss: {g_loss}")


In [None]:
print(f"gray_batch shape: {gray_batch.shape}")
print(f"color_batch shape: {color_batch.shape}")


### Fitting our model

In [None]:

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='mean_absolute_error',
              metrics=['acc'])

model.fit(train_g_rgb, train_c, epochs=50, batch_size=50, verbose=1)


In [None]:
import numpy as np

# Convert to NumPy arrays
test_color_image = np.array(test_color_image)
test_g_rgb = np.array(test_g_rgb)

# Check shapes
print("Shape of test_g_rgb:", test_g_rgb.shape)
print("Shape of test_color_image:", test_color_image.shape)

# Ensure equal number of samples
min_samples = min(test_g_rgb.shape[0], test_color_image.shape[0])
test_g_rgb = test_g_rgb[:min_samples]
test_color_image = test_color_image[:min_samples]

# Valid


In [None]:
loss, acc = model.evaluate(test_g_rgb, test_color_image)
print(f"Loss: {loss}, Accuracy: {acc}")


# plotting colorized image along with grayscale and color image

In [None]:
print(f"Shape of test_gray_image[{i}]:", test_gray_image[i].shape)


In [None]:
from skimage.metrics import structural_similarity as ssim
import numpy as np

# Define accuracy calculation function
def calculate_metrics(color, predicted):
    mse = np.mean((color - predicted) ** 2)
    psnr = 20 * np.log10(1.0 / np.sqrt(mse)) if mse > 0 else float('inf')
    ssim_value = ssim(color, predicted, multichannel=True, data_range=color.max() - color.min())
    return mse, psnr, ssim_value

# Modified plotting function with accuracy metrics
def plot_images_with_metrics(color, grayscale, predicted, mse, psnr, ssim_value):
    plt.figure(figsize=(15,15))
    plt.subplot(1,3,1)
    plt.title('Color Image', color='green', fontsize=20)
    plt.imshow(color)
    plt.subplot(1,3,2)
    plt.title('Grayscale Image', color='black', fontsize=20)
    plt.imshow(grayscale, cmap='gray')
    plt.subplot(1,3,3)
    plt.title(f'Predicted Image\nMSE: {mse:.4f}\nPSNR: {psnr:.2f} dB\nSSIM: {ssim_value:.4f}', color='red', fontsize=16)
    plt.imshow(predicted)
    plt.show()

results = []
# Loop to display images and their metrics
# Loop to display images and their metrics
for i in range(50, 58):
    # Fix shape: convert (160, 160, 1) -> (160, 160, 3)
    gray_image = np.repeat(test_gray_image[i], 3, axis=-1)  # Duplicate channel to make it RGB
    gray_image = gray_image.reshape(1, SIZE, SIZE, 3)  # Add batch dimension

    # Predict
    predicted = model.predict(gray_image)

    # Ensure predicted output has the correct shape
    if predicted.shape[-1] == 1:  # If the output is single-channel
        predicted = np.repeat(predicted, 3, axis=-1)  # Convert to 3 channels

    predicted = predicted.squeeze()  # Remove batch dimension
    predicted = np.clip(predicted, 0.0, 1.0)  # Clip values to [0, 1]

    # Calculate metrics
    mse, psnr, ssim_value = calculate_metrics(test_color_image[i], predicted)
    results.append([mse, psnr, ssim_value])

    # Plot results
    plot_images_with_metrics(test_color_image[i], test_gray_image[i].squeeze(), predicted, mse, psnr, ssim_value)


In [None]:
results = np.array(results)

# Calculate average metrics
average_metrics = results.mean(axis=0)

# Output the average metrics
average_metrics

# Thanks for your visit.
## Any suggestions to improve this model is highly appreciated.
# Feel free to  comment