___

# <font color= #f6c049> **Soda Pop Project: Data Processing** </font>
#### <font color= #2E9AFE> `Deep Learning`</font>
<Strong> Sofía Maldonado, Óscar Josué Rocha & Viviana Toledo </Strong>

_27/02/2026._

___

In [1]:
# General
import numpy as np
import pandas as pd

# Visualization
import matplotlib.pyplot as plt

# Models
import tensorflow as tf
from tensorflow.python.client import device_lib
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, UpSampling2D, Concatenate, Dropout
from keras.optimizers import Adam
from keras import Model

# Loss Function
from tensorflow.keras.applications import VGG19
from tensorflow.keras import Model, Input

2026-02-26 20:01:57.059206: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [2]:
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))

Num GPUs Available:  1


# <font color= #f6c049> **Modeling** </font>

The selected model architecture for the image generation task will be a **Convolutional Autoencoder with a Perceptual loss function using VGG-19**. 

## <font color= #dba226> &ensp; • **Perceptual Loss with VGG Model** </font>

The final convolutional layer from the VGG19 model is going to be extracted to use as a loss function. This layer contains deep patterns captured during training. The weights of the model will be freezed so that no more training is performed, and the layer is only used to compare the input images versus the generated ones.

In [3]:
def get_vgg_model():
    # Get VGG model trained with imagenet, with no fixed input shape
    vgg = VGG19(weights='imagenet', include_top=False, input_shape=(None, None, 3))
    vgg.trainable = False           # Freeze weights
    loss_model = Model(inputs=vgg.input, outputs=vgg.get_layer('block5_conv4').output)
    loss_model.trainable = False
    return loss_model

vgg_model = get_vgg_model()

I0000 00:00:1772157718.665952   29194 gpu_device.cc:2020] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4143 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4050 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.9


With the previously extracted layer, a customized loss function is defined by calculating the difference between predicted and true images:

In [4]:
def perceptual_loss(y_true, y_pred):
    y_true_features = vgg_model(y_true)
    y_pred_features = vgg_model(y_pred)
    return tf.reduce_mean(tf.square(y_true_features - y_pred_features))

## <font color= #dba226> &ensp; • **Convolutional Autoencoder** </font>

Before starting the CAE, we import the previously normalized and resized images:

In [5]:
train_ds = tf.data.Dataset.load('../data/processed/train_ds')
val_ds = tf.data.Dataset.load('../data/processed/val_ds')
test_ds = tf.data.Dataset.load('../data/processed/test_ds')

The images have a size of 256x256 pixels. Thus, the dimensionality reduction within the autoencoder will be done as follows:

$$
65,536_{input}  \rightarrow 16,384_{h_1} \rightarrow 4096_{h_2} \rightarrow 1024_{h_3} \rightarrow 256_z \leftarrow 1024_{h_3} \leftarrow 4096_{h_2} \leftarrow 16,384_{h_1} \leftarrow 65,536_{output}
$$

The hyperparameters of the Convolutional Autoencoder will be:

- Optimizer: Adam
- Loss: Perceptual Loss
- Epochs: 5

In [6]:
# Parameters
input_shape = (256,256,3)         # 65,536

# Input 
input_layer = Input(shape=input_shape)

# Encoder
enc_layer_1 = Conv2D(128, kernel_size=3, activation='relu', padding='same')(input_layer)             
enc_max_pool_1 = MaxPooling2D(pool_size=(2,2))(enc_layer_1)

enc_layer_2 = Conv2D(128, kernel_size=3, activation='relu', padding='same')(enc_max_pool_1)             
enc_max_pool_2 = MaxPooling2D(pool_size=(2,2))(enc_layer_2)

enc_layer_3 = Conv2D(64, kernel_size=3, activation='relu', padding='same')(enc_max_pool_2)             
enc_max_pool_3 = MaxPooling2D(pool_size=(2,2))(enc_layer_3)

enc_layer_4 = Conv2D(64, kernel_size=3, activation='relu', padding='same')(enc_max_pool_3)             
enc_max_pool_4 = MaxPooling2D(pool_size=(2,2))(enc_layer_4)

In [7]:
# Decoder
dec_layer_1 = Conv2D(64, kernel_size=3, activation='relu', padding='same')(enc_max_pool_4)
dec_up_sampling_1 = UpSampling2D(size=(2,2))(dec_layer_1)

dec_layer_2 = Conv2D(64, kernel_size=3, activation='relu', padding='same')(dec_up_sampling_1)
dec_up_sampling_2 = UpSampling2D(size=(2,2))(dec_layer_2)

dec_layer_3 = Conv2D(128, kernel_size=3, activation='relu', padding='same')(dec_up_sampling_2)
dec_up_sampling_3 = UpSampling2D(size=(2,2))(dec_layer_3)

dec_layer_4 = Conv2D(128, kernel_size=3, activation='relu', padding='same')(dec_up_sampling_3)
dec_up_sampling_4 = UpSampling2D(size=(2,2))(dec_layer_4)

# Concatenate
dec_up_sampling_4 = Concatenate()([dec_up_sampling_4, enc_layer_1])
concat = Dropout(0.3)(dec_up_sampling_4)

decoded = Conv2D(3, kernel_size=3, activation='sigmoid', padding='same')(concat)

In [8]:
autoencoder = Model(input_layer, decoded)
autoencoder.compile(optimizer='adam', loss=perceptual_loss)
autoencoder.summary()

In [9]:
autoencoder.fit(train_ds, batch_size=64, epochs=5, validation_data=(val_ds))

Epoch 1/5


2026-02-26 20:02:01.797938: I external/local_xla/xla/service/service.cc:163] XLA service 0x7a7d681197a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2026-02-26 20:02:01.797962: I external/local_xla/xla/service/service.cc:171]   StreamExecutor device (0): NVIDIA GeForce RTX 4050 Laptop GPU, Compute Capability 8.9
2026-02-26 20:02:01.864323: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2026-02-26 20:02:02.492768: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:473] Loaded cuDNN version 91900
2026-02-26 20:02:07.324392: W external/local_xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 5.05GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2026-02-26 20:02:09.08084

UnknownError: Graph execution error:

Detected at node StatefulPartitionedCall defined at (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main

  File "<frozen runpy>", line 88, in _run_code

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel_launcher.py", line 18, in <module>

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/traitlets/config/application.py", line 1075, in launch_instance

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel/kernelapp.py", line 758, in start

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/tornado/platform/asyncio.py", line 211, in start

  File "/home/vivienne/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/asyncio/base_events.py", line 608, in run_forever

  File "/home/vivienne/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once

  File "/home/vivienne/.local/share/uv/python/cpython-3.11.14-linux-x86_64-gnu/lib/python3.11/asyncio/events.py", line 84, in _run

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 621, in shell_main

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 478, in dispatch_shell

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 372, in execute_request

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel/kernelbase.py", line 834, in execute_request

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel/ipkernel.py", line 464, in do_execute

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/ipykernel/zmqshell.py", line 663, in run_cell

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3123, in run_cell

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3178, in _run_cell

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/IPython/core/async_helpers.py", line 128, in _pseudo_sync_runner

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3400, in run_cell_async

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3641, in run_ast_nodes

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py", line 3701, in run_code

  File "/tmp/ipykernel_29194/835743967.py", line 1, in <module>

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py", line 117, in error_handler

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py", line 399, in fit

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py", line 241, in function

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py", line 154, in multi_step_on_iterator

  File "/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/keras/src/backend/tensorflow/trainer.py", line 125, in wrapper

Failed to determine best cudnn convolution algorithm for:
%cudnn-conv-bw-input.40 = (f32[32,256,256,256]{3,2,1,0}, u8[0]{0}) custom-call(%multiply.50, %bitcast.1936), window={size=3x3 pad=1_1x1_1}, dim_labels=bf01_oi01->bf01, custom_call_target="__cudnn$convBackwardInput", metadata={op_type="Conv2DBackpropInput" op_name="gradient_tape/functional_1_1/conv2d_8_1/convolution/Conv2DBackpropInput" source_file="/home/vivienne/apps/deep-learning/soda_pop_proy/.venv/lib/python3.11/site-packages/tensorflow/python/framework/ops.py" source_line=1221}, backend_config={"operation_queue_id":"0","wait_on_operation_queues":[],"cudnn_conv_backend_config":{"activation_mode":"kNone","conv_result_scale":1,"side_input_scale":0,"leakyrelu_alpha":0},"force_earliest_schedule":false,"reification_cost":[]}

Original error: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 2164260864 bytes. [tf-allocator-allocation-error='']

To ignore this failure and try to use a fallback algorithm (which may have suboptimal performance), use XLA_FLAGS=--xla_gpu_strict_conv_algorithm_picker=false.  Please also file a bug for the root cause of failing autotuning.
	 [[{{node StatefulPartitionedCall}}]] [Op:__inference_multi_step_on_iterator_5745]