<br>
<br>

![](https://upload.wikimedia.org/wikipedia/en/5/5f/Western_Institute_of_Technology_and_Higher_Education_logo.png)

**InstitutoTecnológico y de Estudios Superiores de Occidente**

**Maestría Ciencia de Datos**

**Aprendizaje Profundo**

# Actividad 6 Construcción de un modelo para localizar aviones #

<br>
<br>

* * *

Estudiante: Daniel Nuño <br>
Profesor: Dr. Francisco Cervantes <br>
Fecha entrega: Marzo 12, 2023 <br>

* * *

<br>
<br>

## 1) Implemente un generador para los conjuntos de datos

Entrenamiento, validación y prueba. En caso de ser necesario, aplica el aumento de datos para generar más imágenes. (20 pts)
Antes de iniciar la implementación determina ¿cuál debería ser la forma de los datos de entrada y salida esperada para tu modelo?


> Para este ejemplo de solución, implementaremos un **generador de datos que reciba como entrada una lista de cadenas de texto** con la siguiente información:
>
> **"images_path/image_name.jpg,x_min,y_min,x_max,y_max"**
>
> Inicialmente el archivo **airplanes.csv** contiene las anotaciones en el siguiente formato.
>
> **"image_name.jpg,x_min,y_min,x_max,y_max"**
>
> Por lo que sería necesario agregar a cada anotación la ruta en donde se encuentran las imágenes.
>
> Opcionalmente, para reducir el trabajo del generador de datos durante la etapa de entrenamiento, podemos escalar el *boundingbox (x_min, y_min, x_max, y_max)* de cada imagen acorde a sus dimensiones: (height, width, 3). Esto es:
>
> *  x_min = x_min/width 
> *  y_min = y_min/height
> *  x_max = x_max/width
> *  x_min = x_min/height
>
> Para implementar la escala, necesitariamos abrir cada imagen para consultar los valores de: width y height.
>
> Nota: considere el manejo de los tipos de datos.
>
> La **salida del generador** un **dataset**, en donde cada elemento es una dupla: (x, y). Tal que
>
> *  x:  imágen de dimensiones (h, w, 3)
> *  y:  boundingbox (x_min, y_min, x_max, y_max) 

In [1]:
import cv2 as cv
import random
import tensorflow as tf
from tensorflow.data import AUTOTUNE

In [3]:
img_path= "C:/Users/nuno/Desktop/deep-learning-data/activity6/dataset/images"
annotations_path = "C:/Users/nuno/Desktop/deep-learning-data/activity6/dataset/airplanes.csv"

In [4]:
def preprocess_annotation(img_src, annotation):
  filename, x_min, y_min, x_max, y_max = annotation.split(",")
  img_path = img_src + "/" + filename
  img = cv.imread(img_path)
  h, w, _ = img.shape
  annotation = "".join([img_path, ",", str(float(x_min)/w), ",",
                        str(float(y_min)/h), ",",
                        str(float(x_max)/w), ",",
                        str(float(y_max)/h)])
  return annotation

In [5]:
txt = preprocess_annotation(img_path, "image_0002.jpg,59,35,342,153")
print(txt)

C:/Users/nuno/Desktop/deep-learning-data/activity6/dataset/images/image_0002.jpg,0.14713216957605985,0.19021739130434784,0.8528678304239401,0.8315217391304348


In [6]:
def build_datasets(images_path, annotation_path, train_split=0.95, val_split=0.1):
  
  annotations = open(annotation_path).read().splitlines()
  examples = [ preprocess_annotation(images_path, item) for item in annotations ]
  
  random.shuffle(examples)

  #Aplicar el split al conjunto de datos
  s = int(len(examples)*train_split)
  train_examples = examples[:s]
  test_examples = examples[s:]

  s = int(len(train_examples)*val_split)
  val_examples = train_examples[:s]
  train_examples = train_examples[s:]

  return train_examples, val_examples, test_examples

In [7]:
train_data, val_data, test_data = build_datasets(img_path, annotations_path, 0.95, 0.1)

In [8]:
print(train_data[0])

C:/Users/nuno/Desktop/deep-learning-data/activity6/dataset/images/image_0116.jpg,0.1319796954314721,0.18235294117647058,0.8527918781725888,0.8176470588235294


In [9]:
print(len(val_data))

76


In [10]:
def loadExample(example):
  # Extraer de la cadena image, bbox 
  str_tensors = tf.strings.split(example, sep=",")

  # Cargar la imagen
  img = tf.io.read_file(str_tensors[0])
  img = tf.image.decode_jpeg(img, channels=3)
  img = tf.image.convert_image_dtype(img, dtype=tf.float16)
  img = tf.image.resize(img, (128, 128))

  x_min = tf.strings.to_number(str_tensors[1])
  y_min = tf.strings.to_number(str_tensors[2])
  x_max = tf.strings.to_number(str_tensors[3])
  y_max = tf.strings.to_number(str_tensors[4])

  bbox = [x_min, y_min, x_max, y_max]
          
  return img, bbox

In [82]:
img, bbox = loadExample(train_data[0])
print(bbox)

[<tf.Tensor: shape=(), dtype=float32, numpy=0.13197969>, <tf.Tensor: shape=(), dtype=float32, numpy=0.18235295>, <tf.Tensor: shape=(), dtype=float32, numpy=0.8527919>, <tf.Tensor: shape=(), dtype=float32, numpy=0.81764704>]


In [72]:
# Pipelines 

batch_size = 32
train_dataset = tf.data.Dataset.from_tensor_slices(train_data)
train_dataset = (train_dataset
                 .shuffle(len(train_data))
                 .map(loadExample, num_parallel_calls=AUTOTUNE)
                 .cache()
                 .batch(batch_size)
                 .prefetch(AUTOTUNE)
                 )

val_dataset = tf.data.Dataset.from_tensor_slices(val_data)
val_dataset = (val_dataset
                 .shuffle(len(val_data))
                 .map(loadExample, num_parallel_calls=AUTOTUNE)
                 .cache()
                 .batch(batch_size)
                 .prefetch(AUTOTUNE)
                 )

In [80]:
type(train_data)

list

In [81]:
type(train_dataset)

tensorflow.python.data.ops.dataset_ops.PrefetchDataset

## 2) Elección como base uno de los modelos preentrenados

Para la arquitectura de tu modelo, se sugiere que elijas como base uno de los modelos preentrenados de [tensorflow.keras.applications](https://www.tensorflow.org/api_docs/python/tf/keras/applications), por ejemplo: resnet, xception, vgg, etc.


Intentare primero con **Resnet** para ser una buena selección como modelo preentrenado por que el paper meciona una mejora relative 28% con el conjunto COCO para detección de objetos.


> ... we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation. 

## 3) Agrega capas de entrada, capas ocultas y de salida

En caso de ser necesario, agrega capas de entrada, capas ocultas y de salida para que tu modelo realiza de forma apropiada la tarea de localización. Para determinar la configuración de la capa de salida, recuerda cuál es la salida esperada para tu modelo.

In [83]:
from tensorflow.keras.applications import resnet
from tensorflow.keras.layers import Dense, Flatten, Input
from tensorflow.keras import Model
import tensorflow.keras as keras

In [87]:
input_tensor = Input(shape=(128, 128, 3))

model_resnet = resnet.ResNet101(weights="imagenet", include_top=False, input_shape=(128,128,3))

model_resnet.trainable = False            # No queremos continuar entrenando los pesos de VGG16
output_resnet = model_resnet.output        # Hacemos referencia al tensor de salida de VGG16

# Ahora agreguemos algunas capas y concluyamos con las regresión

# Regresión (4 valores reales)
x_tensor = Flatten()(output_resnet)
x_tensor = Dense(25, activation="relu")(x_tensor)
output_tensor = Dense(4, activation="relu", name = "output")(x_tensor)

my_moedl = Model(inputs=input_tensor, outputs=output_tensor)


In [94]:
base_model = resnet.ResNet101(
                            weights='imagenet',  # Load weights pre-trained on ImageNet.
                            input_shape=(128, 128, 3),
                            include_top=False)  # Do not include the ImageNet classifier at the top.

base_model.trainable = False

inputs = Input(shape=(128, 128, 3))
# We make sure that the base_model is running in inference mode here,
# by passing `training=False`. This is important for fine-tuning, as you will
# learn in a few paragraphs.
x = base_model(inputs, training=False)
# Convert features of shape `base_model.output_shape[1:]` to vectors
x =  Flatten()(x)
# A Dense classifier with a single unit (binary classification)
outputs = Dense(4, activation="relu", name = "output")(x)
my_model = Model(inputs, outputs)

In [95]:
print("trainable_weights:", len(my_model.trainable_weights))
print("non_trainable_weights:", len(my_model.non_trainable_weights))

trainable_weights: 2
non_trainable_weights: 624


In [97]:
my_model.summary()

Model: "model_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_7 (InputLayer)        [(None, 128, 128, 3)]     0         
                                                                 
 resnet101 (Functional)      (None, 4, 4, 2048)        42658176  
                                                                 
 flatten_3 (Flatten)         (None, 32768)             0         
                                                                 
 output (Dense)              (None, 4)                 131076    
                                                                 
Total params: 42,789,252
Trainable params: 131,076
Non-trainable params: 42,658,176
_________________________________________________________________


## 4) Entrena y valida tu modelo. 

- Utilizando como función de pérdida el error cuadrado medio y accuracy como métrica.
- Utilizando como métrica: IoU.
- Utilizando como función de pérdida y métrica: GIoU.

### Utilizando como función de pérdida el error cuadrado medio y accuracy como métrica. 


In [98]:
my_model.compile(optimizer=keras.optimizers.Adam(),
              loss=['mse'],
              metrics=['accuracy'])

my_model.fit(train_dataset, epochs=2, validation_data=val_dataset)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x1f462b94640>

### Utilizando como métrica: IoU.

Existe una librería llamada tensorflow_addons que tiene diferentes metricos, perdidas y optimizadores; incluido IoU. Pero la version de python tiene que ser menor a 3.10.

Entonces se tiene que implemetar.

In [110]:
def bb_intersection_over_union(boxA, boxB):
    
    # determine the (x, y)-coordinates of the intersection rectangle
    xA = max(boxA[0], boxB[0])
    yA = max(boxA[1], boxB[1])
    xB = min(boxA[2], boxB[2])
    yB = min(boxA[3], boxB[3])
    # compute the area of intersection rectangle
    interArea = max(0, xB - xA + 1) * max(0, yB - yA + 1)
    # compute the area of both the prediction and ground-truth
    # rectangles
    boxAArea = (boxA[2] - boxA[0] + 1) * (boxA[3] - boxA[1] + 1)
    boxBArea = (boxB[2] - boxB[0] + 1) * (boxB[3] - boxB[1] + 1)
    # compute the intersection over union by taking the intersection
    # area and dividing it by the sum of prediction + ground-truth
    # areas - the interesection area
    iou = interArea / float(boxAArea + boxBArea - interArea)
    # return the intersection over union value
    return iou

def IoU_Own(y_true, y_pred):
    
    # Note: the type float32 is very important. It must be the same type as the output from
    # the python function above or you too may spend many late night hours 
    # trying to debug and almost give up.

    iou = tf.py_function(bb_intersection_over_union, [y_true, y_pred], tf.float32)

    return iou

In [115]:
def bb_intersection_over_union(boxA, boxB):
    
    # determine the (x, y)-coordinates of the intersection rectangle
    xA = tf.maximum(boxA[0], boxB[0])
    yA = tf.maximum(boxA[1], boxB[1])
    xB = tf.minimum(boxA[2], boxB[2])
    yB = tf.minimum(boxA[3], boxB[3])
    # compute the area of intersection rectangle
    interArea = tf.maximum(0, xB - xA + 1) * tf.maximum(0, yB - yA + 1)
    # compute the area of both the prediction and ground-truth
    # rectangles
    boxAArea = (boxA[2] - boxA[0] + 1) * (boxA[3] - boxA[1] + 1)
    boxBArea = (boxB[2] - boxB[0] + 1) * (boxB[3] - boxB[1] + 1)
    # compute the intersection over union by taking the intersection
    # area and dividing it by the sum of prediction + ground-truth
    # areas - the interesection area
    iou = interArea / tf.float32(boxAArea + boxBArea - interArea)
    # return the intersection over union value
    return iou

def IoU_Own(y_true, y_pred):
    
    # Note: the type float32 is very important. It must be the same type as the output from
    # the python function above or you too may spend many late night hours 
    # trying to debug and almost give up.

    iou = tf.py_function(bb_intersection_over_union, [y_true, y_pred], tf.float32)

    return iou

In [116]:
my_model.compile(optimizer=keras.optimizers.Adam(),
              loss='mse',
              metrics=IoU_Own)

my_model.fit(train_dataset, epochs=2, validation_data=val_dataset)

Epoch 1/2


InvalidArgumentError: Graph execution error:

Detected at node 'EagerPyFunc' defined at (most recent call last):
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\runpy.py", line 196, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\runpy.py", line 86, in _run_code
      exec(code, run_globals)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel_launcher.py", line 17, in <module>
      app.launch_new_instance()
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\traitlets\config\application.py", line 846, in launch_instance
      app.start()
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel\kernelapp.py", line 712, in start
      self.io_loop.start()
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\tornado\platform\asyncio.py", line 215, in start
      self.asyncio_loop.run_forever()
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\asyncio\base_events.py", line 600, in run_forever
      self._run_once()
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\asyncio\base_events.py", line 1896, in _run_once
      handle._run()
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\asyncio\events.py", line 80, in _run
      self._context.run(self._callback, *self._args)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel\kernelbase.py", line 510, in dispatch_queue
      await self.process_one()
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel\kernelbase.py", line 499, in process_one
      await dispatch(*args)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel\kernelbase.py", line 406, in dispatch_shell
      await result
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel\kernelbase.py", line 730, in execute_request
      reply_content = await reply_content
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel\ipkernel.py", line 383, in do_execute
      res = shell.run_cell(
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\ipykernel\zmqshell.py", line 528, in run_cell
      return super().run_cell(*args, **kwargs)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_cell
      result = self._run_cell(
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\IPython\core\interactiveshell.py", line 2936, in _run_cell
      return runner(coro)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\IPython\core\interactiveshell.py", line 3135, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\IPython\core\interactiveshell.py", line 3338, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\IPython\core\interactiveshell.py", line 3398, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "C:\Users\nuno\AppData\Local\Temp\ipykernel_6484\3160373843.py", line 5, in <cell line: 5>
      my_model.fit(train_dataset, epochs=2, validation_data=val_dataset)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler
      return fn(*args, **kwargs)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\engine\training.py", line 1409, in fit
      tmp_logs = self.train_function(iterator)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\engine\training.py", line 1051, in train_function
      return step_function(self, iterator)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\engine\training.py", line 1040, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\engine\training.py", line 1030, in run_step
      outputs = model.train_step(data)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\engine\training.py", line 894, in train_step
      return self.compute_metrics(x, y, y_pred, sample_weight)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\engine\training.py", line 987, in compute_metrics
      self.compiled_metrics.update_state(y, y_pred, sample_weight)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\engine\compile_utils.py", line 501, in update_state
      metric_obj.update_state(y_t, y_p, sample_weight=mask)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\utils\metrics_utils.py", line 70, in decorated
      update_op = update_state_fn(*args, **kwargs)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\metrics\base_metric.py", line 140, in update_state_fn
      return ag_update_state(*args, **kwargs)
    File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\keras\metrics\base_metric.py", line 646, in update_state
      matches = ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "C:\Users\nuno\AppData\Local\Temp\ipykernel_6484\3427449114.py", line 27, in IoU_Own
      iou = tf.py_function(bb_intersection_over_union, [y_true, y_pred], tf.float32)
Node: 'EagerPyFunc'
TypeError: 'DType' object is not callable
Traceback (most recent call last):

  File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\tensorflow\python\ops\script_ops.py", line 268, in __call__
    return func(device, token, args)

  File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\tensorflow\python\ops\script_ops.py", line 146, in __call__
    outputs = self._call(device, args)

  File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\tensorflow\python\ops\script_ops.py", line 153, in _call
    ret = self._func(*args)

  File "c:\Users\nuno\Miniconda3\envs\ml\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 642, in wrapper
    return func(*args, **kwargs)

  File "C:\Users\nuno\AppData\Local\Temp\ipykernel_6484\3507677061.py", line 17, in bb_intersection_over_union
    iou = interArea / tf.float32(boxAArea + boxBArea - interArea)

TypeError: 'DType' object is not callable


	 [[{{node EagerPyFunc}}]] [Op:__inference_train_function_118534]

## Referencias

https://arxiv.org/abs/1512.03385

https://www.tensorflow.org/addons