<br>
<br>

![](https://upload.wikimedia.org/wikipedia/en/5/5f/Western_Institute_of_Technology_and_Higher_Education_logo.png)

**InstitutoTecnológico y de Estudios Superiores de Occidente**

**Maestría Ciencia de Datos**

**Aprendizaje Profundo**

# Proyecto: clasificación y localización de objetos #

<br>
<br>

* * *

Estudiante: Daniel Nuño <br>
Profesor: Dr. Francisco Cervantes <br>
Fecha entrega: Marzo 26, 2023 <br>

* * *

<br>
<br>

## Libraries

In [41]:
import os
import random
import cv2 as cv

import tensorflow as tf
#from tensorflow import keras
#from tensorflow.data import AUTOTUNE

#from tensorflow.keras.layers import Conv2D, Flatten, Dense, Input
#from tensorflow.keras.applications import VGG16
#from tensorflow.keras.models import Model
#from tensorflow.keras.utils import plot_model

## Pre processing data and functions

### Processing training data

Set paths for training and ids-categories

In [16]:
img_path= "C:/Users/nuno/Desktop/deep-learning-data/proyecto1/tiny-imagenet-200/train"

project_id_path = "C:/Users/nuno/Desktop/deep-learning-data/proyecto1/tiny-imagenet-200/wnids.txt"
all_id_cat_path = "C:/Users/nuno/Desktop/deep-learning-data/proyecto1/tiny-imagenet-200/words.txt"

Read project ids as list

In [17]:
project_id_list = []
with open(project_id_path) as f:
    for line in f:
        project_id_list.append(line.strip())

Read categories as dictionary

In [18]:
id_cat_dict = dict()
with open(all_id_cat_path, 'r') as f:
    for line in f:
        resulting_line = line.strip().split('\t')
        id_cat_dict[resulting_line[0]] = resulting_line[1]

Get all list of files. Separate bounding box files frome images.

In [19]:
training_files_img = []
training_files_bb = []
for dirpath, dirnames, filenames in os.walk(img_path):
    for filename in filenames:
        path = os.path.join(dirpath, filename)
        if path.endswith('txt'):
            training_files_bb.append(path)
        else:
            training_files_img.append(path)

In [20]:
len(training_files_bb), len(training_files_img)

(200, 100000)

Process bounding boxes

In [21]:
bb_dict = dict()
for file in training_files_bb:
    with open(file, 'r') as f:
        for line in f:
            img_name, xmin, ymin, xmax, ymax = line.strip().split('\t')
            bb_dict[img_name] = [xmin, ymin, xmax, ymax]

Check elements of dictionary of bounding boxes

In [40]:
len(set(bb_dict.keys()))

100000

Create data set list that returns img full path, category, bounding box.

In [23]:
training_list = []
for file in training_files_img:
    #get category and file name
    _, category, _, image_name = training_files_img[0].split('\\')
    #open image
    img = cv.imread(file)
    #get dimensions
    h, w, _ = img.shape
    #get correct size bounding box
    original_bb = bb_dict[image_name]
    rs_bb = [float(original_bb[0])/w,
               float(original_bb[1])/h,
               float(original_bb[2])/w,
               float(original_bb[3])/h,
                ]
    # treat it as list
    example = (file, category, rs_bb)
    #appended to final list
    training_list.append(example)

In [24]:
random.shuffle(training_list)

In [25]:
len(training_list)

100000

### Processing validation data

Path of images and annotations

In [26]:
img_path_val = "C:/Users/nuno/Desktop/deep-learning-data/proyecto1/tiny-imagenet-200/val/images"
val_annotations_txt = "C:/Users/nuno/Desktop/deep-learning-data/proyecto1/tiny-imagenet-200/val/val_annotations.txt"

Process annotations

In [27]:
annotations_dict_val = dict()
validation_list = list()
with open(val_annotations_txt, 'r') as f:
    for line in f:
        img_name, category, xmin, ymin, xmax, ymax = line.strip().split('\t')
        full_path = img_path_val + '/' + img_name
        img = cv.imread(full_path)
        h, w, _ = img.shape
        rs_xmin = float(xmin)/w
        rs_xmax = float(xmax)/w
        rs_ymin = float(ymin)/h
        rs_ymax = float(ymax)/h
        annotations_dict_val[full_path] = (category, [rs_xmin, rs_ymin, rs_xmax, rs_ymax])
        validation_list.append((full_path, category, [rs_xmin, rs_ymin, rs_xmax, rs_ymax]))

Check training and validation have the same format.

In [34]:
training_list[0]

['C:/Users/nuno/Desktop/deep-learning-data/proyecto1/tiny-imagenet-200/train\\n04417672\\images\\n04417672_436.JPEG',
 'n01443537',
 [0.0, 0.15625, 0.984375, 0.90625]]

In [35]:
validation_list[0]

('C:/Users/nuno/Desktop/deep-learning-data/proyecto1/tiny-imagenet-200/val/images/val_0.JPEG',
 'n03444034',
 [0.0, 0.5, 0.6875, 0.96875])

In [67]:
type(validation_list[0][1])

str

In [69]:
tf.constant(validation_list[0][1])

<tf.Tensor: shape=(), dtype=string, numpy=b'n03444034'>

### Load function

Because we want to use TensorFlow batches, it is important to use TensorFlow classes. TensorFlow batches allow you to load data in small groups, instead of loading all into memory at once.

The following function load and treat the image element by element. Each element is a list of three: [full image name, category, bounding box]

bounding box is a list of 4: x_min, y_min, x_max, y_max.

In [70]:
def load_element(element):
    #load image
    img = tf.io.read_file(element[0])
    #make sure is 3 channels
    img = tf.image.decode_jpeg(img, channels=3)
    #conver to float
    img = tf.image.convert_image_dtype(img, dtype=tf.float16)
    #resize
    img = tf.image.resize(img, (128, 128))
    #category
    category = tf.constant(element[1])
    #bounding box
    x_min = tf.abs(element[2][0])
    y_min = tf.abs(element[2][1])
    x_max = tf.abs(element[2][2])
    y_max = tf.abs(element[2][3])
    bb = [x_min, y_min, x_max, y_max]

    return (img, category, bb)

In [71]:
img, category, bb = load_element(training_list[0])

In [72]:
img

<tf.Tensor: shape=(128, 128, 3), dtype=float32, numpy=
array([[[0.94921875, 0.94921875, 0.94921875],
        [0.9511719 , 0.9511719 , 0.9511719 ],
        [0.9550781 , 0.9550781 , 0.9550781 ],
        ...,
        [0.30096436, 0.30096436, 0.30096436],
        [0.40093994, 0.40093994, 0.40093994],
        [0.45092773, 0.45092773, 0.45092773]],

       [[0.9511719 , 0.9511719 , 0.9511719 ],
        [0.953125  , 0.953125  , 0.953125  ],
        [0.95703125, 0.95703125, 0.95703125],
        ...,
        [0.3066101 , 0.3066101 , 0.3066101 ],
        [0.35708618, 0.35708618, 0.35708618],
        [0.38232422, 0.38232422, 0.38232422]],

       [[0.9550781 , 0.9550781 , 0.9550781 ],
        [0.95703125, 0.95703125, 0.95703125],
        [0.9609375 , 0.9609375 , 0.9609375 ],
        ...,
        [0.3179016 , 0.3179016 , 0.3179016 ],
        [0.26937866, 0.26937866, 0.26937866],
        [0.24511719, 0.24511719, 0.24511719]],

       ...,

       [[0.15002441, 0.15002441, 0.15002441],
        [0.16

In [73]:
category

<tf.Tensor: shape=(), dtype=string, numpy=b'n01443537'>

In [74]:
bb

[<tf.Tensor: shape=(), dtype=float32, numpy=0.0>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.15625>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.984375>,
 <tf.Tensor: shape=(), dtype=float32, numpy=0.90625>]

## Define error (loss/metrics) function

## Pipeline for test and new data

1. define folder path for images
2. get images names
3. recursively:
    - load image
    - resize
    - get category and bounding box from nn
    - save results to a list containing the image name, category and bounding box
4. save list of results to csv.