# Computer Vision

This is a fairly complex topic so I'd write more decleratively with lesser background notes, although I might link resources here and there as required.

We'll start small and scale up. Be sure to use GPU for this project otherwise it will be very slow.

In [1]:
# define imports
import json 

import matplotlib.pyplot as plt
import tensorflow as tf

2023-09-24 14:34:01.287880: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Check GPU 

In [5]:
gpus = tf.config.list_physical_devices('GPU')
print("Num GPUs Available: ", len(gpus))

Num GPUs Available:  1


In [3]:
data_dir = "./food-101/images/"
conf_dir = "./food-101/meta/"

class_names = ["pizza", "steak"]

data_sets = {}
data_sets_with_paths = {}

with open(f"{conf_dir}/test.json", "r") as file:
    stuff = file.read()
    test_data_set = json.loads(stuff)
    file.close()

with open(f"{conf_dir}/train.json", "r") as file:
    stuff = file.read()
    train_data_set = json.loads(stuff)
    file.close()

for classes in class_names:

    data_sets[classes] = {}
    data_sets_with_paths[classes] = {}

    image_paths = []
    for image in test_data_set[classes]:
        image_paths.append(data_dir + image)
    data_sets_with_paths[classes]['test_set'] = image_paths

    image_paths = []
    for image in train_data_set[classes]:
        image_paths.append(data_dir + image)
    data_sets_with_paths[classes]['train_set'] = image_paths
    
    data_sets[classes]['test_set'] = test_data_set[classes]
    data_sets[classes]['train_set'] = train_data_set[classes]

## Input the data

In [4]:
tf.device(gpus[0])
tf.random.set_seed(42)

batch_size = 32
img_height = 180
img_width = 180

train_ds = tf.keras.utils.image_dataset_from_directory(
    data_dir, 
    validation_split=0.2, 
    subset="training", 
    seed=42, 
    image_size=(img_height, img_width), 
    batch_size=batch_size
    )
val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=42,
  image_size=(img_height, img_width),
  batch_size=batch_size
  )

class_names = train_ds.class_names

KeyboardInterrupt: 

## Visualize the data

In [None]:
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")

## Configure the dataset for performance
Make sure to use buffered prefetching, so you can yield data from disk without having I/O become blocking. These are two important methods you should use when loading data:

* Dataset.cache keeps the images in memory after they're loaded off disk during the first epoch. This will ensure the dataset does not become a bottleneck while training your model. If your dataset is too large to fit into memory, you can also use this method to create a performant on-disk cache.
* Dataset.prefetch overlaps data preprocessing and model execution while training.

Interested readers can learn more about both methods, as well as how to cache data to disk in the Prefetching section of the [Better performance with the tf.data API](https://www.tensorflow.org/guide/data_performance) guide.

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

## Data augmentation
Overfitting generally occurs when there are a small number of training examples. Data augmentation takes the approach of generating additional training data from your existing examples by augmenting them using random transformations that yield believable-looking images. This helps expose the model to more aspects of the data and generalize better.

You will implement data augmentation using the following Keras preprocessing layers: `tf.keras.layers.RandomFlip`, `tf.keras.layers.RandomRotation`, and `tf.keras.layers.RandomZoom`. These can be included inside your model like other layers, and run on the GPU.

In [None]:
num_classes = len(class_names)

data_augmentation = tf.keras.Sequential(
  [
    tf.keras.layers.RandomFlip(
        "horizontal",
        input_shape=(
            img_height,
            img_width,
            3
          )),
    tf.keras.layers.RandomRotation(0.1),
    tf.keras.layers.RandomZoom(0.1),
  ]
)

# Build the model
model = tf.keras.Sequential([
  tf.keras.layers.Rescaling(1./255, input_shape=(img_height, img_width, 3)),
  tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(64, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(num_classes)
])

# Compile the model
model.compile(
    optimizer='adam',
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy']
    )

model.summary()

In [None]:
# Fit the model
epochs=10
history = model.fit(
  train_ds,
  validation_data=val_ds,
  epochs=epochs
)