## Tensor Flow Keras

### Preface

**Summary**
- Use TensorFlow Framework Library, Keras Module, Flowers Dataset
- Load Data using Keras Utils API
- Load Data using TF Datasets API

**Acknowledgements**
- Dataset: https://www.tensorflow.org/datasets/catalog/tf_flowers
- Blog Article: https://www.tensorflow.org/tutorials/load_data/images

### Initialization

**Packages**

In [10]:
import numpy as pkg_num
import os as pkg_os
import matplotlib as pkg_matplotlib
import matplotlib.pyplot as pkg_plot
import warnings as pkg_warnigs
import PIL as pkg_pil
import PIL.Image as pk_pil_image
import tensorflow as pkg_tf
import tensorflow_datasets as pkg_tfds
import pathlib as pkg_pathlib

**Common**

In [11]:
# Miscellaneous
%matplotlib inline
pkg_warnigs.filterwarnings(action="ignore")

In [23]:
# Image Dataset Directory
image_dataset_dirpath = pkg_pathlib.Path("/tmp/AIML/data/tensorflow/datasets/flower_photos")

# Performance related settings
AUTOTUNE = pkg_tf.data.AUTOTUNE

In [13]:
# Tensor Flow is optimized for CUDA-GPU
# That error goes away with following setting
# TODO: Figure out why the error goes away with this setting!
pkg_os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'

**Load Data**

In [14]:
image_count = len(list(image_dataset_dirpath.glob('*/*.jpg')))
print(image_count)

3670


In [15]:
# Image Size (Target)
image_height = 180
image_width = 180
image_size = (image_height, image_width)

# Batch Size
batch_size = 32

# Seed for Train-Test Data Split
split_seed = 123

In [16]:
train_ds = pkg_tf.keras.utils.image_dataset_from_directory(
  image_dataset_dirpath, validation_split=0.2, subset="training",
  seed=split_seed, image_size=image_size, batch_size=batch_size
)

Found 3670 files belonging to 5 classes.
Using 2936 files for training.


2022-08-18 23:44:06.620835: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-08-18 23:44:06.620908: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-08-18 23:44:06.620964: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (raooruga-WX-1): /proc/driver/nvidia/version does not exist
2022-08-18 23:44:06.622662: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [17]:
test_ds = pkg_tf.keras.utils.image_dataset_from_directory(
  image_dataset_dirpath, validation_split=0.2, subset="validation",
  seed=split_seed, image_size=image_size, batch_size=batch_size
)

Found 3670 files belonging to 5 classes.
Using 734 files for validation.


In [18]:
train_ds.class_names, test_ds.class_names

(['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips'],
 ['daisy', 'dandelion', 'roses', 'sunflowers', 'tulips'])

### Process

**Model: Using Keras**

In [66]:
# Performance settings
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)

In [67]:
num_classes = 5
num_iterations = 3

In [68]:
model = pkg_tf.keras.Sequential([
    # RGB value range is [0, 255], scale them to [0, 1]
    pkg_tf.keras.layers.Rescaling(1./255),

    # Convolution Block with Pooling for each of RGB channels?
    pkg_tf.keras.layers.Conv2D(32, 3, activation='relu'),
    pkg_tf.keras.layers.MaxPooling2D(),
    pkg_tf.keras.layers.Conv2D(32, 3, activation='relu'),
    pkg_tf.keras.layers.MaxPooling2D(),
    pkg_tf.keras.layers.Conv2D(32, 3, activation='relu'),
    pkg_tf.keras.layers.MaxPooling2D(),

    # Connect all the convolution blocks together?
    pkg_tf.keras.layers.Flatten(),
    pkg_tf.keras.layers.Dense(128, activation='relu'),
    pkg_tf.keras.layers.Dense(num_classes)
])

model

<keras.engine.sequential.Sequential at 0x7fa0d018b220>

In [69]:
loss_function = pkg_tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam',loss=loss_function, metrics=['accuracy'])

In [70]:
model.fit(train_ds,validation_data=test_ds,epochs=num_iterations)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.callbacks.History at 0x7fa0d018b940>

In [None]:
#model.predict()

### Model: Using TF Datasets

In [24]:
def configure_for_performance(ds):
  ds = ds.cache()
  ds = ds.shuffle(buffer_size=1000)
  ds = ds.batch(batch_size)
  ds = ds.prefetch(buffer_size=AUTOTUNE)
  return ds

In [19]:
(train_ds, val_ds, test_ds), metadata = pkg_tfds.load(
    name='tf_flowers', data_dir=image_dataset_dirpath,
    split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    with_info=True, as_supervised=True,
)

2022-08-18 23:44:19.697452: W tensorflow/core/platform/cloud/google_auth_provider.cc:184] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "NOT_FOUND: Could not locate the credentials file.". Retrieving token from GCE failed with "FAILED_PRECONDITION: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Could not resolve host: metadata".


[1mDownloading and preparing dataset 218.21 MiB (download: 218.21 MiB, generated: 221.83 MiB, total: 440.05 MiB) to /tmp/AIML/data/tensorflow/datasets/flower_photos/tf_flowers/3.0.1...[0m


Dl Completed...:   0%|          | 0/5 [00:00<?, ? file/s]

[1mDataset tf_flowers downloaded and prepared to /tmp/AIML/data/tensorflow/datasets/flower_photos/tf_flowers/3.0.1. Subsequent calls will reuse this data.[0m


In [21]:
num_classes = metadata.features['label'].num_classes
print(num_classes)

5


In [25]:
train_ds = configure_for_performance(train_ds)
val_ds = configure_for_performance(val_ds)
test_ds = configure_for_performance(test_ds)