### TFData Loader Installation

Hello and welcome. Below is a simple guide to installing and using my module for loading image data for Image Classification problem.

Run below cell to install the module:

In [None]:
!pip install git+https://github.com/sebastian-sz/tfdata-image-loader@main

Obtaining tfdata-image-loader from git+git://github.com/sebastian-sz/tfdata-image-loader.git#egg=tfdata-image-loader
  Cloning git://github.com/sebastian-sz/tfdata-image-loader.git to ./src/tfdata-image-loader
  Running command git clone -q git://github.com/sebastian-sz/tfdata-image-loader.git /content/src/tfdata-image-loader
Installing collected packages: tfdata-image-loader
  Running setup.py develop for tfdata-image-loader
Successfully installed tfdata-image-loader


Proceed with standard python imports:

In [None]:
%tensorflow_version 2.x

import os

import matplotlib.pyplot as plt
import tensorflow as tf

from tfdata_image_loader import TFDataImageLoader

print(tf.__version__)

2.4.1


### Download example dataset

In this section we are going to download example dataset.

In [None]:
!curl https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz | tar xz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  218M  100  218M    0     0  46.6M      0  0:00:04  0:00:04 --:--:-- 59.8M


Remove the License file so it doesn't mess up directory structure:

In [None]:
!rm flower_photos/LICENSE.txt

Preview Class names:

In [None]:
!ls flower_photos

daisy  dandelion  roses  sunflowers  tulips


### Load the data using our loader

In [None]:
DATA_PATH = "./flower_photos"
BATCH_SIZE = 32
TARGET_SIZE = (224, 224)


def preprocess_data(image, label):
    return (image / 127.5) - 1, label


def augment_data(image, label):
    return tf.image.random_flip_left_right(image), label

In [None]:
data_loader = TFDataImageLoader(
    data_path=DATA_PATH,
    target_size=TARGET_SIZE,
    batch_size=BATCH_SIZE,
    pre_process_function=preprocess_data,
    augmentation_function=augment_data,
)

Found 3670 images, belonging to 5 classes

Class names mapping: 
{'daisy': array([1, 0, 0, 0, 0], dtype=int32), 'dandelion': array([0, 1, 0, 0, 0], dtype=int32), 'roses': array([0, 0, 1, 0, 0], dtype=int32), 'sunflowers': array([0, 0, 0, 1, 0], dtype=int32), 'tulips': array([0, 0, 0, 0, 1], dtype=int32)}



In [None]:
dataset = data_loader.load_dataset()

In [None]:
for image_batch, label_batch in dataset.take(1):
    print(image_batch.shape)
    print(label_batch.shape)

(32, 224, 224, 3)
(32, 5)


### Train custom model
We can use the loaded data to train a model:

In [None]:
def make_model(num_classes):
    base_model = tf.keras.applications.MobileNetV2(
        input_shape=(TARGET_SIZE[0], TARGET_SIZE[1], 3),
        include_top=False,
        pooling="avg",
    )

    base_model.trainable=False

    return tf.keras.Sequential([
        base_model,
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(num_classes, activation="softmax")
    ])

In [None]:
num_classes = len(os.listdir(DATA_PATH))

model = make_model(num_classes=num_classes)
model.compile(
    optimizer=tf.keras.optimizers.RMSprop(),
    loss=tf.keras.losses.CategoricalCrossentropy(from_logits=False),
    metrics=['accuracy']
)

model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
mobilenetv2_1.00_224 (Functi (None, 1280)              2257984   
_________________________________________________________________
dropout (Dropout)            (None, 1280)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 6405      
Total params: 2,264,389
Trainable params: 6,405
Non-trainable params: 2,257,984
_________________________________________________________________


In [None]:
model.fit(
    dataset,
    epochs=1,
)



<tensorflow.python.keras.callbacks.History at 0x7f77d5c09250>

### Using your own data.

In order to use your own data you can either:
1. Install `tfdata-image-loader` locally
2. Connect your Google Drive with Colab Notebook and pass the `data path` to Google Drive. For example:
```
from google.colab import drive
from tfdata_image_loader import TFDataImageLoader 
drive.mount('.') 
data_path = "drive/My Drive/data/train/..."
train_loader =  TFDataImageLoader(
    data_path
    (...)
)
```
You can also temporarily copy the data from drive to colab.