# Training with ImageDataGenerator

 I will build a train a model on the Horses or Humans dataset. This contains over a thousand images of horses and humans with varying poses and filesizes.

In [1]:
#download the compressed dataset
!wget https://storage.googleapis.com/tensorflow-1-public/course2/week3/horse-or-human.zip

--2024-06-06 20:04:49--  https://storage.googleapis.com/tensorflow-1-public/course2/week3/horse-or-human.zip
Resolving storage.googleapis.com (storage.googleapis.com)... 142.251.2.207, 74.125.137.207, 2607:f8b0:4023:c0b::cf, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|142.251.2.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 149574867 (143M) [application/zip]
Saving to: ‘horse-or-human.zip’


2024-06-06 20:04:50 (155 MB/s) - ‘horse-or-human.zip’ saved [149574867/149574867]



In [2]:
#Unzip the data
import zipfile

#unzip the file
local_zip = './horse-or-human.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall('./horse-or-human')
zip_ref.close()

The contents of the .zip are extracted to the base directory ./horse-or-human, which in turn each contain horses and humans subdirectories.

We do not explicitly label the images as horses or humans. You will use the ImageDataGenerator API instead -- and this is coded to automatically label images according to the directory names and structure. So, for example, you will have a 'training' directory containing a 'horses' directory and a 'humans' one. ImageDataGenerator will label the images appropriately for you, reducing a coding step.

In [3]:
import os
#Directory with our training horse pictures
train_horse_dir = os.path.join('./horse-or-human/horses')

#Directory with our training human pictures
train_human_dir = os.path.join('./horse-or-human/humans')

In [4]:
train_horse_names = os.listdir(train_horse_dir)
print(train_horse_names[:10])

train_human_names = os.listdir(train_human_dir)
print(train_human_names[:10])

['horse36-2.png', 'horse36-4.png', 'horse14-0.png', 'horse01-2.png', 'horse08-7.png', 'horse04-8.png', 'horse10-3.png', 'horse38-2.png', 'horse32-8.png', 'horse21-2.png']
['human15-02.png', 'human16-20.png', 'human07-17.png', 'human01-14.png', 'human02-11.png', 'human16-29.png', 'human07-21.png', 'human12-00.png', 'human09-27.png', 'human08-15.png']


In [5]:
print('total training horse images:',len(os.listdir(train_horse_dir)))
print('total training human images:',len(os.listdir(train_human_dir)))

total training horse images: 500
total training human images: 527


Now take a look at a few pictures to get a better sense of what they look like.|

In [8]:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.image as mp
#parameters for our graph; we will output images in a 4x4 configuration
nrows = 4
ncols = 4

#index for iterating over images
pic_index = 0

Now, display a batch of 9 horse and 9 human pictures.

In [11]:
#set up matplotlib fig, and sizeit to fit 4x4 pics
fig = plt.gcf()
fig.set_size_inches(ncols *4, nrows *4)
pic_index += 9
next_horse_pix = [os.path.join(train_horse_dir, fname)
                for fname in train_horse_names[pic_index-8:pic_index]]
next_human_pix = [os.path.join(train_human_dir, fname)
                  for fname in train_human_names[pic_index-8:pic_index]]
for i, img_path in enumerate(next_horse_pix + next_human_pix):
  #set up subplot; subplot indices start at 1
  sp =plt.subplot(nrows, ncols, i+1)
  sp.axis( 'off')#Donot show axis

  img = mp.imread(img_path)
  plt.imshow(img)
plt.show()


Output hidden; open in https://colab.research.google.com to view.

Building a Small Model from Scratch

In [12]:
import tensorflow as tf
from tensorflow import keras

In [13]:
model = tf.keras.models.Sequential([
    #first convolution layer
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300,300,3)),
    tf.keras.layers.MaxPooling2D(2,2),
    #Second convolution layer
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    #Third convolutional layer
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    #Fourth convolutional layer
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    #Fifth convolutional layer
    tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    #Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    #only 1 output neuron, since it is a binary classification
    tf.keras.layers.Dense(1, activation='sigmoid')
])

In [14]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 298, 298, 16)      448       
                                                                 
 max_pooling2d (MaxPooling2  (None, 149, 149, 16)      0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 147, 147, 32)      4640      
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 73, 73, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 71, 71, 64)        18496     
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 35, 35, 64)        0

In [15]:
from tensorflow.keras.optimizers import RMSprop
model.compile(loss='binary_crossentropy',
              optimizer = RMSprop(learning_rate=0.001),
              metrics=['accuracy'])

#Data Preprocessing

Next step is to set up the data generators that will read pictures in the source folders, convert them to float32 tensors, and feed them (with their labels) to the model. You'll have one generator for the training images and one for the validation images. These generators will yield batches of images of size 300x300 and their labels (binary).

In [16]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

#All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1/255)

#Flow training images in batches of 128 using train_datagen generator
train_generator = train_datagen.flow_from_directory(
        './horse-or-human/', #This is the source directory for training images
        target_size=(300,300), #All images will be resized to 300x300
        batch_size=128,
        #Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

Found 1027 images belonging to 2 classes.


In [None]:
#Train the model
history = model.fit(
    train_generator,
    steps_per_epoch=8,
    epochs=15,
    verbose=1
)

#Model Prediction

Now take a look at actually running a prediction using the model. This code will allow you to choose 1 or more files from your file system, upload them, and run them through the model, giving an indication of whether the object is a horse or a human.

In [None]:
import numpy as np
from google.colab import files
from tensorflow.keras.utils import load_img, img_to_array

uploaded = files.upload()

for fn in uploaded.keys():
  #predicting images
  path = '/content/' + fn
  img = load_img(path, target_size=(300,300))
  x = img_to_array(img)
  x /= 255
  x = np.expand_dims(x, axis=0)

  images = np.vstack([x])
  classes = model.predict(images, batch_size=10)
  print(classes[0])
  if classes[0]>0.5:
    print(fn + 'is a human')
  else:
    print(fn + 'is a horse')