# Using Real-world Images
The imitation of the classifier from a previous week was that it used a dataset of very uniformed images: clothing pictures framed in a 28x28 size.
What happens if we use larger images and features can be in different locations?

# Understanding ImageGenerator
You can point out to a directory and subdirectories of that will generate labels for you: Images -> Training/Validation -> Horses/Humans
To load images using tensorflow:
```jupyterpython
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Setting a training generator
train_datagen = ImageDataGenerator(rescale=1./255) # instantiate and normalize the data
train_generator = train_datagen.flow_from_directory( # always point to a directory that contains subdirectories that contain images
                                                    # names of subdirectories will be the labels of the images
    directory='train_dir',
    target_size=(300, 300), # input data has to be the same size; images are resized as they are loaded -> no need to pre-process images
    batch_size=128, # images are loaded in batches to optimise performance
    class_mode='binary' # just a binary classifier e.g. horse vs. human
)

# Setting up a validation generator
test_datagen = ImageDataGenerator(rescale=1./255)
test_generator = test_datagen.flow_from_directory(
    directory='validation_dir', #points to a subdir with test images
    target_size=(300, 300),
    batch_size=32,
    class_mode='binary'
)
```

# Defining a ConvNet to use complex images
```jupyterpython
import tensorflow as tf

model = tf.keras.models.Sequential(
    # 3 sets of convolution and pooling layers because images are more complex: 300x330x3 (rgb) instead of 28x28x1 (greyscale).
    # As a result we end up with 35x35 images
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid'), # only 1 neuron for 2 classes;  the sigmoid function is great for binary classification
    # we still could use a sofmtax function with 2 neurons as before, but sigmoid is more efficient for a binary classification
)
```

A journey of the image in a NN:
```
(300, 300, 3) -> (298, 298, 16) -> (149, 149, 16) -> (147, 147, 32) -> (73, 73, 32) -> (71, 71, 64) -> (35, 35, 64) -> (78400) -> (512) -> (1)
```

# Training the ConvNet with fit_generator
```jupyterpython
from tensorflow.keras.optimizers import RMSprop
import tensorflow as tf

model = tf.keras.models.Sequential()
# use `binary_crossentropy` instead of `sparse_categorical_crossentropy` to support a binary choice
# use RMSprop instead of Adam optimizer
model.compile(loss='binary_crossentropy', optimizer=RMSprop(lr=0.001), metrics=['acc'])

#Training the model:
history = model.fit_generator( # note we use `fit_generator` instead of `fit` becasue we use a generator instead of data sets
    train_generator, # set up earlier, streams images from a training directory with batch size of 128
    steps_per_epoch=8, # 1024 images in a training directory, we load 128 of them at a time, so to load them all we need 8 batches
    epochs=15,
    validation_data=validation_generator, # a validation set with 256 images in batches of 32
    validation_steps=8, # 256/32=8
    verbose=2 # how much to display when training is going on; with value of 2 we get a little less animation hiding epoch progress
)
```