In [1]:
try:
    # %tensorflow_version only exists in Colab.
    %tensorflow_version 2.x
    IS_COLAB = True
except Exception:
    IS_COLAB = False

if IS_COLAB:
    !git clone https://github.com/raoulg/ml-21.git
    %cd ml-21/5-vision
    %pip install loguru

In [2]:
%load_ext autoreload
%autoreload 2
import tensorflow as tf
from pathlib import Path
import numpy as np
import sys
sys.path.insert(0, "..")

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1' 

## Baseline model
Let's try a simple model, with just one `Conv2D`, one `MaxPool2d`, and one hidden `Dense` layer.

With `augment=False` and without `BatchNormalization` or `Dropout` we will start overfitting. Adding those two layers helps stabilizing the overfitting, and setting `augment=True` on the datagenerator helps us squeeze more juice out of our data. 

In [3]:
from src.data import make_dataset

data_dir = Path("../data/raw")
make_dataset.get_raw_data(data_dir)

2021-12-21 13:13:18.611 | INFO     | src.data.make_dataset:get_raw_data:20 - Data not present in ../data/raw, downloading from url


Downloading data from https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz


In [4]:
targetsize = (150, 150)
datagen_kwargs = dict(rescale=1./255, validation_split=0.2)
dataflow_kwargs = dict(target_size=targetsize, batch_size=32,
                    interpolation="bilinear")

train, valid = make_dataset.create_generators(datagen_kwargs, dataflow_kwargs, 
                                              datadir = data_dir / "flower_photos",
                                              augment=False)


2021-12-21 13:14:17.280 | INFO     | src.data.make_dataset:create_generators:57 - Creating validation set data generator
2021-12-21 13:14:17.362 | INFO     | src.data.make_dataset:create_generators:67 - Creating train set data generator


Found 731 images belonging to 5 classes.
Found 2939 images belonging to 5 classes.


In [11]:
from src.models import base
shape = targetsize + (3,)
model = base.base_imagemodel(shape, classes = train.num_classes)

Note how we **don't** use an sigmoid activation in the output layer. We could do that, but this way, the relu can range from $[-Inf, Inf]$ instead of $[0,1]$.. If we omit a sigmoid activation we need to specify that `from_logits=True`. 

It gives the layers a bit more freedom, while the output is equivalent. If we want to add a sigmoid, we have to set `from_logits=False`.
Also note how we don't hardcode the amount of units for the last layer. We obtain the amount of units from the train_generator.

In [12]:
model.compile(
  optimizer='adam',
  loss=tf.losses.CategoricalCrossentropy(from_logits=True),
  metrics=['accuracy'])

In [13]:
model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_2 (Conv2D)           (None, 148, 148, 32)      896       
                                                                 
 batch_normalization_2 (Batc  (None, 148, 148, 32)     128       
 hNormalization)                                                 
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 74, 74, 32)       0         
 2D)                                                             
                                                                 
 flatten_2 (Flatten)         (None, 175232)            0         
                                                                 
 dense_4 (Dense)             (None, 128)               22429824  
                                                                 
 dropout_2 (Dropout)         (None, 128)              

So, let's try the model for 10 epochs.

In [14]:
logbase = Path("../logs")
logbase.mkdir(parents=True, exist_ok=True)
log_dir = logbase / "basemodel"
check_dir = Path("../models/base")

In [15]:
from src.models import train_model

epochs = 10
history = train_model.train(log_dir, check_dir, 
                            model=model, 
                            traingen=train,
                            validgen=valid, 
                            totalepochs=epochs,
                            verbose=1)


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


While the model initially is learning, we see it starts overfitting at some point with a train accuracy above 90, while the valid accuracy get's stuck around 50. It's time to set the `augment=True` argument.

Pay attention to how I manage to start training from the epoch count where I stopped. So I trained 10 epochs, and will continue with those weights for another additional 10 epochs.

In [None]:
train, valid = make_dataset.create_generators(datagen_kwargs, dataflow_kwargs, 
                                              datadir = data_dir / "flower_photos",
                                              augment=True)

In [11]:
history = train_model.train(log_dir, check_dir, 
                            model=model, 
                            traingen=train,
                            validgen=valid, 
                            totalepochs=epochs+5,
                            initial_epochs=epochs
                            )

Epoch 11/15
INFO:tensorflow:Assets written to: ../models/base/assets


INFO:tensorflow:Assets written to: ../models/base/assets


Epoch 12/15
Epoch 13/15
INFO:tensorflow:Assets written to: ../models/base/assets


INFO:tensorflow:Assets written to: ../models/base/assets


Epoch 14/15
Epoch 15/15
INFO:tensorflow:Assets written to: ../models/base/assets


INFO:tensorflow:Assets written to: ../models/base/assets


The train accuracy drops (do you understand why?) and model keeps learning  without overfitting. It seems to get stuck at some plateau, but I have run it for over 40 epochs while reaching levels just above 60% of accuracy, which is pretty good for such a basic model with a dataset like this.

In [None]:
from src.visualization import visualize
visualize.evaluate_image_classifier(model, valid, grid=16)

It is hard to say, why it makes a wrong prediction at some images. Sometimes you can see how it is sidetracked by the background, or some weird details, but sometimes it seems a really obvious example and it is hard to understand, why it doesn't pick it up.


We see the model is a slow learner. The loss is fluctiating, but the model keeps learning for quite some time, and doesn't really overfit quickly when the augment is set to True.

On the other hand, the image set is pretty hard, with lot's and lot's of different situations, so after all, a simple model like this still does pretty well.

In [None]:
%load_ext tensorboard
%tensorboard --logdir log_dir