In [2]:
from trainer import CatDogTrainer
import tensorflow as tf

%load_ext autoreload
%autoreload 2

It's very important to notice that even in the original case validation accuracy is **higher** than training (and loss is lower).

# 1 - original training

Here we train the original model from [here](https://www.tensorflow.org/tutorials/images/transfer_learning). We get very good accuracy on both train and validation sets (more than .9) after just 2-3 epochs.

In [3]:
trainer = CatDogTrainer(initial_epochs=3, model_type='mobile_net')

In [4]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


# 2 - resnet

Let's now change `MobileNetV2` to `resnet50` without changing anything else. Result are not bad for validation set but for some reason train accuracy is below than validation accuracy.

In [5]:
trainer = CatDogTrainer(initial_epochs=3, model_type='resnet50')

Downloading data from https://github.com/keras-team/keras-applications/releases/download/resnet/resnet50v2_weights_tf_dim_ordering_tf_kernels_notop.h5


In [6]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


# 3 - changing `mobile_net` version

## image resolution

Let's first change image resolution - from 160 to 224. So everything works and the result is even better.

In [7]:
trainer = CatDogTrainer(initial_epochs=3, model_type='mobile_net')

In [8]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


## normalization

Now let's remove strange normalization and 1 class:

```python
def format_example(image, label):
    image = tf.cast(image, tf.float32)
    # let's change this to image /= 255.0
    image = (image / 127.5) - 1
    image = tf.image.resize(image, (self.IMG_SIZE, self.IMG_SIZE))
    return image, label

# change this to 2
self.prediction_layer = tf.keras.layers.Dense(1)

```

In [10]:
trainer = CatDogTrainer(initial_epochs=3, model_type='mobile_net')

First let's check our changes. We have 2 classes and numbers are in range `[0, 1]`.

In [11]:
trainer.model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
mobilenetv2_1.00_224 (Model) (None, 7, 7, 1280)        2257984   
_________________________________________________________________
global_average_pooling2d_3 ( (None, 1280)              0         
_________________________________________________________________
dense_3 (Dense)              (None, 2)                 2562      
Total params: 2,260,546
Trainable params: 2,562
Non-trainable params: 2,257,984
_________________________________________________________________


In [12]:
img_batch, label_batch = next(iter(trainer.train_batches))

In [13]:
img_batch.shape

TensorShape([32, 224, 224, 3])

In [14]:
img_batch[0, 0, 0, :]

<tf.Tensor: id=76276, shape=(3,), dtype=float32, numpy=array([0.68132335, 0.6166525 , 0.5587918 ], dtype=float32)>

Let's now train the model. Look's like everything still works.

In [15]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


## loss and metrics

It's not clear what all these means (`accuracy` etc.). Let's first look at our labels. They are **not** one-hot.

In [16]:
label_batch.shape

TensorShape([32])

In [17]:
label_batch[:5]

<tf.Tensor: id=87757, shape=(5,), dtype=int64, numpy=array([1, 0, 0, 0, 1])>

Can we use `SparseCategoricalCrossentropy`? It's not quite clear - result are mixed at best. Why is that? Results should be the same. Let's also try to use `BinaryCrossentropy` that is perfect fit in our case. It works.

In [18]:
trainer = CatDogTrainer(initial_epochs=3, model_type='mobile_net')

In [19]:
trainer.loss

<tensorflow.python.keras.losses.SparseCategoricalCrossentropy at 0x7f304cd66f28>

In [20]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


In [21]:
trainer = CatDogTrainer(initial_epochs=3, model_type='mobile_net')

In [22]:
trainer.loss

<tensorflow.python.keras.losses.BinaryCrossentropy at 0x7f3040c0a550>

In [23]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


Let's try to set seed. Looks like it works!

In [26]:
tf.random.set_seed(42)

In [27]:
trainer = CatDogTrainer(initial_epochs=1, model_type='mobile_net')

In [28]:
trainer.train()



In [29]:
tf.random.set_seed(42)
trainer = CatDogTrainer(initial_epochs=1, model_type='mobile_net')
trainer.train()



Let's now change `accuracy` to `tf.keras.metrics.Accuracy`. It doesn't work! So we need to be very careful with `tf.keras` API.

In [39]:
tf.random.set_seed(42)
trainer = CatDogTrainer(initial_epochs=1, model_type='mobile_net')

In [41]:
trainer.metrics

[<tensorflow.python.keras.metrics.BinaryAccuracy at 0x7f2fcc7f72e8>]

In [42]:
trainer.train()



In [14]:
tf.random.set_seed(42)
trainer = CatDogTrainer(initial_epochs=1, model_type='mobile_net')

In [15]:
trainer.metrics

['accuracy']

In [16]:
trainer.train()



In [3]:
tf.random.set_seed(42)
trainer = CatDogTrainer(initial_epochs=1, model_type='mobile_net')

In [4]:
trainer.metrics

[<tensorflow.python.keras.metrics.Accuracy at 0x7f0dfd4f8be0>]

In [5]:
trainer.train()

      1/Unknown - 1s 1s/step

ValueError: Shapes (None, 2) and (None, 1) are incompatible

## optimizer and learning rates

# 4 - `resnet50`

Let's now try to change the model to `resnet50`. So looks like everything works even with the change of the net. But what if we change our optimizer?

In [9]:
tf.random.set_seed(42)
trainer = CatDogTrainer(initial_epochs=3, model_type='resnet50')

In [8]:
trainer.base_model.name

'resnet50v2'

In [10]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


In [11]:
tf.random.set_seed(42)
trainer = CatDogTrainer(initial_epochs=3, model_type='resnet50')

In [12]:
trainer.metrics

['accuracy']

In [13]:
trainer.train()

Epoch 1/3
Epoch 2/3
Epoch 3/3


In [17]:
tf.random.set_seed(42)
trainer = CatDogTrainer(initial_epochs=10, model_type='resnet50')

In [18]:
trainer.train()

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
