### В качестве практики возьму датасет персонажей из Симпсонов и буду делать модель классификации.

- У нас имеется 42 персонажа.
- Классы Очень не сблансированны.
- Т.к. классы в любом случае придется балансировать, то метрику будем брать accuracy

За основу возьму архитектуру MobileNet V2, т.к. она довольно шустрая, но и в тоже время довольно глубокая. Хотя также есть риск переобучить ее, также потому, что она может быть слишком сложной для такого малого кол-ва данных(размер 1 класса из 42 пимерно 2000 картинок). Попробуем сначала на предобученной модели, попробуем поразмараживать слои, посмотреть на скорость обучения. качество модели. Вероятно придется довольно глубоко размараживать, т.к. обучалась она на фотках imagenet, а тут векорные картинки. 

In [1]:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf
import time
from random import randint
import pandas as pd
from tqdm import tqdm, tqdm_notebook
from torchvision import transforms
from PIL import Image

## Подготовка данных.

#### В первую очередь данные сбалансируем. Алгоритм следующий:
- Смотрим все папки и выясняем максимально чилсло картинок в классе.
- Потом по очереди каждый класс смотрим и добавляем аугментированную картинку из имеющихся в классе, до тех пор пока количество не станет равным максимальному классу.

*Картинку физически сохраняем на диск в папку с классом

In [2]:
class Regularisation:
    def __init__(self,
                 directory='simpsons_dataset/train/'):

        self.dir_path = directory
        self.directory = os.listdir(path=os.path.join(self.dir_path))

        self.min_files = float('inf')
        self.max_files = float('-inf')

    @staticmethod
    def load_image(_file):
        image = Image.open(_file)
        image.load()

        return image, image.size

    def transform_image(self, file_):
        x, size = self.load_image(file_)

        transforms_train = transforms.Compose([
            transforms.RandomRotation(degrees=(-15, 15), expand=True),
            transforms.RandomHorizontalFlip(p=0.5),
            transforms.RandomResizedCrop(size=size, scale=(0.5, 0.95)),
        ])

        x = transforms_train(x)

        x.save(str(file_)[:-4] + str(randint(1770, 7000)) + ".jpg", "JPEG")

    def find_max_min(self):

        df = pd.DataFrame(columns=['character', 'total pics'])

        for index, folder in enumerate(self.directory):

            files_num = len(os.listdir(path=os.path.join(self.dir_path + folder)))

            df.loc[index] = {'character': folder, 'total pics': files_num}

            if files_num > self.max_files:
                self.max_files = files_num

            if files_num < self.min_files:
                self.min_files = files_num

        print(df)

        print('--------'
              '\nmax is {0}'
              '\nmin is {1}'.format(self.max_files, self.max_files))

        time.sleep(5)

    def regularise(self):
        self.find_max_min()

        if self.max_files != self.min_files:

            for folder in tqdm(self.directory):

                files_num = len(os.listdir(path=os.path.join(self.dir_path + folder)))
                files = os.listdir(path=os.path.join(self.dir_path + folder))

                while files_num < self.max_files:

                    for file in files:
                        files_num = len(os.listdir(path=os.path.join(self.dir_path + folder)))

                        if files_num >= self.max_files:
                            break

                        else:
                            path = os.path.join(self.dir_path + folder + '/' + file)
                            self.transform_image(path)

In [None]:
regularisation = Regularisation()
regularisation.regularise()

### Загрузка данных

Для загрузки будем использовать утилиту `tf.keras.utils.image_dataset_from_directory`, т.к. загружаем физические картинки с диска.

In [3]:
PATH = 'simpsons_dataset/'
img_dir = os.path.join(PATH, 'train')

BATCH_SIZE = 64
IMG_SIZE = (256, 256)

In [4]:
train_dataset = tf.keras.utils.image_dataset_from_directory(img_dir,
                                                            shuffle=True,
                                                            batch_size=BATCH_SIZE,
                                                            image_size=IMG_SIZE,
                                                            label_mode='categorical',
                                                            validation_split = 0.2,
                                                            subset='training',
                                                             seed=123)

Found 94332 files belonging to 42 classes.
Using 75466 files for training.


In [5]:
validation_dataset = tf.keras.utils.image_dataset_from_directory(img_dir,
                                                                 shuffle=True,
                                                                 batch_size=BATCH_SIZE,
                                                                 image_size=IMG_SIZE,
                                                                 label_mode='categorical',
                                                                 validation_split = 0.2,
                                                                 subset='validation',
                                                                 seed=123)

Found 94332 files belonging to 42 classes.
Using 18866 files for validation.


Проверим картинки и лейблы:

In [None]:
class_names = train_dataset.class_names

plt.figure(figsize=(10, 10))
for images, labels in train_dataset.take(1):
    for i in range(9):
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(images[i].numpy().astype("uint8"))
        plt.title(class_names[np.argmax(labels[i])])
        plt.axis("off")

Также мы может ощипнуть кусок данных для конечного теста модели. Используем: `tf.data.experimental.cardinality`

In [None]:
val_batches = tf.data.experimental.cardinality(validation_dataset)
test_dataset = validation_dataset.take(val_batches // 5)
validation_dataset = validation_dataset.skip(val_batches // 5)

In [None]:
print('Number of validation batches: %d' % tf.data.experimental.cardinality(train_dataset))
print('Number of validation batches: %d' % tf.data.experimental.cardinality(validation_dataset))
print('Number of test batches: %d' % tf.data.experimental.cardinality(test_dataset))

### Немного магии для ускорения загрузки

Метод для оптимизации загрузки данных в модель при обучении.

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_dataset = train_dataset.prefetch(buffer_size=AUTOTUNE)
validation_dataset = validation_dataset.prefetch(buffer_size=AUTOTUNE)
test_dataset = test_dataset.prefetch(buffer_size=AUTOTUNE)

### Стандартизация

Так как модель изначально обучалась на входных данных в диапазоне 1, -1, то и наши картинки нужно привести к этому виду. Вообще используя предобученные модели всегда стоит сперва посмотреть, какие данные они ждут на вход. ('[-1, 1]' или '[0, 1]')

In [None]:
preprocess_input = tf.keras.applications.mobilenet_v2.preprocess_input

Также для этого можно использовать простой метод `tf.keras.layers.Rescaling`.

In [None]:
rescale = tf.keras.layers.Rescaling(1./127.5, offset=-1)

## Загрузим предобученную модель

Если я правильно понял, обычно делают следующим образом: Загружают преобученную модель, заменяют последний слой классификации,
и дальше уже "доучивают" какуюто часть слоев, на свое усмотрение. Так мы и сделаем, загрузим модель без слоя классификации (include_top=False) и укажем, что веса должны быть из обучения на имеджнете (weights='imagenet'). 

Таким образом у нас получается, что мы используем предобученную модель для извлечения признаков из нашей картинки, а классифицируем их сами.

In [None]:
# Загрузка базовой предобученной модели MobileNet V2, не забываем указать размер входящей картинки.
IMG_SHAPE = IMG_SIZE + (3,)
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')

###### Т.е. картинка на входе `256x256x3` превращается в тензор признаков `8x8x1280` .

In [None]:
#Посмотрим, чтобы убедиться
image_batch, label_batch = next(iter(train_dataset))
feature_batch = base_model(image_batch)
print(feature_batch.shape)

## Базовый вариант

И так, базовый вариант: просто извлекаем признаки и классифицируем своим классификатором.

### Замораживаем предобученные веса

Важно сделать это до комприляции. Если мы хотим заморозить конкретные слои, то итерируемся по слоям и используем layer.trainable = False. Но т.к. в бозовом варианте мы замораживаем все слои то пишем просто  base_model.trainable = False


In [None]:
base_model.trainable = False

### Important note about BatchNormalization layers

Many models contain `tf.keras.layers.BatchNormalization` layers. This layer is a special case and precautions should be taken in the context of fine-tuning, as shown later in this tutorial. 

When you set `layer.trainable = False`, the `BatchNormalization` layer will run in inference mode, and will not update its mean and variance statistics. 

When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the BatchNormalization layers in inference mode by passing `training = False` when calling the base model. Otherwise, the updates applied to the non-trainable weights will destroy what the model has learned.

For more details, see the [Transfer learning guide](https://www.tensorflow.org/guide/keras/transfer_learning).

In [None]:
# Let's take a look at the base model architecture
base_model.summary()

### Add a classification head

To generate predictions from the block of features, average over the spatial `8x8` spatial locations, using a `tf.keras.layers.GlobalAveragePooling2D` layer to convert the features to  a single 1280-element vector per image.

In [None]:
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
feature_batch_average = global_average_layer(feature_batch)
print(feature_batch_average.shape)

Apply a `tf.keras.layers.Dense` layer to convert these features into a single prediction per image. You don't need an activation function here because this prediction will be treated as a `logit`, or a raw prediction value. Positive numbers predict class 1, negative numbers predict class 0.

In [None]:
prediction_layer = tf.keras.layers.Dense(42)
prediction_batch = prediction_layer(feature_batch_average)
print(prediction_batch.shape)

Build a model by chaining together the data augmentation, rescaling, `base_model` and feature extractor layers using the [Keras Functional API](https://www.tensorflow.org/guide/keras/functional). As previously mentioned, use `training=False` as our model contains a `BatchNormalization` layer.

In [None]:
inputs = tf.keras.Input(shape=(256, 256, 3))
#x = data_augmentation(inputs)
x = preprocess_input(inputs)
x = base_model(x, training=True)
x = global_average_layer(x)
x = tf.keras.layers.Dropout(0.2)(x)
outputs = prediction_layer(x)
model = tf.keras.Model(inputs, outputs)

### Compile the model

Compile the model before training it. Since there are two classes, use the `tf.keras.losses.BinaryCrossentropy` loss with `from_logits=True` since the model provides a linear output.

In [None]:
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=base_learning_rate),
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

In [None]:
model.summary()

The 2.5 million parameters in MobileNet are frozen, but there are 1.2 thousand _trainable_ parameters in the Dense layer. These are divided between two `tf.Variable` objects, the weights and biases.

In [None]:
len(model.trainable_variables)

### Train the model

After training for 10 epochs, you should see ~94% accuracy on the validation set.


In [None]:
initial_epochs = 10

loss0, accuracy0 = model.evaluate(validation_dataset)

In [None]:
print("initial loss: {:.2f}".format(loss0))
print("initial accuracy: {:.2f}".format(accuracy0))


In [None]:
history = model.fit(train_dataset,
                    epochs=initial_epochs,
                    validation_data=validation_dataset)

### Learning curves

Let's take a look at the learning curves of the training and validation accuracy/loss when using the MobileNetV2 base model as a fixed feature extractor.

In [None]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']


loss = history.history['loss']
val_loss = history.history['val_loss']

plt.figure(figsize=(8, 8))
plt.subplot(2, 1, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.ylabel('Accuracy')
plt.ylim([min(plt.ylim()),1])
plt.title('Training and Validation Accuracy')

plt.subplot(2, 1, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.ylabel('Cross Entropy')
plt.ylim([0,1.0])
plt.title('Training and Validation Loss')
plt.xlabel('epoch')
plt.show()



Note: If you are wondering why the validation metrics are clearly better than the training metrics, the main factor is because layers like `tf.keras.layers.BatchNormalization` and `tf.keras.layers.Dropout` affect accuracy during training. They are turned off when calculating validation loss.

To a lesser extent, it is also because training metrics report the average for an epoch, while validation metrics are evaluated after the epoch, so validation metrics see a model that has trained slightly longer.

## Fine tuning
In the feature extraction experiment, you were only training a few layers on top of an MobileNetV2 base model. The weights of the pre-trained network were **not** updated during training.

One way to increase performance even further is to train (or "fine-tune") the weights of the top layers of the pre-trained model alongside the training of the classifier you added. The training process will force the weights to be tuned from generic feature maps to features associated specifically with the dataset.

Note: This should only be attempted after you have trained the top-level classifier with the pre-trained model set to non-trainable. If you add a randomly initialized classifier on top of a pre-trained model and attempt to train all layers jointly, the magnitude of the gradient updates will be too large (due to the random weights from the classifier) and your pre-trained model will forget what it has learned.

Also, you should try to fine-tune a small number of top layers rather than the whole MobileNet model. In most convolutional networks, the higher up a layer is, the more specialized it is. The first few layers learn very simple and generic features that generalize to almost all types of images. As you go higher up, the features are increasingly more specific to the dataset on which the model was trained. The goal of fine-tuning is to adapt these specialized features to work with the new dataset, rather than overwrite the generic learning.

### Un-freeze the top layers of the model


All you need to do is unfreeze the `base_model` and set the bottom layers to be un-trainable. Then, you should recompile the model (necessary for these changes to take effect), and resume training.

In [None]:
base_model.trainable = True

In [None]:
# Let's take a look to see how many layers are in the base model
print("Number of layers in the base model: ", len(base_model.layers))

# Fine-tune from this layer onwards
fine_tune_at = 70

# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = False

### Compile the model

As you are training a much larger model and want to readapt the pretrained weights, it is important to use a lower learning rate at this stage. Otherwise, your model could overfit very quickly.

In [None]:
model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
              optimizer = tf.keras.optimizers.RMSprop(learning_rate=base_learning_rate/10),
              metrics=['accuracy'])

In [None]:
model.summary()

In [None]:
len(model.trainable_variables)

### Continue training the model

If you trained to convergence earlier, this step will improve your accuracy by a few percentage points.

In [None]:
fine_tune_epochs = 30
total_epochs =  initial_epochs + fine_tune_epochs

history_fine = model.fit(train_dataset,
                         epochs=total_epochs,
                         initial_epoch=history.epoch[-1],
                         validation_data=validation_dataset)

Let's take a look at the learning curves of the training and validation accuracy/loss when fine-tuning the last few layers of the MobileNetV2 base model and training the classifier on top of it. The validation loss is much higher than the training loss, so you may get some overfitting.

You may also get some overfitting as the new training set is relatively small and similar to the original MobileNetV2 datasets.


After fine tuning the model nearly reaches 98% accuracy on the validation set.

In [None]:
acc += history_fine.history['accuracy']
val_acc += history_fine.history['val_accuracy']

loss += history_fine.history['loss']
val_loss += history_fine.history['val_loss']

In [None]:
plt.figure(figsize=(8, 8))
plt.subplot(2, 1, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.ylim([0.8, 1])
plt.plot([initial_epochs-1,initial_epochs-1],
          plt.ylim(), label='Start Fine Tuning')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')

plt.subplot(2, 1, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.ylim([0, 1.0])
plt.plot([initial_epochs-1,initial_epochs-1],
         plt.ylim(), label='Start Fine Tuning')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.xlabel('epoch')
plt.show()

### Evaluation and prediction

Finally you can verify the performance of the model on new data using test set.

In [None]:
loss, accuracy = model.evaluate(test_dataset)
print('Test accuracy :', accuracy)

And now you are all set to use this model to predict if your pet is a cat or dog.

In [None]:
# Retrieve a batch of images from the test set
image_batch, label_batch = test_dataset.as_numpy_iterator().next()
predictions = model.predict_on_batch(image_batch)
print('Predictions:\n', predictions)
# Apply a sigmoid since our model returns logits
#predictions = tf.nn.sigmoid(predictions)
#predictions = tf.where(predictions < 0.5, 0, 1)

#print('Predictions:\n', predictions.numpy())
#print('Labels:\n', label_batch)

#plt.figure(figsize=(10, 10))
#for i in range(30):
#  ax = plt.subplot(3, 10, i + 1)
#  plt.imshow(image_batch[i].astype("uint8"))
#  plt.title(class_names[predictions[i]])
#  plt.axis("off")

In [None]:
label_batch

In [None]:
from tensorflow.keras.preprocessing import image

In [None]:
t = image.load_img('test.jpg', target_size = IMG_SIZE)
t = image.img_to_array(t)
tt = []
tt.append(t)
tt = np.array(tt)

In [None]:
tt.shape

In [None]:
p = model.predict(tt)

In [None]:
p >0.01

In [None]:
os.listdir(train_dir)

## Summary

* **Using a pre-trained model for feature extraction**:  When working with a small dataset, it is a common practice to take advantage of features learned by a model trained on a larger dataset in the same domain. This is done by instantiating the pre-trained model and adding a fully-connected classifier on top. The pre-trained model is "frozen" and only the weights of the classifier get updated during training.
In this case, the convolutional base extracted all the features associated with each image and you just trained a classifier that determines the image class given that set of extracted features.

* **Fine-tuning a pre-trained model**: To further improve performance, one might want to repurpose the top-level layers of the pre-trained models to the new dataset via fine-tuning.
In this case, you tuned your weights such that your model learned high-level features specific to the dataset. This technique is usually recommended when the training dataset is large and very similar to the original dataset that the pre-trained model was trained on.

To learn more, visit the [Transfer learning guide](https://www.tensorflow.org/guide/keras/transfer_learning).
