# Welcome
### Welcome to a fun adventure with TPUs and flower classification üåªüå∏üåπ.
In this fun notebook we will go step by step and create a deep learning model to perform flower classification on 104 different species!

<div class="alert alert-block alert-info"> üìå If you find this notebook interesting, please upvote it, it means a lot to me and it keeps me motivated to improve it as shown in my long list of items below üòÄ</div><br>  

This notebook builds heavily on Ryan's awesome notebook as part of the Deep Learning Kaggle course on Computer Vision at: https://www.kaggle.com/ryanholbrook/create-your-first-submission

I explore 25 combinations of transfer learning and hyperparameter tuning and compare their performance/val accuracy, where did the val loss and accuracy perform best, f1 score as well as precision and recall.
Look out for my personal notes, labeled as Note ... üòÄ, as we take this journey together...

My Personal Plan of Action:
- Understand how TPUs work and how to use them ‚úÖ
- Explore transfer learning with 10+ models pretrained on either imagenet or noisy-student and evaluate their performance ‚úÖ
- Explore training large CNN models from scratch and evaluate their performance ‚úÖ
- Explore 10+ hyperparameter tuning methods and evaluate their performance ‚úÖ
- Explore 25+ combinations of models and tuning methods above and evaluate their performance ‚úÖ
- Ensemble models with loaded weights and evaluate their performance ‚úÖ
- Build a great looking vizualization that captures and highlights model + tuning performance ‚úÖ
- Be generous with comments, either as markdown or in code so anyone can follow along ‚úÖ
- Respond to each comment on this notebook and learn from each other ‚úÖ
- Meet and interact with kagglers online ‚úÖ
- Reach out and thank other Kagglers for their amazing notebooks, share feedback and learn from each other ‚úÖ
- Promote and share our cool online community [Deep Learning Adventures](https://www.meetup.com/Deep-Learning-Adventures) ‚úÖ
- Have fun while building cool models ‚úÖ
- Climb up the leaderboard with new explorations ‚úÖ
- Plot model performance report in 3D üòé ‚úÖ
- Start and maintain [a discussion thread around this notebook](https://www.kaggle.com/c/tpu-getting-started/discussion/209865) ‚úÖ
- Visualize different data augmentation methods ‚úÖ
- Visualize incorrect predictions ‚úÖ
- Explore Test Time Augmentation (TTA) ‚úÖ
- Find and use more training data, increasing the variety our models are exposed to from 12K training images to 68K ‚úÖ
- Convert a few models from .h5 to TensorFlow Lite using [TensorFlow Lite converter](https://www.tensorflow.org/lite/convert) ‚úÖ
- Deploy a few models models on mobile and IoT devices
- Explore other dimensions 224x224 images
- Explore models from TF Hub or Model Garden
- Explore Mixed precision
- Explore ELI5 and model explainability
- Read a paper or two on different data augmentation methods for computer vision
- Earn 5+ votes/bronze medal for this notebook ü•â ‚úÖ
- Earn 20+ votes/silver medal for this notebook ü•à ‚úÖ
- Earn 50+ votes/gold medal for this notebook - your help is needed here üòÄ‚¨ÖÔ∏è‚¨ÖÔ∏è‚¨ÖÔ∏è

In [None]:
from IPython.display import Image
Image(filename="../input/images/Petals to the Metal 31.png", width=1200, height=1000)

In [None]:
Image(filename="../input/images/Petals to the Metal 32.png", width=1200, height=1000)

In [None]:
Image(filename="../input/images/Petals to the Metal 28.png", width=1200, height=1000)

In [None]:
Image(filename="../input/images/3d_plot3.png", width=1200, height=1000)

In [None]:
Image(filename="../input/images/3d_plot4.png", width=1200, height=1000)

In [None]:
Image(filename="../input/images/3d_plot5.png", width=1200, height=1000)

In [None]:
Image(filename="../input/images/Petals to the Metal 10.png", width=1200, height=1000) 

In [None]:
#Weights for 12K images
Image(filename="../input/images/Petals to the Metal 25.png", width=1200, height=1000) 

In [None]:
#Weights for 70K images
Image(filename="../input/images/Petals to the Metal 26.png", width=1200, height=1000) 

In [None]:
Image(filename="../input/images/Petals to the Metal 22.png", width=1200, height=1000) 

If you would like to learn more as well as join a larger data science community üéâ, feel free to join us at: https://www.meetup.com/Deep-Learning-Adventures/events/275438349
All our sessions are recorded üòÉ and available on YouTube at: http://bit.ly/dla-kaggle-courses

In [None]:
Image(filename="../input/images/Deep Learning Adventures.png", width=1200, height=1000) 


# Introduction #

Welcome to the [**Petals to the Metal**](https://www.kaggle.com/c/tpu-getting-started) competition! In this competition, you‚Äôre challenged to build a machine learning model to classify 104 types of flowers based on their images.

In this tutorial notebook, you'll learn how to build an image classifier in Keras and train it on a [Tensor Processing Unit (TPU)](https://www.kaggle.com/docs/tpu). At the end, you'll have a complete project you can build off of with ideas of your own.

<blockquote style="margin-right:auto; margin-left:auto; background-color: #ebf9ff; padding: 1em; margin:24px;">
    <strong>Fork This Notebook!</strong><br>
Create your own editable copy of this notebook by clicking on the <strong>Copy and Edit</strong> button in the top right corner.
</blockquote>

## Note 0 üòÄ
### Upgrade TensorFlow for improved performance and ability to transfer learn from more models in tf.keras.applications
### ‚ö†Ô∏è Issue: Code will need to be refactored due to this upgrade ‚ö†Ô∏è

In [None]:
#!pip install -U tensorflow==2.4.0

# Step 1: Imports #

We begin by importing several Python packages.

In [None]:
!pip install seaborn --upgrade
import seaborn as sns

import matplotlib.pyplot as plt
from matplotlib import cm
import math, re, os
import pandas as pd
import numpy as np
import random
import plotly.express as px

import tensorflow as tf
print("Tensorflow version " + tf.__version__)

# Step 2: Distribution Strategy #

A TPU has eight different *cores* and each of these cores acts as its own accelerator. (A TPU is sort of like having eight GPUs in one machine.) We tell TensorFlow how to make use of all these cores at once through a **distribution strategy**. Run the following cell to create the distribution strategy that we'll later apply to our model.

## Note 1 üòÄ
### 1. TPUs are network-connected accelerators and you must first locate them on the network. This is what TPUClusterResolver() does.

### 2. Two additional lines of boilerplate and you can define a TPUStrategy. This object contains the necessary distributed training code that will work on TPUs with their 8 compute cores (see hardware section below).

### 3. Finally, you use the TPUStrategy by instantiating your model in the scope of the strategy. This creates the model on the TPU. Model size is constrained by the TPU RAM only, not by the amount of memory available on the VM running your Python code. Model creation and model training use the usual Keras APIs.

### Source: https://www.kaggle.com/docs/tpu

In [None]:
# Detect TPU, return appropriate distribution strategy
try:
    tpu = tf.distribute.cluster_resolver.TPUClusterResolver() #See Note 1.1 above üòÄ
    print('Running on TPU ', tpu.master())
except ValueError:
    tpu = None

if tpu:
    tf.config.experimental_connect_to_cluster(tpu) #See Note 1.2 above üòÄ
    tf.tpu.experimental.initialize_tpu_system(tpu) #See Note 1.2 above üòÄ
    strategy = tf.distribute.experimental.TPUStrategy(tpu) #See Note 1.3 above üòÄ
else:
    strategy = tf.distribute.get_strategy() 

print("REPLICAS: ", strategy.num_replicas_in_sync)

We'll use the distribution strategy when we create our neural network model. Then, TensorFlow will distribute the training among the eight TPU cores by creating eight different *replicas* of the model, one for each core.

# Step 3: Loading the Competition Data #

## Get GCS Path ##

When used with TPUs, datasets need to be stored in a [Google Cloud Storage bucket](https://cloud.google.com/storage/). You can use data from any public GCS bucket by giving its path just like you would data from `'/kaggle/input'`. The following will retrieve the GCS path for this competition's dataset.

In [None]:
from kaggle_datasets import KaggleDatasets

GCS_DS_PATH = KaggleDatasets().get_gcs_path('tpu-getting-started')
print(GCS_DS_PATH) # what do gcs paths look like?

You can use data from any public dataset here on Kaggle in just the same way. If you'd like to use data from one of your private datasets, see [here](https://www.kaggle.com/docs/tpu#tpu3pt5).

## Load Data ##

When used with TPUs, datasets are often serialized into [TFRecords](https://www.kaggle.com/ryanholbrook/tfrecords-basics). This is a format convenient for distributing data to each of the TPUs cores. We've hidden the cell that reads the TFRecords for our dataset since the process is a bit long. You could come back to it later for some guidance on using your own datasets with TPUs.

## Note 2 üòÄ
### 1. TPUs are equipped with 128GB of high-speed memory allowing larger batches, larger models and also larger training inputs. In the code below, we can use 512x512 px input images, also provided in the dataset, and see the TPU v3-8 handle them easily.
### 2. num_parallel_reads=AUTO instructs the API to read from multiple files if available. It figures out how many automatically.
### 3. experimental_deterministic = False disables data order enforcement. We will be shuffling the data anyway so order is not important. With this setting the API can use any TFRecord as soon as it is streamed in.
### Source: https://www.kaggle.com/docs/tpu

In [None]:
IMAGE_SIZE = [512, 512] #See Note 2.1 above üòÄ

GCS_PATH = GCS_DS_PATH + '/tfrecords-jpeg-512x512'
AUTO = tf.data.experimental.AUTOTUNE #See Note 2.2 above üòÄ

TRAINING_FILENAMES = tf.io.gfile.glob(GCS_PATH + '/train/*.tfrec')
VALIDATION_FILENAMES = tf.io.gfile.glob(GCS_PATH + '/val/*.tfrec')
TEST_FILENAMES = tf.io.gfile.glob(GCS_PATH + '/test/*.tfrec') 

CLASSES = ['pink primrose',    'hard-leaved pocket orchid', 'canterbury bells', 'sweet pea',     'wild geranium',     'tiger lily',           'moon orchid',              'bird of paradise', 'monkshood',        'globe thistle',         # 00 - 09
           'snapdragon',       "colt's foot",               'king protea',      'spear thistle', 'yellow iris',       'globe-flower',         'purple coneflower',        'peruvian lily',    'balloon flower',   'giant white arum lily', # 10 - 19
           'fire lily',        'pincushion flower',         'fritillary',       'red ginger',    'grape hyacinth',    'corn poppy',           'prince of wales feathers', 'stemless gentian', 'artichoke',        'sweet william',         # 20 - 29
           'carnation',        'garden phlox',              'love in the mist', 'cosmos',        'alpine sea holly',  'ruby-lipped cattleya', 'cape flower',              'great masterwort', 'siam tulip',       'lenten rose',           # 30 - 39
           'barberton daisy',  'daffodil',                  'sword lily',       'poinsettia',    'bolero deep blue',  'wallflower',           'marigold',                 'buttercup',        'daisy',            'common dandelion',      # 40 - 49
           'petunia',          'wild pansy',                'primula',          'sunflower',     'lilac hibiscus',    'bishop of llandaff',   'gaura',                    'geranium',         'orange dahlia',    'pink-yellow dahlia',    # 50 - 59
           'cautleya spicata', 'japanese anemone',          'black-eyed susan', 'silverbush',    'californian poppy', 'osteospermum',         'spring crocus',            'iris',             'windflower',       'tree poppy',            # 60 - 69
           'gazania',          'azalea',                    'water lily',       'rose',          'thorn apple',       'morning glory',        'passion flower',           'lotus',            'toad lily',        'anthurium',             # 70 - 79
           'frangipani',       'clematis',                  'hibiscus',         'columbine',     'desert-rose',       'tree mallow',          'magnolia',                 'cyclamen ',        'watercress',       'canna lily',            # 80 - 89
           'hippeastrum ',     'bee balm',                  'pink quill',       'foxglove',      'bougainvillea',     'camellia',             'mallow',                   'mexican petunia',  'bromelia',         'blanket flower',        # 90 - 99
           'trumpet creeper',  'blackberry lily',           'common tulip',     'wild rose']                                                                                                                                               # 100 - 103


def decode_image(image_data):
    image = tf.image.decode_jpeg(image_data, channels=3)
    image = tf.cast(image, tf.float32) / 255.0  # convert image to floats in [0, 1] range
    image = tf.reshape(image, [*IMAGE_SIZE, 3]) # explicit size needed for TPU
    return image

def read_labeled_tfrecord(example):
    LABELED_TFREC_FORMAT = {
        "image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
        "class": tf.io.FixedLenFeature([], tf.int64),  # shape [] means single element
    }
    example = tf.io.parse_single_example(example, LABELED_TFREC_FORMAT)
    image = decode_image(example['image'])
    label = tf.cast(example['class'], tf.int32)
    return image, label # returns a dataset of (image, label) pairs

def read_unlabeled_tfrecord(example):
    UNLABELED_TFREC_FORMAT = {
        "image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
        "id": tf.io.FixedLenFeature([], tf.string),  # shape [] means single element
        # class is missing, this competitions's challenge is to predict flower classes for the test dataset
    }
    example = tf.io.parse_single_example(example, UNLABELED_TFREC_FORMAT)
    image = decode_image(example['image'])
    idnum = example['id']
    return image, idnum # returns a dataset of image(s)

def load_dataset(filenames, labeled=True, ordered=False):
    # Read from TFRecords. For optimal performance, reading from multiple files at once and
    # disregarding data order. Order does not matter since we will be shuffling the data anyway.

    ignore_order = tf.data.Options()
    if not ordered:
        ignore_order.experimental_deterministic = False # disable order, increase speed. #See Note 2.3 above üòÄ

    dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTO) # automatically interleaves reads from multiple files. #See Note 2.2 above üòÄ
    dataset = dataset.with_options(ignore_order) # uses data as soon as it streams in, rather than in its original order
    dataset = dataset.map(read_labeled_tfrecord if labeled else read_unlabeled_tfrecord, num_parallel_calls=AUTO)
    # returns a dataset of (image, label) pairs if labeled=True or (image, id) pairs if labeled=False
    return dataset

## Note 9 üòÄ
## Use additional data, tuning6, private dataset
## Inspired by Dmitry's notebook [here](https://www.kaggle.com/dmitrynokhrin/densenet201-aug-additional-data) and Araik's notebook [here](https://www.kaggle.com/atamazian/fc-ensemble-external-data-effnet-densenet)
## See also [external data and how to use them](https://www.kaggle.com/c/flower-classification-with-tpus/discussion/140866) and [Kirill's tf_flower_photo_tfrec dataset](https://www.kaggle.com/kirillblinov/tf-flower-photo-tfrec)

In [None]:
GCS_DS_PATH_EXT = KaggleDatasets().get_gcs_path('tf-flower-photo-tfrec')

# External data
GCS_PATH_SELECT_EXT = {
    192: '/tfrecords-jpeg-192x192',
    224: '/tfrecords-jpeg-224x224',
    331: '/tfrecords-jpeg-331x331',
    512: '/tfrecords-jpeg-512x512'
}
GCS_PATH_EXT = GCS_PATH_SELECT_EXT[IMAGE_SIZE[0]]

IMAGENET_FILES = tf.io.gfile.glob(GCS_DS_PATH_EXT + '/imagenet' + GCS_PATH_EXT + '/*.tfrec')
INATURELIST_FILES = tf.io.gfile.glob(GCS_DS_PATH_EXT + '/inaturalist' + GCS_PATH_EXT + '/*.tfrec')
OPENIMAGE_FILES = tf.io.gfile.glob(GCS_DS_PATH_EXT + '/openimage' + GCS_PATH_EXT + '/*.tfrec')
OXFORD_FILES = tf.io.gfile.glob(GCS_DS_PATH_EXT + '/oxford_102' + GCS_PATH_EXT + '/*.tfrec')
TENSORFLOW_FILES = tf.io.gfile.glob(GCS_DS_PATH_EXT + '/tf_flowers' + GCS_PATH_EXT + '/*.tfrec')

ADDITIONAL_TRAINING_FILENAMES = IMAGENET_FILES + INATURELIST_FILES + OPENIMAGE_FILES + OXFORD_FILES + TENSORFLOW_FILES  

TRAINING_FILENAMES = TRAINING_FILENAMES + ADDITIONAL_TRAINING_FILENAMES

## Note 8 üòÄ
## Perform data augmentation, tuning4
## Inspired by Dmitry's notebook [here](https://www.kaggle.com/dmitrynokhrin/densenet201-aug-additional-data)

In [None]:
#tuning4
SEED = 2020

def random_blockout(img, sl=0.1, sh=0.2, rl=0.4):
    p=random.random()
    if p>=0.25:
        w, h, c = IMAGE_SIZE[0], IMAGE_SIZE[1], 3
        origin_area = tf.cast(h*w, tf.float32)

        e_size_l = tf.cast(tf.round(tf.sqrt(origin_area * sl * rl)), tf.int32)
        e_size_h = tf.cast(tf.round(tf.sqrt(origin_area * sh / rl)), tf.int32)

        e_height_h = tf.minimum(e_size_h, h)
        e_width_h = tf.minimum(e_size_h, w)

        erase_height = tf.random.uniform(shape=[], minval=e_size_l, maxval=e_height_h, dtype=tf.int32)
        erase_width = tf.random.uniform(shape=[], minval=e_size_l, maxval=e_width_h, dtype=tf.int32)

        erase_area = tf.zeros(shape=[erase_height, erase_width, c])
        erase_area = tf.cast(erase_area, tf.uint8)

        pad_h = h - erase_height
        pad_top = tf.random.uniform(shape=[], minval=0, maxval=pad_h, dtype=tf.int32)
        pad_bottom = pad_h - pad_top

        pad_w = w - erase_width
        pad_left = tf.random.uniform(shape=[], minval=0, maxval=pad_w, dtype=tf.int32)
        pad_right = pad_w - pad_left

        erase_mask = tf.pad([erase_area], [[0,0],[pad_top, pad_bottom], [pad_left, pad_right], [0,0]], constant_values=1)
        erase_mask = tf.squeeze(erase_mask, axis=0)
        erased_img = tf.multiply(tf.cast(img,tf.float32), tf.cast(erase_mask, tf.float32))

        return tf.cast(erased_img, img.dtype)
    else:
        return tf.cast(img, img.dtype)

    
def data_augment_v2(image, label):
    # Thanks to the dataset.prefetch(AUTO) statement in the next function (below), this happens essentially for free on TPU. 
    # Data pipeline code is executed on the "CPU" part of the TPU while the TPU itself is computing gradients.
    
    flag = random.randint(1,3)
    coef_1 = random.randint(70, 90) * 0.01
    coef_2 = random.randint(70, 90) * 0.01
    
    if flag == 1:
        image = tf.image.random_flip_left_right(image, seed=SEED)
    elif flag == 2:
        image = tf.image.random_flip_up_down(image, seed=SEED)
    else:
        image = tf.image.random_crop(image, [int(IMAGE_SIZE[0]*coef_1), int(IMAGE_SIZE[0]*coef_2), 3],seed=SEED)
        
    image = random_blockout(image)
    
    return image, label 

## Perform data augmentation, tuning7
## Inspired by Xuanzhi Huang and Rahul Paul's notebook [here](https://www.kaggle.com/xuanzhihuang/flower-classification-densenet-201)

In [None]:
import tensorflow_addons as tfa

# Randomly make some changes to the images and return the new images and labels
def data_augment_v3(image, label):
        
    # Set seed for data augmentation
    seed = 100
    
    # Randomly resize and then crop images
    image = tf.image.resize(image, [720, 720])
    image = tf.image.random_crop(image, [512, 512, 3], seed = seed)

    # Randomly reset brightness of images
    image = tf.image.random_brightness(image, 0.6, seed = seed)
    
    # Randomly reset saturation of images
    image = tf.image.random_saturation(image, 3, 5, seed = seed)
        
    # Randomly reset contrast of images
    image = tf.image.random_contrast(image, 0.3, 0.5, seed = seed)

    # Randomly reset hue of images, but this will make the colors really weird, which we think will not happen
    # in common photography
    # image = tf.image.random_hue(image, 0.5, seed = seed)
    
    # Blur images
    image = tfa.image.mean_filter2d(image, filter_shape = 10)
    
    # Randomly flip images
    image = tf.image.random_flip_left_right(image, seed = seed)
    image = tf.image.random_flip_up_down(image, seed = seed)
    
    # Fail to rotate and transform images due to some bug in TensorFlow
    # angle = random.randint(0, 180)
    # image = tfa.image.rotate(image, tf.constant(np.pi * angle / 180))
    # image = tfa.image.transform(image, [1.0, 1.0, -250, 0.0, 1.0, 0.0, 0.0, 0.0])
    
    return image, label

## Create Data Pipelines ##

In this final step we'll use the `tf.data` API to define an efficient data pipeline for each of the training, validation, and test splits.

In [None]:
def data_augment(image, label):
    # Thanks to the dataset.prefetch(AUTO) statement in the next function (below), this happens essentially for free on TPU. 
    # Data pipeline code is executed on the "CPU" part of the TPU while the TPU itself is computing gradients.
    image = tf.image.random_flip_left_right(image)
    #image = tf.image.random_saturation(image, 0, 2)
    return image, label   

def get_training_dataset():
    dataset = load_dataset(TRAINING_FILENAMES, labeled=True)
    dataset = dataset.map(data_augment, num_parallel_calls=AUTO) #tuning4
    dataset = dataset.repeat() # the training dataset must repeat for several epochs
    dataset = dataset.shuffle(2048)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(AUTO) # prefetch next batch while training (autotune prefetch buffer size)
    return dataset

def get_validation_dataset(ordered=False):
    dataset = load_dataset(VALIDATION_FILENAMES, labeled=True, ordered=ordered)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.cache()
    dataset = dataset.prefetch(AUTO)
    return dataset

def get_test_dataset(ordered=False):
    dataset = load_dataset(TEST_FILENAMES, labeled=False, ordered=ordered)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(AUTO)
    return dataset

def count_data_items(filenames):
    # the number of data items is written in the name of the .tfrec files, i.e. flowers00-230.tfrec = 230 data items
    n = [int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in filenames]
    return np.sum(n)

NUM_TRAINING_IMAGES = count_data_items(TRAINING_FILENAMES)
NUM_VALIDATION_IMAGES = count_data_items(VALIDATION_FILENAMES)
NUM_TEST_IMAGES = count_data_items(TEST_FILENAMES)
print('Dataset: {} training images, {} validation images, {} unlabeled test images'.format(NUM_TRAINING_IMAGES, NUM_VALIDATION_IMAGES, NUM_TEST_IMAGES))


### Original Data: Dataset: 12753 training images, 3712 validation images, 7382 unlabeled test images
### Additional Data: Dataset: 68094 training images, 3712 validation images, 7382 unlabeled test images

This next cell will create the datasets that we'll use with Keras during training and inference. Notice how we scale the size of the batches to the number of TPU cores.

## Note 3 üòÄ
### 1. To go fast on a TPU, increase the batch size. The rule of thumb is to use batches of 128 elements per core (ex: batch size of 128*8=1024 for a TPU with 8 cores). At this size, the 128x128 hardware matrix multipliers of the TPU (see hardware section below) are most likely to be kept busy. You start seeing interesting speedups from a batch size of 8 per core though. In the sample above, the batch size is scaled with the core count through this line of code:
### BATCH_SIZE = 16 * tpu_strategy.num_replicas_in_sync
### Source: https://www.kaggle.com/docs/tpu

In [None]:
strategy.num_replicas_in_sync

In [None]:
16 * strategy.num_replicas_in_sync

In [None]:
# Define the batch size. This will be 16 with TPU off and 128 (=16*8) with TPU on
BATCH_SIZE = 16 * strategy.num_replicas_in_sync #See Note 3.1 above üòÄ

ds_train = get_training_dataset()
ds_valid = get_validation_dataset()
ds_test = get_test_dataset()

print("Training:", ds_train)
print ("Validation:", ds_valid)
print("Test:", ds_test)

These datasets are `tf.data.Dataset` objects. You can think about a dataset in TensorFlow as a *stream* of data records. The training and validation sets are streams of `(image, label)` pairs.

In [None]:
np.set_printoptions(threshold=15, linewidth=80)

print("Training data shapes:")
for image, label in ds_train.take(3):
    print(image.numpy().shape, label.numpy().shape) #See Note 3.1 above üòÄ
print("Training data label examples:", label.numpy())

The test set is a stream of `(image, idnum)` pairs; `idnum` here is the unique identifier given to the image that we'll use later when we make our submission as a `csv` file.

In [None]:
print("Test data shapes:")
for image, idnum in ds_test.take(3):
    print(image.numpy().shape, idnum.numpy().shape) #See Note 3.1 above üòÄ
print("Test data IDs:", idnum.numpy().astype('U')) # U=unicode string

# Step 4: Explore Data #

Let's take a moment to look at some of the images in the dataset.

In [None]:
from matplotlib import pyplot as plt

def batch_to_numpy_images_and_labels(data):
    images, labels = data
    numpy_images = images.numpy()
    numpy_labels = labels.numpy()
    if numpy_labels.dtype == object: # binary string in this case,these are image ID strings
        numpy_labels = [None for _ in enumerate(numpy_images)]
        # If no labels, only image IDs, return None for labels (this is the case for test data)
    return numpy_images, numpy_labels

def title_from_label_and_target(label, correct_label):
    if correct_label is None:
        return CLASSES[label], True
    correct = (label == correct_label)
    return "{} [{}{}{}]".format(CLASSES[label], 
                                'OK' if correct else 'NO', 
                                u"\u2192" if not correct else '',
                                CLASSES[correct_label] if not correct else ''), correct

def display_one_flower(image, title, subplot, red=False, titlesize=16):
    plt.subplot(*subplot)
    plt.axis('off')
    plt.imshow(image)
    if len(title) > 0:
        plt.title(title, fontsize=int(titlesize) if not red else int(titlesize/1.2), color='red' if red else 'black', fontdict={'verticalalignment':'center'}, pad=int(titlesize/1.5))
    return (subplot[0], subplot[1], subplot[2]+1)
    
def display_batch_of_images(databatch, predictions=None, display_mismatches_only=False):
    """This will work with:
    display_batch_of_images(images)
    display_batch_of_images(images, predictions)
    display_batch_of_images((images, labels))
    display_batch_of_images((images, labels), predictions)
    """
    # data
    images, labels = batch_to_numpy_images_and_labels(databatch)
    if labels is None:
        labels = [None for _ in enumerate(images)]
        
    # auto-squaring: this will drop data that does not fit into square or square-ish rectangle
    rows = int(math.sqrt(len(images)))
    cols = len(images)//rows
        
    # size and spacing
    FIGSIZE = 13.0
    SPACING = 0.1
    subplot=(rows,cols,1)
    if rows < cols:
        plt.figure(figsize=(FIGSIZE,FIGSIZE/cols*rows))
    else:
        plt.figure(figsize=(FIGSIZE/rows*cols,FIGSIZE))
    
    # display
    for i, (image, label) in enumerate(zip(images[:rows*cols], labels[:rows*cols])):
        title = '' if label is None else CLASSES[label]
        correct = True
        if predictions is not None:
            title, correct = title_from_label_and_target(predictions[i], label)
        dynamic_titlesize = FIGSIZE*SPACING/max(rows,cols)*40+3 # magic formula tested to work from 1x1 to 10x10 images
        if display_mismatches_only:
            if predictions[i] != label:
                subplot = display_one_flower(image, title, subplot, not correct, titlesize=dynamic_titlesize)
        else:        
            subplot = display_one_flower(image, title, subplot, not correct, titlesize=dynamic_titlesize)
    
    #layout
    plt.tight_layout()
    if label is None and predictions is None:
        plt.subplots_adjust(wspace=0, hspace=0)
    else:
        plt.subplots_adjust(wspace=SPACING, hspace=SPACING)
    plt.show()


def display_training_curves(training, validation, title, subplot):
    if subplot%10==1: # set up the subplots on the first call
        plt.subplots(figsize=(10,10), facecolor='#F0F0F0')
        plt.tight_layout()
    ax = plt.subplot(subplot)
    ax.set_facecolor('#F8F8F8')
    ax.plot(training)
    ax.plot(validation)
    ax.set_title('model '+ title)
    ax.set_ylabel(title)
    #ax.set_ylim(0.28,1.05)
    ax.set_xlabel('epoch')
    ax.legend(['train', 'valid.'])

def display_training_curves_v2(training, validation, learning_rate_list, title, subplot):
    if subplot%10==1: # set up the subplots on the first call
        plt.subplots(figsize=(10,10), facecolor='#F0F0F0')
        plt.tight_layout()
    ax = plt.subplot(subplot)
    ax.set_facecolor('#F8F8F8')
    ax.plot(training)
    ax.plot(validation)
    ax.set_title('model '+ title)
    ax.set_ylabel(title, color='b')
    #ax.set_ylim(0.28,1.05)
    ax.set_xlabel('epoch')
    ax.legend(['train', 'valid.', 'learning rate'])        
    
    ax2 = ax.twinx()
    ax2.plot(learning_rate_list, 'g-')
    ax2.set_ylabel('learning rate', color='g')

You can display a single batch of images from a dataset with another of our helper functions. The next cell will turn the dataset into an iterator of batches of 20 images.

In [None]:
ds_iter = iter(ds_train.unbatch().batch(20))

Use the Python `next` function to pop out the next batch in the stream and display it with the helper function.

In [None]:
one_batch = next(ds_iter)
display_batch_of_images(one_batch)

By defining `ds_iter` and `one_batch` in separate cells, you only need to rerun the cell above to see a new batch of images.

## tuning7, show a sample of data augmented

In [None]:
row = 3
col = 4
all_elements = get_training_dataset().unbatch()
one_element = tf.data.Dataset.from_tensors(next(iter(all_elements)))
# Map the images to the data augmentation function for image processing
augmented_element = one_element.repeat().map(data_augment).batch(row * col)

for (img, label) in augmented_element:
    plt.figure(figsize = (15, int(15 * row / col)))
    for j in range(row * col):
        plt.subplot(row, col, j + 1)
        plt.axis('off')
        plt.imshow(img[j, ])
    plt.show()
    break

## tuning7, show a sample of data augmented v2

In [None]:
# Map the images to the data augmentation function for image processing
augmented_element = one_element.repeat().map(data_augment_v2).batch(row * col)

for (img, label) in augmented_element:
    plt.figure(figsize = (15, int(15 * row / col)))
    for j in range(row * col):
        plt.subplot(row, col, j + 1)
        plt.axis('off')
        plt.imshow(img[j, ])
    plt.show()
    break

## tuning7, show a sample of data augmented v3

In [None]:
# Map the images to the data augmentation function for image processing
augmented_element = one_element.repeat().map(data_augment_v3).batch(row * col)

for (img, label) in augmented_element:
    plt.figure(figsize = (15, int(15 * row / col)))
    for j in range(row * col):
        plt.subplot(row, col, j + 1)
        plt.axis('off')
        plt.imshow(img[j, ])
    plt.show()
    break

# Step 5: Define Model #

Now we're ready to create a neural network for classifying images! We'll use what's known as **transfer learning**. With transfer learning, you reuse part of a pretrained model to get a head-start on a new dataset.

For this tutorial, we'll to use a model called **VGG16** pretrained on [ImageNet](http://image-net.org/)). Later, you might want to experiment with [other models](https://www.tensorflow.org/api_docs/python/tf/keras/applications) included with Keras. ([Xception](https://www.tensorflow.org/api_docs/python/tf/keras/applications/Xception) wouldn't be a bad choice.)

The distribution strategy we created earlier contains a [context manager](https://docs.python.org/3/reference/compound_stmts.html#with), `strategy.scope`. This context manager tells TensorFlow how to divide the work of training among the eight TPU cores. When using TensorFlow with a TPU, it's important to define your model in a `strategy.scope()` context.

In [None]:
[*IMAGE_SIZE, 3]

## Note 4 üòÄ
## Let's transfer learn from different neural network architectures from tf.keras.applications and keep track of their performance
## Source: https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications
## Note that TF 2.4 has many more models available for transfer learning
## Source: https://www.tensorflow.org/api_docs/python/tf/keras/applications

In [None]:
', '.join(tf.keras.applications.__dir__())

## Note 5 üòÄ
## Use ModelCheckpoint to keep track of the "best" model during training, according to ```monitor='val_loss'```

In [None]:
# Model weights are saved at the end of every epoch, if it's the best seen so far during model.fit
checkpoint_filepath = "Petals_to_the_Metal-70K_images-trainable_True-MobileNetV2.h5" #"Petals_to_the_Metal-70K_images-trainable_True-DenseNet201.h5"

checkpoint = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=True,
    monitor='val_loss',
    mode='min',
    save_best_only=True
)

## Note 6 üòÄ
## Use EarlyStopping to stop the training when there is no improvement in ```monitor='val_loss'``` for ```patience=3``` consecutive epochs

In [None]:
# This callback will stop the training when there is no improvement in the validation loss for three consecutive epochs. 
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)

## Note 10 üòÄ
## [Track learning rate during training](https://stackoverflow.com/questions/49127214/keras-how-to-output-learning-rate-onto-tensorboard)
NotFoundError: Container worker does not exist. (Could not find resource: worker/_AnonymousVar8064)
	Encountered when executing an operation using EagerExecutor. This error cancels all future operations and poisons their output tensors.

In [None]:
NotFoundError = """
class LRTensorBoard(TensorBoard):
    def __init__(self, log_dir, **kwargs):  # add other arguments to __init__ if you need
        super().__init__(log_dir=log_dir, **kwargs)

    def on_epoch_end(self, epoch, logs=None):
        logs = logs or {}
        logs.update({'lr': K.eval(self.model.optimizer.lr)})
        super().on_epoch_end(epoch, logs)

lr_tracking = LRTensorBoard(log_dir="./lr_tracking")
"""

## [Writing your own callbacks](https://www.tensorflow.org/guide/keras/custom_callback)
## Not needed

In [None]:
class LearningRateTracking(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        keys = list(logs.keys())
        print("End epoch {} of training; got log keys: {}".format(epoch, keys))
        
        #logs = logs or {}
        #logs.update({'lr': K.eval(self.model.optimizer.lr)}) #optimizer._decayed_lr('float32').numpy()
        #return 

#lr_tracking = LearningRateTracking()

In [None]:
use_efficientnet = False #tuning9
if use_efficientnet:
    !pip install -q efficientnet
    from efficientnet.tfkeras import EfficientNetB7

### Calculate weight for each class #tuning11
### Inspired by [Flower Classification DenseNet 201](https://www.kaggle.com/xuanzhihuang/flower-classification-densenet-201)

In [None]:
weight_per_class = True

if weight_per_class:
    from collections import Counter
    import gc

    gc.enable()

    def get_training_dataset_raw():
        dataset = load_dataset(TRAINING_FILENAMES, labeled = True, ordered = False)
        return dataset

    raw_training_dataset = get_training_dataset_raw()

    label_counter = Counter()
    for images, labels in raw_training_dataset:
        label_counter.update([labels.numpy()])

    del raw_training_dataset    

    TARGET_NUM_PER_CLASS = 122 #?

    def get_weight_for_class(class_id):
        counting = label_counter[class_id]
        weight = TARGET_NUM_PER_CLASS / counting
        return weight

    weight_per_class = {class_id: get_weight_for_class(class_id) for class_id in range(104)}

In [None]:
if weight_per_class:
    data = pd.DataFrame.from_dict(weight_per_class, orient='index', columns=['class_weight'])
    plt.figure(figsize=(30, 9))

    #barplot color based on value
    bplot = sns.barplot(x=data.index, y='class_weight', data=data, palette= cm.Blues(data['class_weight']*0.15));
    for p in bplot.patches:
        bplot.annotate(format(p.get_height(), '.1f'), 
                       (p.get_x() + p.get_width() / 2., p.get_height()), 
                       ha = 'center', va = 'center', 
                       xytext = (0, 9), 
                       textcoords = 'offset points')
    plt.xlabel("Class", size=14)
    plt.ylabel("Class weight (inverse of %)", size=14)

## Skip to Model Ensemble/Note 11 üòÄ or look at previous versions of this notebook for transfer learning or end to end training

In [None]:
using_ensemble_models = False

In [None]:
if not using_ensemble_models:
    with strategy.scope():
        #pretrained_model = tf.keras.applications.VGG16
        #pretrained_model = tf.keras.applications.DenseNet201
        #pretrained_model = tf.keras.applications.InceptionResNetV2
        #pretrained_model = tf.keras.applications.InceptionV3
        #pretrained_model = tf.keras.applications.MobileNet
        #pretrained_model = tf.keras.applications.MobileNetV2
        #pretrained_model = tf.keras.applications.NASNetMobile
        #pretrained_model = tf.keras.applications.ResNet50
        #pretrained_model = tf.keras.applications.ResNet101V2
        #pretrained_model = tf.keras.applications.VGG19
        #pretrained_model = tf.keras.applications.Xception
        #pretrained_model = tf.keras.applications.DenseNet201 
        #pretrained_model = EfficientNetB7

        pretrained_model = tf.keras.applications.MobileNetV2(
            include_top=False ,
            weights='imagenet', #tuning10 weights='noisy-student' instead of 'imagenet'
                                #Self-training with Noisy Student improves ImageNet classification https://arxiv.org/abs/1911.04252) 
            #pooling='avg', #tuning1
            input_shape=[*IMAGE_SIZE, 3]
        )

        pretrained_model.trainable = True #tuning8 pretrained_model.trainable = True

        model = tf.keras.Sequential([
            pretrained_model, #Base pretrained on ImageNet to extract features from images

            tf.keras.layers.GlobalAveragePooling2D(), ##Attach a new head to act as a classifier
            #tf.keras.layers.Dropout(0.3), #tuning3
            tf.keras.layers.Dense(len(CLASSES), activation='softmax')
        ])

The `'sparse_categorical'` versions of the loss and metrics are appropriate for a classification task with more than two labels, like this one.

In [None]:
if not using_ensemble_models:
    model.compile(
        optimizer='nadam', #tuning2 optimizer='nadam',
        loss = 'sparse_categorical_crossentropy',
        metrics=['sparse_categorical_accuracy'],
    )

In [None]:
if not using_ensemble_models:
    model.summary()

In [None]:
#if not using_ensemble_models:
tf.keras.utils.plot_model(model, show_shapes=True)

# Step 6: Training #

## Learning Rate Schedule ##

We'll train this network with a special learning rate schedule.

In [None]:
if not using_ensemble_models:
    # Define training epochs
    EPOCHS = 30

    # Define the batch size. This will be 16 with TPU off and 128 (=16*8) with TPU on
    BATCH_SIZE = 16 * strategy.num_replicas_in_sync #See Note 3.1 above üòÄ

    STEPS_PER_EPOCH = NUM_TRAINING_IMAGES // BATCH_SIZE

In [None]:
if not using_ensemble_models:
    # Learning Rate Schedule for Fine Tuning #
    def exponential_lr(epoch,
                       start_lr = 0.00001, min_lr = 0.00001, max_lr = 0.00005 * strategy.num_replicas_in_sync, #tuning1
                       rampup_epochs = 5, sustain_epochs = 0,
                       exp_decay = 0.75): #tuning1

        def lr(epoch, start_lr, min_lr, max_lr, rampup_epochs, sustain_epochs, exp_decay):
            # linear increase from start to rampup_epochs
            if epoch < rampup_epochs:
                lr = ((max_lr - start_lr) /
                      rampup_epochs * epoch + start_lr)
            # constant max_lr during sustain_epochs
            elif epoch < rampup_epochs + sustain_epochs:
                lr = max_lr
            # exponential decay towards min_lr
            else:
                lr = ((max_lr - min_lr) *
                      exp_decay**(epoch - rampup_epochs - sustain_epochs) +
                      min_lr)
            return lr
        return lr(epoch,
                  start_lr,
                  min_lr,
                  max_lr,
                  rampup_epochs,
                  sustain_epochs,
                  exp_decay)

    lr_callback = tf.keras.callbacks.LearningRateScheduler(exponential_lr, verbose=True)

    rng = [i for i in range(EPOCHS)]
    y = [exponential_lr(x) for x in rng]
    plt.plot(rng, y)
    print("Learning rate schedule: {:.3g} to {:.3g} to {:.3g}".format(y[0], max(y), y[-1]))

## Fit Model ##

And now we're ready to train the model. After defining a few parameters, we're good to go!

In [None]:
if not using_ensemble_models:
    history = model.fit(
        ds_train,
        validation_data=ds_valid,
        epochs=EPOCHS,
        steps_per_epoch=STEPS_PER_EPOCH,
        callbacks=[lr_callback, checkpoint], # Model weights are saved at the end of every epoch, if it's the best seen so far
        #workers = 3 #tuning5 https://www.tensorflow.org/tutorials/distribute/multi_worker_with_keras
        class_weight = weight_per_class #tuning11
    )

This next cell shows how the loss and metrics progressed during training. Thankfully, it converges!

## Model performance

In [None]:
if not using_ensemble_models:
    display_training_curves_v2( 
        history.history['loss'],
        history.history['val_loss'],
        history.history['lr'],
        'loss',
        211,
    )

    display_training_curves_v2(
        history.history['sparse_categorical_accuracy'],
        history.history['val_sparse_categorical_accuracy'],
        history.history['lr'],
        'accuracy',
        212,
    )

In [None]:
zoom_after = 20
if not using_ensemble_models:
    display_training_curves(
        history.history['loss'][zoom_after:],
        history.history['val_loss'][zoom_after:],
        'loss',
        211,
    )

    display_training_curves(
        history.history['sparse_categorical_accuracy'][zoom_after:],
        history.history['val_sparse_categorical_accuracy'][zoom_after:],
        'accuracy',
        212,
    )

## Note 5 continued üòÄ
## The model weights (that are considered the best) are loaded into the model.

In [None]:
checkpoint_filepath

In [None]:
if not using_ensemble_models:
    model.load_weights(checkpoint_filepath)

## Note 12 üòÄ
## Convert a few models from .h5 to TensorFlow Lite using [TensorFlow Lite converter](https://www.tensorflow.org/lite/convert)
## Deploy a few models models on mobile and IoT devices

In [None]:
model.summary()

In [None]:
print(checkpoint_filepath)
tflite_model_name = checkpoint_filepath.replace('.h5', '.tflite')
tflite_model_name

In [None]:
# Convert the model
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()

# Save the model
with open(tflite_model_name, 'wb') as f:
    f.write(tflite_model)
    
print('TFLiteConversion completed successfully \U0001F680')  

## Note 11 üòÄ
## Use an ensemble of top 2 performers EfficientNetB7 and DenseNet201
## See [my model dataset](https://www.kaggle.com/georgezoto/models) as well as [Dmitry's notebook](https://www.kaggle.com/dmitrynokhrin/start-with-ensemble-v2) and [Rosa's original notebook](https://www.kaggle.com/wrrosa/tpu-enet-b7-densenet)

In [None]:
def get_pretrained_model(model_name, image_dataset_weights, trainable=True):
    pretrained_model= model_name(
        include_top=False ,
        weights=image_dataset_weights, #tuning10 weights='noisy-student' instead of 'imagenet'
                                       #Self-training with Noisy Student improves ImageNet classification https://arxiv.org/abs/1911.04252) 
        input_shape=[*IMAGE_SIZE, 3]
    )

    pretrained_model.trainable = trainable #tuning8 pretrained_model.trainable = True
    
    model = tf.keras.Sequential([
        pretrained_model, 
        tf.keras.layers.GlobalAveragePooling2D(), 
        tf.keras.layers.Dense(len(CLASSES), activation='softmax')
    ])
    
    return model

In [None]:
if using_ensemble_models:
    with strategy.scope():
        model_EB7 = get_pretrained_model(EfficientNetB7, 'noisy-student', trainable=True)

    model_EB7.load_weights('../input/models/Petals_to_the_Metal-70K_images-trainable_True-EfficientNetB7.h5')    

In [None]:
if using_ensemble_models:
    model_EB7.summary()

In [None]:
if using_ensemble_models:
    with strategy.scope():
        model_D201 = get_pretrained_model(tf.keras.applications.DenseNet201, 'imagenet', trainable=True)

    model_D201.load_weights('../input/models/Petals_to_the_Metal-70K_images-trainable_True-DenseNet201.h5')  

In [None]:
if using_ensemble_models:
    model_D201.summary()

## Ensemble both models

In [None]:
from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix

In [None]:
if using_ensemble_models:
    cmdataset = get_validation_dataset(ordered=True) # since we are splitting the dataset and iterating separately on images and labels, order matters.
    images_ds = cmdataset.map(lambda image, label: image)
    labels_ds = cmdataset.map(lambda image, label: label).unbatch()
    cm_correct_labels = next(iter(labels_ds.batch(NUM_VALIDATION_IMAGES))).numpy() # get everything as one batch

    m1 = model_EB7.predict(images_ds)
    m2 = model_D201.predict(images_ds)

    scores = []
    for alpha in np.linspace(0,1,100):
        cm_probabilities = alpha*m1+(1-alpha)*m2
        cm_predictions = np.argmax(cm_probabilities, axis=-1)
        scores.append(f1_score(cm_correct_labels, cm_predictions, labels=range(len(CLASSES)), average='macro'))

    print("Correct   labels: ", cm_correct_labels.shape, cm_correct_labels)
    print("Predicted labels: ", cm_predictions.shape, cm_predictions)
    plt.plot(scores)

    best_alpha = np.argmax(scores)/100
    cm_probabilities = best_alpha*m1+(1-best_alpha)*m2
    cm_predictions = np.argmax(cm_probabilities, axis=-1)

    #best_alpha = 0.35

In [None]:
if using_ensemble_models:
    print(best_alpha, max(scores))

## Run predictions on the test dataset

In [None]:
if using_ensemble_models:
    test_ds = get_test_dataset(ordered=True)
    #best_alpha = 0.35

    print('Computing predictions...')
    test_images_ds = test_ds.map(lambda image, idnum: image)
    probabilities1 = model_EB7.predict(test_images_ds)
    probabilities2 = model_D201.predict(test_images_ds)

    probabilities = best_alpha * probabilities1 + (1 - best_alpha) * probabilities2

    predictions = np.argmax(probabilities, axis=-1)
    print(predictions)

    print('Generating submission.csv file...')
    # Get image ids from test set and convert to unicode
    test_ids_ds = test_ds.map(lambda image, idnum: idnum).unbatch()
    test_ids = next(iter(test_ids_ds.batch(NUM_TEST_IMAGES))).numpy().astype('U')

    # Write the submission file
    np.savetxt(
        'submission.csv',
        np.rec.fromarrays([test_ids, predictions]),
        fmt=['%s', '%d'],
        delimiter=',',
        header='id,label',
        comments='',
    )

    # Look at the first few predictions
    !head submission.csv

# Step 7: Evaluate Predictions #

Before making your final predictions on the test set, it's a good idea to evaluate your model's predictions on the validation set. This can help you diagnose problems in training or suggest ways your model could be improved. We'll look at two common ways of validation: plotting the **confusion matrix** and **visual validation**.

In [None]:
def display_confusion_matrix(cmat, score, precision, recall):
    plt.figure(figsize=(25,25))
    ax = plt.gca()
    ax.matshow(cmat, cmap='Reds')
    ax.set_xticks(range(len(CLASSES)))
    ax.set_xticklabels(CLASSES, fontdict={'fontsize': 7})
    plt.setp(ax.get_xticklabels(), rotation=45, ha="left", rotation_mode="anchor")
    ax.set_yticks(range(len(CLASSES)))
    ax.set_yticklabels(CLASSES, fontdict={'fontsize': 7})
    plt.setp(ax.get_yticklabels(), rotation=45, ha="right", rotation_mode="anchor")
    titlestring = ""
    if score is not None:
        titlestring += 'f1 = {:.3f} '.format(score)
    if precision is not None:
        titlestring += '\nprecision = {:.3f} '.format(precision)
    if recall is not None:
        titlestring += '\nrecall = {:.3f} '.format(recall)
    if len(titlestring) > 0:
        ax.text(101, 1, titlestring, fontdict={'fontsize': 18, 'horizontalalignment':'right', 'verticalalignment':'top', 'color':'#804040'})
    
    if not using_ensemble_models:
        print('Epoch with min loss and max accuracy:', np.argmin(history.history['val_loss']), np.argmax(history.history['val_sparse_categorical_accuracy']))
        print('min loss and max accuracy:', round(min(history.history['val_loss']),2), round(max(history.history['val_sparse_categorical_accuracy']),2))

    print(titlestring.replace('\n', ''))
    plt.show()
    
def display_training_curves(training, validation, title, subplot):
    if subplot%10==1: # set up the subplots on the first call
        plt.subplots(figsize=(10,10), facecolor='#F0F0F0')
        plt.tight_layout()
    ax = plt.subplot(subplot)
    ax.set_facecolor('#F8F8F8')
    ax.plot(training)
    ax.plot(validation)
    ax.set_title('model '+ title)
    ax.set_ylabel(title)
    #ax.set_ylim(0.28,1.05)
    ax.set_xlabel('epoch')
    ax.legend(['train', 'valid.'])

## Confusion Matrix ##

A [confusion matrix](https://en.wikipedia.org/wiki/Confusion_matrix) shows the actual class of an image tabulated against its predicted class. It is one of the best tools you have for evaluating the performance of a classifier.

The following cell does some processing on the validation data and then creates the matrix with the `confusion_matrix` function included in [`scikit-learn`](https://scikit-learn.org/stable/index.html).

In [None]:
cmdataset = get_validation_dataset(ordered=True)
images_ds = cmdataset.map(lambda image, label: image)
labels_ds = cmdataset.map(lambda image, label: label).unbatch()

cm_correct_labels = next(iter(labels_ds.batch(NUM_VALIDATION_IMAGES))).numpy()

if using_ensemble_models:
    print('using_ensemble_models')
    probabilities1 = model_EB7.predict(images_ds)
    probabilities2 = model_D201.predict(images_ds)
    cm_probabilities = best_alpha * probabilities1 + (1 - best_alpha) * probabilities2
else:
    cm_probabilities = model.predict(images_ds)
    
cm_predictions = np.argmax(cm_probabilities, axis=-1)

labels = range(len(CLASSES))
cmat = confusion_matrix(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
)
cmat = (cmat.T / cmat.sum(axis=1)).T # normalize

In [None]:
cmat

You might be familiar with metrics like [F1-score](https://en.wikipedia.org/wiki/F1_score) or [precision and recall](https://en.wikipedia.org/wiki/Precision_and_recall). This cell will compute these metrics and display them with a plot of the confusion matrix. (These metrics are defined in the Scikit-learn module `sklearn.metrics`; we've imported them in the helper script for you.)

In [None]:
score = f1_score(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
    average='macro',
)

precision = precision_score(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
    average='macro',
)

recall = recall_score(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
    average='macro',
)

display_confusion_matrix(cmat, score, precision, recall)

## Note 7 üòÄ

## Model comparison
## [TF version: 2.2.0](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications)

## [VGG16](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/vgg16) 
    12 epochs, no data augmentation
    Epoch with min loss and max accuracy: 11 11  
    min loss and max accuracy: 3.47 0.23  
    f1 = 0.123 precision = 0.146 recall = 0.226   

## [DenseNet201](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/DenseNet201)  
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 10  
    min loss and max accuracy: 1.31 0.74  
    f1 = 0.643 precision = 0.761 recall = 0.599  

## [InceptionResNetV2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/InceptionResNetV2) 
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 11
    min loss and max accuracy: 1.57 0.66
    f1 = 0.513 precision = 0.640 recall = 0.480      
    

## [InceptionV3](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/InceptionV3)
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 11
    min loss and max accuracy: 1.48 0.69
    f1 = 0.581 precision = 0.728 recall = 0.538  
    
## [MobileNet](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/MobileNet)
    Trains fast compared to other models: 17s 167ms/step 
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 10
    min loss and max accuracy: 1.11 0.76
    f1 = 0.717 precision = 0.798 recall = 0.679   

## [MobileNetV2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/MobileNetV2)
    Trains fast compared to other models: 17s 174ms/step
    Epoch with min loss and max accuracy: 11 11
    min loss and max accuracy: 1.26 0.72
    f1 = 0.650 precision = 0.763 recall = 0.606 

## [NASNetMobile](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/NASNetMobile) 
    input_shape: Optional shape tuple, only to be specified if include_top is False,
                 otherwise the input shape has to be (224, 224, 3) 
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 11
    min loss and max accuracy: 2.69 0.38
    f1 = 0.224 precision = 0.401 recall = 0.203  
    
## [ResNet50](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/ResNet50) 
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 11
    min loss and max accuracy: 3.85 0.12
    f1 = 0.017 precision = 0.035 recall = 0.025  

## [ResNet101V2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/ResNet101V2) 
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 9
    min loss and max accuracy: 0.87 0.83
    f1 = 0.775 precision = 0.842 recall = 0.741      
    
## [VGG19](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/VGG19) 
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 11
    min loss and max accuracy: 3.58 0.21
    f1 = 0.031 precision = 0.036 recall = 0.048 
    
## [Xception](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/Xception)
    12 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 11 11
    min loss and max accuracy: 1.43 0.71
    f1 = 0.575 precision = 0.712 recall = 0.536    
    
## [ResNet101V2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/ResNet101V2) 
    30 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 29 28
    min loss and max accuracy: 0.83 0.83
    f1 = 0.788 precision = 0.863 recall = 0.753 

## Tuning Legend
## tuning1: pooling='avg', exponential_lr()
## tuning2: optimizer='nadam'
## tuning3: Dropout(0.3)
## tuning4: data_augment_v2 with random_blockout
## tuning5: workers = 3 [Multi-worker training with Keras](https://www.tensorflow.org/tutorials/distribute/multi_worker_with_keras)
## tuning6: additional data
## tuning7: data_augment_v3
## tuning8: pretrained_model.trainable=True
## tuning9: EfficientNetB7
## tuning10: noisy-student
## tuning11: class_weight=weight_per_class
## tuning12: Test Time Augmentation TTA


## [ResNet101V2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/ResNet101V2)  
## + tuning1 (0.84 val accuracy at 10 epochs)
## + tuning2 (0.87 val accuracy at 10 epochs, overfit)  
## + tuning3 (0.84 val accuracy at 10 epochs, no overfit)
## Overfitting
## Inspired by Dmitry's notebook [here](https://www.kaggle.com/dmitrynokhrin/densenet201-aug-additional-data)
    30 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 26 27
    min loss and max accuracy: 0.52 0.88
    f1 = 0.864 precision = 0.916 recall = 0.842     
      
## [DenseNet201](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/DenseNet201)  
## + tuning1, tuning2
    30 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 29 29
    min loss and max accuracy: 0.92 0.81
    f1 = 0.767 precision = 0.833 recall = 0.732

## [DenseNet201](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/DenseNet201)  
## + tuning1, tuning2, tuning4
    30 epochs, data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 29 27
    min loss and max accuracy: 0.92 0.82
    f1 = 0.772 precision = 0.846 recall = 0.734 

## [ResNet101V2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/ResNet101V2) 
## + tuning1, tuning2, tuning4
    30 epochs, data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 29 28
    min loss and max accuracy: 0.66 0.85
    f1 = 0.829 precision = 0.870 recall = 0.802 


## [ResNet101V2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/ResNet101V2) 
## + tuning1, tuning2, tuning4, tuning5
    30 epochs, workers=3, data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 29 28
    min loss and max accuracy: 0.66 0.85
    f1 = 0.829 precision = 0.870 recall = 0.802   
    
## [DenseNet201](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/DenseNet201)  
## + tuning1, tuning8
    30 epochs, no data augmentation, model checkpoint, early stopping, training from scratch
    Epoch with min loss and max accuracy: 26 28
    min loss and max accuracy: 0.23 0.95
    f1 = 0.945 precision = 0.950 recall = 0.946     
    
## [ResNet101V2](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/ResNet101V2) 
## + tuning1, tuning8
    30 epochs, no data augmentation, model checkpoint, early stopping, training from scratch
    Epoch with min loss and max accuracy: 10 16
    min loss and max accuracy: 0.36 0.92
    f1 = 0.909 precision = 0.913 recall = 0.911  

## [DenseNet201](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/DenseNet201)  
## + tuning1, tuning2, tuning8
    30 epochs, no data augmentation, model checkpoint, early stopping, training from scratch
    Epoch with min loss and max accuracy: 10 11
    min loss and max accuracy: 0.21 0.95
    f1 = 0.953 precision = 0.960 recall = 0.950
    
## [EfficientNetB7](https://github.com/qubvel/efficientnet) #tuning9
## with noisy-student #tuning10, see [Self-training with Noisy Student improves ImageNet classification](https://arxiv.org/abs/1911.04252)
## tuning1, tuning9, tuning10
    30 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 29 27
    min loss and max accuracy: 0.73 0.84
    f1 = 0.779 precision = 0.839 recall = 0.755
    
## [EfficientNetB7](https://github.com/qubvel/efficientnet) 
## with noisy-student, class_weight
## tuning1, tuning9, tuning10, tuning11
    30 epochs, no data augmentation, model checkpoint, early stopping
    Epoch with min loss and max accuracy: 29 28
    min loss and max accuracy: 1.0 0.81
    f1 = 0.775 precision = 0.769 recall = 0.821 
    
## [EfficientNetB7](https://github.com/qubvel/efficientnet) 
## with noisy-student, with class_weight, training from scratch
## tuning1, tuning2, tuning8, tuning9, tuning10, tuning11
    30 epochs, no data augmentation, model checkpoint, early stopping, training from scratch
    Epoch with min loss and max accuracy: 15 18
    min loss and max accuracy: 0.25 0.96
    f1 = 0.955 precision = 0.950 recall = 0.964
    
## [DenseNet201 (https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/DenseNet201)  
## + tuning1, tuning2, tuning8, tuning11
    30 epochs, no data augmentation, model checkpoint, early stopping, training from scratch
    Epoch with min loss and max accuracy: 24 23
    min loss and max accuracy: 0.22 0.95
    f1 = 0.956 precision = 0.957 recall = 0.958 
    
## Ensemble of [EfficientNetB7](https://github.com/qubvel/efficientnet) 
## with noisy-student, with class_weight, training from scratch
## tuning1, tuning2, tuning8, tuning9, tuning10, tuning11
## +
## [DenseNet201](https://www.tensorflow.org/versions/r2.2/api_docs/python/tf/keras/applications/DenseNet201)  
## + tuning1, tuning2, tuning8, tuning11
    30 epochs, no data augmentation, model checkpoint, early stopping, training from scratch
    Epoch with min loss and max accuracy: 24 23
    min loss and max accuracy: 0.22 0.95
    f1 = 0.962 precision = 0.960 recall = 0.966     

## Create model performance report üòÄ

In [None]:
model_performance_report = pd.DataFrame(columns=['model-family', 'model', 'epochs', 'arg min loss', 'arg max accuracy', 
                                                 'min loss', 'max accuracy', 'f1', 'precision', 'recall'])

model_performance_report.loc[len(model_performance_report)]={ 'model-family': 'VGG',
                                                              'model':'VGG16', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':3.47,
                                                              'max accuracy':0.23,
                                                              'f1':0.123,
                                                              'precision':0.146,
                                                              'recall':0.226}

model_performance_report.loc[len(model_performance_report)]={ 'model-family': 'DenseNet',
                                                              'model':'DenseNet201', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':10,
                                                              'min loss':1.31,
                                                              'max accuracy':0.74,
                                                              'f1':0.643,
                                                              'precision':0.761,
                                                              'recall':0.599}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'InceptionResNet',
                                                              'model':'InceptionResNetV2', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':1.57,
                                                              'max accuracy':0.66,
                                                              'f1':0.513,
                                                              'precision':0.640,
                                                              'recall':0.480}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'Inception', 
                                                              'model':'InceptionV3', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':1.48,
                                                              'max accuracy':0.69,
                                                              'f1':0.581,
                                                              'precision':0.728,
                                                              'recall':0.538}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'MobileNet', 
                                                              'model':'MobileNet', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':10,
                                                              'min loss':1.11,
                                                              'max accuracy':0.76,
                                                              'f1':0.717,
                                                              'precision':0.798,
                                                              'recall':0.679}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'MobileNet',
                                                              'model':'MobileNetV2', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':1.26,
                                                              'max accuracy':0.72,
                                                              'f1':0.650,
                                                              'precision':0.763,
                                                              'recall':0.606}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'NASNetMobile',
                                                              'model':'NASNetMobile', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':2.69,
                                                              'max accuracy':0.38,
                                                              'f1':0.224,
                                                              'precision':0.401,
                                                              'recall':0.203}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'ResNet50', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':3.85,
                                                              'max accuracy':0.12,
                                                              'f1':0.017,
                                                              'precision':0.035,
                                                              'recall':0.025}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'R101V2', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':9,
                                                              'min loss':0.87,
                                                              'max accuracy':0.83,
                                                              'f1':0.775,
                                                              'precision':0.842,
                                                              'recall':0.741}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'VGG',
                                                              'model':'VGG19', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':3.58,
                                                              'max accuracy':0.21,
                                                              'f1':0.031,
                                                              'precision':0.036,
                                                              'recall':0.048}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'Xception',
                                                              'model':'Xception', 
                                                              'epochs':12, 
                                                              'arg min loss':11, 
                                                              'arg max accuracy':11,
                                                              'min loss':1.43,
                                                              'max accuracy':0.71,
                                                              'f1':0.575,
                                                              'precision':0.712,
                                                              'recall':0.536}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'R2 30e', 
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':28,
                                                              'min loss':0.83,
                                                              'max accuracy':0.83,
                                                              'f1':0.788,
                                                              'precision':0.863,
                                                              'recall':0.753}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'R101V2 1,2,3+OF', 
                                                              'epochs':30, 
                                                              'arg min loss':26, 
                                                              'arg max accuracy':27,
                                                              'min loss':0.52,
                                                              'max accuracy':0.88,
                                                              'f1':0.864,
                                                              'precision':0.916,
                                                              'recall':0.842}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'DenseNet',
                                                              'model':'D 1,2', 
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':29,
                                                              'min loss':0.92,
                                                              'max accuracy':0.81,
                                                              'f1':0.767,
                                                              'precision':0.833,
                                                              'recall':0.732}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'DenseNet',
                                                              'model':'D201 1,2,4', 
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':27,
                                                              'min loss':0.92,
                                                              'max accuracy':0.82,
                                                              'f1':0.772,
                                                              'precision':0.846,
                                                              'recall':0.734}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'R101V2 1,2,4', 
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':28,
                                                              'min loss':0.66,
                                                              'max accuracy':0.85,
                                                              'f1':0.829,
                                                              'precision':0.870,
                                                              'recall':0.802}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'R101V2 1,2,4,5', 
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':23,
                                                              'min loss':0.66,
                                                              'max accuracy':0.86,
                                                              'f1':0.829,
                                                              'precision':0.883,
                                                              'recall':0.802}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'DenseNet',
                                                              'model':'D 1,8', 
                                                              'epochs':30, 
                                                              'arg min loss':26, 
                                                              'arg max accuracy':28,
                                                              'min loss':0.23,
                                                              'max accuracy':0.95,
                                                              'f1':0.945,
                                                              'precision':0.950,
                                                              'recall':0.946}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'R101V2 1,8', 
                                                              'epochs':30, 
                                                              'arg min loss':10, 
                                                              'arg max accuracy':16,
                                                              'min loss':0.36,
                                                              'max accuracy':0.92,
                                                              'f1':0.909,
                                                              'precision':0.913,
                                                              'recall':0.911}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'D 1,2,8', 
                                                              'epochs':30, 
                                                              'arg min loss':10, 
                                                              'arg max accuracy':11,
                                                              'min loss':0.21,
                                                              'max accuracy':0.95,
                                                              'f1':0.953,
                                                              'precision':0.960,
                                                              'recall':0.950}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'EfficientNet',
                                                              'model':'EB7 1,2,9,10', 
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':27,
                                                              'min loss':0.73,
                                                              'max accuracy':0.84,
                                                              'f1':0.779,
                                                              'precision':0.839,
                                                              'recall':0.755}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'EfficientNet',
                                                              'model':'EB7 +11', 
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':28,
                                                              'min loss':1.0,
                                                              'max accuracy':0.81,
                                                              'f1':0.775,
                                                              'precision':0.769,
                                                              'recall':0.821}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'EfficientNet',
                                                              'model':'EB7 1,2,8,9,10,11', 
                                                              'epochs':30, 
                                                              'arg min loss':15, 
                                                              'arg max accuracy':18,
                                                              'min loss':0.25,
                                                              'max accuracy':0.96,
                                                              'f1':0.955,
                                                              'precision':0.950,
                                                              'recall':0.964}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'ResNet',
                                                              'model':'D 1,2,8,11', 
                                                              'epochs':30, 
                                                              'arg min loss':24, 
                                                              'arg max accuracy':23,
                                                              'min loss':0.22,
                                                              'max accuracy':0.95,
                                                              'f1':0.956,
                                                              'precision':0.957,
                                                              'recall':0.958}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'Ensemble',
                                                              'model':'Ensemble EB7+D201', 
                                                              'epochs':30, 
                                                              'arg min loss':24, 
                                                              'arg max accuracy':23,
                                                              'min loss':0.22,
                                                              'max accuracy':0.95,
                                                              'f1':0.962,
                                                              'precision':0.960,
                                                              'recall':0.966}

extra_columns = ['total params', 'trainable params', 'non-trainable params','training time per epoch (sec)']
model_performance_report[extra_columns] = pd.DataFrame([[np.nan, np.nan, np.nan, np.nan]], index=model_performance_report.index)

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'DenseNet',
                                                              'model':'D 1,2,6',
                                                              'total params':18_521_768,
                                                              'trainable params':199_784,
                                                              'non-trainable params':18_321_984,
                                                              'training time per epoch (sec)':114,
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':29,
                                                              'min loss':0.71,
                                                              'max accuracy':0.85,
                                                              'f1':0.826,
                                                              'precision':0.791,
                                                              'recall':0.890}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'DenseNet',
                                                              'model':'D 1,2,6,12',
                                                              'total params':18_521_768,
                                                              'trainable params':199_784,
                                                              'non-trainable params':18_321_984,
                                                              'training time per epoch (sec)':114,
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':29,
                                                              'min loss':0.71,
                                                              'max accuracy':0.85,
                                                              'f1':0.826,
                                                              'precision':0.791,
                                                              'recall':0.890}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'DenseNet',
                                                              'model':'D 1,2,6,8',
                                                              'total params':18_521_768,
                                                              'trainable params':18_292_712,
                                                              'non-trainable params':229_056,
                                                              'training time per epoch (sec)':274,
                                                              'epochs':30, 
                                                              'arg min loss':26, 
                                                              'arg max accuracy':28,
                                                              'min loss':0.22,
                                                              'max accuracy':0.96,
                                                              'f1':0.948,
                                                              'precision':0.942,
                                                              'recall':0.957}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'DenseNet',
                                                              'model':'D 1,2,6,8,12',
                                                              'total params':18_521_768,
                                                              'trainable params':18_292_712,
                                                              'non-trainable params':229_056,
                                                              'training time per epoch (sec)':274,
                                                              'epochs':30, 
                                                              'arg min loss':26, 
                                                              'arg max accuracy':28,
                                                              'min loss':0.22,
                                                              'max accuracy':0.96,
                                                              'f1':0.948,
                                                              'precision':0.942,
                                                              'recall':0.957}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'EfficientNet',
                                                              'model':'EB7 1,2,6,8,9,10,11', 
                                                              'total params':64_364_024,
                                                              'trainable params':64_053_304,
                                                              'non-trainable params':310_720,
                                                              'training time per epoch (sec)':511,                                                             
                                                              'epochs':30, 
                                                              'arg min loss':20, 
                                                              'arg max accuracy':28,
                                                              'min loss':0.24,
                                                              'max accuracy':0.96,
                                                              'f1':0.956,
                                                              'precision':0.949,
                                                              'recall':0.967}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'Ensemble',
                                                              'model':'Ensemble 6,12 EB7+D201', 
                                                              'total params':82_885_792,
                                                              'trainable params':82_346_016,
                                                              'non-trainable params':539_776,
                                                              'training time per epoch (sec)':785,                                                             
                                                              'epochs':30, 
                                                              'arg min loss':20, 
                                                              'arg max accuracy':28,
                                                              'min loss':0.24,
                                                              'max accuracy':0.96,
                                                              'f1':0.962,
                                                              'precision':0.956,
                                                              'recall':0.971}

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'MobileNet',
                                                              'model':'MobileNetV2 1,2,6', 
                                                              'total params':2_391_208,
                                                              'trainable params':133_224,
                                                              'non-trainable params':2_257_984,
                                                              'training time per epoch (sec)':79,                                                              
                                                              'epochs':30, 
                                                              'arg min loss':29, 
                                                              'arg max accuracy':26,
                                                              'min loss':0.83,
                                                              'max accuracy':0.8,
                                                              'f1':0.781,
                                                              'precision':0.752,
                                                              'recall':0.850} 

model_performance_report.loc[len(model_performance_report)]={ 'model-family':'MobileNet',
                                                              'model':'MobileNetV2 1,2,6,8', 
                                                              'total params':2_391_208,
                                                              'trainable params':2_357_096,
                                                              'non-trainable params':34_112,
                                                              'training time per epoch (sec)':102,                                                              
                                                              'epochs':30, 
                                                              'arg min loss':24, 
                                                              'arg max accuracy':27,
                                                              'min loss':0.27,
                                                              'max accuracy':0.95,
                                                              'f1':0.936,
                                                              'precision':0.929,
                                                              'recall':0.951}

In [None]:
model_performance_report

In [None]:
#sns.set_theme(style="white")

# Plot miles per gallon against horsepower with other semantics
with sns.axes_style("whitegrid", {'grid.linestyle': '--'}):
    myplot = sns.relplot(x="model", y="max accuracy", hue="model", size="f1",
                sizes=(100, 1000), alpha=1, palette="pastel", legend="brief", #, ‚Äúbrief‚Äù, ‚Äúfull‚Äù, or False
                height=15, data=model_performance_report)

#myplot.fig.set_size_inches(25,15)

#Slighlty rotate the x-axis labels so model names to not overlap
myplot.set_xticklabels(rotation=45)

#Add yaxis gridlines
myplot.axes[0][0].set_yticks(np.arange(0,1.05,0.05), minor=False)

#For each model, add model name, val accuracy and f1 score
df = model_performance_report.copy()
for line in range(0,df.shape[0]):
    if df['model'][line] in ['D 1,8', 'D 1,2,8', 'D 1,2,6', 'D 1,2,6,12', 'D 1,2,6,8', 'D 1,2,6,8,12', 'D 1,2,8,11', 'R101V2', 'EB7 1,2,6,8,9,10,11']:
        #print(df['model'][line])
        mytext = str(df['model'][line][0]) #+' '+str(df['max accuracy'][line])+' '+str(df['f1'][line])
    else:
        mytext = str(df['model'][line])+' - acc:'+str(df['max accuracy'][line])+' - f1:'+str(df['f1'][line])
        
    myplot.axes[0,0].text(model_performance_report['model'][line], 
                           df['max accuracy'][line], 
                           mytext, 
                           horizontalalignment='left', 
                           size='medium', 
                           color='black', 
                           weight='normal')

#Add title and rename axes        
myplot.set(title='Petals to the Metal - Model Performance - y axis:val acc, size:f1 score - Milestones: 12 epochs; 30 epochs; Hyperparameter tuning; End to end training; Ensemble models; 5x data; Ensemble models of 5x models', xlabel='Model', ylabel='Validation Accuracy')

#Add annotation for training from scratch models
x_location, y_location = 10, 0.97
myplot.axes[0][0].annotate('End to end training (tuning8)', xy=(x_location+6, y_location), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#C4F0EF', shrink=0.05, headwidth=20, width=5))

#Add annotation for Transfer Learning with tuning models 
x_location, y_location = 1, 0.85
myplot.axes[0][0].annotate('Transfer Learning with tuning for 30 epochs', xy=(x_location+9, y_location), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#F5B78A', shrink=0.05, headwidth=20, width=5))

#Add annotation for transfer learning models
x_location, y_location = 5, 0.23
myplot.axes[0][0].annotate('Transfer Learning for 12 epochs', xy=(x_location-2, y_location-0.03), xytext=(x_location, y_location-0.05),
             arrowprops=dict(facecolor='lightgrey', shrink=0.05, headwidth=20, width=5))

#Add annotation for transfer learning models
x_location, y_location = -1.5, 0.8
myplot.axes[0][0].annotate('Transfer Learning for 12 epochs', xy=(x_location+6, y_location), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#FFFDAE', shrink=0.05, headwidth=20, width=5))

#Add annotation for Ensemble EB7+D201
x_location, y_location = 19, 0.97
myplot.axes[0][0].annotate('Ensemble EB7+D201', xy=(x_location+5, y_location-0.0005), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#CCBDFA', shrink=0.05, headwidth=20, width=5))

#Add annotation for 70K (5x) additional data
x_location, y_location = 22, 0.98
myplot.axes[0][0].annotate('Additional data 70K (5x)', xy=(x_location+5, y_location-0.01), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='lightgrey', shrink=0.05, headwidth=20, width=5))

#Add annotation for Legend and 
x_location, y_location = 15, 0.1
myplot.axes[0][0].annotate('Models sorted chronologically, size:f1 score', xy=(x_location+12, y_location), xytext=(x_location, y_location),
             arrowprops=dict(facecolor='black', shrink=0.05, headwidth=20, width=5))

#Add Tuning Legend
x_location, y_location, y_delta = 15.1, 0.655, 0.03
myplot.axes[0][0].annotate('Author: George Zoto', xy=(x_location, y_location+y_delta), size='x-large')
myplot.axes[0][0].annotate('Tuning Legend, models sorted chronologically', xy=(x_location, y_location), size='x-large')
myplot.axes[0][0].annotate('tuning1: pooling=avg, exponential_lr()', xy=(x_location, y_location-y_delta), size='large')
myplot.axes[0][0].annotate('tuning2: optimizer=nadam', xy=(x_location, y_location-2*y_delta), size='large')
myplot.axes[0][0].annotate('tuning3: Dropout(0.3)', xy=(x_location, y_location-3*y_delta), size='large')
myplot.axes[0][0].annotate('tuning4: data_augment_v2 with random_blockout', xy=(x_location, y_location-4*y_delta), size='large')
myplot.axes[0][0].annotate('tuning5: workers = 3 Multi-worker training with Keras', xy=(x_location, y_location-5*y_delta), size='large')
myplot.axes[0][0].annotate('tuning6: additional data', xy=(x_location, y_location-6*y_delta), size='large')
myplot.axes[0][0].annotate('tuning7: data_augment_v3', xy=(x_location, y_location-7*y_delta), size='large')
myplot.axes[0][0].annotate('tuning8: pretrained_model.trainable=True', xy=(x_location, y_location-8*y_delta), size='large')
myplot.axes[0][0].annotate('tuning9: EfficientNetB7', xy=(x_location, y_location-9*y_delta), size='large')
myplot.axes[0][0].annotate('tuning10: noisy-student', xy=(x_location, y_location-10*y_delta), size='large')
myplot.axes[0][0].annotate('tuning11: weight_per_class', xy=(x_location, y_location-11*y_delta), size='large');
myplot.axes[0][0].annotate('tuning12: Test Time Augmentation TTA', xy=(x_location, y_location-12*y_delta), size='large');

In [None]:
model_performance_report = model_performance_report.sort_values(by='max accuracy')
model_performance_report

## Plot model performance report üòÄ

In [None]:
#sns.set_theme(style="white")

# Plot miles per gallon against horsepower with other semantics
with sns.axes_style("whitegrid", {'grid.linestyle': '--'}):
    myplot = sns.relplot(x="model", y="max accuracy", hue="model", size="f1",
                sizes=(100, 1000), alpha=1, palette="pastel", legend="brief", #, ‚Äúbrief‚Äù, ‚Äúfull‚Äù, or False
                height=15, data=model_performance_report)

#myplot.fig.set_size_inches(25,15)

#Slighlty rotate the x-axis labels so model names to not overlap
myplot.set_xticklabels(rotation=70)

#Add yaxis gridlines
myplot.axes[0][0].set_yticks(np.arange(0,1.05,0.05), minor=False)

#For each model, add model name, val accuracy and f1 score
df = model_performance_report.copy()
for line in range(0,df.shape[0]):
    if df['model'][line] in ['D 1,2', 'D 1,8', 'D 1,2,6', 'D 1,2,6,12', 'D 1,2,6,8', 'D 1,2,6,8,12', 'D 1,2,8', 'D 1,2,8,11', 'R101V2', 'EB7 1,2,8,9,10,11', 'EB7 1,2,6,8,9,10,11', 'Ensemble EB7+D201', 'MobileNetV2 1,2,6,8']:
        #print(df['model'][line])
        mytext = str(df['model'][line][0])#+' '+str(df['max accuracy'][line])+' '+str(df['f1'][line])
    else:
        mytext = str(df['model'][line])+' - acc:'+str(df['max accuracy'][line])+' - f1:'+str(df['f1'][line])
        
    myplot.axes[0,0].text(model_performance_report['model'][line], 
                           df['max accuracy'][line], 
                           mytext, 
                           horizontalalignment='left', 
                           size='medium', 
                           color='black', 
                           weight='normal')

#Add title and rename axes        
myplot.set(title='Petals to the Metal - Model Performance - y axis:val acc, size:f1 score - Milestones: 12 epochs; 30 epochs; Hyperparameter tuning; End to end training; Ensemble models; 5x data; Ensemble models of 5x models', xlabel='Model', ylabel='Validation Accuracy')

#Add annotation for training from scratch models
x_location, y_location = 14, 0.93
myplot.axes[0][0].annotate('End to end training (tuning8)', xy=(x_location+7, y_location), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#a0e2a7', shrink=0.05, headwidth=20, width=5))

#Add annotation for Transfer Learning with tuning models 
x_location, y_location = 4, 0.85
myplot.axes[0][0].annotate('Transfer Learning with tuning for 30 epochs', xy=(x_location+9, y_location), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#CCBDFA', shrink=0.05, headwidth=20, width=5))

#Add annotation for transfer learning models
x_location, y_location = 5, 0.23
myplot.axes[0][0].annotate('Transfer Learning for 12 epochs', xy=(x_location-2, y_location-0.03), xytext=(x_location, y_location-0.05),
             arrowprops=dict(facecolor='lightgrey', shrink=0.05, headwidth=20, width=5))

#Add annotation for transfer learning models
x_location, y_location = 0, 0.75
myplot.axes[0][0].annotate('Transfer Learning for 12 epochs', xy=(x_location+7, y_location), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='lightgrey', shrink=0.05, headwidth=20, width=5))

#Add annotation for Ensemble EB7+D201
x_location, y_location = 19, 0.96
myplot.axes[0][0].annotate('Ensemble EB7+D201', xy=(x_location+5, y_location-0.001), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#ccbdfa', shrink=0.05, headwidth=20, width=5))

#Add annotation for 70K (5x) additional data
x_location, y_location = 21.5, 0.975
myplot.axes[0][0].annotate('Additional data 70K (5x)', xy=(x_location+5.5, y_location-0.005), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='lightgrey', shrink=0.05, headwidth=20, width=5))

#Add annotation for Ensemble 6,12 EB7+D201
x_location, y_location = 26.5, 0.985
myplot.axes[0][0].annotate('Ensemble 6,12 EB7+D201', xy=(x_location+5.5, y_location-0.01), xytext=(x_location, y_location+0.01),
             arrowprops=dict(facecolor='#a0e2a7', shrink=0.05, headwidth=20, width=5))

#Add annotation for Legend and 
x_location, y_location = 15, 0.1
myplot.axes[0][0].annotate('Models sorted by val accuracy, size:f1 score', xy=(x_location+12, y_location), xytext=(x_location, y_location),
             arrowprops=dict(facecolor='black', shrink=0.05, headwidth=20, width=5))

#Add Tuning Legend
x_location, y_location, y_delta = 15.1, 0.655, 0.03
myplot.axes[0][0].annotate('Author: George Zoto', xy=(x_location, y_location+y_delta), size='x-large')
myplot.axes[0][0].annotate('Tuning Legend, models sorted by performance', xy=(x_location, y_location), size='x-large')
myplot.axes[0][0].annotate('tuning1: pooling=avg, exponential_lr()', xy=(x_location, y_location-y_delta), size='large')
myplot.axes[0][0].annotate('tuning2: optimizer=nadam', xy=(x_location, y_location-2*y_delta), size='large')
myplot.axes[0][0].annotate('tuning3: Dropout(0.3)', xy=(x_location, y_location-3*y_delta), size='large')
myplot.axes[0][0].annotate('tuning4: data_augment_v2 with random_blockout', xy=(x_location, y_location-4*y_delta), size='large')
myplot.axes[0][0].annotate('tuning5: workers = 3 Multi-worker training with Keras', xy=(x_location, y_location-5*y_delta), size='large')
myplot.axes[0][0].annotate('tuning6: additional data', xy=(x_location, y_location-6*y_delta), size='large')
myplot.axes[0][0].annotate('tuning7: data_augment_v3', xy=(x_location, y_location-7*y_delta), size='large')
myplot.axes[0][0].annotate('tuning8: pretrained_model.trainable=True', xy=(x_location, y_location-8*y_delta), size='large')
myplot.axes[0][0].annotate('tuning9: EfficientNetB7', xy=(x_location, y_location-9*y_delta), size='large')
myplot.axes[0][0].annotate('tuning10: noisy-student', xy=(x_location, y_location-10*y_delta), size='large')
myplot.axes[0][0].annotate('tuning11: weight_per_class', xy=(x_location, y_location-11*y_delta), size='large');
myplot.axes[0][0].annotate('tuning12: Test Time Augmentation TTA', xy=(x_location, y_location-12*y_delta), size='large');

## Plot model performance report in 3D üòé

In [None]:
model_performance_report.head(3)

In [None]:
fig = px.scatter_3d(model_performance_report, 
                    title='How early (arg min/max) did a model perform best', symbol='model-family', color='model', 
                    x='epochs', y='arg min loss', z='arg max accuracy',
                    size_max=12, opacity=0.7,
                    width=1200, height=700,
                   )

fig.update_layout(margin=dict(l=0, r=0, b=0, t=30))

fig.show()

In [None]:
model_performance_report.query('epochs == 12')

In [None]:
#Filter only for 12 epoch models
model_performance_report_filtered = model_performance_report.query('epochs == 12').copy()

fig = px.scatter_3d(model_performance_report_filtered, 
                    title='12 epoch model performance - loss and accuracy - by model-family', symbol='model-family', color='model', 
                    x='model-family', y='min loss', z='max accuracy', text='model',
                    size_max=12, opacity=0.7,
                    width=1200, height=700,
                   )

fig.update_layout(margin=dict(l=0, r=0, b=0, t=30))

fig.show()

In [None]:
model_performance_report.query('epochs == 30')

In [None]:
#Filter only for 30 epoch models
model_performance_report_filtered = model_performance_report.query('epochs == 30').copy()

fig = px.scatter_3d(model_performance_report_filtered, 
                    title='30 epoch model performance - loss and accuracy - by model-family', symbol='model-family', color='model', 
                    x='model-family', y='min loss', z='max accuracy', text='model',
                    size_max=12, opacity=0.7,
                    width=1200, height=700,
                   )

fig.update_layout(margin=dict(l=0, r=0, b=0, t=30))

fig.show()

In [None]:
#Filter only for 30 epoch models
model_performance_report_filtered = model_performance_report.query('epochs == 30').copy()

fig = px.scatter_3d(model_performance_report_filtered, 
                    title='30 epoch model performance - f1, precision, recall (color) - by model-family', symbol='model-family', color='recall', 
                    x='model-family', y='f1', z='precision', text='model',
                    size_max=12, opacity=0.7,
                    width=1200, height=700,
                   )

fig.update_layout(margin=dict(l=0, r=0, b=0, t=30))
fig.show()

In [None]:
#Filter only for 30 epoch models
model_performance_report_filtered = model_performance_report.query('epochs == 30').copy()

fig = px.scatter_3d(model_performance_report_filtered, 
                    title='30 epoch model performance - f1, precision, recall - by model-family', symbol='model-family', color='model', 
                    x='f1', y='precision', z='recall', text='model',
                    size_max=12, opacity=0.7,
                    width=1200, height=700,
                   )

fig.update_layout(margin=dict(l=0, r=0, b=0, t=30))
fig.show()

## Visual Validation ##

It can also be helpful to look at some examples from the validation set and see what class your model predicted. This can help reveal patterns in the kinds of images your model has trouble with.

This cell will set up the validation set to display 20 images at a time -- you can change this to display more or fewer, if you like.

In [None]:
dataset = get_validation_dataset()
dataset = dataset.unbatch().batch(20)
batch = iter(dataset)

And here is a set of flowers with their predicted species. Run the cell again to see another set.

In [None]:
images, labels = next(batch)

In [None]:
if using_ensemble_models:
    probabilities1 = model_EB7.predict(images)
    probabilities2 = model_D201.predict(images)
    probabilities = best_alpha * probabilities1 + (1 - best_alpha) * probabilities2
else:
    probabilities = model.predict(images)

In [None]:
predictions = np.argmax(probabilities, axis=-1)
display_batch_of_images((images, labels), predictions)

## Mismatches on validation data
## Inspired by Rosa's notebook [here](https://www.kaggle.com/wrrosa/tpu-enet-b7-densenet)

In [None]:
mismatches = sum(cm_predictions!=cm_correct_labels)
print('Number of mismatches on validation data: {} out of {} or ({:.2%})'.format(mismatches, NUM_VALIDATION_IMAGES, mismatches/NUM_VALIDATION_IMAGES))

In [None]:
cmdataset = get_validation_dataset(ordered=True) # since we are splitting the dataset and iterating separately on images and labels, order matters.
images_ds = cmdataset.map(lambda image, label: image)
labels_ds = cmdataset.map(lambda image, label: label).unbatch()
cm_correct_labels = next(iter(labels_ds.batch(NUM_VALIDATION_IMAGES))).numpy() # get everything as one batch

mismatches_images, mismatches_predictions, mismatches_labels = [], [], []
mismatches_dataset = tf.data.Dataset.from_tensors([])
val_batch = iter(cmdataset.unbatch().batch(1))

for image_index in range(NUM_VALIDATION_IMAGES):
    batch = next(val_batch)
    if cm_predictions[image_index] != cm_correct_labels[image_index]:
        print('Predicted vs Correct labels: {}, {}'.format(cm_predictions[image_index], cm_correct_labels[image_index]))
        #display_batch_of_images(batch, np.array([cm_predictions[image_index]]))
        #mismatches_dataset = tf.data.Dataset.from_tensors(batch)
        #mismatches_images.append(tf.data.Dataset.from_tensors(batch))
        #mismatches_predictions.append(cm_predictions[image_index])
        #mismatches_labels.append(cm_correct_labels[image_index])

In [None]:
dataset = get_validation_dataset()
dataset = dataset.unbatch().batch(20)
batch = iter(dataset)
images, labels = next(batch)

In [None]:
for i in range(3):
    display_batch_of_images((images, labels), predictions, display_mismatches_only=True)
    images, labels = next(batch)

In [None]:
#mismatches_predictions[0]

Let's take a look again at some training images

In [None]:
one_batch = next(ds_iter)
display_batch_of_images(one_batch)

# Step 8: Make Test Predictions #

Once you're satisfied with everything, you're ready to make predictions on the test set.

## tuning12: Test Time Augmentation TTA inspired by Araik's notebook [here](https://www.kaggle.com/atamazian/fc-ensemble-external-data-effnet-densenet), Andrew's notebook [here](https://www.kaggle.com/andrewkh/test-time-augmentation-tta-worth-it) and Nathan's article [here](https://towardsdatascience.com/test-time-augmentation-tta-and-how-to-perform-it-with-keras-4ac19b67fb4d)
## [Learning Loss for Test-Time Augmentation Paper](https://arxiv.org/pdf/2010.11422.pdf)

In [None]:
using_tta = False #tuning12
tta_iterations = 3

In [None]:
if using_tta:
    def get_test_dataset(ordered=False):
        dataset = load_dataset(TEST_FILENAMES, labeled=False, ordered=ordered)
        dataset = dataset.map(data_augment, num_parallel_calls=AUTO) #tuning4
        #dataset = dataset.map(data_augment_v2, num_parallel_calls=AUTO) #tuning4 #error in shapes
        #dataset = dataset.map(data_augment_v3, num_parallel_calls=AUTO) #tuning4 0.44 performance :(
        dataset = dataset.batch(BATCH_SIZE)
        dataset = dataset.prefetch(AUTO)
        return dataset

In [None]:
def predict_tta(model, tta_iterations):
    probs  = []
    for i in range(tta_iterations):
        print('TTA iteration ', i)
        test_ds = get_test_dataset(ordered=True) # since we are splitting the dataset and iterating separately on images and ids, order matters.
        test_images_ds = test_ds.map(lambda image, idnum: image)
        
        if using_ensemble_models:
            print('using_ensemble_models')
            probabilities1 = model_EB7.predict(test_images_ds)
            probabilities2 = model_D201.predict(test_images_ds)
            probabilities = best_alpha * probabilities1 + (1 - best_alpha) * probabilities2
            probs.append(probabilities)
        else:
            probs.append(model.predict(test_images_ds,verbose=0))
        
    return probs

In [None]:
test_ds = get_test_dataset(ordered=True)
test_images_ds = test_ds.map(lambda image, idnum: image)

if using_tta:
    print('Computing predictions using TTA...')
    probabilities = np.mean(predict_tta(model, tta_iterations), axis=0)
else:
    print('Computing predictions...')
    probabilities = model.predict(test_images_ds)
predictions = np.argmax(probabilities, axis=-1)
print(predictions)

We'll generate a file `submission.csv`. This file is what you'll submit to get your score on the leaderboard.

In [None]:
print('using_ensemble_models:', using_ensemble_models)
print('Generating submission.csv file...')

# Get image ids from test set and convert to unicode
test_ids_ds = test_ds.map(lambda image, idnum: idnum).unbatch()
test_ids = next(iter(test_ids_ds.batch(NUM_TEST_IMAGES))).numpy().astype('U')

# Write the submission file
np.savetxt(
    'submission.csv',
    np.rec.fromarrays([test_ids, predictions]),
    fmt=['%s', '%d'],
    delimiter=',',
    header='id,label',
    comments='',
)

# Look at the first few predictions
!head submission.csv

# Step 9: Make a submission #

If you haven't already, create your own editable copy of this notebook by clicking on the **Copy and Edit** button in the top right corner. Then, submit to the competition by following these steps:

1. Begin by clicking on the blue **Save Version** button in the top right corner of the window.  This will generate a pop-up window.  
2. Ensure that the **Save and Run All** option is selected, and then click on the blue **Save** button.
3. This generates a window in the bottom left corner of the notebook.  After it has finished running, click on the number to the right of the **Save Version** button.  This pulls up a list of versions on the right of the screen.  Click on the ellipsis **(...)** to the right of the most recent version, and select **Open in Viewer**.  This brings you into view mode of the same page. You will need to scroll down to get back to these instructions.
4. Click on the **Output** tab on the right of the screen.  Then, click on the file you would like to submit, and click on the blue **Submit** button to submit your results to the leaderboard.

You have now successfully submitted to the competition!

If you want to keep working to improve your performance, select the blue **Edit** button in the top right of the screen. Then you can change your code and repeat the process. There's a lot of room to improve, and you will climb up the leaderboard as you work.


---




*Have questions or comments? Visit the [Learn Discussion forum](https://www.kaggle.com/learn-forum/161321) to chat with other Learners.*