# Step 1: Imports #

This Python script is a concise utility designed to check and display the version of TensorFlow installed on your system. It accomplishes this by importing TensorFlow, along with a few standard Python libraries, and then prints the TensorFlow version to the console. 

**Libraries Imported:**

* math: The math library provides mathematical functions and constants.

* re: The re library allows for regular expression pattern matching.

* os: The os library provides a way to interact with the operating system, including file and directory operations.

* numpy (as np): The numpy library is used for numerical operations and handling arrays.

* tensorflow (as tf): The tensorflow library is the main focus of this script and is used to check and print its version.

In [None]:
import math, re, os
import numpy as np
import tensorflow as tf

print("Tensorflow version " + tf.__version__)

# Step 2: Distribution Strategy #

A TPU has eight different *cores* and each of these cores acts as its own accelerator. (A TPU is sort of like having eight GPUs in one machine.) We tell TensorFlow how to make use of all these cores at once through a **distribution strategy**. Run the following cell to create the distribution strategy that we'll later apply to our model.

In [None]:
# Detect TPU, return appropriate distribution strategy
try:
    tpu = tf.distribute.cluster_resolver.TPUClusterResolver() 
    print('Running on TPU ', tpu.master())
except ValueError:
    tpu = None

if tpu:
    tf.config.experimental_connect_to_cluster(tpu)
    tf.tpu.experimental.initialize_tpu_system(tpu)
    strategy = tf.distribute.experimental.TPUStrategy(tpu)
else:
    strategy = tf.distribute.get_strategy() 

print("REPLICAS: ", strategy.num_replicas_in_sync)

We'll use the distribution strategy when we create our neural network model. Then, TensorFlow will distribute the training among the eight TPU cores by creating eight different *replicas* of the model, one for each core.

# Step 3: Loading the Competition Data #

## Get GCS Path ##

When used with TPUs, datasets need to be stored in a [Google Cloud Storage bucket](https://cloud.google.com/storage/). You can use data from any public GCS bucket by giving its path just like you would data from `'/kaggle/input'`. The following will retrieve the GCS path for this competition's dataset.

In [None]:
from kaggle_datasets import KaggleDatasets

GCS_DS_PATH = KaggleDatasets().get_gcs_path('tpu-getting-started')
print(GCS_DS_PATH) # what do gcs paths look like?

You can use data from any public dataset here on Kaggle in just the same way. If you'd like to use data from one of your private datasets, see [here](https://www.kaggle.com/docs/tpu#tpu3pt5).

## Load Data ##

When used with TPUs, datasets are often serialized into [TFRecords](https://www.kaggle.com/ryanholbrook/tfrecords-basics). This is a format convenient for distributing data to each of the TPUs cores. We've hidden the cell that reads the TFRecords for our dataset since the process is a bit long. You could come back to it later for some guidance on using your own datasets with TPUs.

In [None]:

IMAGE_SIZE = [512, 512]
GCS_PATH = GCS_DS_PATH + '/tfrecords-jpeg-512x512'
AUTO = tf.data.experimental.AUTOTUNE

TRAINING_FILENAMES = tf.io.gfile.glob(GCS_PATH + '/train/*.tfrec')
VALIDATION_FILENAMES = tf.io.gfile.glob(GCS_PATH + '/val/*.tfrec')
TEST_FILENAMES = tf.io.gfile.glob(GCS_PATH + '/test/*.tfrec') 

CLASSES = ['pink primrose',    'hard-leaved pocket orchid', 'canterbury bells', 'sweet pea',     'wild geranium',     'tiger lily',           'moon orchid',              'bird of paradise', 'monkshood',        'globe thistle',         # 00 - 09
           'snapdragon',       "colt's foot",               'king protea',      'spear thistle', 'yellow iris',       'globe-flower',         'purple coneflower',        'peruvian lily',    'balloon flower',   'giant white arum lily', # 10 - 19
           'fire lily',        'pincushion flower',         'fritillary',       'red ginger',    'grape hyacinth',    'corn poppy',           'prince of wales feathers', 'stemless gentian', 'artichoke',        'sweet william',         # 20 - 29
           'carnation',        'garden phlox',              'love in the mist', 'cosmos',        'alpine sea holly',  'ruby-lipped cattleya', 'cape flower',              'great masterwort', 'siam tulip',       'lenten rose',           # 30 - 39
           'barberton daisy',  'daffodil',                  'sword lily',       'poinsettia',    'bolero deep blue',  'wallflower',           'marigold',                 'buttercup',        'daisy',            'common dandelion',      # 40 - 49
           'petunia',          'wild pansy',                'primula',          'sunflower',     'lilac hibiscus',    'bishop of llandaff',   'gaura',                    'geranium',         'orange dahlia',    'pink-yellow dahlia',    # 50 - 59
           'cautleya spicata', 'japanese anemone',          'black-eyed susan', 'silverbush',    'californian poppy', 'osteospermum',         'spring crocus',            'iris',             'windflower',       'tree poppy',            # 60 - 69
           'gazania',          'azalea',                    'water lily',       'rose',          'thorn apple',       'morning glory',        'passion flower',           'lotus',            'toad lily',        'anthurium',             # 70 - 79
           'frangipani',       'clematis',                  'hibiscus',         'columbine',     'desert-rose',       'tree mallow',          'magnolia',                 'cyclamen ',        'watercress',       'canna lily',            # 80 - 89
           'hippeastrum ',     'bee balm',                  'pink quill',       'foxglove',      'bougainvillea',     'camellia',             'mallow',                   'mexican petunia',  'bromelia',         'blanket flower',        # 90 - 99
           'trumpet creeper',  'blackberry lily',           'common tulip',     'wild rose']                                                                                                                                               # 100 - 102


def decode_image(image_data):
    image = tf.image.decode_jpeg(image_data, channels=3)
    image = tf.cast(image, tf.float32) / 255.0  # convert image to floats in [0, 1] range
    image = tf.reshape(image, [*IMAGE_SIZE, 3]) # explicit size needed for TPU
    return image

def read_labeled_tfrecord(example):
    LABELED_TFREC_FORMAT = {
        "image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
        "class": tf.io.FixedLenFeature([], tf.int64),  # shape [] means single element
    }
    example = tf.io.parse_single_example(example, LABELED_TFREC_FORMAT)
    image = decode_image(example['image'])
    label = tf.cast(example['class'], tf.int32)
    return image, label # returns a dataset of (image, label) pairs

def read_unlabeled_tfrecord(example):
    UNLABELED_TFREC_FORMAT = {
        "image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
        "id": tf.io.FixedLenFeature([], tf.string),  # shape [] means single element
        # class is missing, this competitions's challenge is to predict flower classes for the test dataset
    }
    example = tf.io.parse_single_example(example, UNLABELED_TFREC_FORMAT)
    image = decode_image(example['image'])
    idnum = example['id']
    return image, idnum # returns a dataset of image(s)

def load_dataset(filenames, labeled=True, ordered=False):
    # Read from TFRecords. For optimal performance, reading from multiple files at once and
    # disregarding data order. Order does not matter since we will be shuffling the data anyway.

    ignore_order = tf.data.Options()
    if not ordered:
        ignore_order.experimental_deterministic = False # disable order, increase speed

    dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTO) # automatically interleaves reads from multiple files
    dataset = dataset.with_options(ignore_order) # uses data as soon as it streams in, rather than in its original order
    dataset = dataset.map(read_labeled_tfrecord if labeled else read_unlabeled_tfrecord, num_parallel_calls=AUTO)
    # returns a dataset of (image, label) pairs if labeled=True or (image, id) pairs if labeled=False
    return dataset

## Create Data Pipelines ##

### Overview:
The following Python code defines several utility functions for data preparation and augmentation in the context of TensorFlow. These functions are commonly used in machine learning tasks such as image classification. This documentation provides an explanation of each function, its purpose, and how it contributes to data processing in a machine learning pipeline.

### Code Functions:

#### 1. `data_augment(image, label)`
- Purpose: This function performs data augmentation on an image.
- Input:
  - `image`: The input image (e.g., a training image).
  - `label`: The corresponding label for the image.
- Functionality:
  - Randomly flips the input image horizontally (left to right).
  - Can be extended to include other data augmentation techniques (e.g., random saturation adjustment).
- Output:
  - Returns the augmented image and the original label.

#### 2. `get_training_dataset()`
- Purpose: This function prepares a training dataset.
- Functionality:
  - Loads the training dataset from file(s) specified in `TRAINING_FILENAMES`.
  - Applies data augmentation using the `data_augment` function.
  - Repeats the dataset indefinitely to ensure it covers multiple epochs.
  - Shuffles the dataset to introduce randomness.
  - Batches the data to the desired batch size (`BATCH_SIZE`).
  - Prefetches data for improved performance.
- Output:
  - Returns the prepared training dataset.

#### 3. `get_validation_dataset(ordered=False)`
- Purpose: This function prepares a validation dataset.
- Input:
  - `ordered` (Optional): If `True`, maintains the order of data elements (for validation).
- Functionality:
  - Loads the validation dataset from file(s) specified in `VALIDATION_FILENAMES`.
  - Batches the data to the desired batch size (`BATCH_SIZE`).
  - Caches the dataset in memory for faster access.
  - Prefetches data for improved performance.
- Output:
  - Returns the prepared validation dataset.

#### 4. `get_test_dataset(ordered=False)`
- Purpose: This function prepares a test dataset.
- Input:
  - `ordered` (Optional): If `True`, maintains the order of data elements (for testing).
- Functionality:
  - Loads the test dataset from file(s) specified in `TEST_FILENAMES`.
  - Batches the data to the desired batch size (`BATCH_SIZE`).
  - Prefetches data for improved performance.
- Output:
  - Returns the prepared test dataset.

#### 5. `count_data_items(filenames)`
- Purpose: This function counts the number of data items in a list of TFRecord filenames.
- Input:
  - `filenames`: A list of TFRecord filenames.
- Functionality:
  - Extracts and sums the number of data items from the filenames (indicated by numeric values in the filenames).
- Output:
  - Returns the total count of data items in the provided filenames.

#### Additional Information:
- `NUM_TRAINING_IMAGES`, `NUM_VALIDATION_IMAGES`, and `NUM_TEST_IMAGES` are calculated using `count_data_items` for reporting purposes, indicating the number of data items in the respective datasets.

### Important Notes:
- These functions are designed to be used in a TensorFlow machine learning pipeline for tasks such as image classification.
- Ensure that the constants (`TRAINING_FILENAMES`, `VALIDATION_FILENAMES`, `TEST_FILENAMES`, and `BATCH_SIZE`) are appropriately defined in your code.
- Data augmentation techniques can be adjusted or extended in the `data_augment` function.
- These functions facilitate efficient data loading and processing, improving the training and evaluation processes in machine learning models.

Feel free to use and adapt these functions in your machine learning projects as needed.

In [None]:

def data_augment(image, label):
    image = tf.image.random_flip_left_right(image)
    #image = tf.image.random_saturation(image, 0, 2)
    return image, label   

def get_training_dataset():
    dataset = load_dataset(TRAINING_FILENAMES, labeled=True)
    dataset = dataset.map(data_augment, num_parallel_calls=AUTO)
    dataset = dataset.repeat() # the training dataset must repeat for several epochs
    dataset = dataset.shuffle(2048)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(AUTO) # prefetch next batch while training (autotune prefetch buffer size)
    return dataset

def get_validation_dataset(ordered=False):
    dataset = load_dataset(VALIDATION_FILENAMES, labeled=True, ordered=ordered)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.cache()
    dataset = dataset.prefetch(AUTO)
    return dataset

def get_test_dataset(ordered=False):
    dataset = load_dataset(TEST_FILENAMES, labeled=False, ordered=ordered)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(AUTO)
    return dataset

def count_data_items(filenames):
    # the number of data items is written in the name of the .tfrec
    # files, i.e. flowers00-230.tfrec = 230 data items
    n = [int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in filenames]
    return np.sum(n)

NUM_TRAINING_IMAGES = count_data_items(TRAINING_FILENAMES)
NUM_VALIDATION_IMAGES = count_data_items(VALIDATION_FILENAMES)
NUM_TEST_IMAGES = count_data_items(TEST_FILENAMES)
print('Dataset: {} training images, {} validation images, {} unlabeled test images'.format(NUM_TRAINING_IMAGES, NUM_VALIDATION_IMAGES, NUM_TEST_IMAGES))


The following Python code defines the batch size for distributed training and prepares three datasets: training, validation, and test datasets. It utilizes TensorFlow's tf.distribute.Strategy for distributing the training across multiple devices or replicas. This documentation provides an explanation of the code's purpose, how it sets the batch size, and how it creates and prints these datasets.

In [None]:
BATCH_SIZE = 16 * strategy.num_replicas_in_sync

ds_train = get_training_dataset()
ds_valid = get_validation_dataset()
ds_test = get_test_dataset()

print("Training:", ds_train)
print ("Validation:", ds_valid)
print("Test:", ds_test)

The following Python code sets the NumPy print options and prints information about the shapes and labels of training data samples from a TensorFlow dataset. It is used for inspecting and understanding the structure of the training data. This documentation explains the purpose of the code and how it prints information about the training data.

In [None]:
np.set_printoptions(threshold=15, linewidth=80)

print("Training data shapes:")
for image, label in ds_train.take(3):
    print(image.numpy().shape, label.numpy().shape)
print("Training data label examples:", label.numpy())

The following Python code prints information about the shapes and IDs of test data samples from a TensorFlow dataset. It is used for inspecting and understanding the structure of the test data. This documentation explains the purpose of the code and how it prints information about the test data.

In [None]:
print("Test data shapes:")
for image, idnum in ds_test.take(3):
    print(image.numpy().shape, idnum.numpy().shape)
print("Test data IDs:", idnum.numpy().astype('U')) # U=unicode string

# Step 4: Explore Data #

**Overview**:
The following Python code defines several functions to facilitate the display of images and training curves in machine learning projects. These functions are useful for visualizing data and training progress. This documentation explains the purpose of each function and how they can be used in your project.

**Functions**:

1. `batch_to_numpy_images_and_labels(data)`
    - **Purpose**: Converts a batch of images and labels from TensorFlow tensors to NumPy arrays.
    - **Input**:
        - `data`: A batch of images and labels.
    - **Functionality**:
        - Extracts and converts images and labels to NumPy arrays.
        - Handles special cases where labels may be binary strings (e.g., image IDs).
    - **Output**:
        - Returns NumPy arrays for images and labels.

2. `title_from_label_and_target(label, correct_label)`
    - **Purpose**: Generates a title for an image based on its label and whether the prediction is correct.
    - **Input**:
        - `label`: The actual label of the image.
        - `correct_label`: The predicted label of the image (can be None for test data).
    - **Functionality**:
        - Constructs a title that includes the label and indicates whether the prediction is correct.
    - **Output**:
        - Returns a formatted title and a Boolean indicating correctness.

3. `display_one_flower(image, title, subplot, red=False, titlesize=16)`
    - **Purpose**: Displays a single image with an optional title.
    - **Input**:
        - `image`: The image to be displayed.
        - `title`: The title to be displayed above the image.
        - `subplot`: A tuple indicating the subplot configuration.
        - `red` (Optional): If True, the title is displayed in red.
        - `titlesize` (Optional): The font size for the title.
    - **Functionality**:
        - Displays the image with the specified title.
        - Allows customization of title appearance.
    - **Output**:
        - Returns the updated subplot configuration.

4. `display_batch_of_images(databatch, predictions=None)`
    - **Purpose**: Displays a batch of images along with their labels or predictions.
    - **Input**:
        - `databatch`: A batch of images and labels.
        - `predictions` (Optional): Predictions for the images (can be None for training).
    - **Functionality**:
        - Automatically determines the layout and size of subplots to accommodate all images.
        - Displays images with labels or predicted labels.
    - **Output**:
        - Displays the batch of images with titles and optionally highlights incorrect predictions.

5. `display_training_curves(training, validation, title, subplot)`
    - **Purpose**: Plots training and validation curves for model performance.
    - **Input**:
        - `training`: Training data (e.g., accuracy or loss) to be plotted.
        - `validation`: Validation data (e.g., accuracy or loss) to be plotted.
        - `title`: Title of the plot.
        - `subplot`: Subplot configuration for plotting.
    - **Functionality**:
        - Sets up subplots if necessary and plots training and validation curves.
        - Adjusts subplot appearance and legend.
    - **Output**:
        - Displays the training and validation curves.

**Important Notes**:
- These functions are designed for visualizing data and monitoring the training process in machine learning projects.
- The provided functions are versatile and can be adapted to various visualization requirements.
- Ensure that you have the necessary libraries (e.g., Matplotlib) installed to use these functions effectively.
- The code assumes that certain constants (e.g., CLASSES) and data structures are defined elsewhere in your project.
- Use these functions as needed to gain insights into your data and model training progress.

In [None]:

from matplotlib import pyplot as plt

def batch_to_numpy_images_and_labels(data):
    images, labels = data
    numpy_images = images.numpy()
    numpy_labels = labels.numpy()
    if numpy_labels.dtype == object: # binary string in this case,
                                     # these are image ID strings
        numpy_labels = [None for _ in enumerate(numpy_images)]
    # If no labels, only image IDs, return None for labels (this is
    # the case for test data)
    return numpy_images, numpy_labels

def title_from_label_and_target(label, correct_label):
    if correct_label is None:
        return CLASSES[label], True
    correct = (label == correct_label)
    return "{} [{}{}{}]".format(CLASSES[label], 'OK' if correct else 'NO', u"\u2192" if not correct else '',
                                CLASSES[correct_label] if not correct else ''), correct

def display_one_flower(image, title, subplot, red=False, titlesize=16):
    plt.subplot(*subplot)
    plt.axis('off')
    plt.imshow(image)
    if len(title) > 0:
        plt.title(title, fontsize=int(titlesize) if not red else int(titlesize/1.2), color='red' if red else 'black', fontdict={'verticalalignment':'center'}, pad=int(titlesize/1.5))
    return (subplot[0], subplot[1], subplot[2]+1)
    
def display_batch_of_images(databatch, predictions=None):
    """This will work with:
    display_batch_of_images(images)
    display_batch_of_images(images, predictions)
    display_batch_of_images((images, labels))
    display_batch_of_images((images, labels), predictions)
    """
    # data
    images, labels = batch_to_numpy_images_and_labels(databatch)
    if labels is None:
        labels = [None for _ in enumerate(images)]
        
    # auto-squaring: this will drop data that does not fit into square
    # or square-ish rectangle
    rows = int(math.sqrt(len(images)))
    cols = len(images)//rows
        
    # size and spacing
    FIGSIZE = 13.0
    SPACING = 0.1
    subplot=(rows,cols,1)
    if rows < cols:
        plt.figure(figsize=(FIGSIZE,FIGSIZE/cols*rows))
    else:
        plt.figure(figsize=(FIGSIZE/rows*cols,FIGSIZE))
    
    # display
    for i, (image, label) in enumerate(zip(images[:rows*cols], labels[:rows*cols])):
        title = '' if label is None else CLASSES[label]
        correct = True
        if predictions is not None:
            title, correct = title_from_label_and_target(predictions[i], label)
        dynamic_titlesize = FIGSIZE*SPACING/max(rows,cols)*40+3 # magic formula tested to work from 1x1 to 10x10 images
        subplot = display_one_flower(image, title, subplot, not correct, titlesize=dynamic_titlesize)
    
    #layout
    plt.tight_layout()
    if label is None and predictions is None:
        plt.subplots_adjust(wspace=0, hspace=0)
    else:
        plt.subplots_adjust(wspace=SPACING, hspace=SPACING)
    plt.show()


def display_training_curves(training, validation, title, subplot):
    if subplot%10==1: # set up the subplots on the first call
        plt.subplots(figsize=(10,10), facecolor='#F0F0F0')
        plt.tight_layout()
    ax = plt.subplot(subplot)
    ax.set_facecolor('#F8F8F8')
    ax.plot(training)
    ax.plot(validation)
    ax.set_title('model '+ title)
    ax.set_ylabel(title)
    #ax.set_ylim(0.28,1.05)
    ax.set_xlabel('epoch')
    ax.legend(['train', 'valid.'])

In summary, this code essentially takes a TensorFlow dataset containing batches of training data, unstacks those batches into individual data samples, and then regroups them into new batches of 20 data samples each. The resulting iterator 'ds_iter' can be used to access and process these batches of training data one by one in your code.

In [None]:
ds_iter = iter(ds_train.unbatch().batch(20))

This code sequentially fetches batches of training data from the iterator and displays them using the display_batch_of_images function, allowing you to inspect and visualize the training data during the development and debugging of your machine learning model.

In [None]:
one_batch = next(ds_iter)
display_batch_of_images(one_batch)

# Step 5: Define Model #



The provided code installs the efficientnet package using pip and then imports the EfficientNet model implementation from the efficientnet.tfkeras module. 

In [None]:
! pip install -q efficientnet
import efficientnet.tfkeras as efn

The provided code defines a deep learning model for a classification task using TensorFlow and the EfficientNetB7 architecture. Here's an explanation of what each part of the code does:

1. `EPOCHS = 30`: This line sets the number of training epochs to 30. An epoch represents one complete iteration through the entire training dataset during training.

2. `with strategy.scope():`: This line indicates that the subsequent code within the block will be executed within the context of a distribution strategy scope. Distribution strategies are used to train models on multiple GPUs or TPUs in a distributed manner.

3. `pretrained_model = efn.EfficientNetB7(...)`: Inside the strategy scope, this line creates an instance of the EfficientNetB7 model from the `efficientnet.tfkeras` module. It is initialized with the following parameters:
   - `weights='noisy-student'`: The model is initialized with weights pre-trained on the Noisy Student dataset. These pre-trained weights are often used as a starting point for transfer learning.
   - `include_top=False`: The top classification layer of the pre-trained model is not included, allowing you to add your own classification layer.
   - `input_shape=[*IMAGE_SIZE, 3]`: Specifies the expected input shape for images, where `IMAGE_SIZE` is assumed to be a tuple representing the image dimensions, and `3` represents the number of color channels (RGB).

4. `pretrained_model.trainable = True`: This line sets the layers of the pre-trained model to be trainable, which means their weights can be updated during the fine-tuning process.

5. `model = tf.keras.Sequential([...])`: This code defines the overall model architecture by creating a sequential model. It consists of:
   - The `pretrained_model`: The EfficientNetB7 model as the base, which extracts features from input images.
   - `tf.keras.layers.GlobalAveragePooling2D()`: A global average pooling layer, which reduces the spatial dimensions of the feature maps to a single value per feature map, effectively flattening the features.
   - `tf.keras.layers.Dense(len(CLASSES), activation='softmax')`: The final classification layer with as many units as there are classes (defined by `len(CLASSES)`). The activation function is softmax, which is commonly used for multi-class classification tasks.

In summary, this code sets up a deep learning model using the EfficientNetB7 architecture, initializes it with pre-trained weights, and configures it for transfer learning. It is suitable for image classification tasks where you can fine-tune the pre-trained model's weights on your specific dataset.

In [None]:
EPOCHS = 30

with strategy.scope():
    pretrained_model = efn.EfficientNetB7(
        weights='noisy-student',
        include_top=False ,
        input_shape=[*IMAGE_SIZE, 3]
    )
    pretrained_model.trainable = True
    
    model = tf.keras.Sequential([
        # To a base pretrained on ImageNet to extract features from images...
        pretrained_model,
        # ... attach a new head to act as a classifier.
        tf.keras.layers.GlobalAveragePooling2D(),
#         tf.keras.layers.Dense(64),
        tf.keras.layers.Dense(len(CLASSES), activation='softmax')
    ])

The provided code compiles the deep learning model, specifying the optimizer, loss function, and evaluation metric. It also prints a summary of the model's architecture. Here's an explanation of each part of the code:

1. `model.compile(...)`: This line compiles the model, configuring its training process. The `compile` method takes the following arguments:
   - `optimizer='adam'`: Specifies the optimizer to be used during training. In this case, it uses the Adam optimizer, a popular choice for gradient-based optimization.
   - `loss='sparse_categorical_crossentropy'`: Defines the loss function to be used for training. For multi-class classification tasks with integer labels (as opposed to one-hot encoded labels), 'sparse_categorical_crossentropy' is a common choice.
   - `metrics=['sparse_categorical_accuracy']`: Specifies the evaluation metric(s) to be monitored during training. In this case, it uses 'sparse_categorical_accuracy' to measure the model's accuracy during training.

2. `model.summary()`: This line prints a summary of the model's architecture, including the layers, output shape of each layer, and the number of trainable parameters. It provides a concise overview of the model's structure.

In summary, this code configures the model for training by specifying the optimizer, loss function, and evaluation metric. Additionally, it provides a summary of the model's architecture to help you understand its structure and parameter count.

In [None]:
model.compile(
    optimizer='adam',
    loss = 'sparse_categorical_crossentropy',
    metrics=['sparse_categorical_accuracy'],
)

model.summary()

# Step 6: Training #

## Learning Rate Schedule ##

**Overview**:
This Python code defines a learning rate schedule, specifically an exponential learning rate, for fine-tuning a deep learning model during training. Learning rate schedules help control the step size during gradient descent, influencing the convergence and performance of the model. This documentation explains the purpose of the code and its components.

**Functions**:

1. `exponential_lr(epoch, start_lr=**0.00001**, min_lr=**0.00001**, max_lr=**0.00005**, rampup_epochs=**5**, sustain_epochs=**0**, exp_decay=**0.8**) :
    - **Purpose**: Generates a learning rate for a given epoch using an exponential decay schedule.
    - **Input**:
        - `epoch`: The current training epoch.
        - `start_lr`: The initial learning rate.
        - `min_lr`: The minimum learning rate.
        - `max_lr`: The maximum learning rate.
        - `rampup_epochs`: The number of epochs for a linear learning rate increase.
        - `sustain_epochs`: The number of epochs with a sustained maximum learning rate.
        - `exp_decay`: The exponential decay factor.
    - **Functionality**:
        - Computes the learning rate based on the given parameters and the current epoch.
        - The learning rate schedule includes linear increase, sustained maximum, and exponential decay phases.
    - **Output**:
        - Returns the computed learning rate for the current epoch.

2. `lr_callback`: This line creates a TensorFlow callback, `LearningRateScheduler`, that adjusts the learning rate during training according to the `exponential_lr` schedule. The callback is verbose, meaning it will print the learning rate adjustments during training.

3. `rng = [i for i in range(EPOCHS)]`: Generates a list of epochs from 0 to `EPOCHS - 1`.

4. `y = [exponential_lr(x) for x in rng]`: Computes the learning rates for each epoch in the range using the `exponential_lr` function.

5. `plt.plot(rng, y)`: Plots the learning rate schedule to visualize how the learning rate changes over epochs.

6. `print("Learning rate schedule: {:.3g} to {:.3g} to {:.3g}".format(**y[0], max(y), y[-1]))`: Prints a summary of the learning rate schedule, showing the initial, maximum, and final learning rates.

**Important Notes**:
- The learning rate schedule is a critical hyperparameter in training deep learning models, affecting training convergence and performance.
- The provided `exponential_lr` function allows you to customize the learning rate schedule based on your specific requirements.
- Visualizing the learning rate schedule can help you understand how the learning rate evolves during training.

In [None]:

# Learning Rate Schedule for Fine Tuning #
def exponential_lr(epoch,
                   start_lr = 0.00001, min_lr = 0.00001, max_lr = 0.00005,
                   rampup_epochs = 5, sustain_epochs = 0,
                   exp_decay = 0.8):

    def lr(epoch, start_lr, min_lr, max_lr, rampup_epochs, sustain_epochs, exp_decay):
        # linear increase from start to rampup_epochs
        if epoch < rampup_epochs:
            lr = ((max_lr - start_lr) /
                  rampup_epochs * epoch + start_lr)
        # constant max_lr during sustain_epochs
        elif epoch < rampup_epochs + sustain_epochs:
            lr = max_lr
        # exponential decay towards min_lr
        else:
            lr = ((max_lr - min_lr) *
                  exp_decay**(epoch - rampup_epochs - sustain_epochs) +
                  min_lr)
        return lr
    return lr(epoch,
              start_lr,
              min_lr,
              max_lr,
              rampup_epochs,
              sustain_epochs,
              exp_decay)

lr_callback = tf.keras.callbacks.LearningRateScheduler(exponential_lr, verbose=True)

rng = [i for i in range(EPOCHS)]
y = [exponential_lr(x) for x in rng]
plt.plot(rng, y)
print("Learning rate schedule: {:.3g} to {:.3g} to {:.3g}".format(y[0], max(y), y[-1]))

The provided code defines and executes the training process for a deep learning model using TensorFlow. Here's an explanation of each part of the code:

1. `EPOCHS = 30`: This line sets the number of training epochs to 30, which specifies how many times the entire training dataset will be used to update the model's weights during training.

2. `STEPS_PER_EPOCH = NUM_TRAINING_IMAGES // BATCH_SIZE`: This line calculates the number of steps (batches) to complete one epoch of training. It divides the total number of training images (`NUM_TRAINING_IMAGES`) by the batch size (`BATCH_SIZE`). Each step processes a batch of training data.

3. `history = model.fit(...)`: This line initiates the model training process using the `fit` method, which trains the model on the training data. It takes the following arguments:
   - `ds_train`: The training dataset.
   - `validation_data=ds_valid`: The validation dataset used to monitor the model's performance during training.
   - `epochs=EPOCHS`: The number of training epochs.
   - `steps_per_epoch=STEPS_PER_EPOCH`: The number of steps (batches) to complete one epoch.
   - `callbacks=[lr_callback]`: A list of callbacks to be applied during training. In this case, it includes the learning rate scheduler (`lr_callback`) defined earlier.

This code will train the model for the specified number of epochs, updating its weights using gradient descent and monitoring its performance on the validation dataset. The training history, including loss and metrics, will be stored in the `history` variable for later analysis and visualization.

In [None]:
# Define training epochs
EPOCHS = 30
STEPS_PER_EPOCH = NUM_TRAINING_IMAGES // BATCH_SIZE

history = model.fit(
    ds_train,
    validation_data=ds_valid,
    epochs=EPOCHS,
    steps_per_epoch=STEPS_PER_EPOCH,
    callbacks=[lr_callback],
)

The provided code calls the `display_training_curves` function to visualize the training and validation curves for loss and accuracy during the training of a deep learning model. Here's an explanation of each part of the code:

1. `display_training_curves(...)`: This line calls the `display_training_curves` function twice to display two sets of training curves:
   - The first call displays the training and validation loss curves.
   - The second call displays the training and validation accuracy curves.

2. `history.history['loss']`: This retrieves the training loss values from the training history. `history` is typically a dictionary that contains various metrics collected during training, and `'loss'` refers to the training loss.

3. `history.history['val_loss']`: This retrieves the validation loss values from the training history. `'val_loss'` refers to the validation loss.

4. `'loss'` and `'sparse_categorical_accuracy'`: These are used as labels for the curves to indicate whether you are plotting loss or accuracy curves.

5. `211` and `212`: These are subplot configuration values that specify where to place the curves in the figure. The `211` indicates that the first set of curves (loss) will be placed in the top subplot, while the `212` indicates that the second set of curves (accuracy) will be placed in the bottom subplot.

The `display_training_curves` function is responsible for plotting the curves and formatting the subplots accordingly. It helps you visualize the model's training progress by showing how the loss and accuracy change over the training epochs.

In [None]:
display_training_curves(
    history.history['loss'],
    history.history['val_loss'],
    'loss',
    211,
)
display_training_curves(
    history.history['sparse_categorical_accuracy'],
    history.history['val_sparse_categorical_accuracy'],
    'accuracy',
    212,
)

# Step 7: Evaluate Predictions #



The provided code defines two functions for displaying a confusion matrix and training curves using Matplotlib. Here's an explanation of each function:

**1. `display_confusion_matrix(cmat, score, precision, recall)`**:
   - **Purpose**: This function displays a confusion matrix along with additional evaluation metrics such as F1 score, precision, and recall.
   - **Inputs**:
     - `cmat`: The confusion matrix.
     - `score`: The F1 score.
     - `precision`: The precision score.
     - `recall`: The recall score.
   - **Functionality**:
     - Creates a Matplotlib figure and displays the confusion matrix as a heatmap.
     - Labels the x and y axes with class names.
     - If evaluation scores (F1, precision, recall) are provided, they are displayed as text on the plot.
   - **Output**: The confusion matrix plot with evaluation metrics.

**2. `display_training_curves(training, validation, title, subplot)`**:
   - **Purpose**: This function displays training and validation curves (e.g., loss or accuracy) during the model training process.
   - **Inputs**:
     - `training`: The training data (e.g., loss or accuracy) to be plotted.
     - `validation`: The validation data (e.g., loss or accuracy) to be plotted.
     - `title`: The title of the plot.
     - `subplot`: The subplot configuration for plotting.
   - **Functionality**:
     - Sets up subplots when called for the first time.
     - Creates a plot with training and validation curves.
     - Configures the appearance of the plot, including labels and legend.
   - **Output**: The training and validation curves plot.

These functions are useful for visualizing model performance and understanding how the model evolves during training. The `display_confusion_matrix` function is particularly helpful for analyzing classification results, while `display_training_curves` helps monitor the training process by showing how metrics change over epochs.

In [None]:

import matplotlib.pyplot as plt
from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix

def display_confusion_matrix(cmat, score, precision, recall):
    plt.figure(figsize=(15,15))
    ax = plt.gca()
    ax.matshow(cmat, cmap='Reds')
    ax.set_xticks(range(len(CLASSES)))
    ax.set_xticklabels(CLASSES, fontdict={'fontsize': 7})
    plt.setp(ax.get_xticklabels(), rotation=45, ha="left", rotation_mode="anchor")
    ax.set_yticks(range(len(CLASSES)))
    ax.set_yticklabels(CLASSES, fontdict={'fontsize': 7})
    plt.setp(ax.get_yticklabels(), rotation=45, ha="right", rotation_mode="anchor")
    titlestring = ""
    if score is not None:
        titlestring += 'f1 = {:.3f} '.format(score)
    if precision is not None:
        titlestring += '\nprecision = {:.3f} '.format(precision)
    if recall is not None:
        titlestring += '\nrecall = {:.3f} '.format(recall)
    if len(titlestring) > 0:
        ax.text(101, 1, titlestring, fontdict={'fontsize': 18, 'horizontalalignment':'right', 'verticalalignment':'top', 'color':'#804040'})
    plt.show()
    
def display_training_curves(training, validation, title, subplot):
    if subplot%10==1: # set up the subplots on the first call
        plt.subplots(figsize=(10,10), facecolor='#F0F0F0')
        plt.tight_layout()
    ax = plt.subplot(subplot)
    ax.set_facecolor('#F8F8F8')
    ax.plot(training)
    ax.plot(validation)
    ax.set_title('model '+ title)
    ax.set_ylabel(title)
    #ax.set_ylim(0.28,1.05)
    ax.set_xlabel('epoch')
    ax.legend(['train', 'valid.'])

## Confusion Matrix ##



The provided code computes a confusion matrix for a validation dataset and normalizes the matrix for better interpretation. Here's an explanation of each part of the code:

1. `cmdataset = get_validation_dataset(ordered=True)`: This line loads the validation dataset using the `get_validation_dataset` function and sets the `ordered` parameter to `True`. An ordered dataset means that the order of elements is preserved, making it suitable for creating a confusion matrix.

2. `images_ds = cmdataset.map(lambda image, label: image)`: This line extracts only the images from the validation dataset, creating a new dataset that contains only the images.

3. `labels_ds = cmdataset.map(lambda image, label: label).unbatch()`: This line extracts the labels from the validation dataset and then unbatch them. Unbatching converts a batched dataset into an individual dataset with one element per example.

4. `cm_correct_labels = next(iter(labels_ds.batch(NUM_VALIDATION_IMAGES))).numpy()`: This line retrieves the correct labels from the unbatched labels dataset. It uses `next(iter(...))` to obtain a batch of labels and then converts it to a NumPy array.

5. `cm_probabilities = model.predict(images_ds)`: This line uses the trained model (`model`) to predict class probabilities for the images in the validation dataset (`images_ds`).

6. `cm_predictions = np.argmax(cm_probabilities, axis=-1)`: This line computes the predicted class labels by taking the argmax of the predicted probabilities along the last axis. It determines the class with the highest probability for each image.

7. `labels = range(len(CLASSES))`: This line creates a list of labels representing the class indices.

8. `cmat = confusion_matrix(...)`: This line computes the confusion matrix using scikit-learn's `confusion_matrix` function. It takes the correct labels (`cm_correct_labels`) and predicted labels (`cm_predictions`) as inputs.

9. `cmat = (cmat.T / cmat.sum(axis=1)).T`: This line normalizes the confusion matrix to show class-wise percentages of correct predictions. It divides each row (class) by the sum of that row to calculate the percentage of correct predictions.

The resulting `cmat` matrix represents the normalized confusion matrix, showing how well the model performs in classifying different classes in the validation dataset. It helps you understand where the model is making correct predictions and where it might be confused.

In [None]:
cmdataset = get_validation_dataset(ordered=True)
images_ds = cmdataset.map(lambda image, label: image)
labels_ds = cmdataset.map(lambda image, label: label).unbatch()

cm_correct_labels = next(iter(labels_ds.batch(NUM_VALIDATION_IMAGES))).numpy()
cm_probabilities = model.predict(images_ds)
cm_predictions = np.argmax(cm_probabilities, axis=-1)

labels = range(len(CLASSES))
cmat = confusion_matrix(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
)
cmat = (cmat.T / cmat.sum(axis=1)).T # normalize

In the provided code, several evaluation metrics (F1 score, precision, recall) are computed based on the confusion matrix, and then the `display_confusion_matrix` function is called to display the confusion matrix with these metrics. Here's an explanation of each part of the code:

1. `score = f1_score(...)`: This line computes the F1 score, which is a measure of a model's accuracy, based on the correct labels (`cm_correct_labels`) and predicted labels (`cm_predictions`). The `labels` parameter specifies the unique class labels, and `average='macro'` indicates that the F1 score should be computed for each class and then averaged to obtain a single score.

2. `precision = precision_score(...)`: This line computes the precision score, which measures the accuracy of positive predictions, based on the correct labels (`cm_correct_labels`) and predicted labels (`cm_predictions`). Like the F1 score, it uses `labels` and `average='macro'` for calculation.

3. `recall = recall_score(...)`: This line computes the recall score, which measures the model's ability to identify all relevant instances, based on the correct labels (`cm_correct_labels`) and predicted labels (`cm_predictions`). It also uses `labels` and `average='macro'` for calculation.

4. `display_confusion_matrix(cmat, score, precision, recall)`: Finally, this line calls the `display_confusion_matrix` function to visualize the confusion matrix along with the computed F1 score, precision, and recall. These metrics provide a comprehensive view of the model's performance in classifying different classes in the validation dataset.

The `display_confusion_matrix` function was explained earlier and is used to create a visual representation of the confusion matrix with additional evaluation metrics displayed as text on the plot. This helps you assess the model's performance and identify areas where it may need improvement.

In [None]:
score = f1_score(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
    average='macro',
)
precision = precision_score(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
    average='macro',
)
recall = recall_score(
    cm_correct_labels,
    cm_predictions,
    labels=labels,
    average='macro',
)
display_confusion_matrix(cmat, score, precision, recall)

The provided code processes a validation dataset by unbatching it, creating new batches with a batch size of 20, and then creating an iterator to iterate through these batches. Here's an explanation of each part of the code:

1. `dataset = get_validation_dataset()`: This line obtains the validation dataset using the `get_validation_dataset` function. The dataset contains image-label pairs for validation.

2. `dataset = dataset.unbatch().batch(20)`: Here, the validation dataset is first unbatched using the `unbatch` method. This step converts the dataset from batched form to individual examples. After unbatching, it is rebatched with a batch size of 20 using the `batch` method. This means that each batch will contain 20 examples.

3. `batch = iter(dataset)`: This line creates an iterator (`batch`) to iterate through the batches of the validation dataset. The iterator allows you to access one batch at a time, making it convenient for processing and visualization.

With this setup, you can use the `batch` iterator to loop through batches of validation data, making it easy to perform operations or visualization tasks on each batch of examples in your validation dataset.

In [None]:
dataset = get_validation_dataset()
dataset = dataset.unbatch().batch(20)
batch = iter(dataset)

The provided code extracts a batch of images and labels from the validation dataset, makes predictions using a trained model, and then displays the batch of images along with the predicted labels. Here's an explanation of each part of the code:

1. `images, labels = next(batch)`: This line uses the `next` function to retrieve the next batch of data from the `batch` iterator. Specifically, it gets a batch of images and their corresponding labels from the validation dataset.

2. `probabilities = model.predict(images)`: Here, the code uses the trained model (`model`) to predict class probabilities for the batch of images (`images`). The result is stored in the `probabilities` variable.

3. `predictions = np.argmax(probabilities, axis=-1)`: This line calculates the predicted class labels by taking the argmax of the predicted probabilities along the last axis. It determines the class with the highest probability for each image in the batch and stores the predicted labels in the `predictions` variable.

4. `display_batch_of_images((images, labels), predictions)`: Finally, this line calls the `display_batch_of_images` function to display the batch of images along with their true labels (`labels`) and predicted labels (`predictions`). This function was previously explained and is used for visualizing batches of images with labels and optionally highlighting incorrect predictions.

With these steps, you can visually inspect how the model is performing on a batch of validation data and compare the true labels with the model's predictions. This is a helpful way to gain insights into the model's performance and identify any potential issues or misclassifications.

In [None]:
images, labels = next(batch)
probabilities = model.predict(images)
predictions = np.argmax(probabilities, axis=-1)
display_batch_of_images((images, labels), predictions)

# Step 8: Make Test Predictions #

Once you're satisfied with everything, you're ready to make predictions on the test set.

In the provided code, predictions are computed for a test dataset using a trained model. Here's an explanation of each part of the code:

1. `test_ds = get_test_dataset(ordered=True)`: This line obtains the test dataset using the `get_test_dataset` function with the `ordered` parameter set to `True`. An ordered test dataset ensures that the order of elements is preserved, which is important for correctly matching predictions with test data.

2. `print('Computing predictions...')`: This line prints a message indicating that predictions are about to be computed.

3. `test_images_ds = test_ds.map(lambda image, idnum: image)`: Here, the code extracts only the images from the test dataset (`test_ds`) and creates a new dataset (`test_images_ds`) containing only the images. This step is necessary to make predictions on the test images.

4. `probabilities = model.predict(test_images_ds)`: Using the trained model (`model`), this line predicts class probabilities for the test images contained in `test_images_ds`. The `probabilities` variable stores the predicted probabilities for each class for each test image.

5. `predictions = np.argmax(probabilities, axis=-1)`: This line calculates the predicted class labels by taking the argmax of the predicted probabilities along the last axis. It determines the class with the highest probability for each test image and stores the predicted labels in the `predictions` variable.

6. `print(predictions)`: Finally, the code prints the predicted labels for the test images. These predicted labels represent the model's classification for each test image.

This code snippet demonstrates how to use a trained model to make predictions on a test dataset, which is useful for evaluating the model's performance on unseen data or for generating predictions for tasks like image classification.

In [None]:
test_ds = get_test_dataset(ordered=True)

print('Computing predictions...')
test_images_ds = test_ds.map(lambda image, idnum: image)
probabilities = model.predict(test_images_ds)
predictions = np.argmax(probabilities, axis=-1)
print(predictions)

The provided code generates a submission file in CSV format. Here's an explanation of each part of the code:

1. `print('Generating submission.csv file...')`: This line prints a message to indicate that the code is about to generate the submission CSV file.

2. `test_ids_ds = test_ds.map(lambda image, idnum: idnum).unbatch()`: This line extracts the image IDs from the test dataset (`test_ds`) by mapping a lambda function to extract the `idnum` field for each image. The result is unbatched to obtain individual image IDs.

3. `test_ids = next(iter(test_ids_ds.batch(NUM_TEST_IMAGES))).numpy().astype('U')`: Here, the code retrieves the image IDs as a batch, converting them to NumPy arrays and then to Unicode strings. The image IDs represent the unique identifiers for the test images.

4. Writing the Submission File:
   - `np.rec.fromarrays([test_ids, predictions])`: This line creates a NumPy record array by combining the test image IDs and the corresponding model predictions.
   - `fmt=['%s', '%d']`: Specifies the format for each column in the record array. `%s` is for strings (image IDs), and `%d` is for integers (class labels).
   - `delimiter=','`: Specifies the delimiter (comma) to separate values in the CSV file.
   - `header='id,label'`: Sets the header for the CSV file, indicating the columns (image ID and label).
   - `comments=''`: Specifies that there should be no comments in the CSV file.

5. `!head submission.csv`: Finally, this line uses the `head` command to display the first few lines of the generated submission CSV file in the console. This is a way to quickly inspect the contents of the file.

The resulting `submission.csv` file contains image IDs and corresponding model predictions, which can be submitted to a competition for evaluation.



In [None]:
print('Generating submission.csv file...')

# Get image ids from test set and convert to unicode
test_ids_ds = test_ds.map(lambda image, idnum: idnum).unbatch()
test_ids = next(iter(test_ids_ds.batch(NUM_TEST_IMAGES))).numpy().astype('U')

# Write the submission file
np.savetxt(
    'submission.csv',
    np.rec.fromarrays([test_ids, predictions]),
    fmt=['%s', '%d'],
    delimiter=',',
    header='id,label',
    comments='',
)

# Look at the first few predictions
!head submission.csv

# Step 9: Make a submission #

If you haven't already, create your own editable copy of this notebook by clicking on the **Copy and Edit** button in the top right corner. Then, submit to the competition by following these steps:

1. Begin by clicking on the blue **Save Version** button in the top right corner of the window.  This will generate a pop-up window.  
2. Ensure that the **Save and Run All** option is selected, and then click on the blue **Save** button.
3. This generates a window in the bottom left corner of the notebook.  After it has finished running, click on the number to the right of the **Save Version** button.  This pulls up a list of versions on the right of the screen.  Click on the ellipsis **(...)** to the right of the most recent version, and select **Open in Viewer**.  This brings you into view mode of the same page. You will need to scroll down to get back to these instructions.
4. Click on the **Output** tab on the right of the screen.  Then, click on the file you would like to submit, and click on the blue **Submit** button to submit your results to the leaderboard.

You have now successfully submitted to the competition!

If you want to keep working to improve your performance, select the blue **Edit** button in the top right of the screen. Then you can change your code and repeat the process. There's a lot of room to improve, and you will climb up the leaderboard as you work.
