Base code: https://www.kaggle.com/jessemostipak/getting-started-tpus-cassava-leaf-disease 

# Tensor Processing Units (TPUs)

Tensor Processing Units (TPUs) are hardware accelerators that are specialized for deep learning tasks. All Kagglers have 30 hours of free TPU time each week, and can use up to 3 hours in a single session (although if you'd like to increase your TPU quota consider submitting an exemplary TPU notebook to our **[TPU Star program](https://www.kaggle.com/tpu-prize)**!)   

You can read through the Kaggle documentation on TPUs **[here](https://www.kaggle.com/docs/tpu)**, and check out the TPU Star notebooks **[here](https://www.kaggle.com/tpu-stars)**.

# Set up environment

In [1]:
import math, re, os
import pickle
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from kaggle_datasets import KaggleDatasets
from tensorflow import keras
from functools import partial
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Model
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score
from tensorflow.keras.models import load_model
print("Tensorflow version " + tf.__version__)

Tensorflow version 2.4.1


# Detect TPU
What we're doing with our code here is making sure that we'll be sending our data across a TPU. What you're looking for is a printout of `Number of replicas: 8`, corresponding to the 8 cores of a TPU. If your printout instead says `Number of replicas: 1` you likely do not have TPUs enabled in your notebook.   

To enable TPUs navigate to the panel on the right and click on `Accelerator`. Choose TPU from the dropdown.  

If you'd like more TPU troubleshooting and optimization guidelines check out our **[Learn With Me: Troubleshooting and Optimizing TPUs video](https://youtu.be/BSeWHzjMHMU)**.  

In [2]:
# try:
#     tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
#     print('Device:', tpu.master())
#     tf.config.experimental_connect_to_cluster(tpu)
#     tf.tpu.experimental.initialize_tpu_system(tpu)
#     strategy = tf.distribute.experimental.TPUStrategy(tpu)
# except:
#     strategy = tf.distribute.get_strategy()
# print('Number of replicas:', strategy.num_replicas_in_sync)

# Set up variables
We'll set up some of our variables for our notebook here. 

If by chance you're using a private dataset, you'll also want to make sure that you have the **Google Cloud Software Development Kit (SDK)** attached to your notebook. You can find the Google Cloud SDK under the `Add-ons` dropdown menu at the top of your notebook. Documentation for the **Google Cloud Software Development Kit (SDK)** can be found **[here](https://www.kaggle.com/product-feedback/163416)**.

In [3]:
AUTOTUNE = tf.data.experimental.AUTOTUNE
#GCS_PATH = KaggleDatasets().get_gcs_path('cassava-leaf-disease-classification')
GCS_PATH = '../input/cassava-leaf-disease-classification'
#BATCH_SIZE = 16 * strategy.num_replicas_in_sync
BATCH_SIZE = 128
IMAGE_SIZE = [512, 512]
CLASSES = ['0', '1', '2', '3', '4']
EPOCHS = 150

# Load the data
If you've primarily worked with notebooks in Learn, you've maybe noticed that data import and formatting is taken care of for you. But because we're working with competition data we'll have to handle this part of the pipeline ourselves.   

The data we're working with have been formatted into `TFRecords`, which are a format for storing a sequence of binary records. `TFRecords` work _really_ well with TPUs, and allow us to send a small number of large files across the TPU for processing.   

If you'd like to learn more about `TFRecords` and maybe even try creating them yourself, check out this **[TFRecords Basics notebook](https://www.kaggle.com/ryanholbrook/tfrecords-basics)** and **[corresponding video](https://youtu.be/KgjaC9VeOi8)** from Kaggle Data Scientist Ryan Holbrook.  

Because our data consists of `training` and `test` images only, we're going to split our `training` data into `training` and `validation` data using the `train_test_split()` function. 

## Decode the data
In the code chunk below we'll set up a series of functions that allow us to convert our images into tensors so that we can utilize them in our model. We'll also normalize our data. Our images are using a "Red, Blue, Green (RBG)" scale that has a range of [0, 255], and by normalizing it we'll set each pixel's value to a number in the range of [0, 1]. 

In [4]:
def decode_image(image):
    image = tf.image.decode_jpeg(image, channels=3)
    image = tf.cast(image, tf.float32) / 255.0
    image = tf.reshape(image, [*IMAGE_SIZE, 3])
    return image

If you think back to **[Intro to Machine Learning](https://www.kaggle.com/learn/intro-to-machine-learning)** you might remember how we set up variables like `X` and `y`, representing our `features`, `X`, and `prediction target`, `y`. This code is accomplishing something similar, although instead of using the labels `X` and `y`, our `features` are represented by the term `image` and our `prediction target` by the term `target`.  

You might also notice that this function accounts for unlabeled images. This is because our test image doesn't have any labels.  

In [5]:
def read_tfrecord(example, labeled):
    tfrecord_format = {
        "image": tf.io.FixedLenFeature([], tf.string),
        "target": tf.io.FixedLenFeature([], tf.int64)
    } if labeled else {
        "image": tf.io.FixedLenFeature([], tf.string),
        "image_name": tf.io.FixedLenFeature([], tf.string)
    }
    example = tf.io.parse_single_example(example, tfrecord_format)
    image = decode_image(example['image'])
    if labeled:
        label = tf.cast(example['target'], tf.int32)
        return image, label
    idnum = example['image_name']
    return image, idnum

We'll use the following function to load our dataset. One of the advantages of a TPU is that we can run multiple files across the TPU at once, and this accounts for the speed advantages of using a TPU. To capitalize on that, we want to make sure that we're using data as soon as it streams in, rather than creating a data streaming bottleneck.

In [6]:
def load_dataset(filenames, labeled=True, ordered=False):
    ignore_order = tf.data.Options()
    if not ordered:
        ignore_order.experimental_deterministic = False # disable order, increase speed
    dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTOTUNE) # automatically interleaves reads from multiple files
    dataset = dataset.with_options(ignore_order) # uses data as soon as it streams in, rather than in its original order
    dataset = dataset.map(partial(read_tfrecord, labeled=labeled), num_parallel_calls=AUTOTUNE)
    return dataset

## A note on using train_test_split()
While I used `train_test_split()` to create both a `training` and `validation` dataset, consider exploring **[cross validation instead](https://www.kaggle.com/dansbecker/cross-validation)**.

In [7]:
TRAINING_FILENAMES, VALID_FILENAMES = train_test_split(
    tf.io.gfile.glob(GCS_PATH + '/train_tfrecords/ld_train*.tfrec'),
    test_size=0.9, random_state=42
)

TEST_FILENAMES = tf.io.gfile.glob(GCS_PATH + '/test_tfrecords/ld_test*.tfrec')

## Adding in augmentations 
You learned about augmentations in the **[Computer Vision: Data Augmentation](https://www.kaggle.com/ryanholbrook/data-augmentation)** lesson on Kaggle Learn, and here I've applied an augmentation available to us through TensorFlow. You can read more about these augmentations (as well as all of the other augmentations available to you!) in the **[TensorFlow tf.image documentation](https://www.tensorflow.org/api_docs/python/tf/image)**.  

If you're interested in learning how to create and use custom augmentations, check out these **[Rotation Augmentation GPU/TPU](https://www.kaggle.com/cdeotte/rotation-augmentation-gpu-tpu-0-96)** and **[CutMix and MixUp on GPU/TPU](https://www.kaggle.com/cdeotte/cutmix-and-mixup-on-gpu-tpu)** from Kaggle Grandmaster Chris Deotte.

In [8]:
def center_crop(image, label):
    image = tf.image.central_crop(image, central_fraction=400/512)
    return image, label

def data_augment(image, label):
    # Thanks to the dataset.prefetch(AUTO) statement in the following function this happens essentially for free on TPU. 
    # Data pipeline code is executed on the "CPU" part of the TPU while the TPU itself is computing gradients.
    seed = (1, 2)
    #image = tf.image.central_crop(image, central_fraction=0.6)
    image = tf.image.stateless_random_flip_left_right(image, seed)
    image = tf.image.stateless_random_flip_up_down(image, seed)
    image = tf.image.stateless_random_brightness(image, 0.2, seed)
    image = tf.image.stateless_random_contrast(image, 0.2, 0.7, seed)
    image = tf.image.stateless_random_crop(image, size=(400, 400, 3), seed=seed)
    image = tf.image.stateless_random_hue(image, 0.2, seed)
    image = tf.image.stateless_random_saturation(image, 0.2, 0.7, seed)

    return image, label

IMAGESIZE2 = [400,400]

## Define data loading methods
The following functions will be used to load our `training`, `validation`, and `test` datasets, as well as print out the number of images in each dataset.

In [9]:
def get_training_dataset():
    dataset = load_dataset(TRAINING_FILENAMES, labeled=True)  
    dataset = dataset.map(data_augment, num_parallel_calls=AUTOTUNE)  # apply image aug
    dataset = dataset.repeat()
    dataset = dataset.shuffle(2048)
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(AUTOTUNE)
    return dataset

In [10]:
def get_validation_dataset(ordered=False):
    dataset = load_dataset(VALID_FILENAMES, labeled=True, ordered=ordered) 
    dataset = dataset.map(center_crop, num_parallel_calls=AUTOTUNE)  # center crop
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.cache()
    dataset = dataset.prefetch(AUTOTUNE)
    return dataset

In [11]:
def get_test_dataset(ordered=False):
    dataset = load_dataset(TEST_FILENAMES, labeled=False, ordered=ordered)
    dataset = dataset.map(center_crop, num_parallel_calls=AUTOTUNE)  # center crop
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.prefetch(AUTOTUNE)
    return dataset

In [12]:
def count_data_items(filenames):
    n = [int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in filenames]
    return np.sum(n)

In [13]:
# NUM_TRAINING_IMAGES = count_data_items(TRAINING_FILENAMES)
# NUM_VALIDATION_IMAGES = count_data_items(VALID_FILENAMES)
# NUM_TEST_IMAGES = count_data_items(TEST_FILENAMES)

# print('Dataset: {} training images, {} validation images, {} (unlabeled) test images'.format(
#     NUM_TRAINING_IMAGES, NUM_VALIDATION_IMAGES, NUM_TEST_IMAGES))

# Building the model
## Callbacks:

In [14]:
# initial_learning_rate = 1e-4
# lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
#       initial_learning_rate,
#       decay_steps=1000,
#       decay_rate=0.96)
# # Callback
# callback = tf.keras.callbacks.EarlyStopping(monitor='val_sparse_categorical_accuracy', patience=15, restore_best_weights=True)
# #reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-7)


# save_dir = os.path.join(os.getcwd(), 'saved_models')
# if not os.path.isdir(save_dir):
#       os.makedirs(save_dir)
# model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
#     filepath=save_dir,
#     monitor='val_sparse_categorical_accuracy',
#     mode='max',
#     save_best_only=True)

# # Model weights are saved at the end of every epoch, if it's the best seen
# # so far.
# model.fit(epochs=EPOCHS, callbacks=[model_checkpoint_callback])

# The model weights (that are considered the best) are loaded into the model.


## Building our model
In order to ensure that our model is trained on the TPU, we build it using `with strategy.scope()`.    

This model was built using transfer learning, meaning that we have a _pre-trained model_ (ResNet50) as our base model and then the customizable model built using `tf.keras.Sequential`. If you're new to transfer learning I recommend setting `base_model.trainable` to **False**, but _do_ encourage you to change which base model you're using (more options are available in the **[`tf.keras.applications` Module](https://www.tensorflow.org/api_docs/python/tf/keras/applications)** documentation) as well iterate on the custom model. 

Note that we're using `sparse_categorical_crossentropy` as our loss function, because we did _not_ one-hot encode our labels.

## ResNet50

In [15]:
#tf.keras.backend.clear_session()

In [16]:
# def f1_m(y_true, y_pred):
#     true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
#     possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
#     predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
#     recall = true_positives / (possible_positives + K.epsilon())
#     precision = true_positives / (predicted_positives + K.epsilon())
    
#     return 2*((precision*recall)/(precision+recall+K.epsilon()))

In [17]:
# def build_res(lr,alpha,drop_rate):
   
#     img_adjust_layer = tf.keras.layers.Lambda(tf.keras.applications.resnet50.preprocess_input, input_shape=[*IMAGESIZE2, 3])

#     base_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False)
#     base_model.trainable = False
# #     for x in base_model.layers[1:30]:  # Unfreeze some layers: no i cant
# #         x.trainable = True

#     model_res = tf.keras.Sequential([
#         #tf.keras.layers.BatchNormalization(renorm=True),
#         img_adjust_layer,
#         base_model,
#         tf.keras.layers.GlobalAveragePooling2D(),
#         tf.keras.layers.Flatten(),
#         tf.keras.layers.Dense(512, activation = "relu", kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(5, activation = 'softmax')
#     ])
#     model_res.compile(
#         optimizer=tf.keras.optimizers.Adam(learning_rate=lr, epsilon=0.001),
#         loss='sparse_categorical_crossentropy',
#         metrics=['sparse_categorical_accuracy'])
    
#     return model_res

In [18]:
# from tensorflow.keras.models import load_model

# with strategy.scope():  
#     model_res = load_model("../input/resmodel/resmodel_2.h5")
#     model_res.compile(
#         optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4, epsilon=0.001),
#         loss='sparse_categorical_crossentropy',
#         metrics=['sparse_categorical_accuracy'])


In [19]:
# lr = 1e-4
# alpha = 0.015
# drop_rate = 0.2



# with strategy.scope():   
#     model_res = build_res(lr_schedule, alpha, drop_rate)

In [20]:
# train_dataset = get_training_dataset()
# valid_dataset = get_validation_dataset()

# STEPS_PER_EPOCH = NUM_TRAINING_IMAGES // BATCH_SIZE
# VALID_STEPS = NUM_VALIDATION_IMAGES // BATCH_SIZE
# with strategy.scope(): 
#     h_res = model_res.fit(train_dataset, 
#                           steps_per_epoch=STEPS_PER_EPOCH, 
#                           epochs=EPOCHS,
#                           validation_data=valid_dataset,
#                           validation_steps=VALID_STEPS,
#                           callbacks=[callback])

## LR schedule callback to determine lr: 1e-4

In [21]:
# tf.keras.backend.clear_session()

# def build_res(lr,alpha,drop_rate):
   
#     img_adjust_layer = tf.keras.layers.Lambda(tf.keras.applications.resnet50.preprocess_input, input_shape=[*IMAGESIZE2, 3])

#     base_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False)
#     base_model.trainable = False
# #     for x in base_model.layers[-31:-1]:  # Unfreeze some layers
# #         x.trainable = True

#     model_res = tf.keras.Sequential([
#         tf.keras.layers.BatchNormalization(renorm=True),
#         img_adjust_layer,
#         base_model,
#         tf.keras.layers.GlobalAveragePooling2D(),
#         tf.keras.layers.Flatten(),
#         tf.keras.layers.Dense(512, activation = "relu", kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(5, activation = 'softmax')
#     ])
#     model_res.compile(
#         optimizer=tf.keras.optimizers.Adam(learning_rate=lr, epsilon=0.001),
#         loss='sparse_categorical_crossentropy',
#         metrics=['sparse_categorical_accuracy'])
    
#     return model_res



# with strategy.scope():   
#     model_res = build_res(1e-8, 0.015, 0.2)

#     # Callback
#     #callback = tf.keras.callbacks.EarlyStopping(monitor='val_sparse_categorical_accuracy', patience=20, restore_best_weights=True)
#     lr_schedule = tf.keras.callbacks.LearningRateScheduler(lambda epoch: 1e-8*10**(epoch/20))
#     history = model_res.fit(train_dataset, 
#                           steps_per_epoch=STEPS_PER_EPOCH, 
#                           epochs=100,
#                           validation_data=valid_dataset,
#                           validation_steps=VALID_STEPS,
#                           callbacks=[lr_schedule])
#     lrs = 1e-8*10**(np.arange(100)/20)
#     plt.semilogx(lrs, history.history["loss"], label='train')
#     plt.semilogx(lrs, history.history["val_loss"], label='validation')

In [22]:
# tf.keras.backend.clear_session()

# def build_res(alpha,drop_rate):
   
#     img_adjust_layer = tf.keras.layers.Lambda(tf.keras.applications.resnet50.preprocess_input, input_shape=[*IMAGESIZE2, 3])

#     base_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False)
#     base_model.trainable = False
# #     for x in base_model.layers[-31:-1]:  # Unfreeze some layers
# #         x.trainable = True

#     model_res = tf.keras.Sequential([
#         tf.keras.layers.BatchNormalization(renorm=True),
#         img_adjust_layer,
#         base_model,
#         tf.keras.layers.GlobalAveragePooling2D(),
#         tf.keras.layers.Flatten(),
#         tf.keras.layers.Dense(512, activation = "relu", kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(5, activation = 'softmax')
#     ])
    
#     return model_res

# def fit_model(model, train_dataset, valid_dataset, decay_step):
#     lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
#           initial_learning_rate = 1e-4,
#           decay_steps= decay_step,
#           decay_rate=0.96)
#     # compile model
#     model.compile(
#         optimizer=tf.keras.optimizers.Adam(learning_rate=lr_schedule, epsilon=0.001),
#         loss='sparse_categorical_crossentropy',
#         metrics=['sparse_categorical_accuracy'])
#     # fit model
#     STEPS_PER_EPOCH = NUM_TRAINING_IMAGES // BATCH_SIZE
#     VALID_STEPS = NUM_VALIDATION_IMAGES // BATCH_SIZE
#     h_res = model_res.fit(train_dataset, 
#                       steps_per_epoch=STEPS_PER_EPOCH, 
#                       epochs=15,
#                       validation_data=valid_dataset,
#                       validation_steps=VALID_STEPS)
#     # plot learning curves
#     plt.plot(h_res.history['sparse_categorical_accuracy'], label='train')
#     plt.plot(h_res.history['val_sparse_categorical_accuracy'], label='validation')
#     plt.title('initial lrate='+str(decay_step))

# save_dir = os.path.join(os.getcwd(), 'saved_models')
# with strategy.scope():   
#     # create learning curves for different learning rates
#     decay_steps = [1, 10, 1e2, 1e3, 1e4, 1e5]
#     for i in range(len(decay_steps)):
#         model = build_res(alpha,drop_rate)
#         # determine the plot number
#         plot_no = 420 + (i+1)
#         plt.subplot(plot_no)
#         # fit model and plot learning curves for a learning rate
#         fit_model(model, train_dataset, valid_dataset, decay_steps[i])
#     # show learning curves
#     plt.show()

In [23]:
# tf.keras.backend.clear_session()

# def build_res(alpha,drop_rate):
   
#     img_adjust_layer = tf.keras.layers.Lambda(tf.keras.applications.resnet50.preprocess_input, input_shape=[*IMAGESIZE2, 3])

#     base_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False)
#     base_model.trainable = False
# #     for x in base_model.layers[-31:-1]:  # Unfreeze some layers
# #         x.trainable = True

#     model_res = tf.keras.Sequential([
#         tf.keras.layers.BatchNormalization(renorm=True),
#         img_adjust_layer,
#         base_model,
#         tf.keras.layers.GlobalAveragePooling2D(),
#         tf.keras.layers.Flatten(),
#         tf.keras.layers.Dense(512, activation = "relu", kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(256, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(alpha)),
#         tf.keras.layers.BatchNormalization(momentum=0.97),
#         tf.keras.layers.Dropout(rate=drop_rate),
#         tf.keras.layers.Dense(5, activation = 'softmax')
#     ])
    
#     return model_res

# def fit_model(model, train_dataset, valid_dataset, opt, callback):
    
#     # compile model
#     model.compile(
#         optimizer=opt,
#         loss='sparse_categorical_crossentropy',
#         metrics=['sparse_categorical_accuracy'])
#     # fit model
#     STEPS_PER_EPOCH = NUM_TRAINING_IMAGES // BATCH_SIZE
#     VALID_STEPS = NUM_VALIDATION_IMAGES // BATCH_SIZE
#     h_res = model_res.fit(train_dataset, 
#                       steps_per_epoch=STEPS_PER_EPOCH, 
#                       epochs=15,
#                       validation_data=valid_dataset,
#                       validation_steps=VALID_STEPS,
#                       callbacks=[callback])
#     # plot learning curves
#     plt.plot(h_res.history['sparse_categorical_accuracy'], label='train')
#     plt.plot(h_res.history['val_sparse_categorical_accuracy'], label='validation')
#     plt.title('Optimizer='+str(opt))

# initial_learning_rate = 1e-4
# lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
#           initial_learning_rate,
#           decay_steps=1000,
#           decay_rate=0.96,
#           staircase=True) 
# opt_adagrad = tf.keras.optimizers.Adagrad(learning_rate=lr_schedule, epsilon=0.001)
# opt_adam = tf.keras.optimizers.Adam(learning_rate=lr, epsilon=0.001)
# opt_adadelta = tf.keras.optimizers.Adadelta(learning_rate=lr_schedule, epsilon=0.001)
# opt_rmsprop = tf.keras.optimizers.RMSprop(learning_rate=lr_schedule, epsilon=0.001)

# with strategy.scope():   
#     # create learning curves for different learning rates
#     opts = [opt_adagrad,opt_adam, opt_adadelta,opt_rmsprop]
#     for i in range(len(opts)):
#         model = build_res(alpha,drop_rate)
#         # determine the plot number
#         plot_no = 420 + (i+1)
#         plt.subplot(plot_no)
#         # fit model and plot learning curves for a learning rate
#         fit_model(model, train_dataset, valid_dataset, opts[i], callback)
#     # show learning curves
#     plt.show()

With model.summary() we'll see a printout of each of our layers, their corresponding shape, as well as the associated number of parameters. Notice that at the bottom of the printout we'll see information on the total parameters, trainable parameters, and non-trainable parameters. Because we're using a pre-trained model, we expect there to be a large number of non-trainable parameters (because the weights have already been assigned in the pre-trained model).

# Evaluating our model
The first chunk of code is provided to show you where the variables in the second chunk of code came from. As you can see, there's a lot of room for improvement in this model, but because we're using TPUs and have a relatively short training time, we're able to iterate on our model fairly rapidly.

In [24]:
# # print out variables available to us
# print(h_res.history.keys())

In [25]:
# # create learning curves to evaluate model performance
# history_frame = pd.DataFrame(h_res.history)
# history_frame.loc[:, ['loss', 'val_loss']].plot()
# history_frame.loc[:, ['sparse_categorical_accuracy', 'val_sparse_categorical_accuracy']].plot();

# Ensemble Models

In [26]:
def train_set(ordered=False):
    dataset = load_dataset(TRAINING_FILENAMES, labeled=True, ordered=ordered) 
    dataset = dataset.map(center_crop, num_parallel_calls=AUTOTUNE)  # center crop
    dataset = dataset.batch(BATCH_SIZE)
    dataset = dataset.cache()
    dataset = dataset.prefetch(AUTOTUNE)
    return dataset
trainset = train_set()

In [27]:
train_dataset = get_training_dataset()
# valid_dataset = get_validation_dataset()

train = list(trainset)[0]
X_train, y_train = train
# valid = list(valid_dataset)[0]
# X_test, y_test = valid

In [28]:
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score
from tensorflow.keras.models import load_model

def get_model(mod):
    if mod == 0:
        model = load_model("../input/resmodel/resmodel.h5")
    elif mod == 1:
        model = load_model("../input/resmodel/resmodel_2.h5")
    elif mod == 2:
        model = load_model("../input/resmodel/resmodel_3.h5")
    return model

# def get_model():
#     model = load_model("../input/resmodel/resmodel_3.h5")
#     return model

res1_clf = tf.keras.wrappers.scikit_learn.KerasClassifier(
                            lambda: get_model(0),
                            epochs=0,
                            verbose=False)
res2_clf = tf.keras.wrappers.scikit_learn.KerasClassifier(
                            lambda: get_model(1),
                            epochs=0,
                            verbose=False)
res3_clf = tf.keras.wrappers.scikit_learn.KerasClassifier(
                            lambda: get_model(2),
                            epochs=0,
                            verbose=False)

for x in [res1_clf, res2_clf, res3_clf]:
    x._estimator_type = "classifier"

voting = VotingClassifier(
             estimators=[('res1', res1_clf),
                         ('res2', res2_clf),
                         ('res3', res3_clf)], 
             voting='soft',
             flatten_transform=True)


# for clf in (res1_clf, res2_clf, res3_clf, voting):
#     clf.fit(X_train, y_train)
#     y_pred = clf.predict(X_test)
#     print(clf.__class__.__name__, accuracy_score(y_test, y_pred))
voting.fit(X_train, y_train)

VotingClassifier(estimators=[('res1',
                              <tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier object at 0x7fea9cc08610>),
                             ('res2',
                              <tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier object at 0x7fea9c0d05d0>),
                             ('res3',
                              <tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier object at 0x7fea9c0d0bd0>)],
                 voting='soft')

In [29]:
# import pickle
# # save the model to disk
# filename = 'votemodel.pkl'
# with open(filename, 'wb') as file:  
#     pickle.dump(voting, file)

# Making predictions
Now that we've trained our model we can use it to make predictions! 

In [30]:
# # this code will convert our test image data to a float32 
# def to_float32(image, label):
#     return tf.cast(image, tf.float32), label

# test_ds = get_test_dataset(ordered=True) 
# test_ds = test_ds.map(to_float32)

# print('Computing predictions...')
# #test_images_ds = test_dataset
# test_images_ds = test_ds.map(lambda image, idnum: image)
# probabilities = model_res.predict(test_images_ds)
# predictions = np.argmax(probabilities, axis=-1)
# print(predictions)

# Save Model

In [31]:
# # save model

# def save_model(model, name):
#   model_name = '{}.h5'.format(name)
#   save_dir = os.path.join(os.getcwd(), 'saved_models')
  
#   # Save model and weights
#   if not os.path.isdir(save_dir):
#       os.makedirs(save_dir)
#   model_path = os.path.join(save_dir, model_name)
#   model.save(model_path)
#   print('Saved trained model at %s ' % model_path)

# save_model(model_res, 'resmodel_3')

In [32]:
# from tensorflow import keras
# from keras.models import load_model

# #model = load_model("../input/resmodel1/resmodel.h5")

In [33]:
# model = pickle.load(open(filename, 'rb'))
# model

# Creating a submission file
Now that we've trained a model and made predictions we're ready to submit to the competition! You can run the following code below to get your submission file.

In [34]:
preds = []
sample_sub = pd.read_csv('/kaggle/input/cassava-leaf-disease-classification/sample_submission.csv')


for image in sample_sub.image_id:
    img = keras.preprocessing.image.load_img('/kaggle/input/cassava-leaf-disease-classification/test_images/' + image)
    #
    # Preprocess image here (rescale, etc. - you might need to use parameters you determined during training)
    
    #
    img = np.array(img)
    image = tf.cast(img, tf.float32) / 255.0
    img = tf.image.resize(image, IMAGE_SIZE, antialias=True)
    img = tf.image.central_crop(img, central_fraction=400/512)
    img = tf.reshape(img, (-1, 400, 400, 3))
    # Now apply your model and save your prediction:
    
    prediction = voting.predict(img)[0]
    #preds.append(np.argmax(prediction))
    preds.append(prediction)
    # Blind-Monkey Model
    # This is horrible possible baseline model.  You can improve it by
    # putting all of p's mass on the most commonly occuring class.
    # Question: if you set p to the actual class label distribution, on average,  
    # will you get the same result, a better result or a worse result? 

my_submission = pd.DataFrame({'image_id': sample_sub.image_id, 'label': preds})
my_submission.to_csv('/kaggle/working/submission.csv', index=False)



In [35]:
#pd.read_csv('./submission.csv')

In [36]:
# #from skimage.transform import resize

# img = keras.preprocessing.image.load_img('/kaggle/input/cassava-leaf-disease-classification/test_images/' + sample_sub.image_id[0])
# img = np.array(img)
# # img = resize(image, (512, 512),anti_aliasing=True)
# # img = tf.reshape(img, (-1, 512, 512, 3))
# image = tf.cast(img, tf.float32) / 255.0
# img = tf.image.resize(image, IMAGE_SIZE, antialias=True)
# img = tf.image.central_crop(img, central_fraction=400/512)
# img = tf.reshape(img, (-1, 400, 400, 3))
# img

In [37]:
# voting.predict(img)[0]

In [38]:
# print('Generating submission.csv file...')
# test_ids_ds = test_ds.map(lambda image, idnum: idnum).unbatch()
# test_ids = next(iter(test_ids_ds.batch(NUM_TEST_IMAGES))).numpy().astype('U') # all in one batch
# np.savetxt('submission.csv', np.rec.fromarrays([test_ids, predictions]), fmt=['%s', '%d'], delimiter=',', header='id,label', comments='')
# !head submission.csv

Be aware that because this is a code competition with a hidden test set, internet and TPUs cannot be enabled on your submission notebook. Therefore TPUs will only be available for training models. For a walk-through on how to train on TPUs and run inference/submit on GPUs, see our [TPU Docs](https://www.kaggle.com/docs/tpu#tpu6).

# OR

In [39]:
# from PIL import Image

# test_path = '../input/cassava-leaf-disease-classification/test_images'

# test_images = os.listdir(test_path)
# predictions = []

# for image_id in test_images:
    
#     image = Image.open(os.path.join(test_path, image_id))
#     image = np.array(image)
#     image = np.expand_dims(image, axis=0)
# #     print(image.shape)
#     predictions.append(np.argmax(model_res.predict(image)))

# sub = pd.DataFrame({'image_id': test_images, 'label': predictions})
# sub.to_csv(os.path.join(RESULTSPATH, 'submission.csv'), index = False)