# Welcome to Computer Vision! #

Congratulations! You've just been hired as KaggleKars first data scientist! Are you ready?

<!-- TODO: HEADER ILLUSTRATION -->

In this micro-course, you'll:
- Design a state-of-the-art **image classifier** with Keras!
- Master the art of **transfer learning** to boost your models!
- See inside a **convolutional layer** <!-- visualize what the model learns -->
- Use the powerful **TPU accelerator** to speed-up your training!
- Utilize **data augmentation** to get more data--for free!

If you've taken the /Introduction to Deep Learning/ micro-course, you'll know everything you need to be successful.

Now let's get started!

# Classifying Cars #

KaggleKars, a new company, has an idea for an app. If someone sees a car they like, they can take a picture, and the app will suggest cars for sale nearby of the same kind. Your job is to design the **image classifier**. If a user uploads a picture like this: 

<!-- TODO: picture of car -->

your classifier should output the make, model, and year like this: 

<!-- TODO: class label -->

The company has decided for now to focus on a list of 196 different vehicles. They have collected a dataset consisting of 16,185 images of these vehicles, divided into a **training set** of 8,144 images and a **validation set** of 8,041 images. The company says that to be successful, they need the classifier to achieve **95% accuracy** on the validation set.

# Tensor Processing Units #

In this course, you'll be working with large models and large datasets. Training these models on an ordinary CPU would be very time consuming, and because of memory limitations, the accuracy of your model could suffer as well.

Instead, you'll train your models on a **Tensor Processing Unit** or **TPU**. A TPU is an accelerator much like the GPU you used in the introductory course. The TPUs that Kaggle provides, however, are able to handle much larger workloads than the GPUs.

<!-- TODO: picture of tpu -->

TPUs are powerful and they require a bit of special handling. They are so powerful in fact that the hardest part of using them is keeping them busy!

The result is worth the effort though. Models that could take weeks to train on a home computer can be trained in minutes on a cloud TPU. (And Kaggle gives you 30 hours each week for free!)

What's more, the methods that you'll learn in this course will scale to workloads many times as large. Essentially the same methods could be used to train the deep learning models like BERT and BigGan, which have billions of parameters. After completing this micro-course, you'll have tools to solve demanding real-world problems.

This code will initialize the TPU in preparation for training.

In [None]:
import tensorflow as tf

tpu = tf.distribute.cluster_resolver.TPUClusterResolver()
print('Running on TPU ', tpu.master())
tf.config.experimental_connect_to_cluster(tpu)
tf.tpu.experimental.initialize_tpu_system(tpu)
strategy = tf.distribute.experimental.TPUStrategy(tpu)
print("REPLICAS: ", strategy.num_replicas_in_sync)

It's not important to understand the details. Just cut-and-paste whenever you want to use TPU acceleration.

# Loading Data  #

The first step in building a machine learning model is preparing the data. To run efficiently, TPUs require data in a special format called a TFRecord. We'll use the [TensorFlow Datasets](https://www.tensorflow.org/datasets) (TFDS) library to help us load these files, instead of the Keras data loader you used in *Intro to Deep Learning*.
<!-- TODO: ask Alexis what she's using -->

In [12]:
import tensorflow_datasets as tfds

The following cell loads a dataset prepared for supervised training together with its metadata.

In [None]:
from kaggle_datasets import KaggleDatasets
from visiontools import StanfordCars

DATA_DIR = KaggleDatasets.get_gcs_path()

(ds_train, ds_valid), ds_info = tfds.load('stanford_cars/simple',
                                          data_dir=DATA_DIR,
                                          split=['train', 'test'],
                                          with_info=True,
                                          as_supervised=True,
                                          shuffle_files=True)

You'll notice that we specified the data directory as a Google Cloud bucket. Locating the dataset there helps the TPU to retrieve the data more quickly. You'll also notice that we've loaded the ~stanford_cars/simple~ version of the dataset. It has only three classes: `Convertible`, `SUV`, and `Wagon`. In the exercises, you'll use the complete dataset with all 196 classes.

Now let's take a look at a few examples from the dataset.

In [14]:
tfds.visualization.show_examples(ds_info, ds_train)

<!-- TODO: needs tfds v3.0.0 (nightly) for supervised dataset -->

The `ds_train` and `ds_valid` objects are both generators that yield pairs `(image, label)`. 
The following code optimizes these generators for use with the TPU.

In [None]:
BATCH_SIZE = 16
AUTO = tf.data.experimental.AUTOTUNE

ds_train = (
    ds_train.batch(BATCH_SIZE) # train images in batches, instead of one at a time
    .cache() # save the batches in memory, instead of reloading each step
    .prefetch(AUTO) # use the CPU to fetch a batch while the TPU is busy
)

ds_valid = (
    ds_valid.batch(BATCH_SIZE)
    .cache()
    .prefetch(AUTO)
)

<!-- TODO: make sure the batch size is compatible with the size of the dataset -->

In Lesson 5, you'll see how to add **data augmentation** to this pipeline to give a boost to your model.

Now that our data is loaded and optimized, we're ready to build the network!

# Transfer Learning #

To be accurate, an image classifier requires a large amount of data. This is especially true when there are a many labels to learn. (Such problems are sometimes called *fine-grained*.) To achieve your goal of 95% accuracy over 196 classes, you would need much more data than you've been provided with.

The best solution would be to get more data. High-quality data, however, is expensive in both time and money -- neither of which (your company informs you) you happen to have. Fortunately, there is another solution that is highly effective and also virtually free. It is called **transfer learning**.

This is the insight behind transfer learning: a neural network trained on one set of images will have learned a lot about any similar set of images, too. So why not make use of that learning? In fact, all we need to do is replace the layers closest to the outputs and then retrain the model.

<!-- TODO: picture of replacing head of network for transfer learning -->

Here is the Keras code that implements transfer learning with a pretrained model named **VGG16**. By using the `strategy.scope` context, we make our model aware of the TPU we initialized earlier.

In [21]:
IMAGE_SIZE = [128, 128] # [width, height]
NUM_LABELS = 3 # Convertible, SUV, Wagon

with strategy.scope():
    # Load the pretrained VGG16 model
    pretrained_model = tf.keras.applications.VGG16(
        weights='imagenet',
        include_top=False, # replace layers closest to outputs
        input_shape=[*IMAGE_SIZE, 3],
    )
    # pretrained_model.trainable = False # TODO: which?
    # Construct the model
    model = tf.keras.Sequential([
        pretrained_model,
        tf.keras.layers.GlobalAveragePooling2D(), # TODO: replace this
        tf.keras.layers.Dense(NUM_LABELS,
                              activation='softmax',
                              dtype=tf.float32)
    ])

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy'],
)

VGG16 is smaller than many modern convnets, but large enough to be effective on real problems. It is appropriate for our smaller example dataset.

It has been pretrained on a set of images called **ImageNet**, a collection of about 14 million images of over 20,000 classes of everyday things. It is a good dataset to use for transfer learning whenever you are training on natural images. <!-- TODO: check ImageNet numbers -->

# Train the Model #

The hard part is over. Now all we need to do is set it going!

In [None]:
EPOCHS = 10

history = model.fit(ds_train,
                    validation_data=ds_valid,
                    epochs=EPOCHS)

<!-- TODO: put callbacks in visiontools? how successful without? -->

When training a neural network, it's always a good idea to examine the loss and metric plots. The `history` object contains this information in a dictionary `history.history`. We can use Pandas to convert this dictionary to a dataframe and plot it with a built-in method.

In [None]:
import pandas as pd

pd.DataFrame(history.history).plot();

<!-- discuss convergence, over/underfitting -->

# Evaluate #

Let's get a closer look at how our classifier performed.

<!-- TODO: classification report -->
```python .noeval
from sklearn.metrics import classification_report

label_names = 
true_labels =
predicted_labels =
classification_report(true_labels, predicted_labels, target_names=label_names)
```

<!-- TODO: plot_confusion_matrix in visiontools -->
```python .noeval
from sklearn.metrics import confusion_matrix
import seaborn as sns

cmat = confusion_matrix(true_labels, predicted_labels, label_names)
sns.heatmap(cmat, cmap='Blues');
```

# Conclusion #

In this tutorial, you learned how you can leverage the power of the TPU to greatly accelerate your training. TPUs can handle the large datasets and large models demanded by today's image applications. <!-- TODO: awkward -->

You also learned how to reuse a pretrained model with **transfer learning**. Transfer learning is one of the best and easiest ways to give your neural network a boost. And best of all -- it's free!

# Your Turn #

Now you have what you need to build a powerful image classifier of your own. Move on to the first exercise and try it out!

<!-- The insight of transfer learning is that if we remove those topmost layers, we can reuse the bottom part, the part that's most likely to generalize. <\!-- last two paragraphs are awkward -\-> -->

<!-- The kinds of deep neural networks you'll be using in this course are called **convolutional neural networks**. They have two parts: a **dense head** and a **convolutional base**. -->

<!-- <\!-- DIAGRAM: base/head -\-> -->

<!-- The base learns the features, and the head learns the classes. So to reuse a model trained on a different different classes, we want to keep the base, but train a new head. Additionally, we can "unfreeze" the topmost layers of the base so that they too will adapt to the new dataset. -->

<!-- <\!-- DIAGRAM: show swapping:imagenet to cars -\-> -->

<!-- move most of this discussion to next lesson
- training an image classifier is hard, especially when you have a lot of labels
  - for instance, /this problem/ needed /this many/ examples to get to /X%/ accuracy
  - but you only have around /this many/ images per car! it you start from scratch, it won't be nearly enough
- fortunately, there's a way around this problem. It's called **transfer learning**.
- as complicated as the visual world is... visual data has a lot in common...
  - for instance, all pictures will have lines and curves, transitions from one color to another
  - (artists have known this for a long time... some artists say there are only x basic shapes! /link/)
  - so an image classifier trained on one dataset will already know most of what it needs to be successful on any other dataset, so why not reuse that information? that's the idea behind transfer learning!
- ImageNet contains over 14 million images in over 22,000 categories. Keras contains a number of powerful models that have been trained on ImageNet. And you can download them yourself!  -->

<!-- To classify a large number of cars, classification -->
