# Welcome to Computer Vision! #

<!-- TODO: lede -->

<!-- TODO: HEADER ILLUSTRATION -->

In this micro-course, you'll:
- Use modern deep-learning networks to build an **image classifier** with Keras!
- Design your own **custom convnet** with reusable blocks!
- Master the art of **transfer learning** to boost your models!
- Utilize **data augmentation** to extend a dataset--for free!
- Learn the fundamentals of **convolution** and **pooling** so you can go even further!

If you've taken the /Introduction to Deep Learning/ micro-course, you'll know everything you need to be successful.

Now let's get started!

# Introduction #

This course will introduce you to the fundamental ideas of computer vision. Our goal is to learn how a neural network can "understand" a natural image well-enough to solve the same kinds of problems the human visual system can solve.

The neural networks that are best at this task are called **convolutional neural networks** (Sometimes we say **convnet** or **CNN** instead.) Convolution is the mathematical operation these networks use in their layers that give them a structure different from the dense layers you learned about in the introductory course. In future lessons, you'll learn why this structure is so effective at solving computer vision problems.

The ideas in this course are important to any kind of computer vision problem. We will apply them to the problem of **image classification**. At the end, however, you'll be prepared for other topics in computer vision like image segmentation and GANs.

# The Convolutional Classifier #

A convnet used for image classification consists of two parts: a **convolutional base** and a **dense head**.

<!-- TODO: parts of a convnet -->

The base is used to **extract the features** from an image. It is formed primarily of layers performing the convolution operation, but often includes other kinds of layers as well. (You'll learn about these in the next lesson.)

The head is used to **determine the class** of the image. It is formed primarily of dense layers, but might include other layers like dropout. 

What do we mean by visual feature? A feature could be a line, a color, a texture, a shape, a pattern -- or some complicated combination.

The whole process goes something like this:

<!-- TODO: extract -> classify -->

The features actually extracted aren't quite like this, but it gives the idea.

# Training the Classifier #

The goal of the network during training is to learn two things:
1. which features to extract from an image (base),
2. which class goes with what features (head).

These days, convnets are rarely trained from scratch. More often, we **reuse the base of a pretrained model**, that is, a model already trained on some similar dataset.

To this pretrained base we then **attach an untrained head**. Because the base has already learned to extract useful features, we then only need to train the head to classify the images in the new dataset.

<!-- TODO: attach head to base -->

Because the head usually consists of only a few dense layers, very accurate classifiers can be created from relatively little data.

# Example #

Let's walk through an example. Our goal is to create a classifier for the `Stanford Cars` dataset. It consists of about 16,000 images in 196 classes. The steps are basically the same as you learned about in the introductory course.

## Step 1 - Load Data ##

In [1]:
import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)

In [15]:
#$HIDE$
import os
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.vgg16 import preprocess_input

# DATA_DIR = '/kaggle/input/stanford-car-dataset-by-classes-folder/car_data/car_data'
DATA_DIR = '/home/jovyan/work/kaggle/datasets/stanford-cars-keras/car_data/car_data'
TRAIN_DIR = os.path.join(DATA_DIR, 'train')
VALID_DIR = os.path.join(DATA_DIR, 'test')

ds_gen = ImageDataGenerator(preprocessing_function=preprocess_input)

BATCH_SIZE = 16
SIZE = (150, 150)

ds_train = ds_gen.flow_from_directory(directory=TRAIN_DIR,
                                      batch_size=BATCH_SIZE,
                                      shuffle=True,
                                      target_size=SIZE,
                                      class_mode='sparse')

ds_valid = ds_gen.flow_from_directory(directory=VALID_DIR,
                                      batch_size=BATCH_SIZE,
                                      shuffle=True,
                                      target_size=SIZE,
                                      class_mode='sparse')

Found 8144 images belonging to 196 classes.
Found 8041 images belonging to 196 classes.


The first step is to prepare your dataset. We'll skip the details of loading for now, but let's look at a few examples of images and their classes.

## Step 2 - Define Pretrained Base ##

In [16]:
from tensorflow.keras.applications import VGG16

pretrained_base = VGG16(include_top=False,
                        weights='imagenet',
                        input_shape=[*SIZE, 3])

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.4/xception_weights_tf_dim_ordering_tf_kernels_notop.h5


## Step 3 - Attach Head ##

In [17]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten, Dense

model = Sequential([
    pretrained_base,
    Flatten(),
    Dense(512, activation='relu'),
    Dense(196, activation='softmax'),
])

## Step 4 - Train ##

In [18]:
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy'],
)

history = model.fit(ds_train,
                    validation_data=ds_valid,
                    epochs=15)

Train for 509 steps, validate for 503 steps
Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


## Step 5 - Evaluate ##

When training a neural network, it's always a good idea to examine the loss and metric plots. The `history` object contains this information in a dictionary `history.history`. We can use Pandas to convert this dictionary to a dataframe and plot it with a built-in method.

In [None]:
import pandas as pd

pd.DataFrame(history.history).plot();

<!-- discuss convergence, over/underfitting -->

# Conclusion #

In this lesson, we learned about the structure of a convnet classifier: a **head** to act as a classifier atop of a **base** which performs the feature extraction.

The head, essentially, is an ordinary classifier like you learned about in the introductory course. For features, it uses those features extracted by the base. This is the basic idea behind CNN image classifiers: that we can attach a unit that performs feature engineering to the classifier itself.

This is one of the big advantages deep neural networks have over traditional machine learning models: given the right network structure, the deep neural net can learn how to engineer the features it needs to solve its problem.

In the remainder of this micro-course, we're going to explore this convolutional base.