<a href="https://colab.research.google.com/github/alangkim/fchollet/blob/main/%EB%94%A5%EB%9F%AC%EB%8B%9D_%EA%B8%B0%EB%A7%90%EA%B3%A0%EC%82%AC.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Ch8. Introduction to deep learning for computer vision

1. Introduction to
convnets

2. Training a
convnet from scratch on a small dataset

3. Leveraging a
pretrained model

## 1. Introduction to convnets

Stack of Conv2D and MaxPooling2D layers

In [None]:
# Instantiating a small convnet

from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.Input(shape=(28, 28, 1))                                     # MNIST dataset을 이용하기 위해 28*28 사용
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(inputs)     # Conv2D
x = layers.MaxPooling2D(pool_size=2)(x)                                     # MaxPooling2D
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)                                                     # Flatten all the information
outputs = layers.Dense(10, activation="softmax")(x)                         # connect Dense layer

model = keras.Model(inputs=inputs, outputs=outputs)                         # making model by functional API

In [None]:
model.summary()

In [None]:
# Training the convnet on MNIST images

from tensorflow.keras.datasets import mnist

(train_images, train_labels), (test_images, test_labels) = mnist.load_data()
train_images = train_images.reshape((60000, 28, 28, 1)) # CNN을 이용하기 위해서 channel dimension은 필수적이다.
# Convnet is running on the original shape of the image.
train_images = train_images.astype("float32") / 255
test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype("float32") / 255

model.compile(optimizer="rmsprop",
    loss="sparse_categorical_crossentropy", # multi class classification
    metrics=["accuracy"])

model.fit(train_images, train_labels, epochs=5, batch_size=64)

In [None]:
# Evaluating the convnet

test_loss, test_acc = model.evaluate(test_images, test_labels)
print(f"Test accuracy: {test_acc:.3f}")

### The convolution operation

* 'Dense layers' learn 'global patterns' in their input feature space whereas 'convolution layers' learn 'local patterns'

* The patterns they learn are
translation invariant

* They can learn spatial hierarchies of patterns

* Convolution preserves the spatial relationship between pixels by learning image
features using small squares (depending on the filter size) of input data

* Convolution: multiplying elementwise by filter and summing the multiplication
outputs

* Ex) a 3x3 kernel or 3x3x1 filter acts on a 5x6 input image with stride 1 and outputs
a 3x4 feature map.

* In fully connected sense, we need unshared 30(=5x6)x12(=3x4) weights (input size x output size)

* 9 vs 360. So using convolution filter is far more efficient.

Convolution on MxNx3 image with 3x3x3 filter producing 1 feature map by taking dot products between the filter and 3x3x3 piecies of the image.

Depth part is decided based on the input feature map.

### Why convolution?

* Fully Connected -> 1000x1000 images, 10000 hidden nodes, 10^10 parameters
* Convolution     -> 1000x1000 images, 10x10 filter size, 100 filters, 10^4 parameters

* If you are dealing with image dataset, it's highly recommend to use convolution layers in modeling.



### How convolution filter works?

Different values of the filter matrix produce different
feature maps for the same input image.

CNN learns the values of filters during training

The more filters, the more features are extracted

### Feature map


4 parameters of feature map

1. filter size
2. depth
3. stride
4. zero-padding

### The max pooling operation


Role
of max pooling: to aggressively downsample feature maps

Transformed via a hardcoded max
tensor operation

We need the features from the last
convolution layer to contain
information about the totality of the
input

The final feature map has 22
× 22 ×
128 = 61,952 total coefficients per
sample

This is far too large for such a
small model and would result in
intense overfitting

In [None]:
# max-pooling이 없는 경우
inputs = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(inputs)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(10, activation="softmax")(x)
model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)

In [None]:
model_no_max_pool.summary()
# 모델의 크기에 비해 parameters가 너무 많다.

In [None]:
# max-pooling은 없지만 stride를 2로 지정한 경우
inputs = keras.Input(shape=(28, 28, 1))
x = layers.Conv2D(filters=32, kernel_size=3, strides = 2, activation="relu")(inputs) # stride = 2 로 지정.
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(10, activation="softmax")(x)
model_no_max_pool = keras.Model(inputs=inputs, outputs=outputs)

In [None]:
model_no_max_pool.summary()
# parameters가 많이 줄어들었으나 max-pooling의 결과가 더 좋다.
# 일반적으로 classification에서는 stride보다 max-pooling을 자주 사용한다.
# 경험적으로 대부분 average-pooling보다 max-poolng이 좋다.

## 2. Training a convnet from scratch on a small dataset

Downloading a
Kaggle dataset in Google Colaboratory

Access to the API is restricted to
Kaggle users, you need to authenticate yourself.

The
kaggle package will look for your login credentials in a JSON file located at
kaggle kaggle.json

First, you need to create a
Kaggle API key and download it to your local machine
Login
--> My Account --> Account settings --> API
Click the Create New API Token
button


Second, go to your
Colab notebook, and upload the API’s key JSON file to your
Colab session by running the following code in a notebook cell:

### 데이터 불러오기

In [None]:
from google.colab import files
files.upload()

In [None]:
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

In [None]:
!kaggle competitions download -c dogs-vs-cats

In [None]:
import os
os.listdir()

In [None]:
!unzip -qq dogs-vs-cats.zip

In [None]:
os.listdir()

In [None]:
!unzip -qq train.zip

In [None]:
os.listdir()

In [None]:
os.listdir('train')

### Copying images to training, validation, and test directories

복잡하게 나열되어있는 data를 train, validation, test로 나누고 각각 1000개, 500개, 1000개의 data를 넣는 전처리

In [None]:
import os, shutil, pathlib

original_dir = pathlib.Path("train")
# original dataset이 풀려있는 directory
new_base_dir = pathlib.Path("cats_vs_dogs_small")
# smaller dataset을 저장할 directory

def make_subset(subset_name, start_index, end_index):
    for category in ("cat", "dog"):
        dir = new_base_dir / subset_name / category
        os.makedirs(dir)
        # 새로운 directory 만들기 ex) cats_vs_dogs_small/train/dog
        fnames = [f"{category}.{i}.jpg" for i in range(start_index, end_index)]
        # 파일 이름 만들기
        for fname in fnames:
            shutil.copyfile(src=original_dir / fname,
                            dst=dir / fname)
            # src : source, dst : destination

make_subset("train", start_index=0, end_index=1000)
# 처음 1000개로 train set을 만듦
make_subset("validation", start_index=1000, end_index=1500)
# 그 다음 500개로 validation set을 만듦
make_subset("test", start_index=1500, end_index=2500)
# 그 다음 1000개로 test set을 만듦

In [None]:
os.listdir(new_base_dir)

In [None]:
# 위 코드와 동일
os.listdir('cats_vs_dogs_small')

In [None]:
os.listdir('cats_vs_dogs_small/test')

In [None]:
os.listdir('cats_vs_dogs_small/test/dog')
# 1500~2500 index를 가진 dog 파일이 들어가있음

In [None]:
from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.Input(shape=(180, 180, 3))
# 180x180 size를 가진 RGB image
x = layers.Rescaling(1./255)(inputs)
# rescale
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
# binary classification이라 activation은 sigmoid
model = keras.Model(inputs=inputs, outputs=outputs)

In [None]:
model.summary()

# height, width는 점점 작아지고 depth는 점점 깊어진다.

In [None]:
model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

### Data preprocessing

1. Read the picture files.
2. Decode the JPEG content to RGB grids of pixels
3. Convert these into floating
point tensors
4. Resize them to a shared size (we’ll use 180
× 180)
5. Pack them into batches (we’ll use batches of 32 images)

In [None]:
# Using image_dataset_from_directory to read images

from tensorflow.keras.utils import image_dataset_from_directory

train_dataset = image_dataset_from_directory(
    new_base_dir / "train",
    image_size=(180, 180),
    batch_size=32)
validation_dataset = image_dataset_from_directory(
    new_base_dir / "validation",
    image_size=(180, 180),
    batch_size=32)
test_dataset = image_dataset_from_directory(
    new_base_dir / "test",
    image_size=(180, 180),
    batch_size=32)

### Example

#### Understanding TensorFlow Dataset objects



TensorFlow
makes available the tf.data API to create efficient input pipelines

The Dataset class handles many key features that would otherwise be
cumbersome to implement yourself in particular, asynchronous data prefetching

The Dataset class also exposes a functional
style API for modifying datasets

In [None]:
import numpy as np
import tensorflow as tf
random_numbers = np.random.normal(size=(1000, 16))
dataset = tf.data.Dataset.from_tensor_slices(random_numbers)
# from_tensor_slices() class can be used to create a Dataset from a NumPy array

In [None]:
# Yielding single samples

for i, element in enumerate(dataset):
    print(element.shape)
    if i >= 2:
        break

In [None]:
# We can use .batch() method to batch the data

batched_dataset = dataset.batch(32)
for i, element in enumerate(batched_dataset):
    print(element.shape)
    if i >= 2:
        break

#### Range of useful dataset methods

* .shuffle(buffer_size) : Shuffles elements within a buffer
* .prefetch (buffer_size) : Prefetches a buffer of elements in GPU memory to achieve
better device utilization.
* .map(callable) : Applies an arbitrary transformation to each element of the dataset

In [None]:
reshaped_dataset = dataset.map(lambda x: tf.reshape(x, (4, 4)))
for i, element in enumerate(reshaped_dataset):
    print(element.shape)
    if i >= 2:
        break

In [None]:
reshaped_dataset = dataset.map(lambda x: tf.reshape(x, (4, 4))).batch(32)
for i, element in enumerate(reshaped_dataset):
    print(element.shape)
    if i >= 2:
        break

### 다시 원래 문제로 돌아가자

In [None]:
# Displaying the shapes of the data and labels yielded by the Dataset

for data_batch, labels_batch in train_dataset:
    print("data batch shape:", data_batch.shape)
    print("labels batch shape:", labels_batch.shape)
    break

In [None]:
# Fitting the model using a Dataset

callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="convnet_from_scratch.keras",
        save_best_only=True,
        monitor="val_loss")
]
history = model.fit(
    train_dataset,
    epochs=30,
    validation_data=validation_dataset,
    callbacks=callbacks)

In [None]:
# Displaying curves of loss and accuracy during training

import matplotlib.pyplot as plt

accuracy = history.history["accuracy"]
val_accuracy = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(accuracy) + 1)
plt.plot(epochs, accuracy, "bo", label="Training accuracy")
plt.plot(epochs, val_accuracy, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()

In [None]:
# Evaluating the model on the test set
# sample이 2000개로 너무 적어 overfitting이 나타날 것이다.

test_model = keras.models.load_model("convnet_from_scratch.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")

### Using data augmentation to prevent overfitting

* **Data augmentation**
takes the approach of generating more training data
from existing training samples by **augmenting the samples via a number of random transformations**
that yield believable looking images

* In
Keras , this can be done by adding a number of data augmentation layers at
the start of your model.

In [None]:
# 모델에 다음과 같이 data_augmentation을 삽입할 수 있다.

data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
        layers.RandomZoom(0.2),
    ]
)

**RandomFlip**("horizontal")
is for randomly flipping half the images horizontally

**RandomRotation**(0.1)
Rotates the input images by a random value in the range [ -10%, +10%]

**RandomZoom**(0.2)
Zooms in or out of the image by a random factor in the range [ -20%, +20%]

In [None]:
plt.figure(figsize=(10, 10))
for images, _ in train_dataset.take(1):
# We can use .take(N) to only sample N batches from the dataset. This is equivalent to inserting a break in the loop after the Nth batch
    for i in range(9):
        augmented_images = data_augmentation(images)
        # apply the augmentation
        ax = plt.subplot(3, 3, i + 1)
        plt.imshow(augmented_images[0].numpy().astype("uint8"))
        # Display the first image in the output batch.
        # For each of the 9 iteration, this is a different augmentation of the same image
        plt.axis("off")

# augmentation을 통해 dataset이 많아지면 overfitting을 prevent할 수 있다.

### Defining a new convnet

In [None]:
# New convnet includes Image augmentation and dropout

inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs) # augmentation
x = layers.Rescaling(1./255)(x)
x = layers.Conv2D(filters=32, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=64, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=128, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.MaxPooling2D(pool_size=2)(x)
x = layers.Conv2D(filters=256, kernel_size=3, activation="relu")(x)
x = layers.Flatten()(x)
x = layers.Dropout(0.5)(x) # dropout
# dropout을 convolution layer에 사용하는 것은 좋지 않다.
# 일반적인 Dropout은 convolution layer에 사용하지 않는다.
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=inputs, outputs=outputs)

model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

In [None]:
# Training the regularized convnet

callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="convnet_from_scratch_with_augmentation.keras",
        save_best_only=True,
        monitor="val_loss")
]

history = model.fit(
    train_dataset,
    epochs=100,
    validation_data=validation_dataset,
    callbacks=callbacks)

In [None]:
# Evaluating the model on the test set

test_model = keras.models.load_model(
    "convnet_from_scratch_with_augmentation.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")
# dropout과 augmentation이 없는 것보다 결과가 훨씬 좋다.

## 8.3. Leveraging a pretrained model

* A common and highly effective approach to deep learning on small image datasets
is to use a pretrained model

* **Pretrained network** is a saved network that was previously trained on a large
dataset

* Motivations:

    Lots of data, time, resources needed to train and tune a neural network from
scratch

    Cheaper, faster way of adapting a neural network by exploiting their
generalization properties

1. Take top performing pre-trained networks(convolutional base)
2. If we have small amount of data

    Freeze all Networks + New softmax layer for cats and dogs

    Training에 New softmax layer for cats and dogs만 사용한다.

3. If we have larger data

    Freeze some Networks + New softmax layer for cats and dogs

    Training에 top performing pre-trained networks의 일부도 사용한다.

* List of image classification models (all pretrained on the ImageNet dataset) that are available as part of keras : Xception
, Inception V3, ResNet50, VGG16, VGG19, MobileNet

* More available from
tensorflow hub

In [None]:
# Instantiating the VGG16 convolutional base

conv_base = keras.applications.vgg16.VGG16(
    weights="imagenet",
    include_top=False, # classifier part는 제외하고 convolutional base만 가져온다.
    input_shape=(180, 180, 3))

In [None]:
conv_base.summary()

### Fast feature extraction without data augmentation

We’ll start by extracting features as
NumPy arrays by calling the predict()
method of the conv_base model on our training

In [None]:
# Extracting the VGG16 features and corresponding labels

def get_features_and_labels(dataset):
    all_features = []
    all_labels = []
    for images, labels in dataset:
        preprocessed_images = keras.applications.vgg16.preprocess_input(images)
        # vgg16 pretrained network
        features = conv_base.predict(preprocessed_images)
        all_features.append(features)
        all_labels.append(labels)
    return np.concatenate(all_features), np.concatenate(all_labels)

train_features, train_labels =  get_features_and_labels(train_dataset)
val_features, val_labels =  get_features_and_labels(validation_dataset)
test_features, test_labels =  get_features_and_labels(test_dataset)

In [None]:
train_features.shape

In [None]:
# Defining and training the densely connected classifier
# add last layer
# training is very fast because we only have to deal with two dense layers

inputs = keras.Input(shape=(5, 5, 512))
x = layers.Flatten()(inputs)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

callbacks = [
    keras.callbacks.ModelCheckpoint(
      filepath="feature_extraction.keras",
      save_best_only=True,
      monitor="val_loss")
]
history = model.fit(
    train_features, train_labels,
    epochs=20,
    validation_data=(val_features, val_labels),
    callbacks=callbacks)

In [None]:
import matplotlib.pyplot as plt
acc = history.history["accuracy"]
val_acc = history.history["val_accuracy"]
loss = history.history["loss"]
val_loss = history.history["val_loss"]
epochs = range(1, len(acc) + 1)
plt.plot(epochs, acc, "bo", label="Training accuracy")
plt.plot(epochs, val_acc, "b", label="Validation accuracy")
plt.title("Training and validation accuracy")
plt.legend()
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()
plt.show()

# 2 dense layer만 사용했음에도 불구하고 결과가 좋다.

### Fast feature extraction with data augmentation

Create a new model that chains together: 

1) data augmentation

2) freezing convolutional base

3) a dense classifier

In [None]:
# Instantiating and freezing the VGG16 convolutional base

conv_base  = keras.applications.vgg16.VGG16(
    weights="imagenet",
    include_top=False) # only get convolutional base part
conv_base.trainable = False # conv_base는 이미 잘 훈련되어있는거라 훈련시키지 않는다.

Printing the list of trainable weights before and after freezing

In [None]:
conv_base.trainable = True
print("This is the number of trainable weights "
      "before freezing the conv base:", len(conv_base.trainable_weights))

In [None]:
conv_base.trainable = False
print("This is the number of trainable weights "
      "after freezing the conv base:", len(conv_base.trainable_weights))

In [None]:
# Adding a data augmentation stage and a classifier to the convolutional base

data_augmentation = keras.Sequential(
    [
        layers.RandomFlip("horizontal"),
        layers.RandomRotation(0.1),
        layers.RandomZoom(0.2),
    ]
)

inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs) # apply data augmentation
x = keras.applications.vgg16.preprocess_input(x) # apply input value scaling
x = conv_base(x)
x = layers.Flatten()(x)
x = layers.Dense(256)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.compile(loss="binary_crossentropy",
              optimizer="rmsprop",
              metrics=["accuracy"])

In [None]:
callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="feature_extraction_with_data_augmentation.keras",
        save_best_only=True,
        monitor="val_loss")
]

history = model.fit(
    train_dataset,
    epochs=50,
    validation_data=validation_dataset,
    callbacks=callbacks)

In [None]:
# Evaluating the model on the test set

test_model = keras.models.load_model(
    "feature_extraction_with_data_augmentation.keras")
test_loss, test_acc = test_model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")

# 이전보다 결과가 아주 조금 좋아졌다.

### Fine tuning a pretrained model

Fine
tuning consists of unfreezing a few of the top
layers of a frozen model base used for feature
extraction, and jointly training both the newly added
part of the model

last convolution block을 unfreeze하고 같이 훈련시키다.

#### step

1. Add your custom network on top of an already
trained base network
2. Freeze the base network
3. Train the part you added
4. Unfreeze some layers in the base network
5. Jointly train both these layers and the part you added

In [None]:
# Freezing all layers until the fourth from the last

conv_base.trainable = True
for layer in conv_base.layers[:-4]:
    layer.trainable = False

In [None]:
model.compile(loss="binary_crossentropy",
              optimizer=keras.optimizers.RMSprop(learning_rate=1e-5),
              # we use smaller lr
              metrics=["accuracy"])

callbacks = [
    keras.callbacks.ModelCheckpoint(
        filepath="fine_tuning.keras",
        save_best_only=True,
        monitor="val_loss")
]

history = model.fit(
    train_dataset,
    epochs=30,
    validation_data=validation_dataset,
    callbacks=callbacks)

In [None]:
model = keras.models.load_model("fine_tuning.keras")
test_loss, test_acc = model.evaluate(test_dataset)
print(f"Test accuracy: {test_acc:.3f}")

# Many times it will improve the results

1. Convnets
are the best type of machine learning models for
computer vision
2. On a small dataset, overfitting will be the main issue. Data
augmentation is a powerful way
3. It’s easy to reuse an existing
convnet on a new dataset via
transfer learning
4. As a complement to feature extraction, you can use fine
tuning

# Ch9. Advanced deep learning for computer vision

1. Three essential computer vision tasks
2. An image segmentation example
3. Modern
convnet architecture patterns
4. Interpreting what
convnets learn

## 9.1. Three essential computer vision tasks

1. **Image classification**
: assign one or
more labels to an image
2. **Image segmentation**
: goal is to
“segment” or “partition” an image into
different areas, with each area usually
representing a category
3. **Object detection**
: goal is to draw
rectangles (called bounding boxes)
around objects of interest in an image,
and associate each rectangle with a

## 9.2. Image segmentation example

Image segmentation with deep learning is about using a model to assign a class
to each pixel in an image (such as “background” and “foreground,” or “road,”
“car,” and “sidewalk"

* **Semantic segmentation**, where each pixel is independently classified into a
semantic category

* **Instance segmentation**, which seeks not only to classify image pixels by
category, but also to parse out individual object instances

## Oxford IIIT Pets dataset

Contains 7,390 pictures of various breeds of cats and dogs, together with
foreground background segmentation masks

**Segmentation mask**
is the image segmentation equivalent of a label: it’s an
image the same size as the input image, with a single color channel where each
integer value corresponds to the class: 1 (foreground), 2 (background), and
3(contour)

In [None]:
# download data

!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz
!wget http://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz
!tar -xf images.tar.gz
!tar -xf annotations.tar.gz

# !wget : download file from the website
# !tar : unzip file

In [None]:
# directory 안에 있는 file 확인

!ls

In [None]:
# directory 안에 있는 file 확인

import os
os.listdir()

In [None]:
os.listdir('images')

In [None]:
fnms1 = os.listdir('images')
len(fnms1)

In [None]:
os.listdir('annotations')
# annotation : 주석

In [None]:
!cat annotations/README

In [None]:
os.listdir('annotations/trimaps/')

In [None]:
fnms2 = os.listdir('annotations/trimaps/')
len(fnms2)
# fnms1보다 크다 : 중복 파일이 존재한다는 의미

In [None]:
import os

input_dir = "images/"
target_dir = "annotations/trimaps/"

input_img_paths = sorted(
    [os.path.join(input_dir, fname)     # join해라
     for fname in os.listdir(input_dir) # input_dir에 있는 fname을
     if fname.endswith(".jpg")])        # fname이 .jpg로 끝나면

target_paths = sorted(
    [os.path.join(target_dir, 
                  fname)
     for fname in os.listdir(target_dir)
     if fname.endswith(".png") and not fname.startswith(".")]) # 중복 파일 제거

In [None]:
input_img_paths[:5]

In [None]:
target_paths[:5]

In [None]:
len(input_img_paths)

In [None]:
len(target_paths)
# 중복 파일 제거 성공

In [None]:
# 10번째 이미지

import matplotlib.pyplot as plt
from tensorflow.keras.utils import load_img, img_to_array

plt.axis("off")
plt.imshow(load_img(input_img_paths[9]))

In [None]:
# annotation

def display_target(target_array):
    normalized_array = (target_array.astype("uint8") - 1) * 127
    plt.axis("off")
    plt.imshow(normalized_array[:, :, 0])

img = img_to_array(load_img(target_paths[9], color_mode="grayscale"))
display_target(img)

In [None]:
# Load our inputs and targets into two NumPy arrays

import numpy as np
import random

img_size = (200, 200)
# resize everything
num_imgs = len(input_img_paths)
# total number of samples in the data

random.Random(1337).shuffle(input_img_paths)
random.Random(1337).shuffle(target_paths)
# seed number를 1337로 동일하게 지정해줘서 input과 target이 same order를 가지면서 shuffle 될 수 있다.

def path_to_input_image(path):
    return img_to_array(load_img(path, target_size=img_size))

def path_to_target(path):
    img = img_to_array(
        load_img(path, target_size=img_size, color_mode="grayscale"))
    img = img.astype("uint8") - 1
    return img

input_imgs = np.zeros((num_imgs,) + img_size + (3,), dtype="float32")
# (num_imgs,)는 7000, img_size는 위에서 resize한 대로 (200, 200), RGB라서 (3,)
# 따라서 결론적으로 (7000, 200, 200, 3)
targets = np.zeros((num_imgs,) + img_size + (1,), dtype="uint8")
# (7000, 200, 200, 1)
# 마지막 1은 1 or 2 or 3 셋 중에 한 숫자가 들어감
for i in range(num_imgs):
    input_imgs[i] = path_to_input_image(input_img_paths[i])
    targets[i] = path_to_target(target_paths[i])

# validation을 위한 1000개의 sample
num_val_samples = 1000

# split the data into training and validation
train_input_imgs = input_imgs[:-num_val_samples]
train_targets = targets[:-num_val_samples]
val_input_imgs = input_imgs[-num_val_samples:]
val_targets = targets[-num_val_samples:]

In [None]:
input_imgs.shape

In [None]:
targets.shape

In [None]:
# modeling

from tensorflow import keras
from tensorflow.keras import layers

def get_model(img_size, num_classes):
    inputs = keras.Input(shape=img_size + (3,)) # (200, 200, 3)
    x = layers.Rescaling(1./255)(inputs) # rescale

    x = layers.Conv2D(64, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
    x = layers.Conv2D(128, 3, strides=2, activation="relu", padding="same")(x)
    x = layers.Conv2D(128, 3, activation="relu", padding="same")(x)
    x = layers.Conv2D(256, 3, strides=2, padding="same", activation="relu")(x)
    x = layers.Conv2D(256, 3, activation="relu", padding="same")(x)
    # maxpooling을 사용하지 않고 stride 사용

    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(256, 3, activation="relu", padding="same", strides=2)(x)
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(128, 3, activation="relu", padding="same", strides=2)(x)
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same")(x)
    x = layers.Conv2DTranspose(64, 3, activation="relu", padding="same", strides=2)(x)

    outputs = layers.Conv2D(num_classes, 3, activation="softmax", padding="same")(x)

    model = keras.Model(inputs, outputs)
    return model

In [None]:
model = get_model(img_size=img_size, num_classes=3)
model.summary()

#### The first half
of the model closely resembles the kind of
convnet you’d use for image classification

Encode the images into smaller feature maps that contain
spatial information about original image

Downsample
by adding strides rather than using
maxpooling because we care a lot about the spatial location
of information, **maxpooling destroy location information** (stride는 spatial location information이 남아있다.)

#### The second half
of the model is a stack of
Conv2DTranspose layers, inverse of the transformations

Transformation going in the opposite direction of
convolutions

### Up sampling

Motivation : Need a transformation going in the opposite direction of convolutions

* Generating images involving up sampling from low resolution to high resolution

* Decoding layer of a convolutional auto encoder

Neural network up
samplings: Transposed convolution, Fractionally strided
convolution

### Transposed convolution

* Going backward of a convolution operation such that it has the similar positional
connectivity and forms a one to many relationship

* We can express a convolution
operation using a convolution
matrix, which is nothing but a
rearranged matrix

* We similarly express a transposed
convolution using a transposed
convolution matrix, whose layout is
a transposed shape but in which
the actual weight values does not
have to come from the original
convolution matrix

In [None]:
# compile and fit

model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy")
# 원 핫 인코딩을 한다면 loss에 categorical_crossentropy도 사용 가능
# 현재는 targets이 0, 1, 2의 값을 갖기 때문에 sparse_categorical_crossentropy 사용

callbacks = [
    keras.callbacks.ModelCheckpoint("oxford_segmentation.keras",
                                    save_best_only=True)
]

history = model.fit(train_input_imgs, train_targets,
                    epochs=50,
                    callbacks=callbacks,
                    batch_size=64,
                    validation_data=(val_input_imgs, val_targets))

In [None]:
epochs = range(1, len(history.history["loss"]) + 1)
loss = history.history["loss"]
val_loss = history.history["val_loss"]
plt.figure()
plt.plot(epochs, loss, "bo", label="Training loss")
plt.plot(epochs, val_loss, "b", label="Validation loss")
plt.title("Training and validation loss")
plt.legend()

Reload our best performing model according to the validation loss,
and demonstrate how to use it to predict a segmentation mask

In [None]:
from tensorflow.keras.utils import array_to_img

model = keras.models.load_model("oxford_segmentation.keras")

i = 4
test_image = val_input_imgs[i]
plt.axis("off")
plt.imshow(array_to_img(test_image))

mask = model.predict(np.expand_dims(test_image, 0))[0]

def display_mask(pred):
    mask = np.argmax(pred, axis=-1)
    mask *= 127
    plt.axis("off")
    plt.imshow(mask)

display_mask(mask)

## 9.3. Modern convnet architecture patterns

A good model architecture is one that
reduces the size of the search space or
otherwise makes it easier to converge to a good point of the search space

Model architecture is more an art than a science. Experienced machine learning
engineers are able to
intuitively cobble together high performing models on
their first try, while beginners often struggle to create a model that trains at all

You’ll develop your own
intuition throughout this book

In the following sections, we’ll review a few essential
convnet architecture best
practices:
**residual connections , batch normalization , and separable convolutions**

We will apply them to our cat vs. dog classification problem

### Rdsidual connections

너무 많은 layer를 쌓으면 결과가 converge하지 않는 문제가 발생한다.

residual connection을 통해 layer를 많이 쌓아도 문제가 발생하지 않도록 할 수 있다.

The residual connection acts as an
information shortcut around destructive or
noisy blocks

In [None]:
# Residual block where the number of filters changes

from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.Input(shape=(32, 32, 3))
x = layers.Conv2D(32, 3, activation="relu")(inputs)
# x : (32, 32, 32)
residual = x
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
# x : (32, 32, 64)
residual = layers.Conv2D(64, 1)(residual)
# 차원이 달라 계산할 수 없으므로 1X1 Conv2D layer를 이용한다.
x = layers.add([x, residual])

In [None]:
# If the block includes maxpooling layer

inputs = keras.Input(shape=(32, 32, 3))
x = layers.Conv2D(32, 3, activation="relu")(inputs)
# x : (32, 32, 32)
residual = x
x = layers.Conv2D(64, 3, activation="relu", padding="same")(x)
x = layers.MaxPooling2D(2, padding="same")(x)
# x : (16, 16, 64)
residual = layers.Conv2D(64, 1, strides=2)(residual)  # apply Conv2D of 1X1 filter.
# (16, 16, 64)
x = layers.add([x, residual])


### Batch normalization

Internal Covariate Shift : distribution change of each layer’s inputs during
training as the parameters of the previous layers change.

* Inputs to each layer are a ected by the parameters of all preceding layers so that small changes to the network parameters amplify as the network becomes deeper

* This requires a lower learning rate and careful parameter initialization, which slows down training and makes it notoriously hard to train models with saturating nonlinearities.

BN transform can be freely added to any subset of activations to be normalized.

Author generally recommend placing the previous layer’s activation after
the batch normalization layer (although this is still a subject of debate)

In [None]:
x = layers.Conv2D(32, 3, use_bias=False)(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x) # activation 을 batch 이후에 두는 것을 추천

In [None]:
# 아래와 같이 작성할 수도 있지만 위를 추천

x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.BatchNormalization()(x)

### Depthwise separable convolutions

Depthwise
separable convolution ( Depthwise Conv + Pointwise Conv ) is used to
build a light weight CNN (fewer parameters and multiply adds) for efficient on device
intelligence.

In [None]:
# A mini Xception like model

inputs = keras.Input(shape=(180, 180, 3))
x = data_augmentation(inputs)

x = layers.Rescaling(1./255)(x)
x = layers.Conv2D(filters=32, kernel_size=5, use_bias=False)(x)

for size in [32, 64, 128, 256, 512]:
    residual = x
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)
    x = layers.SeparableConv2D(size, 3, padding="same", use_bias=False)(x)
    # batch normalization을 해주는 부분
    # 결과는 Conv2D가 조금 더 좋지만 속도는 Separable2D가 빠르다.
    x = layers.BatchNormalization()(x)
    x = layers.Activation("relu")(x)
    x = layers.SeparableConv2D(size, 3, padding="same", use_bias=False)(x)
    x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
    residual = layers.Conv2D(
    size, 1, strides=2, padding="same", use_bias=False)(residual)
    x = layers.add([x, residual])

x = layers.GlobalAveragePooling2D()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1, activation="sigmoid")(x)

model = keras.Model(inputs=inputs, outputs=outputs)

## 9.4. Interpreting what convnets learn

### Visualizing intermediate activations

The representations learned by
convnets are highly
amenable to visualization

* Visualizing
intermediate convnet outputs
* Visualizing
convnets filters
* Visualizing
heatmaps of class activation in an image

In [None]:
# You can use this to load the file "convnet_from_scratch_with_augmentation.keras"
# you obtained in the last chapter.
from google.colab import files
files.upload()

In [None]:
import os

os.listdir()

In [None]:
from tensorflow import keras

model = keras.models.load_model("convnet_from_scratch_with_augmentation.keras")

model.summary()

**Preprocessing a single image**

In [None]:
from tensorflow import keras
import numpy as np

img_path = keras.utils.get_file(
    fname="cat.jpg",
    origin="https://img-datasets.s3.amazonaws.com/cat.jpg")

# convert image to array
def get_img_array(img_path, target_size):
    img = keras.utils.load_img(
        img_path, target_size=target_size)
    array = keras.utils.img_to_array(img)
    array = np.expand_dims(array, axis=0)
    return array

img_tensor = get_img_array(img_path, target_size=(180, 180))

**Displaying the test picture**

In [None]:
import matplotlib.pyplot as plt
plt.axis("off")
plt.imshow(img_tensor[0].astype("uint8"))
plt.show()

**Instantiating a model that returns layer activations**

In [None]:
from tensorflow.keras import layers

layer_outputs = []
layer_names = []
for layer in model.layers:
    if isinstance(layer, (layers.Conv2D, layers.MaxPooling2D)):
        layer_outputs.append(layer.output)
        layer_names.append(layer.name)
activation_model = keras.Model(inputs=model.input, outputs=layer_outputs)

In [None]:
model.layers

In [None]:
layer_names

**Using the model to compute layer activations**

In [None]:
# feed images to activation model

activations = activation_model.predict(img_tensor)

In [None]:
len(activations)

# 9 layer가 있기 때문에 9

In [None]:
first_layer_activation = activations[0]
print(first_layer_activation.shape)

**Visualizing the fifth channel**

In [None]:
# 첫 번째 convnet을 거친 이미지

import matplotlib.pyplot as plt
plt.matshow(first_layer_activation[0, :, :, 5], cmap="viridis")

In [None]:
# maxpooling을 거친 이미지
plt.matshow(activations[1][0, :, :, 5], cmap="viridis")

In [None]:
# 두 번째 convnet을 거친 이미지
# deeper convnet activations are more abstract
plt.matshow(activations[2][0, :, :, 5], cmap="viridis")

In [None]:
# 마지막 activation
plt.matshow(activations[8][0, :, :, 5], cmap="viridis")

**Visualizing every channel in every intermediate activation**

In [None]:
# activation output을 visualize해주는 코드(생략)

images_per_row = 16
for layer_name, layer_activation in zip(layer_names, activations):
    n_features = layer_activation.shape[-1]
    size = layer_activation.shape[1]
    n_cols = n_features // images_per_row
    display_grid = np.zeros(((size + 1) * n_cols - 1,
                             images_per_row * (size + 1) - 1))
    for col in range(n_cols):
        for row in range(images_per_row):
            channel_index = col * images_per_row + row
            channel_image = layer_activation[0, :, :, channel_index].copy()
            if channel_image.sum() != 0:
                channel_image -= channel_image.mean()
                channel_image /= channel_image.std()
                channel_image *= 64
                channel_image += 128
            channel_image = np.clip(channel_image, 0, 255).astype("uint8")
            display_grid[
                col * (size + 1): (col + 1) * size + col,
                row * (size + 1) : (row + 1) * size + row] = channel_image
    scale = 1. / size
    plt.figure(figsize=(scale * display_grid.shape[1],
                        scale * display_grid.shape[0]))
    plt.title(layer_name)
    plt.grid(False)
    plt.axis("off")
    plt.imshow(display_grid, aspect="auto", cmap="viridis")

    # relu를 거치기 때문에 갈수록 드랍되는 레이어가 많아진다.
    # 위의 레이어일수록 고양이의 모습이 많이 남아있다.

#### Things to note:
•
First layer acts as a collection of various edge detectors,
activations retain almost all of the information present in
the initial picture

•
As you go higher, the activations become increasingly
abstract and less visually interpretable.

•
The sparsity of the activations increases with the depth of
the layer

### Visualizing convnet filters

Display the visual pattern that each filter is meant to respond to

To maximize the response of a specific filter

**Instantiating the Xception convolutional base**

In [None]:
model = keras.applications.xception.Xception(
    weights="imagenet",
    include_top=False)

**Printing the names of all convolutional layers in Xception**

In [None]:
for layer in model.layers:
    if isinstance(layer, (keras.layers.Conv2D, keras.layers.SeparableConv2D)):
        print(layer.name)

**Creating a feature extractor model**

In [None]:
layer_name = "block3_sepconv1"
layer = model.get_layer(name=layer_name)
feature_extractor = keras.Model(inputs=model.input, outputs=layer.output)

**Using the feature extractor**

In [None]:
activation = feature_extractor(
    keras.applications.xception.preprocess_input(img_tensor)
)

In [None]:
img_tensor.shape

In [None]:
activation.shape

In [None]:
import tensorflow as tf

def compute_loss(image, filter_index):
    activation = feature_extractor(image)
    filter_activation = activation[:, 2:-2, 2:-2, filter_index] # 테두리 제거
    return tf.reduce_mean(filter_activation)

**Loss maximization via stochastic gradient ascent**

In [None]:
@tf.function
def gradient_ascent_step(image, filter_index, learning_rate):
    with tf.GradientTape() as tape:
        tape.watch(image)
        loss = compute_loss(image, filter_index)
    grads = tape.gradient(loss, image)
    grads = tf.math.l2_normalize(grads)
    image += learning_rate * grads
    return image

**Function to generate filter visualizations**

In [None]:
img_width = 200
img_height = 200

def generate_filter_pattern(filter_index):
    iterations = 30
    learning_rate = 10.
    image = tf.random.uniform(
        minval=0.4,
        maxval=0.6,
        shape=(1, img_width, img_height, 3))
    for i in range(iterations):
        image = gradient_ascent_step(image, filter_index, learning_rate)
    return image[0].numpy()

**Utility function to convert a tensor into a valid image**

In [None]:
def deprocess_image(image):
    image -= image.mean()
    image /= image.std()
    image *= 64
    image += 128
    image = np.clip(image, 0, 255).astype("uint8")
    image = image[25:-25, 25:-25, :]
    return image

In [None]:
plt.axis("off")
plt.imshow(deprocess_image(generate_filter_pattern(filter_index=2)))

# 뒤로 갈수록 복잡해진다.

**Generating a grid of all filter response patterns in a layer**

In [None]:
all_images = []
for filter_index in range(64):
    print(f"Processing filter {filter_index}")
    image = deprocess_image(
        generate_filter_pattern(filter_index)
    )
    all_images.append(image)

margin = 5
n = 8
cropped_width = img_width - 25 * 2
cropped_height = img_height - 25 * 2
width = n * cropped_width + (n - 1) * margin
height = n * cropped_height + (n - 1) * margin
stitched_filters = np.zeros((width, height, 3))

for i in range(n):
    for j in range(n):
        image = all_images[i * n + j]
        stitched_filters[
            (cropped_width + margin) * i : (cropped_width + margin) * i + cropped_width,
            (cropped_height + margin) * j : (cropped_height + margin) * j
            + cropped_height,
            :,
        ] = image

keras.utils.save_img(
    f"filters_for_layer_{layer_name}.png", stitched_filters)

### Visualizing heatmaps of class activation

Which parts of a given image led a convnet to its final
classification decision

producing heatmaps of class activation over input images

**Loading the Xception network with pretrained weights**

In [None]:
model = keras.applications.xception.Xception(weights="imagenet")

**Preprocessing an input image for Xception**

In [None]:
# download the elephant images
img_path = keras.utils.get_file(
    fname="elephant.jpg",
    origin="https://img-datasets.s3.amazonaws.com/elephant.jpg")

# convert image to array
def get_img_array(img_path, target_size):
    img = keras.utils.load_img(img_path, target_size=target_size)
    array = keras.utils.img_to_array(img)
    array = np.expand_dims(array, axis=0)
    array = keras.applications.xception.preprocess_input(array)
    return array

img_array = get_img_array(img_path, target_size=(299, 299))

In [None]:
 img_array.shape

In [None]:
plt.imshow(keras.utils.load_img(img_path, target_size=(299, 299)))

In [None]:
# prediction

preds = model.predict(img_array)
print(keras.applications.xception.decode_predictions(preds, top=3)[0])

# afican elephant일 가능성이 87%로 가장 높다.

In [None]:
np.argmax(preds[0])

To visualize which parts of the image are the most African elephant like, let’s set up the Grad CAM process

**Setting up a model that returns the last convolutional output**

In [None]:
# Create a model that maps the input image to the activations of the last convolutional layer.

last_conv_layer_name = "block14_sepconv2_act"
classifier_layer_names = [
    "avg_pool",
    "predictions",
]
last_conv_layer = model.get_layer(last_conv_layer_name)
last_conv_layer_model = keras.Model(model.inputs, last_conv_layer.output)

**Reapplying the classifier on top of the last convolutional output**

In [None]:
# Create a model that maps the activations of the last convolutional layer to the final class predictions.

classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])
x = classifier_input
for layer_name in classifier_layer_names:
    x = model.get_layer(layer_name)(x)
classifier_model = keras.Model(classifier_input, x)

**Retrieving the gradients of the top predicted class**

In [None]:
# Compute the gradient of the top predicted class for our input image with respect to the activations of the last convolution layer

import tensorflow as tf

with tf.GradientTape() as tape:
    last_conv_layer_output = last_conv_layer_model(img_array)
    tape.watch(last_conv_layer_output)
    preds = classifier_model(last_conv_layer_output)
    top_pred_index = tf.argmax(preds[0])
    top_class_channel = preds[:, top_pred_index]

grads = tape.gradient(top_class_channel, last_conv_layer_output)

**Gradient pooling and channel-importance weighting**

In [None]:
# Apply pooling and importance weighting to the gradient tensor to obtain our heatmap of class activation

pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2)).numpy()
last_conv_layer_output = last_conv_layer_output.numpy()[0]
for i in range(pooled_grads.shape[-1]):
    last_conv_layer_output[:, :, i] *= pooled_grads[i]
heatmap = np.mean(last_conv_layer_output, axis=-1)

**Heatmap post-processing**

In [None]:
# For visualization purposes, we’ll also normalize the heatmap between 0 and 1.

heatmap = np.maximum(heatmap, 0)
heatmap /= np.max(heatmap)
plt.matshow(heatmap)

**Superimposing the heatmap on the original picture**

In [None]:
import matplotlib.cm as cm

img = keras.utils.load_img(img_path)
img = keras.utils.img_to_array(img)

heatmap = np.uint8(255 * heatmap)

jet = cm.get_cmap("jet")
jet_colors = jet(np.arange(256))[:, :3]
jet_heatmap = jet_colors[heatmap]

jet_heatmap = keras.utils.array_to_img(jet_heatmap)
jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))
jet_heatmap = keras.utils.img_to_array(jet_heatmap)

superimposed_img = jet_heatmap * 0.4 + img
superimposed_img = keras.utils.array_to_img(superimposed_img)

save_path = "elephant_cam.jpg"
superimposed_img.save(save_path)

In [None]:
plt.imshow(superimposed_img)

# Ch10. Deep learning for timeseries

* Different kinds of
timeseries tasks

* A temperature
forecasting example

* Understanding recurrent neural networks

* Advanced use of recurrent neural networks

## Different kinds of timeseries tasks

* Forecasting : predicting what will happen next in a series
* Classification
: Assign one or more categorical labels to a
timeseries .
* Event detection
: Identify the occurrence of a specific
expected event within a continuous data stream
* Anomaly detection
: Detect anything unusual happening
within a continuous datastream

## A temperature-forecasting example

Weather timeseries dataset recorded at the weather station
at the Max Planck Institute for Biogeochemistry in Jena,
Germany

Features:
14 different quantities (such as temperature,
pressure, humidity, wind direction, and so on) were recorded
every 10 minutes over 2009~2016

In [None]:
!wget https://s3.amazonaws.com/keras-datasets/jena_climate_2009_2016.csv.zip
!unzip jena_climate_2009_2016.csv.zip

**Inspecting the data of the Jena weather dataset**

In [None]:
import os
fname = os.path.join("jena_climate_2009_2016.csv")

with open(fname) as f:
    data = f.read()

lines = data.split("\n")
header = lines[0].split(",")
lines = lines[1:]
print(header)
print(len(lines))

**Parsing the data**

In [None]:
import numpy as np
temperature = np.zeros((len(lines),))
raw_data = np.zeros((len(lines), len(header) - 1))
for i, line in enumerate(lines):
    values = [float(x) for x in line.split(",")[1:]]
    temperature[i] = values[1]
    raw_data[i, :] = values[:]

**Plotting the temperature timeseries**

In [None]:
from matplotlib import pyplot as plt
plt.plot(range(len(temperature)), temperature)

**Plotting the first 10 days of the temperature timeseries**

In [None]:
plt.plot(range(1440), temperature[:1440])

**Computing the number of samples we'll use for each data split**

In [None]:
num_train_samples = int(0.5 * len(raw_data))
num_val_samples = int(0.25 * len(raw_data))
num_test_samples = len(raw_data) - num_train_samples - num_val_samples
print("num_train_samples:", num_train_samples)
print("num_val_samples:", num_val_samples)
print("num_test_samples:", num_test_samples)

### Preparing the data

**Normalizing the data**

In [None]:
mean = raw_data[:num_train_samples].mean(axis=0)
raw_data -= mean
std = raw_data[:num_train_samples].std(axis=0)
raw_data /= std

In [None]:
import numpy as np
from tensorflow import keras
int_sequence = np.arange(10)
dummy_dataset = keras.utils.timeseries_dataset_from_array(
    data=int_sequence[:-3],
    targets=int_sequence[3:],
    sequence_length=3,
    batch_size=2,
)

for inputs, targets in dummy_dataset:
    for i in range(inputs.shape[0]):
        print([int(x) for x in inputs[i]], int(targets[i]))

**Instantiating datasets for training, validation, and testing**

In [None]:
sampling_rate = 6
sequence_length = 120
delay = sampling_rate * (sequence_length + 24 - 1)
batch_size = 256

train_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=0,
    end_index=num_train_samples)

val_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=num_train_samples,
    end_index=num_train_samples + num_val_samples)

test_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=num_train_samples + num_val_samples)

**Inspecting the output of one of our datasets**

In [None]:
for samples, targets in train_dataset:
    print("samples shape:", samples.shape)
    print("targets shape:", targets.shape)
    break

### A common-sense, non-machine-learning baseline

**Computing the common-sense baseline MAE**

In [None]:
def evaluate_naive_method(dataset):
    total_abs_err = 0.
    samples_seen = 0
    for samples, targets in dataset:
        preds = samples[:, -1, 1] * std[1] + mean[1]
        total_abs_err += np.sum(np.abs(preds - targets))
        samples_seen += samples.shape[0]
    return total_abs_err / samples_seen

print(f"Validation MAE: {evaluate_naive_method(val_dataset):.2f}")
print(f"Test MAE: {evaluate_naive_method(test_dataset):.2f}")

### Let's try a basic machine-learning model

**Training and evaluating a densely connected model**

In [None]:
from tensorflow import keras
from tensorflow.keras import layers

inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = layers.Flatten()(inputs)
x = layers.Dense(16, activation="relu")(x)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

callbacks = [
    keras.callbacks.ModelCheckpoint("jena_dense.keras",
                                    save_best_only=True)
]
model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,
                    validation_data=val_dataset,
                    callbacks=callbacks)

model = keras.models.load_model("jena_dense.keras")
print(f"Test MAE: {model.evaluate(test_dataset)[1]:.2f}")

**Plotting results**

In [None]:
import matplotlib.pyplot as plt
loss = history.history["mae"]
val_loss = history.history["val_mae"]
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, "bo", label="Training MAE")
plt.plot(epochs, val_loss, "b", label="Validation MAE")
plt.title("Training and validation MAE")
plt.legend()
plt.show()

### Let's try a 1D convolutional model

In [None]:
inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = layers.Conv1D(8, 24, activation="relu")(inputs)
x = layers.MaxPooling1D(2)(x)
x = layers.Conv1D(8, 12, activation="relu")(x)
x = layers.MaxPooling1D(2)(x)
x = layers.Conv1D(8, 6, activation="relu")(x)
x = layers.GlobalAveragePooling1D()(x)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

callbacks = [
    keras.callbacks.ModelCheckpoint("jena_conv.keras",
                                    save_best_only=True)
]
model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,
                    validation_data=val_dataset,
                    callbacks=callbacks)

model = keras.models.load_model("jena_conv.keras")
print(f"Test MAE: {model.evaluate(test_dataset)[1]:.2f}")

### A first recurrent baseline

**A simple LSTM-based model**

In [None]:
inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = layers.LSTM(16)(inputs)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

callbacks = [
    keras.callbacks.ModelCheckpoint("jena_lstm.keras",
                                    save_best_only=True)
]
model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,
                    validation_data=val_dataset,
                    callbacks=callbacks)

model = keras.models.load_model("jena_lstm.keras")
print(f"Test MAE: {model.evaluate(test_dataset)[1]:.2f}")

## Understanding recurrent neural networks

**NumPy implementation of a simple RNN**

In [None]:
import numpy as np
timesteps = 100
input_features = 32
output_features = 64
inputs = np.random.random((timesteps, input_features))
state_t = np.zeros((output_features,))
W = np.random.random((output_features, input_features))
U = np.random.random((output_features, output_features))
b = np.random.random((output_features,))
successive_outputs = []
for input_t in inputs:
    output_t = np.tanh(np.dot(W, input_t) + np.dot(U, state_t) + b)
    successive_outputs.append(output_t)
    state_t = output_t
final_output_sequence = np.stack(successive_outputs, axis=0)

### A recurrent layer in Keras

**An RNN layer that can process sequences of any length**

In [None]:
num_features = 14
inputs = keras.Input(shape=(None, num_features))
outputs = layers.SimpleRNN(16)(inputs)

**An RNN layer that returns only its last output step**

In [None]:
num_features = 14
steps = 120
inputs = keras.Input(shape=(steps, num_features))
outputs = layers.SimpleRNN(16, return_sequences=False)(inputs)
print(outputs.shape)

**An RNN layer that returns its full output sequence**

In [None]:
num_features = 14
steps = 120
inputs = keras.Input(shape=(steps, num_features))
outputs = layers.SimpleRNN(16, return_sequences=True)(inputs)
print(outputs.shape)

**Stacking RNN layers**

In [None]:
inputs = keras.Input(shape=(steps, num_features))
x = layers.SimpleRNN(16, return_sequences=True)(inputs)
x = layers.SimpleRNN(16, return_sequences=True)(x)
outputs = layers.SimpleRNN(16)(x)

## Advanced use of recurrent neural networks

### Using recurrent dropout to fight overfitting

**Training and evaluating a dropout-regularized LSTM**

In [None]:
inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = layers.LSTM(32, recurrent_dropout=0.25)(inputs)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

callbacks = [
    keras.callbacks.ModelCheckpoint("jena_lstm_dropout.keras",
                                    save_best_only=True)
]
model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=50,
                    validation_data=val_dataset,
                    callbacks=callbacks)

In [None]:
inputs = keras.Input(shape=(sequence_length, num_features))
x = layers.LSTM(32, recurrent_dropout=0.2, unroll=True)(inputs)

### Stacking recurrent layers

**Training and evaluating a dropout-regularized, stacked GRU model**

In [None]:
inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = layers.GRU(32, recurrent_dropout=0.5, return_sequences=True)(inputs)
x = layers.GRU(32, recurrent_dropout=0.5)(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

callbacks = [
    keras.callbacks.ModelCheckpoint("jena_stacked_gru_dropout.keras",
                                    save_best_only=True)
]
model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=50,
                    validation_data=val_dataset,
                    callbacks=callbacks)
model = keras.models.load_model("jena_stacked_gru_dropout.keras")
print(f"Test MAE: {model.evaluate(test_dataset)[1]:.2f}")

### Using bidirectional RNNs

**Training and evaluating a bidirectional LSTM**

In [None]:
inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = layers.Bidirectional(layers.LSTM(16))(inputs)
outputs = layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,
                    validation_data=val_dataset)

### Going even further

## Summary