References:
* https://www.tensorflow.org/tutorials

# <font color=blue>Convolutional Neural Network</font>

## A simple CNN

```python
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt
```

__Dataset__:

```python
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0
```

* train_images.shape is (50000, 32, 32, 3).
* train_labels.shape is (50000, 1).
* Entries of train_labels range from 0 to 9 (10 labels).

__Model__:

```python
input_shape = train_images[0].shape

model = models.Sequential()
model.add(layers.Conv2D(32, 3, activation='relu', input_shape=input_shape))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64, 3, activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64, 3, activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
```

* Since there are 10 class labels, the final output size is 10.
* Shapes:
    * input: (batch_size, 32, 32, 3)
    * after Conv2D: (batch_size, 30, 30, 32)
    * after MaxPooling2D: (batch_size, 15, 15, 32)
    * after Conv2D: (batch_size, 13, 13, 64)
    * after MaxPooling2D: (batch_size, 6, 6, 64)
    * after Conv2D: (batch_size, 4, 4, 64)
    * after Flatten: (batch_size, 1024)       # 1024 = 4*4*64
    * after Dense: (batch_size, 64)
    * after Dense: (batch_size, 10)
    
```python
model.compile(optimizer='adam', loss=keras.losses.SparseCategoricalCrossentropy(from_digits=True), metrics=['accuracy'])

history = model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels))

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
```

* We use the `SparseCategoricalCrossentropy()` loss, since `train_labels.shape[1]` is 1.
* If `train_labels.shape[1]` is 10, then we should use the `CategoricalCrossentropy()` loss.
* We use `from_digits=True` in `SparseCategoricalCrossentropy()`, since the last dense layer of the model did not use an activation such as `softmax`.

## Using ImageDataGenerator

```python
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, Dropout, MaxPooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import numpy as np
import matplotlib.pyplot as plt
```

__Dataset__:

* `train_dir`, `validation_dir`: directories with two subdirectories 'cats' and 'dogs', respectively
* `train_cats_dir`: directory with our training cat pictures (the subdirectory 'cats' under 'train_dir')
* `train_dogs_dir`: directory with our training dog pictures (the subdirectory 'dogs' under 'train_dir')
* `validation_cats_dir`: directory with our validation cat pictures (the subdirectory 'cats' under 'validation_dir')
* `validation_dogs_dir`: directory with our validation dog pictures (the subdirectory 'dogs' under 'validation_dir')

Load images from the disk, applies rescaling, and resizes the images:
```python
batch_size = 128
IMG_HEIGHT, IMG_WIDTH = 150, 150

train_data_gen = ImageDataGenerator(rescale=1./255,
                    rotation_range=45,
                    width_shift_range=.15,
                    height_shift_range=.15,
                    horizontal_flip=True,
                    zoom_range=0.5
                    )\
    .flow_from_directory(batch_size=batch_size, 
                         directory=train_dir, 
                         shuffle=True, 
                         target_size=(IMG_HEIGHT, IMG_WIDTH), 
                         class_mode='binary')

val_data_gen = ImageDataGenerator(rescale=1./255)\
    .flow_from_directory(batch_size=batch_size,
                         directory=validation_dir,
                         target_size=(IMG_HEIGHT, IMG_WIDTH),
                         class_mode='binary')
```

* len(train_data_gen) is 16, since the number of training images is 2000 and batch_size is 128.
* For i in range(16), 
    * train_data\[i\] which is the _i_-th batch of the training dataset is a tuple of length 2.
    * train_data\[i\]\[0\] is a numpy array of shape (128, 150, 150, 3).
    * train_data\[i\]\[1\] is a numpy array of shape (128, ).


__Model__ (binary classifier):

```python
model = Sequential([
    Conv2D(16, 3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH,3)),
    MaxPooling2D(),
    Dropout(0.2),
    Conv2D(32, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(0.2),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(1)
])
```

* MaxPooling2D: pool_size=(2, 2) by default.
* Shapes:
    * input batch: (128, 150, 150, 3)
    * after Conv2D: (128, 150, 150, 16)
    * after MaxPooling2D: (128, 75, 75, 16)
    * after Conv2D: (128, 75, 75, 32)
    * after MaxPooling2D: (128, 37, 37, 32)
    * after Conv2D: (128, 37, 37, 64)
    * after MaxPooling2D: (128, 18, 18, 64)
    * after Flatten: (128, 20736)
    * after Dense: (128, 512)
    * after Dense: (128, 1)
    
```python
model.compile(optimizer='adam', loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), metrics=['accuracy'])
```

Use `fit_generator()` to train the network:

```python
# total_train is the number of training examples.
# total_val is the number of validation examples.
history = model.fit_generator(train_data_gen, 
                              steps_per_epoch=total_train // batch_size
                              epochs=15, 
                              validation_data=val_data_gen,
                              validation_steps=total_val // batch_size)                            
```

## Transfer learning with a TF Hub

```python
import matplotlib.pylab as plt
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras import layers
```

__Dataset__:

```python
data_root = tf.keras.utils.get_file('flower_photos',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz', untar=True)

IMAGE_SHAPE = (224, 224)

image_data = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1/255)\
    .flow_from_directory(data_root, target_size=IMAGE_SHAPE)
```

* len(image_data) is 115 (=the number of batches)
* For i in range(115),
    * image_data\[i\] (a batch) is a tuple of length 2
    * image_data\[i\]\[0\] (image_batch) has shape of (32, 224, 224, 3). 
    * image_data\[i\]\[1\] (label_batch) has shape of (32, 5).
    * Each row of image_data\[i\]\[1\] is in One-Hot encoding.
    

__Model__:

```python
classifier_url ="https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/2" #@param {type:"string"}

classifier = tf.keras.Sequential([
    hub.KerasLayer(classifier_url, input_shape=IMAGE_SHAPE+(3,))
])
```

* The output shape of the ImageNet classifier is (batch_size, 1001).
* The result is a 1001 element vector of logits, rating the probability of each class for a given image.

```python
image_batch, label_batch = image_data[0]

result_batch = classifier.predict(image_batch)         # shape: (32, 1001)
```

__Model (using a headless model)__:

```python
feature_extractor_url = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/2" #@param {type:"string"}

feature_extractor_layer = hub.KerasLayer(feature_extractor_url, input_shape=IMAGE_SHAPE+(3,))
feature_extractor_layer.trainable = False

model = tfl.keras.Sequential([
    feature_extractor_layer, 
    layers.Dense(image_data.num_classes)])
```

* The output of feature_extractor_layer has shape (batch_size, 1280).
* image_data.num_classes is 5.
* Shapes:
    * input: (batch_size, 224, 224, 3)
    * after feature_extractor_layer: (batch_size, 1280)
    * after Dense: (batch_size, 5)
    

```python
model.compile(
    optimizer=tf.keras.optimizers.Adam(),
    loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
    metrics=['acc'])
```

* Class labels are recorded in One-Hot encoding. Thus we use `CategoricalCrossentropy()` not `SparseCategoricalCrossentropy()`.

```python
steps_per_epoch = np.ceil(image_data.samples/image_data.batch_size)

history = model.fit_generator(image_data, epochs=15,
                              steps_per_epoch=steps_per_epoch)
```

## Transfer learning with a ConvNet

```python
import os
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds
```

__Dataset__:

```python
(raw_train, raw_validation, raw_test), metadata = tfds.load(
    'cats_vs_dogs',
    split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    with_info=True,
    as_supervised=True,
)

for image, label in raw_train.take(2):
    print((image.shape, image.dtype), (label.numpy(), label.dtype))
# Outputs:
# (TensorShape([262, 350, 3]), tf.uint8) (1, tf.int64)
# (TensorShape([409, 336, 3]), tf.uint8) (1, tf.int64)

IMG_SIZE = 160 # All images will be resized to (160, 160).

def format_example(image, label):
    image = tf.cast(image, tf.float32)/127.5 - 1        # Entries range from -1 to 1.
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    return image, label

train = raw_train.map(format_example)
validation = raw_validation.map(format_example)
test = raw_test.map(format_example)

BATCH_SIZE = 32

train_batches = train.shuffle(1000).batch(BATCH_SIZE)
validation_batches = validation.batch(BATCH_SIZE)
test_batches = test.batch(BATCH_SIZE)
```

* Each batch of images has shape of (32, 160, 160, 3).


__Model (Feature extraction)__:

```python
IMG_SHAPE = (IMG_SIZE, IMG_SIZE, 3)

base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE, include_top=False, weights='imagenet')
base_model.trainable = False

model = tf.keras.Sequential([
    base_model,
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dense(1)
])
```

* Shapes
    * input: (batch_size, 160, 160, 3)
    * after base_model: (batch_size, 5, 5, 1280)
    * after GlobalAveragePooling2D: (batch_size, 1280)
    * after Dense: (batch_size, 1)
    
```python
learning_rate = 0.0001
initial_epochs = 10

model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate),
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(train_batches,
                    epochs=initial_epochs,
                    validation_data=validation_batches)

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']
```

__Model (Fine tuning)__:

* `len(base_model.layers)` is 155. We set the 100 bottom layers to be untrainable.
* We compile the model using a lower learning rate.
* `initial_epoch` in model.fit() is an integer at which to start training (useful for resuming a previous training run).
* Since `history.epoch` is `[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]`, `initial_epoch` is set to 9.

```python
base_model.trainable = True

for layer in base_model.layers[:100]:
    layer.trainable = False

model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer = tf.keras.optimizers.RMSprop(lr=learning_rate/10),
              metrics=['accuracy'])  

fine_tune_epochs = 10
total_epochs =  initial_epochs + fine_tune_epochs

history_fine = model.fit(train_batches,
                         epochs=total_epochs,
                         initial_epoch=history.epoch[-1],
                         validation_data=validation_batches)

acc += history_fine.history['accuracy']
val_acc += history_fine.history['val_accuracy']

loss += history_fine.history['loss']
val_loss += history_fine.history['val_loss']
```

# <font color=blue>Image Segmentation</font>

References: https://www.tensorflow.org/tutorials/images/segmentation

```python
import tensorflow as tf
from tensorflow_examples.models.pix2pix import pix2pix
import tensorflow_datasets as tfds
```

__Dataset__: 

* Each pixel of an image is given a label (0, 1, or 2 in this example). 
* Think of this as multi-classification where each pixel is being classified into three classes.

```python
for sample_image_batch, sample_mask_batch in train_dataset.take(1):
    break
```

* `sample_image_batch` is a tf.Tensor with shape=(64,128,128,3).
* `sample_mask_batch` is a tf.Tensor with shape=(64,128,128,1).
* Each entry of `sample_mask_batch` is 0, 1, or 2.

__Model__:

* The model being used here is a modified U-Net. A U-Net consists of an encoder (downsampler) and decoder (upsampler). 
* The encoder uses five intermediate outputs of a pretrained MobileNetV2 model. It will be nontrainable and used for feature extraction.
* The decoder uses four upsample blocks implemented in TensorFlow Examples in the Pix2pix tutorial. It will be trainable.

```python
base_model = tf.keras.applications.MobileNetV2(input_shape=[128,128,3], include_top=False)

layer_names = [
    'block_1_expand_relu',   # (64,64,96)
    'block_3_expand_relu',   # (32,32,144)
    'block_6_expand_relu',   # (16,16,192)
    'block_13_expand_relu',  # (8,8,576)
    'block_16_project',      # (4,4,320)
]
layers = [base_model.get_layer(name).output for name in layer_names]

down_stack = tf.keras.Model(inputs=base_model.input, outputs=layers)
down_stack.trainable = False

up_stack = [
    pix2pix.upsample(512,3),  # 4x4 -> (8,8,512)
    pix2pix.upsample(256,3),  # 8x8 -> (16,16,256)
    pix2pix.upsample(128,3),  # 16x16 -> (32,32,128)
    pix2pix.upsample(64,3),   # 32x32 -> (64,64,64)
]


inputs = tf.keras.layers.Input(shape=[128,128,3])
downs = down_stack(inputs)    # list of five tensors

concat = tf.keras.layers.Concatenate()
x = downs[-1]
for i in range(len(up_stack)):
    x = concat([up_stack[i](x), downs[-2-i]])
    
outputs = tf.keras.layers.Conv2DTranspose(3, 3, strides=2, padding='same')(x)
model = tf.keras.Model(inputs=inputs, outputs=outputs)
```

* Shapes of x:
    * before iteration: (4,4,320)
    * after i=0: (8,8,1088), concatenated by (8,8,512) and (8,8,576) 
    * after i=1: (16,16,448), concatenated by (16,16,256) and (16,16,192)
    * after i=2: (32,32,272), concatenated by (32,32,128) and (32,32,144)
    * after i=3: (64,64,160), concatenated by (64,64,64) and (64,64,96)
    
* outputs.shape is (128,128,3), since there are three possible labels for each pixel.

* If `x` is a tensor of shape (a,b,c), then `Conv2DTranspose(n, 3, strides=2, padding='same')(x)` returns a tensor of shape (2a,2b,n).


```python
model.compile(optimizer='adam',
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

model_history = model.fit(train_dataset, epochs=20,
                          steps_per_epoch=57,
                          validation_steps=11,
                          validation_data=test_dataset)
```

See the result of a test image:

```python
for image_batch, mask_batch in test_dataset.take(1):
    image, mask = image_batch[0], mask_batch[0]
    pred_mask = model.predict(image[tf.newaxis,])[0]          # shape: (128,128,3)
    pred_mask = tf.argmax(pred_mask, axis=-1)                 # shape: (128,128)
    pred_mask = pred_mask[...,tf.newaxis]                     # shape: (128,128,1)
```

Next, plot `image`, `mask`, and `pred_mask`.