<a href="https://colab.research.google.com/github/mraskj/css_fall2023/blob/main/code/class13/class13-tutorial_tensorflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Class 13 - Tutorial - Tensorflow

This tutorial shows how you can use CNN to classify images. It is based on entirely on the notebook from Michelle Torres: https://colab.research.google.com/drive/1KFHwz8wjDdcFfsTmXfo-gwkKc-itN3MS?usp=%20sharing#scrollTo=4a8Q5WqivZY_ which was used for: https://ds3.ai/courses/imageasdata

Note that it uses Tensorflow.

In [None]:
import tensorflow as tf
from tensorflow.keras import backend as K
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import ZeroPadding2D
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import Lambda
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.resnet50 import preprocess_input, decode_predictions
from keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array, array_to_img
from tensorflow.keras.utils import get_file
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import confusion_matrix
from PIL import Image

from keras.utils import to_categorical
from keras.datasets import mnist

import pandas as pd
import os
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
import matplotlib.pyplot as plt
import numpy as np
import cv2
import requests
from io import BytesIO


device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Now, let's import our training and testing dataset to build a model that will allow us to recognize handwritten numbers. We will "massage" the data a bit to optimize the process and make it work properly. Tihs means, we will re-shape each image in the training-testing sets (depending on what the set up from Keras is), and we will normalize the pixel intensity so it lies between 0 and 1 (instead of 0 to 255).

For the outcome labels, we will convert them from integers to vectors:

In [None]:
# load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape((X_train.shape[0], 28, 28, 1))
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1))

 # scale data to the range of [0, 1]
X_train = X_train.astype("float32") / 255.0
X_test = X_test.astype("float32") / 255.0

# "Binarify" the labels (from categorical to vectors)
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
num_classes = y_test.shape[1]

print(num_classes)
print(y_test)
print(X_test.shape)

It's time to build our model. What layers can you identify?

In [None]:
def large_model():
# create model
    model = Sequential()
    model.add(ZeroPadding2D(padding=(2, 2), input_shape=(28,28,1), data_format=None))
    model.add(Conv2D(32, (5, 5), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Conv2D(32, (3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.2))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(num_classes, activation='softmax'))
# Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

And now comes the training and testing

In [None]:
model = large_model()
# Fit and train the model using only MNIST data for both training and evaluation
H = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=15, batch_size=200)
# Final evaluation of the model using MNIST testing data
scores = model.evaluate(X_test, y_test, verbose=0)
print(scores)

Let's check some diagnosis:

In [None]:
predictions = model.predict(X_test, batch_size=128)
print(predictions[0])
print(len(predictions[0]))

In [None]:
print(classification_report(y_test.argmax(axis=1),
	predictions.argmax(axis=1),
	target_names=[str(x) for x in np.array(list(range(num_classes)), dtype='uint8')]))

And the history of loss and accuracy to track overfitting:

In [None]:
# A nice function to plot loss and accuracy histories
def plot_training(H, N, type="loss"):
    # construct a plot that plots and saves the training history
    plt.style.use("ggplot")
    plt.figure()
    if type=="loss":
        plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
        plt.plot(np.arange(0, N), H.history["val_loss"], label="val_loss")
        plt.title("Training vs. Validity Loss")
        plt.ylabel("Loss")
        plt.legend(loc="upper right")

    else:
        plt.plot(np.arange(0, N), H.history["accuracy"], label="train_acc")
        plt.plot(np.arange(0, N), H.history["val_accuracy"], label="val_acc")
        plt.title("Training vs. Validity Accuracy")
        plt.ylabel("Accuracy")
        plt.legend(loc="lower right")
    plt.xlabel("Epoch #")


plot_training(H, 15)
plot_training(H, 15, "accuracy")

Now let's jump to a slightly more complicated but applicable question.

We want to classify actual images from protests based on the level of conflict they depict: 1= No conflict, 2= Low conflict, 3= High conflict. The training data contains 400 images that were labeled by 5 different human coders. The testing set has 200 images. All of them come from Getty Images, and are from the BLM protests in Ferguson in 2014.

For this classification task, we will use transfer learning. The GPU becomes more necessary here than in the other case because of the size of the data. Let's import our training and testing data for the re-training/transfer learning process:

In [None]:
train_full = pd.read_csv("https://raw.githubusercontent.com/smtorres/Start_Images/main/train_meta.csv")
test_full = pd.read_csv("https://raw.githubusercontent.com/smtorres/Start_Images/main/test_meta.csv")

print(train_full)

Almost always you will have to clean up your data and make sure you have the correct labels in the correct format. In this case, we will simply make sure that the labels are of type "string" given the specifications of the CNN model we will train. Note that our outcome of interest is "ConfCode" (level of conflict). We are also load all of our images in an array that will feed the model.

In [None]:
to_res = (256, 256)

files_train = train_full['ImgFile']
files_test = test_full['ImgFile']
core_url = "https://github.com/smtorres/Start_Images/blob/main/"

train_imgs = []
for i in files_train:
  turl = core_url+"train/"+i+"?raw=true"
  image_url = get_file(origin=turl)
  img = load_img(image_url, target_size=to_res)
  img = np.array(img)
  #img = img/255
  train_imgs.append(img)
  del turl,image_url,img

train_imgs_mat = np.array(train_imgs)
train_labels = train_full['ConfCode'].to_list()
train_labels= [str(x) for x in train_labels]

test_imgs = []
for i in files_test:
  turl = core_url+"test/"+i+"?raw=true"
  image_url = get_file(origin=turl)
  img = load_img(image_url, target_size=to_res)
  img = np.array(img)
  #img = img/255
  test_imgs.append(img)
  del turl,image_url,img
  print(str(i))

test_imgs_mat = np.array(test_imgs)
test_labels = test_full['ConfCode'].to_list()
test_labels = [str(x) for x in test_labels]

print('Train dataset shape:', train_imgs_mat.shape,
 '\tValidation dataset shape:', test_imgs_mat.shape)

We will also have to clean and manipulate the images. In this case, by normalizing them

In [None]:
train_imgs_scaled =train_imgs_mat/ 255
test_imgs_scaled = test_imgs_mat/255

train_labels_enc = to_categorical(train_labels)

test_labels_enc = to_categorical(test_labels)

# visualize a sample image
print(train_imgs_mat[0].shape)
print(test_labels[:5])
array_to_img(train_imgs_mat[0])

Now, let's begin with the fun! Our work flow includes the following:

1) Import a pre-trained model from Keras (in this case, a lovely ResNet50).

In [None]:
input_t = Input(shape=(256, 256, 3)) # The size of the images that we want the model to take. In this case 256 x 256 pixels (*)
res_model = ResNet50(weights="imagenet", input_tensor = input_t) # Import the canned model

Let's explore the architecture and what's inside our model...

In [None]:
for i,layer in enumerate(res_model.layers):
	print(i, layer.name, "-", layer.trainable)

Notice how all layers are trainable, meaning, we are using the FULL architecture, including the original labels, to potentially classify some pictures.

Great! It seems like it works... let's import the model again but now without the head (e.g. the pre-canned labels)

2) And then, let's "freeze" some layers, meaning, make them forget what they learned when the CNN was trained with the Imagenet dataset.

In [None]:
model = ResNet50(include_top=False, weights="imagenet", input_tensor = input_t) # Import the canned model

for layer in res_model.layers[:143]:
	layer.trainable = False

# Sanity check to see if it worked!
for i,layer in enumerate(res_model.layers):
	print(i, layer.name, "-", layer.trainable)


3) Modify the architecture of the pre-trained model according to your needs. Note that we did a LOT with this model. Let's go through some of the components:

In [None]:
model = Sequential()

model.add(ResNet50(include_top = False, pooling = 'avg', weights = 'imagenet', input_tensor=input_t))

model.add(Dense(3, activation = 'softmax'))

# Say not to train first layer (ResNet) model as it is already trained
model.layers[0].trainable = False

model.summary()

In [None]:
model.compile(optimizer = Adam(learning_rate=0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])

4) Get our training data ready! This implies resizing (to fit the input of the original model we are using), normalizing, and "augmenting" the images in our set. This augmentation is a cool trick: it flips, turns, and modifies the images in our dataset so the CNN "learns" different variations of the same model. It is the cheapest way of enlarging your training data!

In [None]:
train_datagen = ImageDataGenerator(zoom_range=0.3, rotation_range=50,
 width_shift_range=0.2, height_shift_range=0.2, shear_range=0.2,
 horizontal_flip=True, fill_mode='nearest')

test_datagen = ImageDataGenerator()

train_generator = train_datagen.flow(train_imgs_mat, train_labels_enc,batch_size=30)
test_generator = test_datagen.flow(test_imgs_mat, test_labels_enc, batch_size=30)

5) Train our model with our images of interest... be patient.

In [None]:
history = model.fit(train_generator,epochs=10)

6) ...and assess accuracy, mistakes, loss values, diagnostic plots, etc. until we are comfortable with the results (this does not mean, until we get the result we want but until we have some strong evidence that the model is correctly doing what we envisioned it should do!)



In [None]:
preds = model.predict(test_generator)
y_int_pred = np.argmax(preds,axis=1)
print(preds[0:10])
print(y_int_pred[0:10])

In [None]:
model.evaluate(test_generator, verbose=0)

In [None]:
y_true=[int(x) for x in test_labels]
print(confusion_matrix(y_true, y_int_pred))

7) Save the coefficients (errh... weights) of your model so you can predict out-of-sample labels --> Our ultimate objective!



In [None]:
model.save_weights(base_direc+"Model/tuned_weights_trust")