<a href="https://colab.research.google.com/github/dyjdlopez/icpep-ai-workshop-2021/blob/main/day3/ICpEP_AI_D3_01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Fundamentals of TensorFlow
Copyright D.Lopez 2021 | All Rights reserved <br><br>

[TensorFlow](https://www.tensorflow.org/) is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.<br>
TensorFlow provides several APIs that allow developers to develop a range of AI Apps from data estimation, computer vision, natural language processing, and even reinforcement learning.
![image](https://camo.githubusercontent.com/c04e16c05de80dadbdc990884672fc941fdcbbfbb02b31dd48c248d010861426/68747470733a2f2f7777772e74656e736f72666c6f772e6f72672f696d616765732f74665f6c6f676f5f736f6369616c2e706e67)<br>





In [None]:
# !pip install tensorflow
# !pip install tensorflow-gpu
import tensorflow as tf
import numpy as np
import cv2
import matplotlib.pyplot as plt


## Part 1 Tensor Operations
TensorFlow mainly operates using tensors (as its name suggests) so let’s try to use our current knowledge about tensors and apply it with our current platform.

### 1.1 NumPy and TensorFlow
If you have enjoyed using matrices and tensors in NumPy, then performing tensor algebra in TensorFlow will just be a breeze.

In [None]:
np_tensor = np.array(3)
tf_tensor = tf.constant(3)

print(np_tensor)
print(tf_tensor)

In [None]:
np_mat = np.array([
                   [1,2],
                   [3,1]
], dtype=float)
tf_mat = tf.constant([
                      [1,2],
                      [3,1]
], dtype=float)
print(np_mat)
print(tf_mat)

In [None]:
type(tf_mat.numpy())

In [None]:
A = tf_mat
B = tf.transpose(tf_mat)
print(f"Matrix A: \n{A}")
print(f"Matrix B: \n{B}")
print(f"Sum of Tensors: \n{A+B}")
print(f"Difference of Tensors: \n{A-B}")
print(f"Product of Tensors: \n{A*B}")

In [None]:
print(f"Dot Product of Tensors: \n{A@B}")

In [None]:
C = tf.reshape(A, [4,1])
C

## Part 2: Machine Learning Revisited
As we recall, machine learning takes in data and a program to produce a rule or determine a pattern as opposed with traditional program that requires a pattern or rule together with the data to create a working system.

Machine learning can be further classified into several cognitive paradigms:

<b>Supervised learning</b>— is a type of machine learning that requires input data to have a feature and a label or the typical X data and y label format. Supervised learning requires its dataset to be:
* Large (Volume)
* Various
* Valid

<b>Unsupervised learning</b>—unlike input data from supervised learning, unsupervised learning data doesn't have labels. Unsupervised learning aims to find patterns in unexplored data. Typical applications of unsupervised learning include: dimension reduction and clustering.

<b>Reinforcement learning</b>—the inputs for a reinforcement learning algorithm requires little to none data (in form of a dataset) to succeed in learning. Reinforcement learning aims to learn a rule, policy, or “way to do stuff” by determining whether its actions for a certain environment is rewarded or punished by its algorithm. The common uses of reinforcement learning included optimization.

In the succeeding topics, we will be focusing on supervised learning using Deep Neural Networks.

### 2.1 The Neuron (Again)
![image](https://svitla.com/uploads/ckeditor/ArtificialNeuronModel_english.jpg)<br>

Recalling our last discussion with the neuron, we found out that it is the basic unit of a neural network. The learning process of the neuron consists of a feed-forward propagation in which it takes in several inputs in which it is multiplied by some weights and fed into a transfer function and then subjected to an activation function; and a backward propagation routine where it computes for the loss and cost of a neuron and uses the error value to update the weights and repeating until it converges (or even diverge) to a certain period of training

In [None]:
#Features
X = np.arange(-1,5,dtype=float)
def fx(x): return 2*x-1
#Targets/Labels
y = np.array(list(map(fx,X)))

In [None]:
print(X)
print(y)

In [None]:
from tensorflow.keras.optimizers import Adam, SGD, RMSprop
from tensorflow.keras.losses import MSE, MAE

In [None]:
### Dense Layer
model = tf.keras.Sequential([
                             tf.keras.layers.Dense(units=1, input_shape=[1])
])
lr=0.01
model.compile(optimizer=SGD(learning_rate=lr),
              loss=MSE)
model.summary()

In [None]:
history1 = model.fit(X,y,epochs=200)

In [None]:
plt.title('Loss Curve')
plt.plot(history1.history['loss'])
plt.ylabel('loss')
plt.xlabel('epoch')

plt.show()

In [None]:
model.predict([10.0])

## Part 3: Neural Networks

### 3.1 Multilayer Perceptron
![image](https://www.researchgate.net/profile/Facundo_Bre/publication/321259051/figure/fig1/AS:614329250496529@1523478915726/Artificial-neural-network-architecture-ANN-i-h-1-h-2-h-n-o.png)

As the name suggests, a multilayer perceptron (MLP) is a network of neurons or perceptrons arrange and connected horizontally and vertically. In this setup, neurons share knowledge along their respective layer and passes the activated values to the next layers to have a sense of "deep" learning. The concept of MLP gave rise to develop the new field of machine learning—Deep Learning, where we study about Artificial Neural Networks (ANN).

An ANN consists of three parts:
* Input layer
* Hidden layer(s)
* Output layer
However, when counting the number of layers of a neural network we exclude the input layer since no learning is happening at the input layer or Layer 0 ($L0$).

In [None]:
### Multilayer Perceptron
model = tf.keras.Sequential([
  tf.keras.layers.Dense(units=16,input_shape=[1]), #Hidden Layer
  tf.keras.layers.Dense(units=1) #Output layer
])

lr=0.01
model.compile(optimizer=SGD(learning_rate=lr),
              loss=MSE)
model.summary()
history2=model.fit(X,y, epochs=200)

In [None]:
plt.title('Loss Curve')
plt.plot(history1.history['loss'], label='Single Neuron')
plt.plot(history2.history['loss'], label='MLP')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend()
plt.show()

In [None]:
model.predict([10.0])

### 3.2 Activation Functions

![image](https://www.researchgate.net/profile/Junxi_Feng/publication/335845675/figure/fig3/AS:804124836765699@1568729709680/Commonly-used-activation-functions-a-Sigmoid-b-Tanh-c-ReLU-and-d-LReLU.ppm)

Back in our discussion about the neuron, we know that an activaiton function is quite crucial in getting the right values. Different activation functions are used for different objectives of learning. One factor to consider in choosing an activation function is the behavior of outputs per layer or the expected output of the machine learning task. Simply, identifying whether you are classifying data or predicting data could help which activation function to use.

For a deeper discussion and implementation check out:
* [Activation functions in TensorFlow](https://www.tensorflow.org/api_docs/python/tf/keras/activations)
* [Activation functions in Keras](https://keras.io/api/layers/activations/)


In [None]:
from tensorflow.keras.layers import Activation
from tensorflow.nn import sigmoid, tanh, softmax, relu, leaky_relu

In [None]:
inputs = tf.constant([
                      [0.0,-1.2,2.4,32.0,-20.1]
                      ])
print(inputs)

In [None]:
### Sigmoid
sigmoid_layer = Activation(sigmoid)
sigmoid_layer(inputs).numpy()

In [None]:
### Tanh
tanh_layer = Activation(tanh)
tanh_layer(inputs).numpy()

In [None]:
### Softmax
softmax_layer = Activation(softmax)
softmax_layer(inputs).numpy()

In [None]:
### ReLU
relu_layer = Activation(relu)
relu_layer(inputs).numpy()

In [None]:
### Leaky ReLU
lrelu_layer = Activation(leaky_relu)
lrelu_layer(inputs).numpy()

### 3.3 Computer Vision
Using deep neural network to solve image processing problems leads to the developing computer vision solutions. The goal of computer vision is to mimic the visio-cognitive functions of the brain. The main activities being done in computer vision include image and video recognition systems. For almost a decade there are a lot of effort being made in improving image systems in which the following tasks were introduced:
![image](https://3.bp.blogspot.com/-e-V_TvNbMSc/XJ7uRvmc4CI/AAAAAAAADPo/47Cg4DqI-g45qQEDRYuPwgaEiqqYDq2wACLcBGAs/s1600/cnn-extensions.png)

In [None]:
from tensorflow.keras.losses import sparse_categorical_crossentropy
from tensorflow.keras.optimizers import Adam, RMSprop, SGD

In [None]:
mnist = tf.keras.datasets.mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

In [None]:
plt.imshow(training_images[0])
training_images[0].shape

In [None]:
training_images  = training_images / 255.0
test_images = test_images / 255.0


In [None]:
model = tf.keras.models.Sequential([
                                    tf.keras.layers.Flatten(), 
                                    tf.keras.layers.Dense(64, activation=relu), 
                                    tf.keras.layers.Dense(10, activation=softmax)
])

In [None]:
model.compile(optimizer=Adam(),
              loss=sparse_categorical_crossentropy,
              metrics=['accuracy'])
history3 = model.fit(training_images, training_labels, epochs=5)

In [None]:
plt.figure(figsize=(12,4))

plt.subplot(121)
plt.title('Loss Curve')
plt.plot(history3.history['loss'])
plt.ylabel('loss')
plt.xlabel('epoch')

plt.subplot(122)
plt.title('Accuracy Curve')
plt.plot(history3.history['accuracy'])
plt.ylabel('accuracy')
plt.xlabel('epoch')

plt.show()

In [None]:
model.evaluate(test_images, test_labels)

What if we add more layers?

In [None]:
model = tf.keras.models.Sequential([
                                    tf.keras.layers.Flatten(), 
                                    tf.keras.layers.Dense(128, activation=relu), 
                                    tf.keras.layers.Dense(64, activation=relu),
                                    tf.keras.layers.Dense(10, activation=softmax)
])

In [None]:
model.compile(optimizer=Adam(),
              loss=sparse_categorical_crossentropy,
              metrics=['accuracy'])
history4 = model.fit(training_images, training_labels, epochs=5)

In [None]:
plt.figure(figsize=(12,4))

plt.subplot(121)
plt.title('Loss Curve')
plt.plot(history4.history['loss'], label='2 Layers')
plt.plot(history3.history['loss'], label='1 Layer')
plt.legend()
plt.ylabel('loss')
plt.xlabel('epoch')

plt.subplot(122)
plt.title('Accuracy Curve')
plt.plot(history4.history['accuracy'], label='2 Layers')
plt.plot(history3.history['accuracy'], label='1 Layer')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend()

plt.show()

### 3.4 Convolutional Neural Networks
Last week, we saw what convolutions do to images to get or localize some features. In deep learning, convolutions have also proven very useful in extracting information from images and learning from them.

In 1998, [LeCunn et. al](http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf) introduced the implementation of convolutional neural networks for computer vision. But the first idea of using convolutions for visual cognition was [first conceptualized](https://medium.com/@gopalkalpande/biological-inspiration-of-convolutional-neural-network-cnn-9419668898ac) by Hubel and Wiesel in 1962, taking consideration of the processes visual cortex of the brain.

![image](https://miro.medium.com/max/4308/1*1TI1aGBZ4dybR6__DI9dzA.png)

You may notice some new terminologies in this section, let's break down the technicalities of CNNs.

<b>Feature Maps</b>—feature maps are the learnt filters for convolutions. Recall that we need kernels in to perform convolutions, in CNNs we are trying to learn which are the appropriate values of kernels to get the feature that we want from the image.

<b>Subsampling</b>—subsampling is a method of reducing the dimensions of an image or feature maps but retaining vital information. To achieve this, we tend to use pooling techniques such as maxima pooling or average pooling.

<b>Full connections</b>—in the later layers of a CNN feature maps are flattened into a vector so we could run in through an MLP to determine the defining features of the the image



In [None]:
mnist = mnist = tf.keras.datasets.mnist
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()
training_images=training_images.reshape(60000, 28, 28, 1)
training_images=training_images / 255.0
test_images = test_images.reshape(10000, 28, 28, 1)
test_images=test_images/255.0

In [None]:
model = tf.keras.models.Sequential([
          tf.keras.layers.Conv2D(32, (3,3), 
                                 activation='relu', input_shape=(28,28,1)),
          tf.keras.layers.MaxPooling2D((2,2)),
          tf.keras.layers.Flatten(),
          tf.keras.layers.Dense(128, activation='relu'),
          tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(
    optimizer='adam',
    loss=sparse_categorical_crossentropy,
    metrics=['accuracy']
)
history5= model.fit(training_images, training_labels, epochs=5)

In [None]:
plt.figure(figsize=(12,4))

plt.subplot(121)
plt.title('Loss Curve')
plt.plot(history5.history['loss'], label='CNN')
plt.plot(history4.history['loss'], label='2 Layers')
plt.plot(history3.history['loss'], label='1 Layer')
plt.legend()
plt.ylabel('loss')
plt.xlabel('epoch')

plt.subplot(122)
plt.title('Accuracy Curve')
plt.plot(history5.history['accuracy'], label='CNN')
plt.plot(history4.history['accuracy'], label='2 Layers')
plt.plot(history3.history['accuracy'], label='1 Layer')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend()

plt.show()

## Part 4: Custom Training

### 4.1 Importing Images

In [None]:
from google.colab import drive
import os
drive.mount('/content/drive')

!ls drive/MyDrive/ICpEP\ AI\ Workshop\ Files/ds/
data_dir = "drive/MyDrive/ICpEP AI Workshop Files/ds/"
dir = os.listdir(data_dir)
for folder in dir:
  print(os.listdir(data_dir+folder))

In [None]:
dog_img = cv2.cvtColor(cv2.imread(data_dir+dir[0]+'/dog1.jpg'), 
                       cv2.COLOR_BGR2RGB)
plt.imshow(dog_img)

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [None]:
TRAIN_DIR = data_dir+'train'
train_datagen = ImageDataGenerator(rescale=1.0/255.)
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
                                                     class_mode='binary',
                                                     target_size=(100,100))


In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(16, (3, 3), activation='relu', input_shape=(100, 100, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer=Adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])

In [None]:
history6 = model.fit(train_generator,
                              epochs=20,
                              verbose=1)

In [None]:
plt.figure(figsize=(12,4))

plt.subplot(121)
plt.title('Loss Curve')
plt.plot(history6.history['loss'])
plt.legend()
plt.ylabel('loss')
plt.xlabel('epoch')

plt.subplot(122)
plt.title('Accuracy Curve')
plt.plot(history6.history['accuracy'])
plt.ylabel('accuracy')
plt.xlabel('epoch')

plt.show()

In [None]:
import numpy as np
from google.colab import files
from keras.preprocessing import image

def upload_and_predict():
  uploaded = files.upload()

  for fn in uploaded.keys():
  
    # predicting images
    path = '/content/' + fn
    img = image.load_img(path, target_size=(100, 100))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)

    images = np.vstack([x])
    classes = model.predict(images)
    print(classes[0])
    if classes[0]>0.5:
      print(fn + " is a dog")
    else:
      print(fn + " is not a dog")
upload_and_predict()

#### <i>Bias—Variance Tradeoff</i>
![image](https://www.googleapis.com/download/storage/v1/b/kaggle-forum-message-attachments/o/inbox%2F3335785%2F728627205a9dc976248c9f05a2baf464%2F1.png?generation=1600257419469907&alt=media)

The bias—variance tradeoff is an essential topic to learn in debugging machine learning models. Overfitting and underfitting can be more understood if you know about biases and variance.

<b>Bias</b>—the bias refers to the difference between the average prediction of the model and the groud truths. Models with high bias over-generalize the training data failing to "learn". This is refered to as <b>underfitting</b>.

<b>Variance</b>—the variance of a model refers to the amount that the estimate of the target function will change if different training data was used. models with high variance is too "specialized" or overtrained on the dataset and fails to be flexible with data not in the training set. This is referred to as <b>overfitting</b>.

<b><i>Remedies</b></i>

To solve <b>underfitting</b> you might to:
* Train longer, 
* Add features,
* Choose more appropriate models,
* Adding more layers (complexity) to the model, or
* Choose a more appropriate loss or optimizer

To solve <b>overfitting</b> you might to:
* Train shorter, 
* Add more varieties or samples (generalizing), or
* Apply regularization techniques such as batch normalization, data augmentation, dropout, etc.

### 4.2 Validation Sets
To see the bias-variance tradeoff better in traning we need to have a validation set during training. The similarity between the test and validation set are that they are both data from the same dataset but not part of the training and they can be used to determine bias and variance. The difference is that validation sets are tested with the model during training while test sets are checked with the model post-training.

There are also deployment sets that are considered another post-training dataset but are sampled from an actual area of deployment.

For our example, we will be using [Image Data Generators](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator). Image data generators came from TensorFlow's image processing API. It allows us to pre-transform our dataset in-memory.

In [None]:
TRAIN_DIR = data_dir+'train'
train_datagen = ImageDataGenerator(rescale=1.0/255.)
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
                                                     class_mode='binary',
                                                     target_size=(100,100))


In [None]:
VALIDATION_DIR = data_dir+'test'
validation_datagen = ImageDataGenerator(rescale=1.0/255.)
validaiton_generator = train_datagen.flow_from_directory(VALIDATION_DIR,
                                                     class_mode='binary',
                                                     target_size=(100,100))

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(16, (3, 3), activation='relu', input_shape=(100, 100, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer=Adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])

In [None]:
history7 = model.fit(train_generator,
                              epochs=15,
                              verbose=1,
                     validation_data=validaiton_generator)

In [None]:
plt.figure(figsize=(12,4))

plt.subplot(121)
plt.title('Loss Curve')
plt.plot(history7.history['loss'])
plt.plot(history7.history['val_loss'])
plt.legend()
plt.ylabel('loss')
plt.xlabel('epoch')

plt.subplot(122)
plt.title('Accuracy Curve')
plt.plot(history7.history['accuracy'])
plt.plot(history7.history['val_accuracy'])
plt.ylabel('accuracy')
plt.xlabel('epoch')

plt.show()

In [None]:
upload_and_predict()

### 4.3 Data Augmentation

In [None]:
TRAIN_DIR = data_dir+'train'
train_datagen = ImageDataGenerator(
    rescale=1.0/255.,
    rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
                                                     class_mode='binary',
                                                    batch_size=4,
                                                     target_size=(100,100))

VALIDATION_DIR = data_dir+'test'
validation_datagen = ImageDataGenerator(
    rescale=1.0/255.,
    rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest'
      )
validaiton_generator = train_datagen.flow_from_directory(VALIDATION_DIR,
                                                     class_mode='binary',
                                                     batch_size=4,
                                                     target_size=(100,100))

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Conv2D(16, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2, 2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer=Adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])

In [None]:
history8 = model.fit(train_generator,
                              epochs=15,
                              verbose=1,
                     validation_data=validaiton_generator)

In [None]:
plt.figure(figsize=(12,4))

plt.subplot(121)
plt.title('Loss Curve')
plt.plot(history8.history['loss'])
plt.plot(history8.history['val_loss'])
plt.ylabel('loss')
plt.xlabel('epoch')

plt.subplot(122)
plt.title('Accuracy Curve')
plt.plot(history8.history['accuracy'])
plt.plot(history8.history['val_accuracy'])
plt.ylabel('accuracy')
plt.xlabel('epoch')

plt.show()

In [None]:
upload_and_predict()