In [0]:
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow.keras as keras
from sklearn.model_selection import train_test_split

# General Concepts

## Artificial Intelligence
AI is the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages. It is a branch of computer science dealing with the simulation of intelligent behavior in computers; the capability of a machine to imitate intelligent human behavior.

## Machine Learning
Machine learning is the application of AI by using statistics to find and recognize patterns in large sets of data. There are 3 common subsets of of machine learning:

### Supervised Learning
Supervised learning is a type of ML where the model is provided with labeled training data. The program is meant to then process unlabeled data, and predict the label for the new data.

### Unsupervised Learning
Unsupervised learning, however, is a type of ML where the model does not have labeled training data, and learns as it processes the data. It looks for patterns amoung the data, and make informed decisions that way.

### Reinforcement Learning
Reinforcement learning does not take in labels like the other two. Instead, there is an agent and an environment. A common example is a video game: the agent is taked with "do not get the game over screen". It then learns from its environment to learn and become better at reaching that goal.

## Deep Learning
Deep learning is a subset of ML where neural networks receive data through a model with many layers.

# Basic Concepts
Here are the basic concepts that we learned in this class. They set up the core foundation of what you need to know to perform deep learning.

## Linear Regression
Linear regression is a type of predicitve analysis. It is a learning alorithm that looks at the inpenedent variable and makes predictions based on the dependent variable. Its objective is to find the equation of a line based on the input values. If the line is multidemisional, the equation is:
$ y = b + w_1 x_1 + w_2 x_2 + ... + w_n x_n $

## Logistic Regression
Logistic regression is an algorithm that produces a value between 0 and 1 using a sigmoid curve that determines a binary choice, typically a win/lose decision. 

In [0]:
def logistic_regression(data, labels):
    # 200 epochs
    epochs = 200
    # learn rate of 0.001
    lr = 0.001

    w = np.random.rand(2)
    b = np.zeros(1)

    i = 0
    while i < epochs:
        z = w * data[i][0] + w * data[i][1] + b
        #sigmoid function
        a = 1 / (1 + np.exp(-z))

        # gradient of cross-entropy loss
        grad_w = (a - labels[i]) * data[i][0] + (a - labels[i]) * data[i][1]
        grad_b = a - labels[i]

        w = w - (lr * grad_w)
        b = b - (lr * grad_b)
    return w, b

## Conolutions
Convolutions are used in convolutional nerual networks, which are used to progressively extract higher and higher level representations of the image content. It takes the raw pixel data as input and learns how to extract these features. It takes a kernal matrix, and preforms the convolution.

In [0]:

def conv2d(input_mat, kernal_mat):
    in_size = input_mat.shape[0]
    k_size = kernal_mat.shape[0]

    # Check if matrices are valid.
    if input_mat.shape[0] != input_mat.shape[1]:
        print("Error: input matrix is not an n x n matrix.")
        return []
    elif kernal_mat.shape[0] != kernal_mat.shape[1]:
        print("Error: kernal matrix is not an n x n matrix.")
        return []
    elif k_size > in_size:
        print("Error: kernal matrix cannot be larger than input matrix.")
        return []
    
    n = in_size - k_size + 1
    if in_size < 1 or k_size < 1 or n < 1:
        print("Error: Negative dimensions are not allowed.");
        return []
    
    # Create output matrix
    output_mat = np.zeros((n, n))

    # Convolution
    for i in range(n):
        for j in range(n):
            output_mat[i, j] = np.sum(input_mat[i:i+k_size, j:j+k_size] * kernal_mat)

    return output_mat

## Max Pooling
Max pooling is an operation where the input's size is specified. This allows for the windows to iterate through the matrix and finds a non-overalpping region for the output.

In [0]:
def maxpooling2d(input_mat, s):
    in_size = input_mat.shape[0]

    # Check if matrices are valid.
    if input_mat.shape[0] != input_mat.shape[1]:
        print("Error: input matrix is not an n x n matrix.")
        return []
    
    # Create dimensions of output matrix
    x = int(input_mat.shape[0] / s)
    y = int(input_mat.shape[1] / s)
    
    # Create output matrix
    output_mat = np.zeros((x, y))

    # Maxpooling
    for i in range(x):
        for j in range(y):
            in_slice = input_mat[i*s:(i+1)*s, j*s:(j+1)*s]
            output_mat[i, j] = np.amax(in_slice)

    return output_mat

## Gradient Descent
Gradient descent calculates the slope of the graph for loss minimization. Taking the gradient decent of a loss functions lets the model get closer and closer to being correct.

# Building a Model
Convolutional neural networks, as stated above, is a nerual network with additional layers. These types of networks are extremely useful for the classification of of images. These features range from edge detection all the way to species classification. 

## Convolution layers
These layers take the size of the tiles being extracted, as well as the number of filters to apply. The algorithm starts by taking the dot product and then calculates the extracted tile and filter for each pixel of the picture. 

## Rectified Linear Unit
or RELU, is a linear activation function. This is applied to introduce nonlinearity into the model. This allows for more complex images.

## Example of the beginning of CNN
The code below shows the beginning stages of a CNN construction. This is using the Xception convolutional base.

In [0]:
import tensorflow as tf
%tensorflow_version 2.x
from keras.applications import Xception

conv_base = Xception(
    weights='imagenet', 
    include_top=False, 
    input_shape=(150, 150, 3))

# convolutional base followed by classifier
model = tf.keras.models.Sequential()
model.add(conv_base)

# relu introduces nonlinearity 
# softmax produces classification probabilities 
model.add(tf.keras.layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
model.add(tf.keras.layers.Dense(10, activation='softmax'))

# Comping a Model
A model tries to minimize loss with the usage of optimizers and learning rates. 

## Optimizers 
Optimizers are functions that assist gradient descent. The change the weights in order to obtain faster and more accurate training results.

### Stochastic Gradient Descent
Stochastic Gradient Descent allows the model to reach faster, accurate predictions. It does so by taking the derivatives of the loss function and multiplies it by the learning rate to approach the desired values. 
### Learning Rate
A learning rate is the number that is multiplied with the resulting gradient to find the next parameters. The learning rate is the size of the steps taken towards the minimized loss. A learning rate has to be finetuned to a model before it is optimal. Typically a value between 0.0001 and 0.01. The learning rate allows us to fine tune our model.

# Training a Model
## Overfitting
Overfitting is a situation where the model becomes very familiar and adept at working with the training data. It gets so familiar that it has trouble processing new data sets. 
This results in low loss for the training data and high loss for new test data. This happens when a model is overcomplicated and is not generalized. 

## Underfitting
Underfitting is when a model can't recognize the training data which produces low accuracy and high loss. Underfitting is typically caused by the opposite of overfitting: the model isn't complex enough. Extra layers and neurons is a possible solution to this problem.



In [0]:
history = model.fit(trainImages, 
                      trainLabels, 
                      epochs=100, 
                      batch_size=128, 
                      validation_data=(testImages, testLabels))

# Finetuning a Pretrained Model

Finetuning is the process of taking a model that has already been trained for a task and training it to work with a different data set. This is much easier than building a model from scratch. This can be implemented with the addition and removal of layers into the old model.

## Convolutional Bases
One way to finetune a model is to load in a pretrained convolutional base. This allows us to take a massive model and use it for a particular case in which we want to study. For example, we could take a model that Google has made about animals, and use it on cat's with hats.

## Freezing Layers
Freezing layers allows us to exclude certain layers while we train. This can be used if we achieve a nice model accuracy of 97%, and we do not want to accidentally change it.

