In [1]:
# Paul-Jason Mello
# Professor Shim
# CMPE 257
# May 5th, 2022

# Convolutional Neural Networks

In [2]:
# !pip install tensorflow-datasets

In [3]:
# importing required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow.keras
import tensorflow_datasets as tfds

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.preprocessing import image
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D

In [4]:
print(tensorflow.__version__)

2.3.0


## 1. What are Convolutional Neural Networks

In [5]:
# Convolution Neural Networks (CNNs) are neural networks which are explicity designed for image inputs. CNNs work by 
# creating a filter which is dragged over pixel clusters by a stride. As we navigate through the image we can create an
# image which is better condensed and helps us extract meaningful information. CNNs are a supervised learning algorithms
# which can be trained to do many things accurately. This versatility and accuracy makes them commonly used in the field.

## 2. Why CNNs were introduced when Fully connected ANNs were already there

In [6]:
# ANNs are better used for text data, while CNNs are used for image data. My understanding is that there are certain 
# properties that can be extracted from a CNN which ANN struggles too. Specifically, the filter we drag over the image
# helps contain properties regarding the spatial relation between pixels. This may be highly complex data that needs a 
# strong architecture like CNN to extract such features. 

## 3. What is meant by the following terms: convolutional layer, pooling layer, padding, stride

In [7]:
# Convolutional Layer
# 
# A convolutional layer is the NxM frame we slide over the image. This filter outputs a downsampled version of this data.
# This can help to create a feature map which is capable of extracting complex data relationships.

# Pooling Layer
# 
# 
# A pooling layer is essentially when we take a filter and slide it over an image to compress and extract information. 
# We select the type of pooling we want to apply, such as max or average pooling, and then calculate the max or average
# at each filter timestep. This downsamples the image but keeps important general properties that may be useful.
# This can directly help reducing the dimensions of the data.

# Padding
# 
# The idea of padding is to create a buffer of unused data on the edges of an image. The zero-padding process helps to 
# prevent instances of image shrinking as we navigate through the consistantly small convolutional layers. It does this 
# by creating a frame of 0's around the image which acts as a buffer to reduce loss.  
# 
# 

# Stride
# 
# A stride is the pixel width x height that the filter is offset by when traversing the image. This may be one pixel to the
# right at a time or we may skip the entire filter size. The goal depends on the purpose of conducting a swipe through the 
# image. In max pooling we would desire a stride which mimicked the filter size so downsampling would be 4 -> 1

## 4. What would be the size of the output if input is n^2, filter is f^2 and stride is of s 

In [8]:
# N^2 = N x N matrix
# F^2 = F x F matrix
# N = 32, F = 3, s = 1 
# 
# ((N - F) / (s)) + 1 
#
# ((32 - 3) / 1) + 1 = 30, 30, 3

## 5. What are pre-trained models and what do you mean by transfer learning

In [9]:
# Pre-trained models are models which have an abstract understanding of a defined goal. They are models which can be used
# as a starting point to build off of because the models themselves have been trained previously. This process of using
# another model as the starting point has become known as transfer learning. In this way we can take patterns and 
# generalizations from one model and apply it to another model to build off of. Transfer learning is desireable because
# the models are often trained on large datasets and can provide very strong foundations to build off of.

## 6. Discuss CPU vs GPU vs TPU

In [10]:
# CPU
# The central processing unit (CPU) is the logic center of the computer and is comrpised of the control unit and the
# arthimatic logic unit.

# GPU
# The graphical processing unit (GPU) helps with fidelity and parallel processing. It is there to help the graphical 
# interfaces run efficently. 

# TPU
# The tensor processing unit is specifically designed for deep learning using tensors. This is a very specific device 
# for a very specific task. However, the benefits are immense as the architecture is very efficent for tensorflow work.

In [11]:
# Each of these devices play a specific role to complete a specific task. The TPU is notable because of its tensor
# capabilities which are exceptionally useful for neural networks. While the GPU and CPU are fast, being able to directly
# compute linear algebra on hardware vastly speeds up the process of learning. 

## 7. Perform CNN classification on citrus leaves dataset from tensorflow 
##     (try to achieve minimum 90% accuracy and above on the test set)
##     Can be found using the link: https://www.tensorflow.org/datasets/catalog/citrus_leaves

In [12]:
train = ImageDataGenerator(validation_split = 0.2, rescale = 1/255.0)
trainingData = train.flow_from_directory("/Users/GIGA/CMPE 257/HW 11/citrus", 
                                         target_size = (256, 256), batch_size = 32, 
                                         class_mode = "binary", subset = "training")
validationData = train.flow_from_directory("/Users/GIGA/CMPE 257/HW 11/citrus", 
                                           target_size = (256, 256), batch_size = 32,
                                           class_mode = "binary", subset = "validation")

Found 608 images belonging to 2 classes.
Found 151 images belonging to 2 classes.


In [13]:
model = tf.keras.Sequential([
    layers.Conv2D(128, (3, 3), activation = "linear", input_shape = (256, 256, 3)),
    layers.MaxPooling2D(pool_size = (4, 4), padding = 'same'),
    layers.Conv2D(32, (3, 3), activation = "relu"),
    layers.MaxPooling2D(pool_size = (5, 5), padding = 'same'),
    layers.Conv2D(32, (3, 3), activation = "relu"),
    layers.Flatten(),
    layers.Dense(5, activation = "softmax")
])
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 254, 254, 128)     3584      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 64, 64, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 62, 62, 32)        36896     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 32)        9248      
_________________________________________________________________
flatten (Flatten)            (None, 3872)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 1

In [14]:
model.compile(optimizer = "Adamax", loss = "sparse_categorical_crossentropy", metrics = ["accuracy"])

In [15]:
model.fit(trainingData, steps_per_epoch = trainingData.samples // 32, validation_data = validationData,  
          validation_steps = validationData.samples // 32, epochs = 5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<tensorflow.python.keras.callbacks.History at 0x1ae6fc24c10>

## 8. Plot the model architecture and explain how did you decide number of layers, filter size and other hyper parameters

In [16]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 254, 254, 128)     3584      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 64, 64, 128)       0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 62, 62, 32)        36896     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 11, 11, 32)        9248      
_________________________________________________________________
flatten (Flatten)            (None, 3872)              0         
_________________________________________________________________
dense (Dense)                (None, 5)                 1

In [17]:
# In this model, my aim was to continually condense the image through a rapid series of max pooling. I kept the layer
# count low because initial testing demonstrated I didnt need many layers. The filter sizes were arbitrarily chosen to be
# 3x3. Finally I decided to use a relu activation function because we are using image data. Overall I was able to achieve 
# 100% accuracy in only two epochs.

## 9. Increase the accuracy of the model in the demo file.

In [18]:
# Done, increased from 91.9 -> 92.3