<a href="https://colab.research.google.com/github/DavidSenseman/BIO1173/blob/master/Class_06_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---------------------------
**COPYRIGHT NOTICE:** This Jupyterlab Notebook is a Derivative work of [Jeff Heaton](https://github.com/jeffheaton) licensed under the Apache License, Version 2.0 (the "License"); You may not use this file except in compliance with the License. You may obtain a copy of the License at

> [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

------------------------

# **BIO 1173: Intro Computational Biology**

**Module 6: Convolutional Neural Networks (CNN) for Computer Vision**

* Instructor: [David Senseman](mailto:David.Senseman@utsa.edu), [Department of Integrative Biology](https://sciences.utsa.edu/integrative-biology/), [UTSA](https://www.utsa.edu/)

### Module 6 Material

* Part 6.1: Image Processing in Python
* **Part 6.2: Using Convolutional Neural Networks** 
* Part 6.3: Using Pretrained Neural Networks with Keras 
* Part 6.4: Looking at Keras Generators and Image Au


### Google CoLab Instructions

The following code ensures that Google CoLab is running the correct version of TensorFlow.
  Running the following code will map your GDrive to ```/content/drive```.

In [1]:
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    COLAB = True
    print("Note: using Google CoLab")
    %tensorflow_version 2.x
except:
    print("Note: not using Google CoLab")
    COLAB = False

Note: not using Google CoLab


### Lesson Setup

Run the next code cell to load necessary packages

In [2]:
# You MUST run this code cell first
import tensorflow as tf
import pandas as pd
import os
import numpy as np
import pandas as pd

import os
import shutil
path = '/'
memory = shutil.disk_usage(path)
LESSON_DIRECTORY = os.getcwd()
print("Your LESSON_DIRECTORY is: " + LESSON_DIRECTORY)
print("Disk", memory)
print("Tensorflow version =", (tf.__version__))
print("Available GPU acceleration =", tf.test.gpu_device_name())

2024-04-09 12:04:59.707434: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1


Your LESSON_DIRECTORY is: /home/david/BIO1173/Classes/Class_06_2
Disk usage(total=982820896768, used=63912402944, free=868908408832)
Tensorflow version = 2.4.1
Available GPU acceleration = /device:GPU:0


2024-04-09 12:05:00.704750: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-04-09 12:05:00.709207: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2024-04-09 12:05:00.710039: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2024-04-09 12:05:00.891263: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2024-04-09 12:05:00.891552: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 na

In [3]:
# Detect Windows machine

WINDOWS = False
if os.name == "nt":
    WINDOWS = True
    !pip install patool
    print("\nNote: Jupyterlab is running on a WINDOWS computer")
else:
    print("\n Note: Jupyterlab is not running on a WINDOWS computer")


 Note: Jupyterlab is not running on a WINDOWS computer


### Create a ./temp folder in BIO1173

Neural network requires a lot of file storage space. If you are working on Google COLAB, this shouldn't be too much of an issue. On the other hand, if you are working on your laptop computer, and your hard drive is nearly full, this could be a problem. As part of this lesson, a new folder will be created in your course folder, `BIO1173`, called `./temp`. This folder will be used to store large image datasets for the remainder of this course. Temporary folders, like `./temp` are ofter used in computer programming to store **_temporary files_**. In general, you should be able to delete **_all_** the folder(s) and files(s) in folder like `./temp` without causing any problems. 

As you can see below, how WINDOWS handles files, folders and directories, is different from MacOS and Linux. Therefore, you will see different code depending upon what kind of computer you are running Jupyterlab or if you are using Google's COLAB.

In [4]:
# System commands to create a temporary folder called /temp

# Change to LESSON_DIRECTORY
os.chdir(LESSON_DIRECTORY)

if COLAB:
    print("Note: Using COLAB, no \\temp folder is needed")
elif WINDOWS:
    os.chdir("../")
    BASE_DIR = os.getcwd()
    #new_dirpath = os.getcwd()
    print("Your BASE_DIR directory is : " + BASE_DIR)
    try:
        os.mkdir("./temp")
        print("Note: making ./temp folder")
    except:
        print("Note: ./temp folder is already present")
    # Change back to LESSON_DIRECTORY
    os.chdir(LESSON_DIRECTORY)
else:
    os.chdir("../")
    BASE_DIR = os.getcwd()
    #new_dirpath = os.getcwd()
    print("Your BASE_DIR directory is : " + BASE_DIR)
    try:
        os.mkdir("./temp")
        print("Note: making ./temp folder")
    except:
        print("Note: ./temp folder is already present")
    # Change back to LESSON_DIRECTORY
    os.chdir(LESSON_DIRECTORY)
    
    
print("Your current working directory is : " + os.getcwd())

Your BASE_DIR directory is : /home/david/BIO1173/Classes
Note: ./temp folder is already present
Your current working directory is : /home/david/BIO1173/Classes/Class_06_2


In [5]:
# Simple function to print out elasped time
def hms_string(sec_elapsed):
    h = int(sec_elapsed / (60 * 60))
    m = int((sec_elapsed % (60 * 60)) / 60)
    s = sec_elapsed % 60
    return "{}:{:>02}:{:>05.2f}".format(h, m, s)

# Part 6.2: Keras Neural Networks for Digits and Fashion MNIST

This module will focus on computer vision. There are some important differences and similarities with previous neural networks.

* We will usually use classification, though regression is still an option.
* The input to the neural network is now 3D (height, width, color)
* Data are not transformed; no z-scores or dummy variables.
* Processing time is much longer.
* We now have different layer times: dense layers (just like before), convolution layers, and max-pooling layers.
* Data will no longer arrive as CSV files. TensorFlow provides some utilities for going directly from the image to the input for a neural network.


## Common Computer Vision Data Sets

There are many data sets for computer vision. Two of the most popular classic datasets are the MNIST digits data set and the CIFAR image data sets. We will not use either of these datasets in this course, but it is important to be familiar with them since neural network texts often refer to them.

The [MNIST Digits Data Set](http://yann.lecun.com/exdb/mnist/) is very popular in the neural network research community. You can see a sample of it in Figure 6.MNIST.

**Figure 6.MNIST: MNIST Data Set**
![MNIST Data Set](https://biologicslab.co/BIO1173/images/class_8_mnist.png "MNIST Data Set")

[Fashion-MNIST](https://www.kaggle.com/zalando-research/fashionmnist) is a dataset of [Zalando](https://jobs.zalando.com/tech/) 's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image associated with a label from 10 classes. Fashion-MNIST is a direct drop-in replacement for the original [MNIST dataset](http://yann.lecun.com/exdb/mnist/) for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits. You can see this data in Figure 6.MNIST-FASHION.

**Figure 6.MNIST-FASHION: MNIST Fashon Data Set**
![mnist-fashion](https://biologicslab.co/BIO1173/images/mnist-fashion.png "mnist-fashion")

The [CIFAR-10 and CIFAR-100](https://www.cs.toronto.edu/~kriz/cifar.html) datasets are also frequently used by the neural network research community.

**Figure 6.CIFAR: CIFAR Data Set**
![CIFAR Data Set](https://biologicslab.co/BIO1173/images/class_8_cifar.png "CIFAR Data Set")

The CIFAR-10 data set contains low-rez images that are divided into 10 classes. The CIFAR-100 data set contains 100 classes in a hierarchy. 

## Convolutional Neural Networks (CNNs)

The convolutional neural network (CNN) is a neural network technology that has profoundly impacted the area of computer vision (CV). Fukushima  (1980) [[Cite:fukushima1980neocognitron]](https://www.rctn.org/bruno/public/papers/Fukushima1980.pdf) introduced the original concept of a convolutional neural network, and   LeCun, Bottou, Bengio & Haffner (1998) [[Cite:lecun1995convolutional]](http://yann.lecun.com/exdb/publis/pdf/lecun-bengio-95a.pdf) greatly improved this work. From this research, Yan LeCun introduced the famous LeNet-5 neural network architecture. This chapter follows the LeNet-5 style of convolutional neural network.  
Although computer vision primarily uses CNNs, this technology has some applications outside of the field. You need to realize that if you want to utilize CNNs on non-visual data, you must find a way to encode your data to mimic the properties of visual data.  

The order of the input array elements is crucial to the training. In contrast, most neural networks that are not CNNs treat their input data as a long vector of values, and the order in which you arrange the incoming features in this vector is irrelevant. You cannot change the order for these types of neural networks after you have trained the network. 

The CNN network arranges the inputs into a grid. This arrangement worked well with images because the pixels in closer proximity to each other are important to each other. The order of pixels in an image is significant. The human body is a relevant example of this type of order. For the design of the face, we are accustomed to eyes being near to each other. 

This advance in CNNs is due to years of research on biological eyes. In other words, CNNs utilize overlapping fields of input to simulate features of biological eyes. Until this breakthrough, AI had been unable to reproduce the capabilities of biological vision.
Scale, rotation, and noise have presented challenges for AI computer vision research. You can observe the complexity of biological eyes in the example that follows. A friend raises a sheet of paper with a large number written on it. As your friend moves nearer to you, the number is still identifiable. In the same way, you can still identify the number when your friend rotates the paper. Lastly, your friend creates noise by drawing lines on the page, but you can still identify the number. As you can see, these examples demonstrate the high function of the biological eye and allow you to understand better the research breakthrough of CNNs. That is, this neural network can process scale, rotation, and noise in the field of computer vision. You can see this network structure in Figure 6.LENET.

**Figure 6.LENET: A LeNET-5 Network (LeCun, 1998)**
![A LeNET-5 Network](https://biologicslab.co/BIO1173/images/class_8_lenet5.png "A LeNET-5 Network")

So far, we have only seen one layer type (dense layers). By the end of this book we will have seen:

* **Dense Layers** - Fully connected layers.  
* **Convolution Layers** - Used to scan across images. 
* **Max Pooling Layers** - Used to downsample images. 
* **Dropout Layers** - Used to add regularization. 
* **LSTM and Transformer Layers** - Used for time series data.

## Convolution Layers

The first layer that we will examine is the convolutional layer. We will begin by looking at the hyper-parameters that you must specify for a convolutional layer in most neural network frameworks that support the CNN:

* Number of filters
* Filter Size
* Stride
* Padding
* Activation Function/Non-Linearity

The primary purpose of a convolutional layer is to detect features such as edges, lines, blobs of color, and other visual elements. The filters can detect these features. The more filters we give to a convolutional layer, the more features it can see.

A filter is a square-shaped object that scans over the image. A grid can represent the individual pixels of a grid. You can think of the convolutional layer as a smaller grid that sweeps left to right over each image row. There is also a hyperparameter that specifies both the width and height of the square-shaped filter. The following figure shows this configuration in which you see the six convolutional filters sweeping over the image grid:

A convolutional layer has weights between it and the previous layer or image grid. Each pixel on each convolutional layer is a weight. Therefore, the number of weights between a convolutional layer and its predecessor layer or image field is the following:

```
[FilterSize] * [FilterSize] * [# of Filters]
```

For example, if the filter size were 5 (5x5) for 10 filters, there would be 250 weights.

You need to understand how the convolutional filters sweep across the previous layer's output or image grid. Figure 6.CNN illustrates the sweep:

**Figure 6.CNN: Convolutional Neural Network**
![Convolutional Neural Network](https://biologicslab.co/BIO1173/images/class_8_cnn_grid.png "Convolutional Neural Network")

The above figure shows a convolutional filter with 4 and a padding size of 1. The padding size is responsible for the border of zeros in the area that the filter sweeps. Even though the image is 8x7, the extra padding provides a virtual image size of 9x8 for the filter to sweep across. The stride specifies the number of positions the convolutional filters will stop. The convolutional filters move to the right, advancing by the number of cells specified in the stride. Once you reach the far right, the convolutional filter moves back to the far left; then, it moves down by the stride amount and
continues to the right again.

Some constraints exist concerning the size of the stride. The stride cannot be 0. The convolutional filter would never move if you set the stride. Furthermore, neither the stride nor the convolutional filter size can be larger than the previous grid. There are additional constraints on the stride (*s*), padding (*p*), and the filter width (*f*) for an image of width (*w*). Specifically, the convolutional filter must be able to start at the far left or top border, move a certain number of strides, and land on the far right or bottom border. The following equation shows the number of steps a convolutional operator
must take to cross the image:

$$ steps = \frac{w - f + 2p}{s}+1 $$

The number of steps must be an integer. In other words, it cannot have decimal places. The purpose of the padding (*p*) is to be adjusted to make this equation become an integer value.

## Max Pooling Layers

Max-pool layers downsample a 3D box to a new one with smaller dimensions. Typically, you can always place a max-pool layer immediately following the convolutional layer. The LENET shows the max-pool layer immediately after layers C1 and C3. These max-pool layers progressively decrease the size of the dimensions of the 3D boxes passing through them. This technique can avoid overfitting (Krizhevsky, Sutskever & Hinton, 2012).

A pooling layer has the following hyper-parameters:

* Spatial Extent (*f*)
* Stride (*s*)

Unlike convolutional layers, max-pool layers do not use padding. Additionally, max-pool layers have no weights, so training does not affect them. These layers downsample their 3D box input. The 3D box output by a max-pool layer will have a width equal to this equation:

$$ w_2 = \frac{w_1 - f}{s} + 1 $$

The height of the 3D box produced by the max-pool layer is calculated similarly with this equation:

$$ h_2 = \frac{h_1 - f}{s} + 1 $$

The depth of the 3D box produced by the max-pool layer is equal to the depth the 3D box received as input. The most common setting for the hyper-parameters of a max-pool layer is f=2 and s=2. The spatial extent (f) specifies that boxes of 2x2 will be scaled down to single pixels. Of these four pixels, the pixel with the maximum value will represent the 2x2 pixel in the new grid. Because squares of size 4 are replaced with size 1, 75% of the pixel information is lost. The following figure shows this transformation as a 6x6 grid becomes a 3x3:

**Figure 6.MAXPOOL: Max Pooling Layer**
![Max Pooling Layer](https://biologicslab.co/BIO1173/images/class_8_conv_maxpool.png "Max Pooling Layer")

Of course, the above diagram shows each pixel as a single number. A grayscale image would have this characteristic. We usually take the average of the three numbers for an RGB image to determine which pixel has the maximum value.

## Regression Convolutional Neural Networks

We will now look at two examples, one for regression and another for classification. For supervised computer vision, your dataset will need some labels. For classification, this label usually specifies what the image is a picture of. For regression, this "label" is some numeric quantity the image should produce, such as a count. We will look at two different means of providing this label.

The first example will show how to handle regression with convolution neural networks. We will provide an image and expect the neural network to count items in that image. We will use a [dataset](https://www.kaggle.com/jeffheaton/count-the-paperclips) that I created that contains a random number of paperclips. The following code will download this dataset for you.



![____](https://biologicslab.co/BIO1173/images/clips-49999.jpg)

In [6]:
# 

PATH=True

URL = "https://biologicslab.co/BIO1173/data/"
DOWNLOAD_SOURCE = URL+"paperclips.zip"
DOWNLOAD_FILE = DOWNLOAD_SOURCE[DOWNLOAD_SOURCE.rfind('/')+1:]
print("DOWNLOAD_SOURCE=",DOWNLOAD_SOURCE)
print("DOWNLOAD_FILE=",DOWNLOAD_FILE)

if COLAB:
    PATH = "/content"
    EXTRACT_TARGET = os.path.join(PATH,"clips")
    SOURCE = os.path.join(EXTRACT_TARGET, "paperclips")
    print("Note: Using COLAB")
elif WINDOWS:
    PATH=LESSON_DIRECTORY
    print("PATH=",PATH)
    EXTRACT_FOLDER_IN=BASE_DIR+"\\temp\\"
    print("EXTRACT_FOLDER_IN=",EXTRACT_FOLDER_IN)
    EXTRACT_FOLDER_OUT=EXTRACT_FOLDER_IN+"clips\\"
    print("EXTRACT_FOLDER_OUT=",EXTRACT_FOLDER_OUT)
    SOURCE = os.path.join(EXTRACT_FOLDER_IN, "paperclips.zip")
    print("Note: Using WINDOWS")
    print("SOURCE=",SOURCE)
    print("DOWNLOAD_FILE=",DOWNLOAD_FILE)
else:
    PATH=LESSON_DIRECTORY
    print("PATH=",PATH)
    EXTRACT_FOLDER_IN=BASE_DIR+"/temp/"
    print("EXTRACT_FOLDER_IN=",EXTRACT_FOLDER_IN)
    EXTRACT_FOLDER_OUT=EXTRACT_FOLDER_IN+"clips/"
    print("EXTRACT_FOLDER_OUT=",EXTRACT_FOLDER_OUT)
    SOURCE = os.path.join(EXTRACT_FOLDER_IN, "paperclips.zip")
    print("Note: Using OTHER (MacOS?)")
    print("SOURCE=",SOURCE)
    print("DOWNLOAD_FILE=",DOWNLOAD_FILE)

DOWNLOAD_SOURCE= https://biologicslab.co/BIO1173/data/paperclips.zip
DOWNLOAD_FILE= paperclips.zip
PATH= /home/david/BIO1173/Classes/Class_06_2
EXTRACT_FOLDER_IN= /home/david/BIO1173/Classes/temp/
EXTRACT_FOLDER_OUT= /home/david/BIO1173/Classes/temp/clips/
Note: Using OTHER (MacOS?)
SOURCE= /home/david/BIO1173/Classes/temp/paperclips.zip
DOWNLOAD_FILE= paperclips.zip


Next, we download the images. This part depends on the origin of your images. The following code downloads images from a URL, where a ZIP file contains the images. The code unzips the ZIP file.

In [7]:
!pip install patool



In [8]:
!pip install wget



In [12]:
# Download and extract the image data

if COLAB:
    print("PATH=",PATH, "DOWNLOAD_FILE=",DOWNLOAD_FILE, "DOWNLOAD_SOURCE=",DOWNLOAD_SOURCE)
    !wget -O {os.path.join(PATH,DOWNLOAD_FILE)} {DOWNLOAD_SOURCE}
    !mkdir -p {SOURCE}
    !mkdir -p {TARGET}
    !mkdir -p {EXTRACT_TARGET}
    !unzip -o -j -d {SOURCE} {os.path.join(PATH, DOWNLOAD_NAME)} >/dev/null

elif WINDOWS:
    import patoolib
    import wget
    if os.path.isfile(SOURCE):
        print("SOURCE already exists.", SOURCE)
    else:
        #!wget.download({DOWNLOAD_SOURCE} - {SOURCE})
        !wget -v {DOWNLOAD_SOURCE} --output-document={SOURCE}
        print("In WINDOWS")
    try:
        patoolib.extract_archive(SOURCE,outdir=EXTRACT_FOLDER_OUT)
    except:
        print("Note: File already extracted")
        
    DATA_FOLDER=EXTRACT_FOLDER_OUT+"\\paperclips\\"

else:
    import patoolib
    import wget
    DATA_FOLDER=EXTRACT_FOLDER_OUT+"/paperclips/"
    if os.path.isfile(SOURCE):
        print("SOURCE already exists.", SOURCE)
    else:
        print("EXTRACT_FOLDER_IN=",EXTRACT_FOLDER_IN,"DOWNLOAD_SOURCE=",DOWNLOAD_SOURCE)
        !wget -v {DOWNLOAD_SOURCE} --output-document={SOURCE}
        #!wget -v -O {EXTRACT_FOLDER_IN} {DOWNLOAD_SOURCE}
        
        try:
            patoolib.extract_archive(SOURCE,outdir=EXTRACT_FOLDER_OUT)
        except:
            print("Note: File already extracted")
    
    

#print("SOURCE=",SOURCE)

EXTRACT_FOLDER_IN= /home/david/BIO1173/Classes/temp/ DOWNLOAD_SOURCE= https://biologicslab.co/BIO1173/data/paperclips.zip
--2024-04-09 12:09:58--  https://biologicslab.co/BIO1173/data/paperclips.zip
Resolving biologicslab.co (biologicslab.co)... 194.163.45.209
Connecting to biologicslab.co (biologicslab.co)|194.163.45.209|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 163590691 (156M) [application/zip]
Saving to: ‘/home/david/BIO1173/Classes/temp/paperclips.zip’


2024-04-09 12:10:06 (18.4 MB/s) - ‘/home/david/BIO1173/Classes/temp/paperclips.zip’ saved [163590691/163590691]

patool: Extracting /home/david/BIO1173/Classes/temp/paperclips.zip ...
patool: ... creating output directory `/home/david/BIO1173/Classes/temp/clips/'.
patool: ... /home/david/BIO1173/Classes/temp/paperclips.zip extracted to `/home/david/BIO1173/Classes/temp/clips/'.


The labels are contained in a CSV file named **train.csv** for regression. This file has just two labels, **id** and **clip_count**. The ID specifies the filename; for example, row id 1 corresponds to the file **clips-1.jpg**. The following code loads the labels for the training set and creates a new column, named **filename**, that contains the filename of each image, based on the **id** column.

In [13]:
import pandas as pd

if COLAB:
    df = pd.read_csv(
    os.path.join(SOURCE,"train.csv"), 
    na_values=['NA', '?'])
    df['filename']="clips-"+df["id"].astype(str)+".jpg"
elif WINDOWS:
    df = pd.read_csv(
    os.path.join(DATA_FOLDER,"train.csv"), 
    na_values=['NA', '?'])
    df['filename']="clips-"+df["id"].astype(str)+".jpg"
else:
    df = pd.read_csv(
    os.path.join(DATA_FOLDER,"train.csv"), 
    na_values=['NA', '?'])
    df['filename']="clips-"+df["id"].astype(str)+".jpg"

This results in the following dataframe.

In [14]:
df

Unnamed: 0,id,clip_count,filename
0,30001,11,clips-30001.jpg
1,30002,2,clips-30002.jpg
2,30003,26,clips-30003.jpg
3,30004,41,clips-30004.jpg
4,30005,49,clips-30005.jpg
...,...,...,...
19995,49996,35,clips-49996.jpg
19996,49997,54,clips-49997.jpg
19997,49998,72,clips-49998.jpg
19998,49999,24,clips-49999.jpg


Separate into a training and validation (for early stopping)

In [15]:
TRAIN_PCT = 0.9
TRAIN_CUT = int(len(df) * TRAIN_PCT)

df_train = df[0:TRAIN_CUT]
df_validate = df[TRAIN_CUT:]

print(f"Training size: {len(df_train)}")
print(f"Validate size: {len(df_validate)}")

Training size: 18000
Validate size: 2000


If your code is correct you should see the following output:
~~~text
Training size: 18000
Validate size: 2000
~~~

We are now ready to create two ImageDataGenerator objects. We currently use a generator, which creates additional training data by manipulating the source material. This technique can produce considerably stronger neural networks. The generator below flips the images both vertically and horizontally. Keras will train the neuron network both on the original images and the flipped images. This augmentation increases the size of the training data considerably. Module 6.4 goes deeper into the transformations you can perform. You can also specify a target size to resize the images automatically.

The function **flow_from_dataframe** loads the labels from a Pandas dataframe connected to our **train.csv** file. When we demonstrate classification, we will use the **flow_from_directory**; which loads the labels from the directory structure rather than a CSV.

In [16]:
if COLAB:
    !pip install keras_preprocessing 

In [17]:
import tensorflow as tf
import keras_preprocessing
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator

training_datagen = ImageDataGenerator(
  rescale = 1./255,
  horizontal_flip=True,
  vertical_flip=True,
  fill_mode='nearest')

train_generator = training_datagen.flow_from_dataframe(
        dataframe=df_train,
        directory=DATA_FOLDER,
        x_col="filename",
        y_col="clip_count",
        target_size=(256, 256),
        batch_size=32,
        class_mode='other')

validation_datagen = ImageDataGenerator(rescale = 1./255)

val_generator = validation_datagen.flow_from_dataframe(
        dataframe=df_validate,
        directory=DATA_FOLDER,
        x_col="filename",
        y_col="clip_count",
        target_size=(256, 256),
        class_mode='other')

Found 18000 validated image filenames.
Found 2000 validated image filenames.


If your code is correct you should see the following output:
~~~text
Found 18000 validated image filenames.
Found 2000 validated image filenames.
~~~

We can now train the neural network. The code to build and train the neural network is not that different than in the previous modules. We will use the Keras Sequential class to provide layers to the neural network. We now have several new layer types that we did not previously see.

* **Conv2D** - The convolution layers.
* **MaxPooling2D** - The max-pooling layers.
* **Flatten** - Flatten the 2D (and higher) tensors to allow a Dense layer to process.
* **Dense** - Dense layers, the same as demonstrated previously. Dense layers often form the final output layers of the neural network.

The training code is very similar to previously. This code is for regression, so a final linear activation is used, along with mean_squared_error for the loss function. The generator provides both the *x* and *y* matrixes we previously supplied.

In [19]:
from tensorflow.keras.callbacks import EarlyStopping
import time

# Set variables
EPOCHS=5  # Use 25 if possible
MODEL_STEPS= 250

# Build model
model = tf.keras.models.Sequential([
    # Note the input shape is the desired size of the image 150x150 
    # with 3 bytes color.
    
    # This is the first convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu', 
        input_shape=(256, 256, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    
    # The second convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    tf.keras.layers.Flatten(),
    
    # 512 neuron hidden layer
    tf.keras.layers.Dense(512, activation='relu'),
    tf.keras.layers.Dense(1, activation='linear')
])

# Print model summary
model.summary()

# Set model steps
epoch_steps = MODEL_STEPS # needed for 2.2
validation_steps = len(df_validate)
model.compile(loss = 'mean_squared_error', optimizer='adam')
monitor = EarlyStopping(monitor='val_loss', min_delta=1e-3, 
        patience=5, verbose=1, mode='auto',
        restore_best_weights=True)

# Record start
start_time = time.time()

# Train model
print(f"Starting training for {EPOCHS} epochs...")
history = model.fit(train_generator,  
  verbose = 1, 
  validation_data=val_generator, callbacks=[monitor], epochs=EPOCHS)

# Print elapsed time
elapsed_time = time.time() - start_time
print("Elapsed time: {}".format(hms_string(elapsed_time)))

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 254, 254, 64)      1792      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 127, 127, 64)      0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 125, 125, 64)      36928     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 62, 62, 64)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 246016)            0         
_________________________________________________________________
dense_2 (Dense)              (None, 512)               125960704 
_________________________________________________________________
dense_3 (Dense)              (None, 1)                

This code will run very slowly if you do not use a GPU. The above code takes approximately 13 minutes with a GPU.

## Score Regression Image Data

Scoring/predicting from a generator is a bit different than training. We do not want augmented images, and we do not wish to have the dataset shuffled. For scoring, we want a prediction for each input. We construct the generator as follows:

* shuffle=False
* batch_size=1
* class_mode=None

We use a **batch_size** of 1 to guarantee that we do not run out of GPU memory if our prediction set is large. You can increase this value for better performance. The **class_mode** is None because there is no *y*, or label. After all, we are predicting.

In [None]:
#

df_test = pd.read_csv(
    os.path.join(DATA_FOLDER,"test.csv"), 
    na_values=['NA', '?'])

df_test['filename']="clips-"+df_test["id"].astype(str)+".jpg"

test_datagen = ImageDataGenerator(rescale = 1./255)

test_generator = validation_datagen.flow_from_dataframe(
        dataframe=df_test,
        directory=DATA_FOLDER,
        x_col="filename",
        batch_size=1,
        shuffle=False,
        target_size=(256, 256),
        class_mode=None)

If your code is correct you should see the following output:
~~~text
Found 5000 validated image filenames.
~~~

We need to reset the generator to ensure we are always at the beginning.

In [None]:
test_generator.reset()
pred = model.predict(test_generator,steps=len(df_test))

We can now generate a CSV file to hold the predictions.

In [None]:
df_submit = pd.DataFrame({'id':df_test['id'],'clip_count':pred.flatten()})
df_submit.to_csv(os.path.join(PATH,"submit.csv"),index=False)

In [None]:
df_submit

## Classification Neural Networks

Just like earlier in this module, we will load data. However, this time we will use a dataset of images of three different types of the iris flower. This zip file contains three different directories that specify each image's label. The directories are named the same as the labels:

* iris-setosa
* iris-versicolour
* iris-virginica


In [21]:
# 
import os
PATH=True

URL = "https://biologicslab.co/BIO1173/data/"
DOWNLOAD_SOURCE = URL+"/iris-image.zip"
DOWNLOAD_FILE = DOWNLOAD_SOURCE[DOWNLOAD_SOURCE.rfind('/')+1:]
print("DOWNLOAD_SOURCE=",DOWNLOAD_SOURCE)
print("DOWNLOAD_FILE=",DOWNLOAD_FILE)

if COLAB:
    PATH = "/content"
    EXTRACT_TARGET = os.path.join(PATH,"iris-images")
    SOURCE = os.path.join(EXTRACT_TARGET, "iris-images")
    print("Note: Using COLAB")
elif WINDOWS:
    PATH=LESSON_DIRECTORY
    print("PATH=",PATH)
    EXTRACT_FOLDER_IN=BASE_DIR+"\\temp\\"
    print("EXTRACT_FOLDER_IN=",EXTRACT_FOLDER_IN)
    EXTRACT_FOLDER_OUT=EXTRACT_FOLDER_IN+"iris-images\\"
    print("EXTRACT_FOLDER_OUT=",EXTRACT_FOLDER_OUT)
    SOURCE = os.path.join(EXTRACT_FOLDER_IN, "iris-images.zip")
    print("Note: Using WINDOWS")
    print("SOURCE=",SOURCE)
    print("DOWNLOAD_FILE=",DOWNLOAD_FILE)
else:
    PATH=LESSON_DIRECTORY
    print("PATH=",PATH)
    EXTRACT_FOLDER_IN=BASE_DIR+"/temp/"
    print("EXTRACT_FOLDER_IN=",EXTRACT_FOLDER_IN)
    EXTRACT_FOLDER_OUT=EXTRACT_FOLDER_IN+"iris-images/"
    print("EXTRACT_FOLDER_OUT=",EXTRACT_FOLDER_OUT)
    SOURCE = os.path.join(EXTRACT_FOLDER_IN, "iris-images.zip")
    print("Note: Using OTHER (MacOS?)")
    print("SOURCE=",SOURCE)
    print("DOWNLOAD_FILE=",DOWNLOAD_FILE)

DOWNLOAD_SOURCE= https://biologicslab.co/BIO1173/data//iris-image.zip
DOWNLOAD_FILE= iris-image.zip
PATH= /home/david/BIO1173/Classes/Class_06_2
EXTRACT_FOLDER_IN= /home/david/BIO1173/Classes/temp/
EXTRACT_FOLDER_OUT= /home/david/BIO1173/Classes/temp/iris-images/
Note: Using OTHER (MacOS?)
SOURCE= /home/david/BIO1173/Classes/temp/iris-images.zip
DOWNLOAD_FILE= iris-image.zip


Just as before, we unzip the images.

In [22]:
# Download and extract the image data

if COLAB:
    print("PATH=",PATH, "DOWNLOAD_FILE=",DOWNLOAD_FILE, "DOWNLOAD_SOURCE=",DOWNLOAD_SOURCE)
    !wget -O {os.path.join(PATH,DOWNLOAD_FILE)} {DOWNLOAD_SOURCE}
    !mkdir -p {SOURCE}
    !mkdir -p {TARGET}
    !mkdir -p {EXTRACT_TARGET}
    !unzip -o -j -d {SOURCE} {os.path.join(PATH, DOWNLOAD_NAME)} >/dev/null

elif WINDOWS:
    import patoolib
    import wget
    if os.path.isfile(SOURCE):
        print("SOURCE already exists.", SOURCE)
    else:
        #wget.download(DOWNLOAD_SOURCE, -o {SOURCE})
        print("In WINDOWS")
    try:
        patoolib.extract_archive(SOURCE,outdir=EXTRACT_FOLDER_OUT)
    except:
        print("Note: File already extracted")
        
    DATA_FOLDER=EXTRACT_FOLDER_OUT          # +"\\iris\\"

else:
    import patoolib
    import wget
    DATA_FOLDER=EXTRACT_FOLDER_OUT          #+"/iris/"
    if os.path.isfile(SOURCE):
        print("SOURCE already exists.", SOURCE)
    else:
        print("EXTRACT_FOLDER_IN=",EXTRACT_FOLDER_IN,"DOWNLOAD_SOURCE=",DOWNLOAD_SOURCE)
        !wget -v {DOWNLOAD_SOURCE} --output-document={SOURCE}
        try:
            patoolib.extract_archive(SOURCE,outdir=EXTRACT_FOLDER_OUT)
        except:
            print("Note: File already extracted")
    

EXTRACT_FOLDER_IN= /home/david/BIO1173/Classes/temp/ DOWNLOAD_SOURCE= https://biologicslab.co/BIO1173/data//iris-image.zip
--2024-04-09 12:20:11--  https://biologicslab.co/BIO1173/data//iris-image.zip
Resolving biologicslab.co (biologicslab.co)... 194.163.45.209
Connecting to biologicslab.co (biologicslab.co)|194.163.45.209|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5587253 (5.3M) [application/zip]
Saving to: ‘/home/david/BIO1173/Classes/temp/iris-images.zip’


2024-04-09 12:20:12 (14.7 MB/s) - ‘/home/david/BIO1173/Classes/temp/iris-images.zip’ saved [5587253/5587253]

patool: Extracting /home/david/BIO1173/Classes/temp/iris-images.zip ...
patool: ... creating output directory `/home/david/BIO1173/Classes/temp/iris-images/'.
patool: ... /home/david/BIO1173/Classes/temp/iris-images.zip extracted to `/home/david/BIO1173/Classes/temp/iris-images/'.


You can see these folders with the following command.

We set up the generator, similar to before.  This time we use flow_from_directory to get the labels from the directory structure.

In [23]:
import tensorflow as tf
import keras_preprocessing
from keras_preprocessing import image
from keras_preprocessing.image import ImageDataGenerator

training_datagen = ImageDataGenerator(
  rescale = 1./255,
  horizontal_flip=True,
  vertical_flip=True,
  width_shift_range=[-200,200],
  rotation_range=360,

  fill_mode='nearest')

train_generator = training_datagen.flow_from_directory(
    directory=DATA_FOLDER, target_size=(256, 256), 
    class_mode='categorical', batch_size=32, shuffle=True)

validation_datagen = ImageDataGenerator(rescale = 1./255)

validation_generator = validation_datagen.flow_from_directory(
    directory=DATA_FOLDER, target_size=(256, 256), 
    class_mode='categorical', batch_size=32, shuffle=True)


Found 421 images belonging to 3 classes.
Found 421 images belonging to 3 classes.


If your code is correct you should see the following output:
~~~text
Found 421 images belonging to 3 classes.
Found 421 images belonging to 3 classes.
~~~

Training the neural network with classification is similar to regression. 

In [24]:
from tensorflow.keras.callbacks import EarlyStopping
import time

# Set variables
EPOCHS=5  # Use 50 if possible
STEPS_PER_EPOCH =10

class_count = len(train_generator.class_indices)

# Record start time
start_time = time.time()

# Build the model
model = tf.keras.models.Sequential([
    
    # Note the input shape is the desired size of the image 
    # 300x300 with 3 bytes color
    
    # This is the first convolution
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', 
        input_shape=(256, 256, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),
    
    # The second convolution
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.MaxPooling2D(2,2),
    
    # The third convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.MaxPooling2D(2,2),
    
    # The fourth convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    # The fifth convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.5),
    
    # 512 neuron hidden layer
    tf.keras.layers.Dense(512, activation='relu'),
    
    # Only 1 output neuron. It will contain a value from 0-1 
    tf.keras.layers.Dense(class_count, activation='softmax')
])

# Print model summary
model.summary()

# Compile model
model.compile(loss = 'categorical_crossentropy', optimizer='adam')

# Train model
print(f"Starting training for {EPOCHS} epochs...") 
model.fit(train_generator, epochs=EPOCHS, steps_per_epoch=STEPS_PER_EPOCH, 
                    verbose = 1)

# Print elapsed time
elapsed_time = time.time() - start_time
print("Elapsed time: {}".format(hms_string(elapsed_time)))

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_4 (Conv2D)            (None, 254, 254, 16)      448       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 127, 127, 16)      0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 125, 125, 32)      4640      
_________________________________________________________________
dropout (Dropout)            (None, 125, 125, 32)      0         
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 62, 62, 32)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 60, 60, 64)        18496     
_________________________________________________________________
dropout_1 (Dropout)          (None, 60, 60, 64)       

The iris image dataset is not easy to predict; it turns out that a tabular dataset of measurements is more manageable.  However, we can achieve a 63%. 

In [26]:
from sklearn.metrics import accuracy_score
import numpy as np

validation_generator.reset()
pred = model.predict(validation_generator)

predict_classes = np.argmax(pred,axis=1)
expected_classes = validation_generator.classes

correct = accuracy_score(expected_classes,predict_classes)
print(f"Accuracy: {correct}")

Accuracy: 0.6389548693586699



# Other Resources

* [Imagenet:Large Scale Visual Recognition Challenge 2014](http://image-net.org/challenges/LSVRC/2014/index)
* [Andrej Karpathy](http://cs.stanford.edu/people/karpathy/) - PhD student/instructor at Stanford.
* [CS231n Convolutional Neural Networks for Visual Recognition](http://cs231n.stanford.edu/) - Stanford course on computer vision/CNN's.
* [CS231n - GitHub](http://cs231n.github.io/)
* [ConvNetJS](http://cs.stanford.edu/people/karpathy/convnetjs/) - JavaScript library for deep learning.

Now we can zip the preprocessed files and store them somewhere.