<a href="https://colab.research.google.com/github/DavidSenseman/BIO1173/blob/master/Class_06_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---------------------------
**COPYRIGHT NOTICE:** This Jupyterlab Notebook is a Derivative work of [Jeff Heaton](https://github.com/jeffheaton) licensed under the Apache License, Version 2.0 (the "License"); You may not use this file except in compliance with the License. You may obtain a copy of the License at

> [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

------------------------

# **BIO 1173: Intro Computational Biology**

**Module 6: Convolutional Neural Networks (CNN) for Computer Vision**

* Instructor: [David Senseman](mailto:David.Senseman@utsa.edu), [Department of Integrative Biology](https://sciences.utsa.edu/integrative-biology/), [UTSA](https://www.utsa.edu/)

### Module 6 Material

* Part 6.1: Using Convolutional Neural Networks 
* Part 6.2: Using Pretrained Neural Networks with Keras
* **Part 6.3: Looking at Keras Generators and Image Augmentation**


### Lesson Setup

Run the next code cell to load necessary packages

In [None]:
# You MUST run this code cell first
import pandas as pd
import os
import numpy as np
import pandas as pd

import os
import shutil
path = '/'
memory = shutil.disk_usage(path)
dirpath = os.getcwd()
print("Your current working directory is : " + dirpath)
print("Disk", memory)

### Google CoLab Instructions

The following code ensures that Google CoLab is running the correct version of TensorFlow.

In [None]:
# You must run this cell second
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    from google.colab import auth
    auth.authenticate_user()
    COLAB = True
    print("Note: using Google CoLab")
    %tensorflow_version 2.x
    import requests
    gcloud_token = !gcloud auth print-access-token
    gcloud_tokeninfo = requests.get('https://www.googleapis.com/oauth2/v3/tokeninfo?access_token=' + gcloud_token[0]).json()
    print(gcloud_tokeninfo['email'])
except:
    print("Note: not using Google CoLab")
    COLAB = False

# Part 6.3: Using Image Augmentation

The [ImageDataGenerator](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator) class provides many options for image augmentation.  

**_Image augmentation_** is a technique used in machine learning to artificially increase the size of a training dataset by applying various transformations to the existing images. In Keras, the `ImageDataGenerator` class is commonly used for image augmentation. This class allows for easily applying transformations such as rotation, flipping, zooming, shifting, and more to the images in the dataset during training, helping to improve the model's performance and generalization capabilities.

Deciding which augmentations to use can impact the effectiveness of your model. In this lesson we will visualize some of these augmentations that you might use to train your neural network. Depending upon the composition of your images, some augmentations will be more useful than other ones.

### Example 1: Load Image Data

The code in the cell below uses the function `urllib.request.urlopen()` to read an image file stored on an HTTPS server. This function, `urllib.request.urlopen()` is used in Python to open a URL and retrieve its contents. It is commonly used to interact with web resources such as downloading files, making HTTP requests, or reading web pages. This function allows you to access and retrieve data from a given URL, making it useful for tasks such as web scraping, data collection, and accessing APIs.

In Example 1, the `urllib.request.urlopen()` function is used to open the specified URL (https://biologicslab.co) and retrieve its contents. The `read()` method of the response object is then used to read the data from the URL, which can be further processed or displayed as needed.

The argument `URL` needs both the actual web address, including the image name, but the addition argument `raw=true` as shown in this code chunk:
~~~text
URL = "https://biologicslab.co/BIO1173/images/cambrian.jpg?raw=true"
~~~

The code assigns the variable `LOCAL_IMG_FILE1` to the image name and then reads the image data from the URL using a `with` statement to manage resources and ensure their proper cleanup or release after they are no longer needed. 

Here is the code chunk that actually reads and downloads the image data: 
~~~text
with urllib.request.urlopen(URL) as response, \
  open(LOCAL_IMG_FILE1, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)
~~~
In this example, `shutil.copyfileobj` is used to copy the contents of the `response` to `out_file` by passing the file objects as arguments to the function. This function efficiently copies data in chunks without loading the entire file into memory, making it suitable for working with large files.


In this example, the with statement is used to open a file ('example.txt') for reading. When the block of code inside the with statement completes execution, Python automatically closes the file, ensuring that the file handle is properly released. This helps in managing resources efficiently and avoiding memory leaks or file handle issues.

Finally, the function `Image()` is used to display the downloaded image. In Example 1, the image is an artist's reconstruction of the [Cambrian Explosion](https://biologicslab.co/BIO1173/data/CambrianExplosion.pdf), an "evolutionary burst" that occurred 540 million years ago. 

In [None]:
# Example 1: Load Image Data

import urllib.request
import shutil
from IPython.display import Image

# Specify the URL for the image
URL = "https://biologicslab.co/BIO1173/images/cambrian.jpg?raw=true"

# Create variable
LOCAL_IMG_FILE1 = "cambrian.jpg"

# Download image
with urllib.request.urlopen(URL) as response, \
  open(LOCAL_IMG_FILE1, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)

# Display image
Image(filename=LOCAL_IMG_FILE1)

If your code if correct you should see the following image:

![__](https://biologicslab.co/BIO1173/images/cambrian.jpg).

In case you were wondering, the large creature in this image is called [Anomalocaris](https://en.wikipedia.org/wiki/Anomalocaris).

### **Exercise 1: Load Image Data**

In the cell below, write the code to download display the image `roadrunner.jpg` from the course HTTPS server. Create a variable called `LOCAL_IMG_FILE2` to refer to this image. 

In [None]:
# Insert your code for Exercise 1 here



If your code is correct you should see the following image of the [Greater Roadrunner](https://en.wikipedia.org/wiki/Roadrunner), (genus _Geococcyx_).


![__](https://biologicslab.co/BIO1173/images/roadrunner.jpg).

The primary food of the roadrunner consists of insects, small reptiles, rodents, birds, and fruits. They are opportunistic feeders that have a diverse diet, but insects such as grasshoppers, beetles, and scorpions are among their preferred food sources. Roadrunners are also known to consume snakes, lizards, small mammals, and occasionally fruits and seeds.


### Define Useful Functions

Next, we introduce a simple utility function called `visualize_generator()` to visualize four images sampled from any generator. Using this utility function, we can see 4 different ways the Image Augmentation works at the same time. 

In [None]:
from numpy import expand_dims
from tensorflow.keras.utils import load_img
from tensorflow.keras.utils import img_to_array
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot
import matplotlib.pyplot as plt
import numpy as np
import matplotlib

def visualize_generator(img_file, gen):
	# Load the requested image
  img = load_img(img_file)
  data = img_to_array(img)
  samples = expand_dims(data, 0)

	# Generat augumentations from the generator
  it = gen.flow(samples, batch_size=1)
  images = []
  for i in range(4):
    batch = it.next()
    image = batch[0].astype('uint8')
    images.append(image)

  images = np.array(images)

	# Create a grid of 4 images from the generator
  index, height, width, channels = images.shape
  nrows = index//2
    
  grid = (images.reshape(nrows, 2, height, width, channels)
            .swapaxes(1,2)
            .reshape(height*nrows, width*2, 3))
  
  fig = plt.figure(figsize=(15., 15.))
  plt.axis('off')
  plt.imshow(grid)

### Example 2: Flip Image 

We begin by flipping the image. Some images may not make sense to flip, such as this landscape.  However, if you expect "noise" in your data where some images may be flipped, then this augmentation may be useful, even if it violates physical reality.

You should note that you are able to control the "flipping axis", either horizontally and vertically. By design the process has a random component so that if you run the next code cell several times, you will get different outputs. As you might imagine, this would usually be a positive thing if your objective was to increae the size of your training image dataset, without having to add additional images.

In [None]:
# Example 2: Flip image

visualize_generator(
  LOCAL_IMG_FILE1,
  ImageDataGenerator(horizontal_flip=True, vertical_flip=True))

### **Exercise 2: Flip Image** 

In the cell below, write the code to flip your Roadrunner image both vertically and horizontally. 

In [None]:
# Insert your code for Exercise 2 here



### Example 3: Move image

Next, we will try moving the image. Notice how part of the image is missing? There are various ways to fill in the missing data, as controlled by **fill_mode**. In this case, we simply use the nearest pixel to fill. It is also possible to rotate images.

In [None]:
# Example 3: Move image

visualize_generator(
    LOCAL_IMG_FILE1,
    ImageDataGenerator(width_shift_range=[-200,200], 
        fill_mode='nearest'))

### **Exercise 3: Move image**

In the cell below write the code to move your Roadrunner image.

In [None]:
# Insert your code for Exercise 3 here



### Example 4: Adjust Brightness

We can also adjust brightness. Training Convolutional Neural Networks (CNNs) on the same image with different levels of brightness is a form of data augmentation that helps improve the model's robustness and generalization capabilities. By presenting the network with variations of the same image (e.g., brighter and darker versions), the CNN learns to be invariant to changes in lighting conditions, thereby improving its ability to recognize objects under different lighting conditions in real-world scenarios. This practice helps the CNN become more robust and perform better when presented with new, unseen data during inference.

In [None]:
# Example 4: Adjust brightness

visualize_generator(
  LOCAL_IMG_FILE1,
  ImageDataGenerator(brightness_range=[0,1]))

# brightness_range=None, shear_range=0.0

### **Exercise 4: Adjust Brightness**

In the cell below, write the code to display your Roadrunner image with different levels of brightness.

In [None]:
# Insert your code for Exercise 4 here



### Example 5: Shear Image

To **_shear_** an image means to apply a geometric transformation that shifts the position of pixels in the image along a specified direction, resulting in a distorted or skewed version of the original image. 

Shearing is a linear transformation that tilts the image along one axis while keeping the other axis unchanged. This transformation can be used for various purposes in image processing and computer vision tasks, such as correcting perspective distortion, creating artistic effects, or augmenting a dataset for training machine learning models.

Shearing may not be appropriate for all image types. 

In [None]:
# Example 5: Shear Image

visualize_generator(
  LOCAL_IMG_FILE1,
  ImageDataGenerator(shear_range=30))

### **Exercise 5: Shear Image**

In the cell below, write the code to shear your Roadrunner image.

In [None]:
# Insert your code for Exercise 5 here



### Example 6: Rotate Image

It is also possible to rotate images. Like the other examples shown above, adding rotated images to the training set, the CNN learns to be flexible to changes in position, thereby improving its ability to recognize objects in real-world scenarios. This practice helps the CNN become more robust and perform better when presented with new, unseen data during inference.

In [None]:
# Example 6: Rotate Image

visualize_generator(
  LOCAL_IMG_FILE1,
  ImageDataGenerator(rotation_range=30))

### **Exercise 6: Rotate Image**

In the cell below, write the code to rotate your Roadrunner image.

In [None]:
# Insert your code for Exercise 6 here



If we wanted to, we could zip-up the preprocessed files and store them somewhere for latter use.

## **Lesson Turn-in**

When you have completed all of the code cells, and run them in sequential order (the last code cell should be number 15) use the **File --> Print.. --> Save to PDF** to generate a PDF of your JupyterLab notebook. Save your PDF as `Class_06_3.lastname.pdf` where _lastname_ is your last name, and upload the file to Canvas.