# Exploring Images

Python includes multiple libraries that you can use to work with images. In this notebook, you'll use a few of them to explore some properties of images and technuques for working with them.

# Image Files
Let's start by creating an image of a simple geometric shape, and saving it as a JPG file. The **PIL** library in Python includes functions for working with images, so we'll use that to create and save the image.

Run the following code cell to create the image file:

In [None]:
# Function to create a random image (of a square, circle, or triangle)
def create_image (size, shape):
    from random import randint
    import numpy as np
    from PIL import Image, ImageDraw
    
    # Set random position and color
    xy1 = randint(10,40)
    xy2 = randint(60,100)
    col = (randint(0,200), randint(0,200), randint(0,200))

    # Create a new blank image
    img = Image.new("RGB", size, (255, 255, 255))
    
    # Create a canvas so we can draw on the image
    draw = ImageDraw.Draw(img)
    
    # Draw the specified shape on the image
    if shape == "circle":
        draw.ellipse([(xy1,xy1), (xy2,xy2)], fill=col)
    elif shape == "square":
        draw.rectangle([(xy1,xy1), (xy2,xy2)], fill=col)
    else: # triangle
        draw.polygon([(xy1,xy1), (xy2,xy2), (xy2,xy1)], fill=col)
    del draw
    
    # Return the shape image
    return img

# Call the function to create a new image (a 128x128px image of a circle)
new_img = create_image((128, 128), "circle")

# Save the image as a file
img_file_name = "new_img.jpg"
new_img.save(img_file_name)
print("Saved image as", img_file_name)


After you've run the cell above, switch back to the browser tab containing the folder where this notebook is saved and verify that a new file named **new_img.jpg** has been created. You can click this file to view it.

When you are done, come back to this notebook.

To load a saved image, the PIL library provides an **Image** class with an **open** method, as shown here:

In [None]:
from PIL import Image

# Load the image
loaded_img = Image.open(img_file_name)

# Check the filename and format
print(loaded_img.filename, "is a", loaded_img.format)

## Plotting an Image
Now that you have an image, you can use the **Matplotlib** library to plot it as a visualization. 

In [None]:
from matplotlib import pyplot as plt

# The following 'magic' command must be run before plotting in order to display images inline in a Jupyter notebook
%matplotlib inline

# Create a new figure
fig = plt.figure()

# Plot the image
imgplot = plt.imshow(loaded_img)

## Image Data Types
We've used the **PIL** library to create, load, and visualize our image.

Let's look at the data type of our image variable.

In [None]:
type(loaded_img)

The image is a PIL-specific type. However, we may want to work with our image using libraries other than PIL. The secret to this is to understand that images are actually just arrays of numeric values that represent pixel intensities, and the Python **NumPy** library provides a useful neutral format for image data.

For example, you can easily convert a PIL image to a NumPy multidimensional array like this:

In [None]:
import numpy as np

# Convert the PIL image to a NumPy array
img_array = np.array(loaded_img)

type(img_array)

You can also convert a NumPy array containing image data to a generic PIL image type like this:

In [None]:
# Convert to PIL format from array
pil_img = Image.fromarray(img_array)

type(pil_img)

Let's explore the idea that an image is a multidimensional array of pixel values in a little more detail.

Run the following cell to veiw the *shape* of the image array:

In [None]:
img_array.shape

In this case, the image array has three dimensions, which represent the red, green, and blue (*RGB*) channels of the color image. Each channel is an array of 128 x 128 pixels.

Let's look at the data type of these pixel values.

In [None]:
img_array.dtype

The pixel values are unsigned integer numbers between 0 and 255. These represent the pixel intensities for each of the RGB channels.

## Working with Multiple Image Files
Now that we've explored some basic principles, let's generate a complete set of images to work with in this lab.

We'll create 1,200 circles, squares, and triangles in appropriately named folders.

In [None]:
# function to create a dataset of images
def generate_image_data (classes, size, cases, img_dir):
    import os, shutil
    from PIL import Image
    
    # Check for existing folder and give option to replace it
    if os.path.exists(img_dir):
        replace_folder = input("Image folder already exists. Enter Y to replace it. \n")
        if replace_folder == "Y":
            print("Deleting old images...")
            shutil.rmtree(img_dir)
        else:
            return # Quit - no need to replace existing images
        
    # Create folder
    os.makedirs(img_dir)
    
    # Generate the specified number of images
    print("Generating new images...")
    i = 0
    while(i < (cases - 1) / len(classes)):
        if (i%25 == 0):
            print("Progress:{:.0%}".format((i*len(classes))/cases))
        i += 1
        for classname in classes:
            img = create_image(size, classname)
            # Save the image in an appropriately named folder based on the class (shape)
            saveFolder = os.path.join(img_dir,classname)
            if not os.path.exists(saveFolder):
                os.makedirs(saveFolder)
            imgFileName = os.path.join(saveFolder, classname + str(i) + '.jpg')
            try:
                img.save(imgFileName)
            except:
                try:
                    # Retry (resource constraints can cause occassional disk access errors)
                    img.save(imgFileName)
                except:
                    # We gave it a shot - time to move on with our lives
                    print("Error saving image", imgFileName)
            
# Our classes will be circles, squares, and triangles
classnames = ['circle', 'square', 'triangle']

# All images will be 128x128 pixels
img_size = (128,128)

# We'll store the images in a folder named 'shapes'
folder_name = 'shapes'

# Generate 1200 random images.
generate_image_data(classnames, img_size, 1200, folder_name)

print("Image files ready in '%s' folder!" % folder_name)

Now that we have our images, let's plot the first one in each folder:

In [None]:
import os

# Set up a figure of an appropriate size
fig = plt.figure(figsize=(12, 16))

# loop through the subfolders
dir_num = 0
for root, folders, filenames in os.walk(folder_name):
    for folder in folders:
        # Load the first image file using the PIL library
        file = os.listdir(os.path.join(root,folder))[0]
        imgFile = os.path.join(root,folder, file)
        img = Image.open(imgFile)
        # Add the image to the figure (which will have 1 row,and enough columns to show a file from each folder)
        a=fig.add_subplot(1, len(folders),dir_num + 1)
        imgplot = plt.imshow(img)
        # Add a caption with the folder name
        a.set_title(folder)
        dir_num = dir_num + 1


Nw we have a set of images representing three different classes of shape. In the next exercise, we'll explore how to use these images to train a machine learning model that can classify an image based on the shape it contains.