# Practical 6 Part 1 - Preparing Your Data

In this practical we will learn how to train basic GANs and Deep Convolutional GANs as an introduction to understanding how GANs work. 

Before we can even start with anything, let's download and unzip the dataset that we will be using for this practical from Polymall (UT Zappos 50k).

*NOTE: This dataset is also available at http://vision.cs.utexas.edu/projects/finegrained/utzap50k/ 



## Section 6.1.1 - Initialize Settings

Change the folder settings if necessary.

Then, run the following cell to initialize the folders and our output width.

In [4]:
import os 

# TODO: 
# Set up your folder containing the data to your Clothing dataset
#
data_folder = os.path.expanduser("~") + '/Downloads/ut-zap50k-images-square/'
output_folder = os.path.expanduser("~") + '/data/p6/'


width = 28
height = 28

In [None]:
# Create the folder for containing your output data.
#
os.makedirs(output_folder, exist_ok=True)

## Section 6.1.2 - Declaring Image Processing Functions

Run the following cell as is to declare the image processing functions.


In [5]:
import numpy as np
import scipy.io
import pandas
from matplotlib import pyplot as plt
from matplotlib import cm
import cv2
import glob

# Loads an image using OpenCV and returns the result
# in a numpy array in R, G, B order.
#
def loadimage(filename):
    img = cv2.imread(filename)
    img = img[...,::-1]     #reverse the RGB
    return img

# Resizes an image using OpenCV and returns the result
# in a numpy array in R, G, B order.
#
def resizeimage(img, width, height, nearest):
    if nearest:
        return cv2.resize(img, (width, height), interpolation=cv2.INTER_NEAREST)
    else:
        return cv2.resize(img, (width, height), interpolation=cv2.INTER_CUBIC)


# Loads and resize the image down to a size that can
# be fed into our segmentation network.
#
def load_and_process_image(filepath, width, height):
    if not (os.path.exists(filepath)):
        return None
    
    img = loadimage(filepath)
    img = resizeimage(img, width, height, False)                   # Bicubic interpolation    
    return img

## Section 6.1.3 - Load Up and Process Images

Run the following code as is to load up and process all images. There are about 50,000 images, which we are all going to resize to 28x28. 

We will save two sets of data. One set contains the these images in colour, the other set in grayscale.

This should take about two minutes to complete.


In [None]:
x_gray = []
x = []
count = 0

print ("Image processing start...")

for filepath in glob.iglob(data_folder + '**/*.jpg', recursive=True):
    #print (filepath)
    try:
        img = load_and_process_image(filepath, width, height)
    except:
        continue
    
    # Append the full RGB-coloured image into x
    #
    x.append(img)
    
    # Convert the image into grayscale
    #
    imggray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    # Append the grayscaled images into x_gray
    #
    x_gray.append(imggray)
    
    count = count + 1
    
    if count % 10 == 0:
        print ("%d images..." % (count))

print ("%d images processed complete." % (len(x)))

# Save the RGB coloured images into x.npy
#
x = np.array(x)
print (x.shape)
np.save(output_folder + "x.npy", x)

# Save the grayscale images into x_gray.npy
#
x_gray = np.array(x_gray)
print (x_gray.shape)
np.save(output_folder + "x_gray.npy", x_gray)

In [7]:
# Save the RGB coloured images into x.npy
#
x = np.array(x)
print (x.shape)
np.save(output_folder + "x.npy", x)

# Save the grayscale images into x_gray.npy
#
x_gray = np.array(x_gray)
print (x_gray.shape)
np.save(output_folder + "x_gray.npy", x_gray)

(50062, 28, 28, 3)
(50062, 28, 28)


## Section 6.1.4 - Upload Data to Google Drive

Once you have completed your processing, upload both output files to your Google Drive's Data/P6 folder.

Then, proceed to Part 2 of the practical.