# Data Acquisition and Cleaning

## Eye Cropping Function for Full Face Images

This notebook will allow you to take a full face image and extract the left and right eye from the image.  The results are saved in training and testing folders for use in training the models for this project.  

Closed eye data citation: https://parnec.nuaa.edu.cn/_upload/tpl/02/db/731/template731/pages/xtan/ClosedEyeDatabases.html

Open eye data citation: http://vis-www.cs.umass.edu/lfw/

After running this code, the images produced went through a rigorous review process to confirm they were indeed, closed or open eyes.  The reviewed images used for model training are found at 'new_closed_eyes.zip' and 'new_open_eyes.zip'.

### Imports

In [None]:
# If not already installed, the module 'face_recognition' is necessary to run this notebook.
# https://pypi.org/project/face-recognition/
!pip install face_recognition

Collecting face_recognition
  Downloading https://files.pythonhosted.org/packages/1e/95/f6c9330f54ab07bfa032bf3715c12455a381083125d8880c43cbe76bb3d0/face_recognition-1.3.0-py2.py3-none-any.whl
Collecting face-recognition-models>=0.3.0
[?25l  Downloading https://files.pythonhosted.org/packages/cf/3b/4fd8c534f6c0d1b80ce0973d01331525538045084c73c153ee6df20224cf/face_recognition_models-0.3.0.tar.gz (100.1MB)
[K     |████████████████████████████████| 100.2MB 42kB/s 
[?25hBuilding wheels for collected packages: face-recognition-models
  Building wheel for face-recognition-models (setup.py) ... [?25l[?25hdone
  Created wheel for face-recognition-models: filename=face_recognition_models-0.3.0-py2.py3-none-any.whl size=100566173 sha256=9894b207d5f930751e2b990d1d790a4ce7ee4b5305058ab1fde6f30bfecd728e
  Stored in directory: /root/.cache/pip/wheels/d2/99/18/59c6c8f01e39810415c0e63f5bede7d83dfb0ffc039865465f
Successfully built face-recognition-models
Installing collected packages: face-recogni

In [1]:
import os
from PIL import Image, ImageDraw
import face_recognition

### Mount Drive for Drive File Access

In [None]:
# # Uncomment if working on Google Colab
# from google.colab import drive
# drive.mount('/content/drive/')

Mounted at /content/drive/


## The Function

These functions takes full face images and crops eyes from that image.  It then stores them in the 'train' data folder for use with model training and tuning.

In [6]:
# OPEN EYE IMAGE CROPS:

def eye_cropper(folders):
    
    # Establish count for iterative file saving
    count = 0

    # For loop going through each image file
    for folder in os.listdir(folders):
        for file in os.listdir(folders + '/' + folder):
          
            # Using Facial Recognition Library on Image
            image = face_recognition.load_image_file(folders + '/' + folder + '/' + file)
          
            # create a variable for the facial feature coordinates
            face_landmarks_list = face_recognition.face_landmarks(image)
          
            # create a placeholder list for the eye coordinates
            # and append coordinates for eyes to list unless eyes 
            # weren't found by facial recognition
            eyes = []
            try:
                eyes.append(face_landmarks_list[0]['left_eye'])
                eyes.append(face_landmarks_list[0]['right_eye'])
            except:
                continue

            # establish the max x and y coordinates of the eye
            for eye in eyes:
                x_max = max([coordinate[0] for coordinate in eye])
                x_min = min([coordinate[0] for coordinate in eye])
                y_max = max([coordinate[1] for coordinate in eye])
                y_min = min([coordinate[1] for coordinate in eye])

              # establish the range of x and y coordinates    
                x_range = x_max - x_min
                y_range = y_max - y_min
              
              
                # in order to make sure the full eye is captured,
                # calculate the coordinates of a square that has a 
                # 50% cushion added to the axis with a larger range and 
                # then match the smaller range to the cushioned larger range
                if x_range > y_range:
                    right = round(.5*x_range) + x_max
                    left = x_min - round(.5*x_range)
                    bottom = round(((right-left) - y_range))/2 + y_max
                    top = y_min - round(((right-left) - y_range))/2
                else:
                    bottom = round(.5*y_range) + y_max
                    top = y_min - round(.5*y_range)
                    right = round(((bottom-top) - x_range))/2 + x_max
                    left = x_min - round(((bottom-top) - x_range))/2
              
                # save the original image as a variable
                im = Image.open(folders + '/' + folder + '/' + file)
              
                # crop original image using the cushioned coordinates
                im = im.crop((left, top, right, bottom))
              
                # resize image for input into our model
                im = im.resize((80,80))
              
                # save file to output folder
                # Update to folder where cropped open eye images will be stored
                im.save('../data/train/open_eyes/eye_crop_open' + str(count) + '.jpg')
#                 im.save('/content/drive/MyDrive/DSIR_2.8_Group_Project/Data/train/Open_Eyes/eye_crop_open' + str(count) + '.jpg')
              
                # increase count for iterative file saving
                count += 1
              
                # print count every 100 photos saved to monitor progress
                if count % 100 == 0:
                    print(count)
    
# Calling function to crop full-face, closed eye images
eye_cropper('../data/open_eyes_face_data')     

# # If using Google Colab, change directory to where full face images are stored
# eye_cropper('/content/drive/MyDrive/DSIR_2.8_Group_Project/Data/open_eyes_face_data') 

100
200
300
400
500
600
700
800
900
1000
1100
1200
1300
1400
1500
1600
1700
1800
1900
2000
2100
2200
2300
2400
2500
2600
2700
2800
2900
3000
3100
3200
3300
3400
3500
3600
3700
3800
3900
4000
4100
4200
4300
4400
4500
4600
4700
4800
4900
5000
5100
5200
5300
5400
5500
5600
5700
5800
5900
6000
6100
6200
6300
6400
6500
6600
6700
6800
6900
7000
7100
7200
7300
7400
7500
7600
7700
7800
7900
8000
8100
8200
8300
8400
8500
8600
8700
8800
8900
9000
9100
9200
9300
9400
9500
9600
9700
9800
9900
10000
10100
10200
10300
10400
10500
10600
10700
10800
10900
11000
11100
11200
11300
11400
11500
11600
11700
11800
11900
12000
12100
12200
12300
12400
12500
12600
12700
12800
12900
13000
13100
13200
13300
13400
13500
13600
13700
13800
13900
14000
14100
14200
14300
14400
14500
14600
14700
14800
14900
15000
15100
15200
15300
15400
15500
15600
15700
15800
15900
16000
16100
16200
16300
16400
16500
16600
16700
16800
16900
17000
17100
17200
17300
17400
17500
17600
17700
17800
17900
18000
18100
18200
18300
18400
1850

In [7]:
# CLOSED EYE IMAGE CROPS:

def eye_cropper(folder):
    
    # Establish count for iterative file saving
    count = 0

    # For loop going through each image file
    for file in os.listdir(folder):
        
        # Using Facial Recognition Library on Image
        image = face_recognition.load_image_file(folder + '/' + file)
        
        # create a variable for the facial feature coordinates
        face_landmarks_list = face_recognition.face_landmarks(image)
        
        # create a placeholder list for the eye coordinates
        # and append coordinates for eyes to list unless eyes 
        # weren't found by facial recognition
        eyes = []
        try:
            eyes.append(face_landmarks_list[0]['left_eye'])
            eyes.append(face_landmarks_list[0]['right_eye'])
        except:
            continue

        # establish the max x and y coordinates of the eye
        for eye in eyes:
            x_max = max([coordinate[0] for coordinate in eye])
            x_min = min([coordinate[0] for coordinate in eye])
            y_max = max([coordinate[1] for coordinate in eye])
            y_min = min([coordinate[1] for coordinate in eye])

        # establish the range of x and y coordinates    
            x_range = x_max - x_min
            y_range = y_max - y_min
            
            
            # in order to make sure the full eye is captured,
            # calculate the coordinates of a square that has a 
            # 50% cushion added to the axis with a larger range and 
            # then match the smaller range to the cushioned larger range
            if x_range > y_range:
                right = round(.5*x_range) + x_max
                left = x_min - round(.5*x_range)
                bottom = round(((right-left) - y_range))/2 + y_max
                top = y_min - round(((right-left) - y_range))/2
            else:
                bottom = round(.5*y_range) + y_max
                top = y_min - round(.5*y_range)
                right = round(((bottom-top) - x_range))/2 + x_max
                left = x_min - round(((bottom-top) - x_range))/2
            
            # save the original image as a variable
            im = Image.open(folder + '/' + file)
            
            # crop original image using the cushioned coordinates
            im = im.crop((left, top, right, bottom))
            
            # resize image for input into our model
            im = im.resize((80,80))
            
            # save file to output folder
            # Update to folder where cropped open eye images will be stored
            im.save('../data/train/closed_eyes/eye_crop_closed' + str(count) + '.jpg')

#             im.save('/content/drive/MyDrive/DSIR_2.8_Group_Project/Data/train/Closed_Eyes/eye_crop_closed' + str(count) + '.jpg')
            
            # increase count for iterative file saving
            count += 1
            
            # print count every 10 photos saved to monitor progress
            if count % 10 == 0:
                print(count)

# Calling function to crop full-face, closed eye images
eye_cropper('../data/closed_eyes_face_data')  

# # If using Google Colab, change directory to where full face images are stored
# eye_cropper('/content/drive/MyDrive/DSIR_2.8_Group_Project/Data/dataset_B_FacialImages_highResolution') 

10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
170
180
190
200
210
220
230
240
250
260
270
280
290
300
310
320
330
340
350
360
370
380
390
400
410
420
430
440
450
460
470
480
490
500
510
520
530
540
550
560
570
580
590
600
610
620
630
640
650
660
670
680
690
700
710
720
730
740
750
760
770
780
790
800
810
820
830
840
850
860
870
880
890
900
910
920
930
940
950
960
970
980
990
1000
1010
1020
1030
1040
1050
1060
1070
1080
1090
1100
1110
1120
1130
1140
1150
1160
1170
1180
1190
1200
1210
1220
1230
1240
1250
1260
1270
1280
1290
1300
1310
1320
1330
1340
1350
1360
1370
1380
1390
1400
1410
1420
1430
1440
1450
1460
1470
1480
1490
1500
1510
1520
1530
1540
1550
1560
1570
1580
1590
1600
1610
1620
1630
1640
1650
1660
1670
1680
1690
1700
1710
1720
1730
1740
1750
1760
1770
1780
1790
1800
1810
1820
1830
1840
1850
1860
1870
1880
1890
1900
1910
1920
1930
1940
1950
1960
1970
1980
1990
2000
