# Data Preprocessing

The data needs to be preprocessed before being used for any machine learning algorithms. Some of the datetimes are incorrect, all images need to be given unique names, and information bars need to be cropped out.

In [3]:
from image_preprocessing import *
import os
import os.path
import datetime
import exiftool
import re

## Datetime Correction
Images from the Browning camera initially had incorrect datetimes. To calculate the change in datetime, I looked at the datetimes between a trigger image series of incorrect and corrected datetimes. 

In [2]:
time_delta = datetime.datetime(2019, 6, 30, 10, 9, 0) - datetime.datetime(2018, 1, 28, 13, 23, 0)
time_delta

datetime.timedelta(517, 74760)

Images from the Reconyx camera also initially had incorrect datetimes. I only have information for each day, so the times may always be incorrect. Looking at light in the images, the times seem to match up reasonably well. Following is the change in days to correct Reconyx images.

In [3]:
time_delta = datetime.datetime(2019, 4, 11, 3, 38, 0) - datetime.datetime(2018, 1, 2, 3, 38, 0)
time_delta

datetime.timedelta(464)

Using the change_datetimes function, I corrected the image datetimes by moving images in and out of the data directory. This function will also be useful to adjust datetimes for daylight savings time.

In [18]:
active_dir = None
for image in os.listdir(active_dir):
    change_datetimes(active_dir + image, time_delta)

## Unique Naming
I decided to name by site (s1), camera (c1 = Reconyx, c2 = Browning), datetime, and unique number. I need a unique number because the smallest datetime units given by images are seconds and some images share the same datetime. I use the image sequence number as the unique number for Reconyx images and iteration for Browning images. <br>
TO-DO:
* Extract image sequence number from Browning images.
* Rename by image trigger date and sequence number for all images. 

In [49]:
active_dir = "./data/"
for image in os.listdir(active_dir):
    rename_image(active_dir + image)

IMG_0001.JPG renamed to ./data/s1c2_20191020141453_0.jpg!
IMG_0002.JPG renamed to ./data/s1c2_20191020141453_1.jpg!
IMG_0003.JPG renamed to ./data/s1c2_20191020141453_2.jpg!
IMG_0004.JPG renamed to ./data/s1c2_20191020141454_0.jpg!
IMG_0005.JPG renamed to ./data/s1c2_20191020141454_1.jpg!
IMG_0006.JPG renamed to ./data/s1c2_20191020141454_2.jpg!
IMG_0007.JPG renamed to ./data/s1c2_20191020141454_3.jpg!
IMG_0008.JPG renamed to ./data/s1c2_20191020141455_0.jpg!
IMG_0009.JPG renamed to ./data/s1c2_20191021073803_0.jpg!
IMG_0010.JPG renamed to ./data/s1c2_20191021073803_1.jpg!
IMG_0011.JPG renamed to ./data/s1c2_20191021073804_0.jpg!
IMG_0012.JPG renamed to ./data/s1c2_20191021073804_1.jpg!
IMG_0013.JPG renamed to ./data/s1c2_20191021073804_2.jpg!
IMG_0014.JPG renamed to ./data/s1c2_20191021073805_0.jpg!
IMG_0015.JPG renamed to ./data/s1c2_20191021073805_1.jpg!
IMG_0016.JPG renamed to ./data/s1c2_20191021073805_2.jpg!
IMG_0017.JPG renamed to ./data/s1c2_20191022103519_0.jpg!
IMG_0018.JPG r

<br><br>
Trying to get pyexif wrapper to work with all metadata.

In [10]:
image_path = "./data/IMG_0073.JPG"

In [13]:
import subprocess
print(subprocess.check_output(['exiftool', image_path], encoding='utf-8'))

ExifTool Version Number         : 11.74
File Name                       : IMG_0073.JPG
Directory                       : ./data
File Size                       : 962 kB
File Modification Date/Time     : 2019:11:04 21:57:01-08:00
File Access Date/Time           : 2019:11:07 20:19:23-08:00
File Creation Date/Time         : 2019:11:07 20:19:23-08:00
File Permissions                : rw-rw-rw-
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
Exif Byte Order                 : Little-endian (Intel, II)
Make                            : Prometheus
Camera Model Name               : BTC-6PXD
Orientation                     : Horizontal (normal)
X Resolution                    : 72
Y Resolution                    : 72
Resolution Unit                 : inches
Software                        : 6PXD-V1804270
Modify Date                     : 2019:06:03 04:45:33
Y Cb Cr Positioning             : Co-sited
Exposure Time          

In [5]:
with exiftool.ExifTool() as et:
        metadata = et.get_metadata(image_path)

In [6]:
metadata

{'SourceFile': './data/IMG_0073.JPG',
 'ExifTool:ExifToolVersion': 11.74,
 'File:FileName': 'IMG_0073.JPG',
 'File:Directory': './data',
 'File:FileSize': 985534,
 'File:FileModifyDate': '2019:11:04 21:57:01-08:00',
 'File:FileAccessDate': '2019:11:07 20:19:23-08:00',
 'File:FileCreateDate': '2019:11:07 20:19:23-08:00',
 'File:FilePermissions': 666,
 'File:FileType': 'JPEG',
 'File:FileTypeExtension': 'JPG',
 'File:MIMEType': 'image/jpeg',
 'File:ExifByteOrder': 'II',
 'File:ImageWidth': 5376,
 'File:ImageHeight': 3024,
 'File:EncodingProcess': 0,
 'File:BitsPerSample': 8,
 'File:ColorComponents': 3,
 'File:YCbCrSubSampling': '2 1',
 'EXIF:Make': 'Prometheus',
 'EXIF:Model': 'BTC-6PXD',
 'EXIF:Orientation': 1,
 'EXIF:XResolution': 72,
 'EXIF:YResolution': 72,
 'EXIF:ResolutionUnit': 2,
 'EXIF:Software': '6PXD-V1804270',
 'EXIF:ModifyDate': '2019:06:03 04:45:33',
 'EXIF:YCbCrPositioning': 2,
 'EXIF:ExposureTime': 0.04166666667,
 'EXIF:FNumber': 2.4,
 'EXIF:ExposureProgram': 2,
 'EXIF:IS

In [1]:
import pyexif

In [2]:
from pyexif import pyexif

In [6]:
path = "./data/IMG_0073.JPG"

In [3]:
dir(pyexif)

['ExifEditor',
 '_EXIFTOOL_INSTALLED',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__spec__',
 '_install_exiftool_info',
 '_runproc',
 'datetime',
 'json',
 'os',
 'out',
 're',
 'six',
 'subprocess',
 'sys',
 'usage']

In [8]:
e = pyexif.ExifEditor(path)
e.getTags()

[('Aperture', 2.4),
 ('ApertureValue', 2.4),
 ('BitsPerSample', 8),
 ('CircleOfConfusion', '0.009 mm'),
 ('ColorComponents', 3),
 ('ColorSpace', 'sRGB'),
 ('ComponentsConfiguration', 'Y, Cb, Cr, -'),
 ('CompressedBitsPerPixel', 'undef'),
 ('Compression', 'JPEG (old-style)'),
 ('CreateDate', '2019:06:03 04:45:33'),
 ('DateTimeOriginal', '2019:06:03 04:45:33'),
 ('DependentImage1EntryNumber', 0),
 ('DependentImage2EntryNumber', 0),
 ('DigitalZoomRatio', 1000),
 ('Directory', './data'),
 ('EncodingProcess', 'Baseline DCT, Huffman coding'),
 ('ExifByteOrder', 'Little-endian (Intel, II)'),
 ('ExifImageHeight', 3024),
 ('ExifImageWidth', 5376),
 ('ExifToolVersion', 11.74),
 ('ExifVersion', '0220'),
 ('ExposureCompensation', 0),
 ('ExposureMode', 'Unknown (136)'),
 ('ExposureProgram', 'Program AE'),
 ('ExposureTime', '1/24'),
 ('FNumber', 2.4),
 ('FOV', '15.1 deg'),
 ('FileAccessDate', '2019:11:07 20:19:23'),
 ('FileCreateDate', '2019:11:07 20:19:23'),
 ('FileModifyDate', '2019:11:04 21:57:01

## Image Cropping
To decrease noise in the image data I need to crop out information bars and logos. From experimentation, I need to crop out the ... for the Reconyx and ... for the Browning images.

In [13]:
1440 -1404

36

In [16]:
3024 - 2830

194

Reconyx:
* 1152, top 32, bottom 70
* 1440, top 36, bottom 70
Browning
* 3024, bottom 194