# Data Preprocessing

The data needs to be preprocessed before being used for any machine learning algorithms. Some of the datetimes are incorrect, all images need to be given unique names, and information bars need to be cropped out.

In [156]:
from image_preprocessing import *
import os
import re
import datetime
from PIL import Image
import piexif

## Datetime Correction
Images from the Browning camera initially had incorrect datetimes. To calculate the change in datetime, I looked at the datetimes between a trigger image series of incorrect and corrected datetimes. 

In [120]:
time_delta = datetime.datetime(2019, 6, 30, 10, 9, 0) - datetime.datetime(2018, 1, 28, 13, 23, 0)
time_delta

datetime.timedelta(517, 74760)

Images from the Reconyx camera also initially had incorrect datetimes. I only have information for each day, so the times may always be incorrect. Looking at light in the images, the times seem to match up reasonably well. Following is the change in days to correct Reconyx images.

In [121]:
time_delta = datetime.datetime(2019, 4, 11, 3, 38, 0) - datetime.datetime(2018, 1, 2, 3, 38, 0)
time_delta

datetime.timedelta(464)

Using the change_datetimes function, I corrected the image datetimes by moving images in and out of the data directory. This function will also be useful to adjust datetimes for daylight savings time.

In [18]:
active_dir = None
for image in os.listdir(active_dir):
    change_datetimes(active_dir + image, time_delta)

## Unique Naming
I decided to name by site (s1), camera (c1 = Reconyx, c2 = Browning), datetime, and unique number. I need a unique number because the smallest datetime units given by images are seconds and some images share the same datetime.

In [23]:
image = "./data/RCNX0037.jpg"

In [117]:
def rename_image(image_path):
    """Take in image path and rename image by site, camera, and datetime."""
    # Get image datetime.
    with open(image_path, 'rb') as image_file:
        image = Image(image_file)
    dt_str = image.datetime
    dt = datetime.datetime.strptime(dt_str, '%Y:%m:%d %H:%M:%S')
    new_dt_str = dt.strftime("%Y%m%d%H%M%S")
    camera = ""
    if image_path.startswith('IMG'):
        camera += 'c2'
    else:
        camera += 'c1'
    name = "./data/s1" + camera + "_" + new_dt_str + ".jpg"
    try:
        os.rename(image_path, name)
    except:
        
        new_dt_str = (dt + datetime.timedelta(0, 1)).strftime("%Y%m%d%H%M%S")
        name = "./data/s1" + camera + "_" + new_dt_str + ".jpg"
        os.rename(image_path, name)
    print(f"{image_path[7::]} renamed!")

In [118]:
active_dir = "./data/"
for image in os.listdir(active_dir):
    rename_image(active_dir + image)

RCNX0212.JPG renamed!
RCNX0213.JPG renamed!
RCNX0214.JPG renamed!
RCNX0215.JPG renamed!
RCNX0216.JPG renamed!
RCNX0217.JPG renamed!
RCNX0218.JPG renamed!
RCNX0219.JPG renamed!
RCNX0220.JPG renamed!
RCNX0221.JPG renamed!
RCNX0222.JPG renamed!
RCNX0223.JPG renamed!


FileExistsError: [WinError 183] Cannot create a file when that file already exists: './data/RCNX0224.JPG' -> './data/s1c1_20190527033304.jpg'

In [124]:
dt_str = "2019:01:01 2:59:59"


dt = datetime.datetime.strptime(dt_str, '%Y:%m:%d %H:%M:%S')
new_dt_str = (dt).strftime("%Y%m%d%H%M%S%f")
new_dt_str

'20190101025959000000'

In [135]:
image_path = "./data/RCNX0224.JPG"
with open(image_path, 'rb') as image_file:
        image = Image(image_file)
dt_str = image.datetime
dt = datetime.datetime.strptime(dt_str, '%Y:%m:%d %H:%M:%S')
new_dt_str = (dt + time).strftime("%Y%m%d%H%M%S%f")
new_dt_str

'20190527033303000000'

In [183]:
# open image
img = Image.open(image_path)
# load exif data into dictionary
exif_dict = piexif.load(img.info['exif'])
# generate updated datetime
old_dt_str = exif_dict["0th"][306]
print(old_dt_str)

In [184]:
exif_dict

{'0th': {271: b'RECONYX',
  272: b'HYPERFIRE 2 COVERT\x00',
  282: (72, 1),
  283: (72, 1),
  296: 2,
  306: b'2019:01:01 1:20:00',
  531: 2,
  34665: 173},
 'Exif': {33434: (25, 9600),
  34855: 100,
  36864: b'0220',
  36867: b'2019:01:01 1:20:00',
  36868: b'2019:01:01 1:20:00',
  37121: b'\x01\x02\x03\x00',
  37385: 25,
  37500: b'RECONYXH2\x00\x01\xce\x00\x00\x00\x00\xe0\x00d\x00\xbaN\x91\xa4\x12\x01\x1a\x98\xd2|\x13G23AYAP\x01\x00\x01\xc1\x01\x00\x04\x00a\x00\x18 $\x08M\x00\x07\x00\n\x00\x00\x00\x06\x00\x03\x00!\x00\x03\x00\x05\x00\x1b\x00\xe3\x07\x02\x00\x06\x005\x00\x0c\x00\x80\x00\x00\x00 \x00\x00\x00\x01\x00\x00\x00\x00\x00Z\x00\x0f\x10F\x1e\x01\x00HYPERFIRE 2 COVERT\x00\x00\x00\x00H\x002\x00R\x00X\x00E\x00L\x001\x002\x000\x002\x007\x007\x003\x006\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\xc2\x07\x00\n\x01\x01\x00\n\x00\x00\x00\n\x00\x14\x00<\x00\x04\x00\x01\x00\x00\x0

In [182]:
new_dt_str = "2019:01:01 1:20:00"
# replace exif datetimes with updated value
exif_dict["0th"][306] = new_dt_str.encode("utf-8")
print(exif_dict['0th'][306])
exif_dict["Exif"][36868] = new_dt_str.encode("utf-8")
exif_dict["Exif"][36867] = new_dt_str.encode("utf-8")
# Convert into bytes and put in image file
exif_bytes = piexif.dump(exif_dict)
piexif.insert(exif_bytes, image_path)
print(f"Updated exif datetimes!")

b'2019:05:22 3:44:06'
b'2019:01:01 1:20:00'
Updated exif datetimes!


In [193]:
for ifd_name in exif_dict:
    print("\n{0} IFD:".format(ifd_name))
    for key in exif_dict[ifd_name]:
        try:
            print(key, exif_dict[ifd_name][key][:30])
        except:
            print(key, exif_dict[ifd_name][key])


0th IFD:
271 b'RECONYX'
272 b'HYPERFIRE 2 COVERT\x00'
282 (72, 1)
283 (72, 1)
296 2
306 b'2019:01:01 1:20:00'
531 2
34665 173

Exif IFD:
33434 (25, 9600)
34855 100
36864 b'0220'
36867 b'2019:01:01 1:20:00'
36868 b'2019:01:01 1:20:00'
37121 b'\x01\x02\x03\x00'
37385 25
37500 b'RECONYXH2\x00\x01\xce\x00\x00\x00\x00\xe0\x00d\x00\xbaN\x91\xa4\x12\x01\x1a\x98\xd2|'
40960 b'0100'
40961 1
40962 2048
40963 1440
41986 0
41987 1
41990 3

GPS IFD:

Interop IFD:

1st IFD:

thumbnail IFD:


TypeError: 'NoneType' object is not iterable

In [1]:
import exiftool

In [9]:
os.system("./data/RCNX0224.JPG")

1

In [11]:
print(os.popen("exiftool -b -imagesize RCNX0224.JPG").read())




In [8]:
!pip install PyExifTool



In [1]:
import exiftool
import os

executed exiftool.


In [18]:
with exiftool.ExifTool() as et:
    metadata = et.get_metadata("./data/RCNX0224.JPG")
    
for k, v in metadata.items():
    print(k, v)

SourceFile ./data/RCNX0224.JPG
ExifTool:ExifToolVersion 11.74
File:FileName RCNX0224.JPG
File:Directory ./data
File:FileSize 143600
File:FileModifyDate 2019:11:06 08:57:57-08:00
File:FileAccessDate 2019:11:06 08:57:57-08:00
File:FileCreateDate 2019:05:27 04:33:02-07:00
File:FilePermissions 666
File:FileType JPEG
File:FileTypeExtension JPG
File:MIMEType image/jpeg
File:ExifByteOrder MM
File:ImageWidth 2048
File:ImageHeight 1440
File:EncodingProcess 0
File:BitsPerSample 8
File:ColorComponents 3
File:YCbCrSubSampling 2 1
EXIF:Make RECONYX
EXIF:Model HYPERFIRE 2 COVERT
EXIF:XResolution 72
EXIF:YResolution 72
EXIF:ResolutionUnit 2
EXIF:ModifyDate 2019:01:01 1:20:00
EXIF:YCbCrPositioning 2
EXIF:ExposureTime 0.002604166667
EXIF:ISO 100
EXIF:ExifVersion 0220
EXIF:DateTimeOriginal 2019:01:01 1:20:00
EXIF:CreateDate 2019:01:01 1:20:00
EXIF:ComponentsConfiguration 1 2 3 0
EXIF:Flash 25
EXIF:FlashpixVersion 0100
EXIF:ColorSpace 1
EXIF:ExifImageWidth 2048
EXIF:ExifImageHeight 1440
EXIF:ExposureMode

In [14]:
with exiftool.ExifTool() as et:
    metadata = et.get_metadata("./data/IMG_0073.JPG")
    
for k, v in metadata.items():
    print(k, v)

SourceFile ./data/IMG_0073.JPG
ExifTool:ExifToolVersion 11.74
File:FileName IMG_0073.JPG
File:Directory ./data
File:FileSize 985534
File:FileModifyDate 2019:11:04 21:57:01-08:00
File:FileAccessDate 2019:11:05 14:44:11-08:00
File:FileCreateDate 2019:11:05 14:44:11-08:00
File:FilePermissions 666
File:FileType JPEG
File:FileTypeExtension JPG
File:MIMEType image/jpeg
File:ExifByteOrder II
File:ImageWidth 5376
File:ImageHeight 3024
File:EncodingProcess 0
File:BitsPerSample 8
File:ColorComponents 3
File:YCbCrSubSampling 2 1
EXIF:Make Prometheus
EXIF:Model BTC-6PXD
EXIF:Orientation 1
EXIF:XResolution 72
EXIF:YResolution 72
EXIF:ResolutionUnit 2
EXIF:Software 6PXD-V1804270
EXIF:ModifyDate 2019:06:03 04:45:33
EXIF:YCbCrPositioning 2
EXIF:ExposureTime 0.04166666667
EXIF:FNumber 2.4
EXIF:ExposureProgram 2
EXIF:ISO 100
EXIF:ExifVersion 0220
EXIF:DateTimeOriginal 2019:06:03 04:45:33
EXIF:CreateDate 2019:06:03 04:45:33
EXIF:ComponentsConfiguration 1 2 3 0
EXIF:CompressedBitsPerPixel undef
EXIF:Shutt