<h1><center>Austin Animal Center Shelter Outcomes Data</center></h1>
<h3><center>Sam Loyd</center></h3>
<h3><center>DSC 540</center></h3>
<h3><center>January 2020</center></h3>


This data can be found at 

https://www.kaggle.com/aaronschlegel/austin-animal-center-shelter-outcomes-and

Context quoted from site above:

##### The Austin Animal Center is the largest no-kill animal shelter in the United States that provides care and shelter to over 18,000 animals each year and is involved in a range of county, city, and state-wide initiatives for the protection and care of abandoned, at-risk, and surrendered animals.

##### As part of the City of Austin Open Data Initiative, the Austin Animal Center makes available its collected dataset that contains statistics and outcomes of animals entering the Austin Animal Services system (Austin Animal Shelter, 2017).

The following list of variables was taken from sparsely populated codebook at site above.

##### 37 Variables:  
age_upon_outcome Age of the animal at the time at which it left the shelter.  
animal_id  
animal_type Cat, dog, or other (including at least one bat!).  
breed Animal breed. Many animals are generic mixed-breeds, e.g. "Long-haired mix".  
color Color of the animal's fur, if it has fur.  
date_of_birth  
datetime Timestamp of outcome  
monthyear  
name  
outcome_subtype  
outcome_type Ultimate outcome for this animal. Possible entries include transferred, [mercy] euthanized, adopted.  
sex_upon_outcome  
count  
sex  
Spay/Neuter  
Periods  
Period Range  
outcome_age_(days)  
outcome_age_(years)  
Cat/Kitten (outcome)  
sex_age_outcome  
age_group  
dob_year  
dob_month  
dob_monthyear  
outcome_month  
outcome_year  
outcome_weekday  
outcome_hour  
breed1  
breed2  
cfa_breed  
domestic_breed  
coat_pattern  
color1  
color2  
coat  

### Replace headers (Data Wrangling with Python pg. 154 – 163)

In [1]:
# Required modules
import pprint
import numpy as np
import datetime
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
import re
from termcolor import colored

In [2]:
# The data set that I chose came from Kaggle at the above address.
# The data set did not contain a matching header file so I manually created one from the 
# information that I had.

from csv import DictReader

data_rdr = DictReader(open('aac_shelter_cat_outcome_eng.csv', 'rt'))
header_rdr = DictReader(open('aac_replacement_headers.csv', 'rt'))

data_rows = [d for d in data_rdr]
header_rows = [h for h in header_rdr]

##### Basic look at the data

In [3]:
# Count rows
print(len(data_rows))

29421


In [4]:
# Count headers from generated file.
print(len(header_rows))

37


In [5]:
# View data_rows
data_rows

[OrderedDict([('age_upon_outcome', '2 weeks'),
              ('animal_id', 'A684346'),
              ('animal_type', 'Cat'),
              ('breed', 'domestic shorthair'),
              ('color', 'orange'),
              ('date_of_birth', '2014-07-07 00:00:00'),
              ('datetime', '2014-07-22 16:04:00'),
              ('monthyear', '2014-07-22T16:04:00'),
              ('name', ''),
              ('outcome_subtype', 'Partner'),
              ('outcome_type', 'Transfer'),
              ('sex_upon_outcome', 'Intact Male'),
              ('count', '1'),
              ('sex', 'Male'),
              ('Spay/Neuter', 'No'),
              ('Periods', '2'),
              ('Period Range', '7'),
              ('outcome_age_(days)', '14'),
              ('outcome_age_(years)', '0.038356164383561646'),
              ('Cat/Kitten (outcome)', 'Kitten'),
              ('sex_age_outcome', 'Intact Male Kitten'),
              ('age_group', '(-0.022, 2.2]'),
              ('dob_year', '2014'),
  

##### Data that I generated in a csv file to create new headers from.

In [6]:
# View header_rows
header_rows

[OrderedDict([('Column', 'age_upon_outcome'),
              ('Title', 'Outcome Age'),
              ('Explanation', 'Age upon leaving shelter.')]),
 OrderedDict([('Column', 'animal_id'),
              ('Title', 'Animal ID'),
              ('Explanation', '')]),
 OrderedDict([('Column', 'animal_type'),
              ('Title', 'Animal Type'),
              ('Explanation', 'Animal breed.  Many are generic.')]),
 OrderedDict([('Column', 'breed'), ('Title', 'Breed'), ('Explanation', '')]),
 OrderedDict([('Column', 'color'),
              ('Title', 'Color'),
              ('Explanation', "Color of Animal's fur if applicable.")]),
 OrderedDict([('Column', 'date_of_birth'),
              ('Title', 'Date Of Birth'),
              ('Explanation', '')]),
 OrderedDict([('Column', 'datetime'),
              ('Title', 'Outcome Age Timestamp'),
              ('Explanation', 'TimeStamp of outcome.')]),
 OrderedDict([('Column', 'monthyear'),
              ('Title', 'Outcome Month-Year'),
              

##### Header Replacement in a list of dictionaries.

In [7]:
# Run through list of data rows.  Match the columns value in header_rows to 
# actual column headers in data rows.  If matched replace the column header 
# from data rows with the matching title value and create a new list of dictionarires
# for further analysis.

new_list = []
for dict_data_rows in data_rows:
    new_row = {}
    for dictkey, dictval in dict_data_rows.items():
        for dict_header_rows in header_rows:

            # This is used to get a exact match as using column could allow for redundant matches. 
            check_val = dict_header_rows.get('Column')
            replace_val = dict_header_rows.get('Title')          
            if dictkey in check_val:
                new_row[replace_val] = dictval
                
    new_list.append(new_row)

In [8]:
# Show first value to illustrate new titles.
new_list[0]

{'Outcome Age': '2 weeks',
 'Animal ID': 'A684346',
 'Animal Type': 'Cat',
 'Breed': 'domestic shorthair',
 'Primary Breed': 'domestic shorthair',
 'Secondary Breed ': '',
 'CFA Breed': 'False',
 'Domestic Breed': 'True',
 'Color': 'orange',
 'Primary Color': 'orange',
 'Secondary Color': '',
 'Date Of Birth': '2014-07-07 00:00:00',
 'Outcome Age Timestamp': '2014-07-22 16:04:00',
 'Outcome Month-Year': '2014-07-22T16:04:00',
 'DOB Month-Year': '2014-07',
 'Name': '',
 'Outcome Subtype': 'Partner',
 'Outcome Type': 'Transfer',
 'Outcome Sex': 'Male',
 'Count': '1',
 'Sex': 'Male',
 'Outcome Sex Age': 'Intact Male Kitten',
 'Spay-Neuter': 'No',
 'Periods': '2',
 'Period Range': '7',
 'Outcome Age in Days': '14',
 'Outcome Age in Years': '0.038356164383561646',
 'Outcome Cat-Kitten': 'Kitten',
 'Age Group': '(-0.022, 2.2]',
 'DOB Year': '2014',
 'DOB Month': '7',
 'Exit Month': '7',
 'Outcome Year': '2014',
 'Outcome Weekday': 'Tuesday',
 'Outcome Hour': '16',
 'Coat Pattern': 'orange',


### Format Data to a Readable Format (Data Wrangling with Python pg. 164 – 168)

In [9]:
# Place in a human readable format showing first row 
print("Query from the first entry (row 0) in the data set:")
for dictkey, dictval in new_list[0].items():
    # Clean up a few values and illustrate date formatting 
    # Get rid of the T in the output
    if dictkey == "Outcome Month-Year":
        # DEBUG print(type(dictval))
        # Use substitution to remove the T
        fixdictval = re.sub("T", " ", dictval)
        # DEBUG print(fixdictval)
        dictval = datetime.datetime.strptime(fixdictval, '%Y-%m-%d %H:%M:%S')
    # Decimal is unnecessarily long
    elif dictkey == "Outcome Age in Years":
        # DEBUG print(type(dictval))
        floatdict = round(float(dictval),2)
        dictval = floatdict
    elif dictval.find('/') > 0 and dictval.find(':') > 0:
        dictval = datetime.datetime.strptime(dictval, '%Y-%m-%d %H:%M:%S')
    dictkey = dictkey + "?"
    # Change color to highlight answers.
    print('\n  Question: What is the','{}'.format(dictkey),'\n    Answer:' \
          ,colored('{}','red').format(dictval))

Query from the first entry (row 0) in the data set:

  Question: What is the Outcome Age? 
    Answer: [31m2 weeks[0m

  Question: What is the Animal ID? 
    Answer: [31mA684346[0m

  Question: What is the Animal Type? 
    Answer: [31mCat[0m

  Question: What is the Breed? 
    Answer: [31mdomestic shorthair[0m

  Question: What is the Primary Breed? 
    Answer: [31mdomestic shorthair[0m

  Question: What is the Secondary Breed ? 
    Answer: [31m[0m

  Question: What is the CFA Breed? 
    Answer: [31mFalse[0m

  Question: What is the Domestic Breed? 
    Answer: [31mTrue[0m

  Question: What is the Color? 
    Answer: [31morange[0m

  Question: What is the Primary Color? 
    Answer: [31morange[0m

  Question: What is the Secondary Color? 
    Answer: [31m[0m

  Question: What is the Date Of Birth? 
    Answer: [31m2014-07-07 00:00:00[0m

  Question: What is the Outcome Age Timestamp? 
    Answer: [31m2014-07-22 16:04:00[0m

  Question: What is the Outcom

### Identify outliers and bad data (Data Wrangling with Python pg. 169 – 174)

 ##### Find Missing Values - Missingness

In [10]:
# I chose to create a dictionary totalling the columns with missing values and counting them
index = 0
mcount = 0
misscol = {}
while index < len(new_list):
    for dictkey, dictval in new_list[index].items():
        if not dictval:
            mcount += 1
            # Using get() 
            # Increment value in dictionary 
            misscol[dictkey] = misscol.get(dictkey, 0) + 1
            # DEBUG print("Missing value for {} {}:".format(dictkey,dictval))
    index += 1
print("\nColumns with missing values showing totals:\n")
pprint.pprint(misscol)
print("\nTotal missing values: {}".format(mcount))


Columns with missing values showing totals:

{'Color': 3626,
 'Name': 12774,
 'Outcome Subtype': 10780,
 'Outcome Type': 3,
 'Secondary Breed ': 29369,
 'Secondary Color': 19067}

Total missing values: 75619


 ##### Find Missing Values - NA, na ...

In [11]:
# I chose to create a dictionary totalling the columns with NA values and counting them
# Similar to code above - none were found in this data set.

nalist = ["NA", "Na", "na"]
nindex = 0
ncount = 0
nacol = {}
while nindex < len(new_list):
    # print(nindex)
    for dictkey, dictval in new_list[nindex].items():
         # print(dictval)
        if dictval in nalist:
            ncount += 1
            # Using get() 
            # Increment value in dictionary 
            nacol[dictkey] = nacol.get(dictkey, 0) + 1
    nindex += 1
print("\nColumns with NA values showing totals:\n")
pprint.pprint(nacol)
print("\nTotal NA values: {}".format(ncount))


Columns with NA values showing totals:

{}

Total NA values: 0


#### Evaluate data types of each column

In [12]:
# Initialize values
datatypes = {} 

start_dict = {'digit': 0, 'boolean': 0,
              'empty': 0, 'time_related': 0,
              'text': 0, 'alphanumeric': 0, 'unknown': 0
              } 
xindex = 0
xcount = 0
xcol = {}

# Loop through total count of list
while xindex < len(new_list):
    # print(nindex)
    # loop through every dictionary element in list of dictionaries
    for dictkey,dictval in new_list[xindex].items():
        # print(dictval)
        question = dictkey
        answer = dictval
        key = 'unknown' 
        # Is it a digit?
        if answer.isdigit(): 
            key = 'digit'
        # Is it a boolean?
        elif answer in ['Yes', 'No', 'True', 'False']: 
            key = 'boolean'
        #  Empty?
        elif answer.isspace(): 
            key = 'empty'
        # Does it indicate date type?
        elif answer.find('/') > 0 or answer.find(':') > 0: 
            key = 'time_related'
        # Is it text?
        elif answer.isalpha(): 
            key = 'text'
        # Is it alphanumeric?
        elif answer.isalnum(): 
            key = 'alphanumeric'
        # Can't classify.
        if question not in datatypes.keys(): 
            datatypes[question] = start_dict.copy() 
        datatypes[question][key] += 1 
    xindex += 1
print("\nDictionary of dictionaries containing columns and analysis of data type with totals:\n")
pprint.pprint(datatypes)


Dictionary of dictionaries containing columns and analysis of data type with totals:

{'Age Group': {'alphanumeric': 0,
               'boolean': 0,
               'digit': 0,
               'empty': 0,
               'text': 0,
               'time_related': 0,
               'unknown': 29421},
 'Animal ID': {'alphanumeric': 29421,
               'boolean': 0,
               'digit': 0,
               'empty': 0,
               'text': 0,
               'time_related': 0,
               'unknown': 0},
 'Animal Type': {'alphanumeric': 0,
                 'boolean': 0,
                 'digit': 0,
                 'empty': 0,
                 'text': 29421,
                 'time_related': 0,
                 'unknown': 0},
 'Breed': {'alphanumeric': 0,
           'boolean': 0,
           'digit': 0,
           'empty': 0,
           'text': 1459,
           'time_related': 52,
           'unknown': 27910},
 'CFA Breed': {'alphanumeric': 0,
               'boolean': 29421,
            

### Find Duplicates

##### First discover what makes the data unique

In [13]:
# Variable initialization
dictlist = []
pairlist = []
colorlist = []
dindex = 0
# dcount = 0
# dupcol = {}

# The code exmaple from the text would allow me to do this with shorter code, but given 
# that I was testing two different possible unique values at inception, this method allowed me to 
# run through the data once addressing both possibilities.  
while dindex < len(new_list):
    for dictkey, dictval in new_list[dindex].items():
        if dictkey == "Animal ID":
            # Just looking at atnimal id
            dictlist.append(dictval)
            # Storing for later pairing with outcome date
            animal = dictval
            # Create pair for further analysis
        if dictkey == "Outcome Age Timestamp":
            # DEBUG print(dictval)
            outcomedt = datetime.datetime.strptime(dictval, '%Y-%m-%d %H:%M:%S')
            # print(outcomedt)
            # Strip out spaces, dashes and colons and build my own key
            pairlist.append(animal + "-" + outcomedt.strftime("%Y-%m-%d %H:%M:%S"))
    dindex += 1

##### Animal ID alone is not enough to express uniqueness as this proves.

In [14]:
# Is animal id unique - NO
unq, count = np.unique(dictlist, return_counts=True)    

print("\nSummary of duplicates only looking at animal id:\n")
print(unq[count>1])

# cout each non unique value
dcount = len(unq[count>1])

print("\nTotal animal id values with duplicates: {}".format(dcount))


Summary of duplicates only looking at animal id:

['A282897' 'A304036' 'A318574' ... 'A764345' 'A764565' 'A765350']

Total animal id values with duplicates: 1107


##### A new key is required.  I selected animal id and outcome date.

In [15]:
# Is animal id plus outcome date unique - YES, but legitimate duplicates remain and need to be dealt with. 
unq, count = np.unique(pairlist, return_counts=True) 

# cout each non unique value
dcount = len(unq[count>1])

print("\nSummary of duplicates looking at the combination of animal id and outcome date after removing special characters:\n")
duplist = unq[count>1].tolist()
pprint.pprint(duplist)


print("\nTotal animal id plus outcome date values (using new key) with duplicates: {}".format(dcount))


Summary of duplicates looking at the combination of animal id and outcome date after removing special characters:

['A660948-2013-11-03 18:16:00',
 'A682781-2014-07-03 09:00:00',
 'A683782-2014-07-16 09:00:00',
 'A686827-2014-08-28 09:00:00',
 'A695798-2015-01-23 12:34:00',
 'A696688-2015-03-26 19:05:00',
 'A748204-2017-05-04 18:04:00']

Total animal id plus outcome date values (using new key) with duplicates: 7


##### Validate remaining duplicates are actually duplicates.

In [16]:
# Run through data again showing duplicate entries.
# Since there were only 7 rows duplicated a visual verification should suffice.

dictlist = []
pairlist = []
colorlist = []
dindex = 0

while dindex < len(new_list):
    for dictkey, dictval in new_list[dindex].items():
        if dictkey == "Animal ID":
            # Just looking at atnimal id
            dictlist.append(dictval)
            # Storing for later pairing with outcome date
            animal = dictval
            # Create pair for further analysis
        if dictkey == "Outcome Age Timestamp":
            # DEBUG print(dictval)
            outcomedt = datetime.datetime.strptime(dictval, '%Y-%m-%d %H:%M:%S')
            # print(outcomedt)
            # Strip out spaces, dashes and colons and build my own key
            if (animal + "-" + outcomedt.strftime("%Y-%m-%d %H:%M:%S")) in duplist:
                print(new_list[dindex])
    dindex += 1

{'Outcome Age': '2 years', 'Animal ID': 'A686827', 'Animal Type': 'Cat', 'Breed': 'domestic shorthair', 'Primary Breed': 'domestic shorthair', 'Secondary Breed ': '', 'CFA Breed': 'False', 'Domestic Breed': 'True', 'Color': '', 'Primary Color': 'Breed Specific', 'Secondary Color': '', 'Date Of Birth': '2012-02-27 00:00:00', 'Outcome Age Timestamp': '2014-08-28 09:00:00', 'Outcome Month-Year': '2014-08-28T09:00:00', 'DOB Month-Year': '2014-08', 'Name': '', 'Outcome Subtype': 'SCRP', 'Outcome Type': 'Transfer', 'Outcome Sex': 'Female', 'Count': '1', 'Sex': 'Female', 'Outcome Sex Age': 'Spayed Female Cat', 'Spay-Neuter': 'Yes', 'Periods': '2', 'Period Range': '365', 'Outcome Age in Days': '730', 'Outcome Age in Years': '2.0', 'Outcome Cat-Kitten': 'Cat', 'Age Group': '(-0.022, 2.2]', 'DOB Year': '2012', 'DOB Month': '2', 'Exit Month': '8', 'Outcome Year': '2014', 'Outcome Weekday': 'Thursday', 'Outcome Hour': '9', 'Coat Pattern': 'calico', 'Coat': 'calico'}
{'Outcome Age': '2 years', 'Ani

{'Outcome Age': '2 months', 'Animal ID': 'A748204', 'Animal Type': 'Cat', 'Breed': 'domestic shorthair', 'Primary Breed': 'domestic shorthair', 'Secondary Breed ': '', 'CFA Breed': 'False', 'Domestic Breed': 'True', 'Color': 'white/blue', 'Primary Color': 'white', 'Secondary Color': 'blue', 'Date Of Birth': '2017-02-04 00:00:00', 'Outcome Age Timestamp': '2017-05-04 18:04:00', 'Outcome Month-Year': '2017-05-04T18:04:00', 'DOB Month-Year': '2017-05', 'Name': 'Bob', 'Outcome Subtype': '', 'Outcome Type': 'Adoption', 'Outcome Sex': 'Male', 'Count': '1', 'Sex': 'Male', 'Outcome Sex Age': 'Neutered Male Kitten', 'Spay-Neuter': 'Yes', 'Periods': '2', 'Period Range': '30', 'Outcome Age in Days': '60', 'Outcome Age in Years': '0.1643835616438356', 'Outcome Cat-Kitten': 'Kitten', 'Age Group': '(-0.022, 2.2]', 'DOB Year': '2017', 'DOB Month': '2', 'Exit Month': '5', 'Outcome Year': '2017', 'Outcome Weekday': 'Thursday', 'Outcome Hour': '18', 'Coat Pattern': 'white', 'Coat': 'white'}
{'Outcome Ag

In [17]:
# New key that can be used after duplicates are handled

print("\nNew unique key that can be used after duplicates are removed from data set with")

# set() method is used to convert any of the iterable to the distinct element and sorted sequence of iterable elements
# taken from https://www.geeksforgeeks.org/python-set-method/

keylist = set(unq)
lenlist = len(keylist)
print("  a total unique entry count of: {}\n".format(lenlist))


New unique key that can be used after duplicates are removed from data set with
  a total unique entry count of: 29414



In [18]:
# Display new valid unique key
pprint.pprint(keylist)

{'A191351-2015-11-17 13:29:00',
 'A197810-2014-12-22 15:23:00',
 'A214991-2013-12-14 13:28:00',
 'A234161-2013-12-27 17:02:00',
 'A243903-2014-01-05 11:45:00',
 'A251268-2018-01-29 16:07:00',
 'A258441-2014-11-28 14:27:00',
 'A260631-2014-10-20 13:50:00',
 'A261770-2014-11-07 15:51:00',
 'A275975-2013-10-12 11:27:00',
 'A276644-2017-01-02 11:03:00',
 'A279045-2015-02-13 11:15:00',
 'A282897-2013-12-28 17:05:00',
 'A282897-2015-07-11 16:49:00',
 'A287217-2014-10-31 13:06:00',
 'A290917-2014-07-08 14:46:00',
 'A292482-2014-07-07 11:06:00',
 'A293138-2017-04-26 12:54:00',
 'A295822-2015-05-09 18:47:00',
 'A301824-2016-10-10 19:09:00',
 'A304036-2015-06-14 16:09:00',
 'A304036-2016-04-19 13:52:00',
 'A307592-2014-06-14 15:15:00',
 'A310938-2015-04-19 12:06:00',
 'A312005-2014-10-18 16:32:00',
 'A318574-2015-07-31 17:02:00',
 'A318574-2016-05-23 12:54:00',
 'A321341-2014-03-30 16:38:00',
 'A323090-2016-05-22 11:22:00',
 'A326898-2017-05-28 13:34:00',
 'A341599-2014-01-11 11:05:00',
 'A34659

 'A666214-2013-10-30 14:34:00',
 'A666227-2013-11-05 18:12:00',
 'A666242-2013-11-04 18:30:00',
 'A666248-2013-11-22 16:04:00',
 'A666249-2013-11-22 16:04:00',
 'A666250-2013-11-22 16:04:00',
 'A666255-2014-05-13 12:37:00',
 'A666286-2013-12-09 19:51:00',
 'A666303-2013-12-21 15:42:00',
 'A666308-2013-11-04 14:05:00',
 'A666309-2013-12-04 12:45:00',
 'A666310-2013-12-04 12:45:00',
 'A666311-2013-11-09 17:04:00',
 'A666314-2013-10-30 11:40:00',
 'A666319-2013-11-09 18:44:00',
 'A666327-2013-11-09 14:25:00',
 'A666328-2013-11-10 18:33:00',
 'A666329-2013-11-09 14:43:00',
 'A666330-2013-11-10 15:18:00',
 'A666332-2013-11-09 12:12:00',
 'A666343-2013-12-19 14:12:00',
 'A666353-2013-10-30 17:16:00',
 'A666355-2013-11-02 14:00:00',
 'A666357-2013-11-24 15:00:00',
 'A666358-2013-11-10 16:12:00',
 'A666360-2013-11-02 14:00:00',
 'A666361-2013-11-02 14:00:00',
 'A666377-2013-10-31 12:50:00',
 'A666381-2013-12-16 12:52:00',
 'A666382-2013-11-02 14:00:00',
 'A666385-2013-11-12 15:10:00',
 'A66638

 'A668389-2013-12-09 12:53:00',
 'A668393-2013-12-07 17:08:00',
 'A668396-2013-12-03 18:11:00',
 'A668402-2013-12-05 15:50:00',
 'A668405-2013-12-04 14:32:00',
 'A668406-2013-12-05 15:53:00',
 'A668408-2013-12-16 14:38:00',
 'A668413-2013-12-03 15:34:00',
 'A668443-2013-12-22 11:21:00',
 'A668471-2013-12-06 17:26:00',
 'A668471-2015-03-03 16:43:00',
 'A668472-2013-12-06 17:26:00',
 'A668474-2013-12-04 14:38:00',
 'A668477-2013-12-04 14:38:00',
 'A668478-2013-12-04 14:38:00',
 'A668479-2013-12-04 14:39:00',
 'A668480-2013-12-04 14:39:00',
 'A668481-2013-12-04 14:39:00',
 'A668488-2013-12-06 17:25:00',
 'A668489-2013-12-06 17:25:00',
 'A668490-2013-12-06 17:25:00',
 'A668491-2013-12-06 17:25:00',
 'A668494-2014-01-23 11:03:00',
 'A668500-2013-12-06 17:24:00',
 'A668503-2013-12-06 17:28:00',
 'A668505-2014-01-01 15:34:00',
 'A668505-2014-01-26 16:25:00',
 'A668507-2013-12-06 17:25:00',
 'A668508-2013-12-06 17:26:00',
 'A668539-2013-12-20 12:46:00',
 'A668540-2013-12-17 12:13:00',
 'A66854

 'A676131-2014-04-05 12:21:00',
 'A676136-2014-04-05 11:59:00',
 'A676139-2017-03-08 07:51:00',
 'A676141-2014-05-21 14:52:00',
 'A676147-2014-04-10 11:40:00',
 'A676148-2014-04-06 15:57:00',
 'A676149-2014-04-10 14:10:00',
 'A676150-2014-04-10 14:10:00',
 'A676151-2014-04-29 17:16:00',
 'A676152-2014-04-29 17:15:00',
 'A676155-2014-04-14 16:45:00',
 'A676156-2014-04-14 16:45:00',
 'A676157-2014-04-13 18:18:00',
 'A676166-2014-04-18 14:18:00',
 'A676167-2014-04-20 14:51:00',
 'A676170-2014-04-09 18:49:00',
 'A676170-2014-04-14 16:23:00',
 'A676177-2014-04-10 12:13:00',
 'A676193-2014-04-06 11:55:00',
 'A676194-2014-04-06 11:55:00',
 'A676195-2014-05-03 14:00:00',
 'A676211-2014-04-06 15:23:00',
 'A676212-2014-04-06 15:23:00',
 'A676213-2014-04-11 16:00:00',
 'A676219-2014-04-07 11:01:00',
 'A676223-2014-04-11 12:44:00',
 'A676224-2014-04-29 16:34:00',
 'A676225-2014-05-05 12:32:00',
 'A676226-2014-05-05 12:32:00',
 'A676227-2014-04-07 11:00:00',
 'A676228-2014-04-08 13:26:00',
 'A67623

 'A679854-2014-06-01 16:26:00',
 'A679857-2014-05-28 19:21:00',
 'A679861-2014-06-05 17:45:00',
 'A679862-2014-06-19 14:34:00',
 'A679864-2014-06-14 13:00:00',
 'A679871-2014-06-05 18:57:00',
 'A679895-2014-05-30 08:00:00',
 'A679896-2014-06-07 14:16:00',
 'A679907-2014-05-28 17:31:00',
 'A679908-2014-07-13 18:58:00',
 'A679909-2014-07-13 18:57:00',
 'A679910-2014-07-13 17:43:00',
 'A679910-2017-08-15 09:29:00',
 'A679911-2014-07-13 17:42:00',
 'A679915-2014-06-01 17:48:00',
 'A679920-2014-05-28 17:09:00',
 'A679921-2014-07-12 20:12:00',
 'A679921-2014-07-13 19:05:00',
 'A679921-2014-08-28 17:19:00',
 'A679921-2016-08-05 14:15:00',
 'A679923-2014-07-14 18:09:00',
 'A679924-2014-07-01 14:30:00',
 'A679925-2014-07-01 14:30:00',
 'A679926-2014-07-21 19:13:00',
 'A679927-2014-07-01 14:32:00',
 'A679928-2014-07-17 17:22:00',
 'A679928-2016-05-16 15:49:00',
 'A679929-2014-06-02 08:00:00',
 'A679929-2014-08-12 13:41:00',
 'A679930-2014-06-01 13:19:00',
 'A679931-2014-07-12 20:48:00',
 'A67993

 'A681238-2014-06-15 12:36:00',
 'A681240-2014-06-20 17:19:00',
 'A681241-2014-06-30 18:30:00',
 'A681242-2014-06-14 12:45:00',
 'A681243-2014-06-14 12:46:00',
 'A681244-2014-06-14 12:46:00',
 'A681245-2014-06-14 12:46:00',
 'A681246-2014-07-03 14:38:00',
 'A681247-2014-07-10 19:00:00',
 'A681248-2014-06-20 13:18:00',
 'A681257-2014-06-14 12:47:00',
 'A681258-2014-06-14 12:47:00',
 'A681259-2014-06-14 12:45:00',
 'A681260-2014-07-02 16:57:00',
 'A681261-2014-06-15 19:12:00',
 'A681262-2014-06-15 19:12:00',
 'A681263-2014-06-14 12:48:00',
 'A681264-2014-06-14 12:48:00',
 'A681265-2014-06-14 12:49:00',
 'A681267-2014-08-05 13:03:00',
 'A681268-2015-03-14 17:58:00',
 'A681271-2014-06-14 16:16:00',
 'A681274-2014-06-14 16:16:00',
 'A681275-2014-06-14 16:16:00',
 'A681281-2014-07-11 16:56:00',
 'A681290-2014-06-15 19:11:00',
 'A681302-2014-06-21 19:05:00',
 'A681306-2014-06-28 14:41:00',
 'A681307-2014-07-26 13:40:00',
 'A681307-2014-08-03 08:34:00',
 'A681325-2014-06-14 19:22:00',
 'A68132

 'A685368-2014-08-06 15:42:00',
 'A685372-2014-08-06 15:42:00',
 'A685373-2014-08-06 19:23:00',
 'A685374-2014-08-06 19:23:00',
 'A685375-2014-08-06 19:24:00',
 'A685384-2014-08-06 15:38:00',
 'A685390-2014-08-07 15:59:00',
 'A685414-2014-08-07 10:47:00',
 'A685417-2014-08-09 11:02:00',
 'A685421-2014-08-07 12:49:00',
 'A685423-2014-08-07 15:38:00',
 'A685424-2014-08-07 13:24:00',
 'A685425-2014-08-07 13:25:00',
 'A685426-2014-08-07 13:25:00',
 'A685427-2014-08-07 13:25:00',
 'A685435-2014-08-08 09:00:00',
 'A685440-2014-08-07 16:13:00',
 'A685441-2014-08-07 16:13:00',
 'A685449-2014-08-07 17:27:00',
 'A685453-2014-08-07 18:00:00',
 'A685454-2014-08-07 17:38:00',
 'A685455-2014-08-08 09:00:00',
 'A685460-2014-08-19 18:25:00',
 'A685468-2014-11-19 13:45:00',
 'A685476-2014-08-21 14:30:00',
 'A685477-2014-08-23 13:25:00',
 'A685478-2014-08-25 15:08:00',
 'A685479-2014-08-26 15:09:00',
 'A685480-2014-08-16 13:51:00',
 'A685481-2014-08-17 17:11:00',
 'A685482-2014-08-18 12:24:00',
 'A68548

 'A688654-2014-11-04 11:37:00',
 'A688655-2014-11-04 11:32:00',
 'A688658-2014-09-23 17:39:00',
 'A688665-2014-10-19 15:26:00',
 'A688674-2014-09-23 09:00:00',
 'A688675-2014-09-23 09:00:00',
 'A688676-2014-09-22 13:06:00',
 'A688677-2014-10-06 16:30:00',
 'A688677-2014-11-19 15:07:00',
 'A688683-2014-09-23 09:00:00',
 'A688684-2014-09-23 09:00:00',
 'A688685-2014-09-23 09:00:00',
 'A688686-2014-09-23 09:00:00',
 'A688687-2014-09-23 09:00:00',
 'A688699-2014-10-03 18:22:00',
 'A688700-2014-10-02 13:18:00',
 'A688701-2014-09-28 12:32:00',
 'A688702-2014-09-23 14:37:00',
 'A688703-2014-10-16 17:07:00',
 'A688704-2014-09-29 16:23:00',
 'A688705-2014-10-04 18:02:00',
 'A688706-2014-09-27 17:58:00',
 'A688707-2014-09-22 17:42:00',
 'A688713-2014-09-29 17:03:00',
 'A688721-2014-10-01 17:26:00',
 'A688736-2015-01-11 14:23:00',
 'A688737-2014-10-05 15:09:00',
 'A688740-2014-10-05 15:10:00',
 'A688741-2014-10-05 15:10:00',
 'A688746-2014-09-23 13:23:00',
 'A688747-2014-09-23 13:23:00',
 'A68874

 'A690822-2016-09-27 15:14:00',
 'A690832-2014-10-30 09:00:00',
 'A690833-2014-10-28 15:58:00',
 'A690835-2015-01-02 14:05:00',
 'A690836-2014-12-06 16:55:00',
 'A690837-2014-12-09 15:20:00',
 'A690838-2014-12-09 15:20:00',
 'A690839-2014-12-06 18:28:00',
 'A690840-2014-10-27 11:49:00',
 'A690843-2014-10-27 10:24:00',
 'A690844-2014-10-27 16:35:00',
 'A690862-2014-10-28 13:32:00',
 'A690869-2014-12-27 18:29:00',
 'A690873-2014-11-07 13:34:00',
 'A690876-2014-11-17 12:37:00',
 'A690892-2014-10-28 09:00:00',
 'A690894-2014-10-27 18:02:00',
 'A690898-2014-12-02 14:12:00',
 'A690899-2014-11-14 14:19:00',
 'A690900-2014-11-22 13:23:00',
 'A690901-2014-11-20 19:23:00',
 'A690902-2014-11-18 16:05:00',
 'A690903-2014-11-18 16:04:00',
 'A690904-2014-10-31 17:32:00',
 'A690905-2014-12-17 12:07:00',
 'A690906-2014-12-19 09:12:00',
 'A690909-2014-10-30 12:21:00',
 'A690919-2014-10-28 12:37:00',
 'A690920-2014-10-28 12:37:00',
 'A690921-2014-10-28 12:38:00',
 'A690927-2014-10-28 11:30:00',
 'A69093

 'A699731-2015-07-28 09:49:00',
 'A699731-2015-10-10 14:16:00',
 'A699731-2016-01-23 16:19:00',
 'A699734-2015-04-08 13:13:00',
 'A699738-2015-04-27 18:31:00',
 'A699739-2015-04-02 09:00:00',
 'A699741-2015-04-02 09:00:00',
 'A699746-2015-04-11 14:55:00',
 'A699749-2015-04-02 11:25:00',
 'A699751-2015-04-11 11:20:00',
 'A699757-2015-04-03 09:00:00',
 'A699784-2015-05-22 17:45:00',
 'A699799-2015-04-18 08:52:00',
 'A699800-2015-04-18 12:44:00',
 'A699801-2015-04-18 08:52:00',
 'A699802-2015-04-19 15:50:00',
 'A699803-2015-04-22 07:33:00',
 'A699806-2015-04-03 09:00:00',
 'A699807-2015-04-03 09:00:00',
 'A699811-2015-04-03 09:00:00',
 'A699814-2015-04-03 09:00:00',
 'A699822-2015-04-07 15:21:00',
 'A699828-2015-04-03 11:59:00',
 'A699838-2015-04-07 15:14:00',
 'A699839-2015-04-05 13:35:00',
 'A699872-2015-04-03 16:46:00',
 'A699873-2015-04-03 16:46:00',
 'A699874-2015-04-03 16:46:00',
 'A699884-2015-04-03 16:16:00',
 'A699887-2015-04-04 09:00:00',
 'A699897-2015-04-15 18:15:00',
 'A69989

 'A702764-2015-05-17 16:02:00',
 'A702765-2015-05-17 16:02:00',
 'A702766-2015-05-17 16:02:00',
 'A702767-2015-05-17 18:48:00',
 'A702768-2015-05-17 18:48:00',
 'A702769-2015-05-17 18:48:00',
 'A702770-2015-05-26 12:20:00',
 'A702771-2015-05-19 13:37:00',
 'A702772-2015-05-26 12:20:00',
 'A702773-2015-05-26 12:20:00',
 'A702775-2015-05-17 16:21:00',
 'A702779-2015-05-17 17:53:00',
 'A702785-2015-06-30 14:18:00',
 'A702794-2015-07-19 15:26:00',
 'A702796-2015-09-12 17:49:00',
 'A702797-2015-09-12 17:50:00',
 'A702798-2015-09-10 13:29:00',
 'A702799-2015-09-10 13:29:00',
 'A702800-2015-09-10 13:22:00',
 'A702800-2016-02-13 16:40:00',
 'A702801-2015-07-18 19:59:00',
 'A702802-2015-07-18 13:42:00',
 'A702804-2015-09-19 17:33:00',
 'A702814-2015-05-18 13:38:00',
 'A702817-2015-05-18 13:07:00',
 'A702818-2015-05-18 13:07:00',
 'A702819-2015-05-18 13:07:00',
 'A702820-2015-05-18 13:06:00',
 'A702822-2015-05-31 13:21:00',
 'A702822-2016-04-08 14:27:00',
 'A702824-2015-05-22 18:56:00',
 'A70282

 'A703932-2015-06-10 12:47:00',
 'A703939-2015-07-18 16:42:00',
 'A703946-2015-07-19 19:40:00',
 'A703947-2015-07-19 19:21:00',
 'A703948-2015-06-07 19:56:00',
 'A703949-2015-06-30 15:32:00',
 'A703952-2015-05-30 18:56:00',
 'A703958-2015-05-30 18:42:00',
 'A703959-2015-05-30 18:42:00',
 'A703960-2015-05-30 18:42:00',
 'A703961-2015-05-30 18:42:00',
 'A703962-2015-05-30 18:42:00',
 'A703963-2015-05-30 18:42:00',
 'A703968-2015-06-08 19:18:00',
 'A703979-2015-05-31 13:46:00',
 'A703980-2015-05-31 13:46:00',
 'A703987-2015-06-07 13:30:00',
 'A703988-2015-06-07 13:30:00',
 'A703989-2015-06-01 09:00:00',
 'A703990-2015-07-19 17:45:00',
 'A703993-2015-05-31 14:49:00',
 'A703994-2015-06-26 13:28:00',
 'A703997-2015-05-31 14:44:00',
 'A703998-2015-09-19 11:08:00',
 'A704005-2015-07-18 15:09:00',
 'A704006-2015-06-09 13:40:00',
 'A704016-2015-05-31 17:11:00',
 'A704017-2015-05-31 17:12:00',
 'A704018-2015-05-31 17:12:00',
 'A704019-2015-05-31 17:12:00',
 'A704020-2015-05-31 17:12:00',
 'A70402

 'A708976-2015-08-19 14:27:00',
 'A708977-2015-08-19 14:27:00',
 'A708978-2015-08-17 16:43:00',
 'A708979-2015-08-19 18:13:00',
 'A708979-2016-02-24 13:51:00',
 'A708980-2015-08-24 17:00:00',
 'A708981-2015-08-24 18:13:00',
 'A708982-2015-08-31 17:32:00',
 'A708983-2015-09-13 00:00:00',
 'A708984-2015-08-17 18:08:00',
 'A708985-2015-08-09 11:46:00',
 'A708986-2015-08-09 11:46:00',
 'A708989-2015-08-04 09:00:00',
 'A708990-2015-10-22 10:58:00',
 'A708996-2015-08-05 09:00:00',
 'A708996-2015-10-03 17:57:00',
 'A708998-2015-08-16 17:46:00',
 'A709007-2015-08-10 11:19:00',
 'A709035-2015-08-16 12:05:00',
 'A709036-2015-08-12 16:38:00',
 'A709042-2015-08-05 09:00:00',
 'A709045-2015-08-04 17:55:00',
 'A709046-2015-08-04 17:55:00',
 'A709047-2015-08-04 17:55:00',
 'A709056-2015-08-04 14:29:00',
 'A709058-2015-08-30 15:47:00',
 'A709064-2015-11-25 16:51:00',
 'A709065-2015-11-25 16:51:00',
 'A709066-2015-08-24 19:24:00',
 'A709069-2015-08-16 18:15:00',
 'A709070-2015-08-24 19:41:00',
 'A70907

 'A711036-2015-12-31 15:12:00',
 'A711037-2015-09-01 12:34:00',
 'A711038-2015-09-18 15:17:00',
 'A711039-2015-09-01 12:35:00',
 'A711040-2015-09-13 18:29:00',
 'A711041-2015-11-23 16:06:00',
 'A711042-2015-09-18 15:18:00',
 'A711043-2015-09-08 12:01:00',
 'A711044-2015-09-02 09:00:00',
 'A711045-2015-09-13 12:27:00',
 'A711046-2015-09-12 09:00:00',
 'A711048-2015-09-19 12:15:00',
 'A711049-2015-09-13 12:28:00',
 'A711050-2015-09-01 12:49:00',
 'A711051-2015-09-01 12:48:00',
 'A711057-2015-09-01 14:05:00',
 'A711058-2015-09-01 14:05:00',
 'A711059-2015-09-01 14:05:00',
 'A711061-2015-09-07 13:16:00',
 'A711062-2015-09-06 18:01:00',
 'A711064-2015-12-26 13:11:00',
 'A711070-2015-09-07 17:32:00',
 'A711071-2015-09-05 17:27:00',
 'A711072-2015-09-08 18:37:00',
 'A711072-2015-09-25 19:16:00',
 'A711073-2015-09-08 18:36:00',
 'A711073-2015-09-12 18:39:00',
 'A711074-2015-09-03 14:24:00',
 'A711075-2015-09-03 14:24:00',
 'A711076-2015-09-03 14:24:00',
 'A711077-2015-09-03 14:24:00',
 'A71108

 'A712481-2015-09-23 17:20:00',
 'A712493-2015-12-05 15:45:00',
 'A712495-2015-09-28 11:42:00',
 'A712501-2015-09-25 09:00:00',
 'A712502-2015-09-25 09:00:00',
 'A712508-2015-09-25 09:00:00',
 'A712508-2015-09-29 12:33:00',
 'A712512-2015-09-28 15:46:00',
 'A712513-2015-09-28 15:46:00',
 'A712516-2016-01-09 17:42:00',
 'A712517-2015-11-06 17:28:00',
 'A712517-2016-03-17 15:49:00',
 'A712517-2016-04-13 00:00:00',
 'A712518-2015-11-10 11:36:00',
 'A712518-2016-03-17 15:49:00',
 'A712518-2016-04-18 00:00:00',
 'A712523-2015-09-24 18:24:00',
 'A712524-2015-09-24 18:24:00',
 'A712525-2015-09-24 18:24:00',
 'A712526-2015-09-24 18:23:00',
 'A712527-2015-10-01 19:13:00',
 'A712528-2015-10-06 10:47:00',
 'A712529-2015-11-18 16:22:00',
 'A712530-2015-11-27 13:20:00',
 'A712532-2015-10-05 14:58:00',
 'A712541-2015-10-12 16:17:00',
 'A712542-2015-10-05 18:24:00',
 'A712543-2015-10-16 17:08:00',
 'A712544-2015-10-11 17:40:00',
 'A712554-2015-09-26 11:42:00',
 'A712556-2015-09-26 09:00:00',
 'A71255

 'A721300-2016-03-05 14:00:00',
 'A721303-2016-02-29 13:37:00',
 'A721304-2016-02-29 13:37:00',
 'A721305-2016-03-17 14:56:00',
 'A721307-2016-02-24 18:26:00',
 'A721308-2016-02-24 18:26:00',
 'A721309-2016-02-24 18:27:00',
 'A721310-2016-02-24 18:27:00',
 'A721313-2016-03-03 00:00:00',
 'A721314-2016-03-03 00:00:00',
 'A721315-2016-02-29 15:58:00',
 'A721316-2016-03-09 18:24:00',
 'A721319-2016-03-04 15:02:00',
 'A721320-2016-03-04 15:01:00',
 'A721321-2016-02-27 00:00:00',
 'A721322-2016-03-20 12:38:00',
 'A721349-2016-06-29 09:48:00',
 'A721355-2016-02-29 19:05:00',
 'A721357-2016-02-26 17:58:00',
 'A721361-2016-03-05 15:13:00',
 'A721361-2016-03-27 15:33:00',
 'A721361-2016-09-26 14:12:00',
 'A721365-2016-02-25 19:07:00',
 'A721369-2016-03-06 19:08:00',
 'A721388-2016-02-26 19:08:00',
 'A721432-2016-03-13 19:04:00',
 'A721436-2016-03-06 19:09:00',
 'A721448-2016-03-02 17:31:00',
 'A721455-2016-03-22 16:48:00',
 'A721461-2016-03-09 18:44:00',
 'A721466-2016-04-24 14:33:00',
 'A72146

 'A724935-2016-04-23 12:29:00',
 'A724936-2016-04-23 12:30:00',
 'A724937-2016-04-23 12:30:00',
 'A724938-2016-04-23 12:30:00',
 'A724942-2016-05-04 11:23:00',
 'A724944-2016-04-23 16:48:00',
 'A724948-2016-04-24 17:40:00',
 'A724949-2016-04-25 18:02:00',
 'A724956-2016-04-28 13:32:00',
 'A724959-2016-04-23 17:07:00',
 'A724960-2016-04-23 17:09:00',
 'A724970-2016-04-28 13:32:00',
 'A724971-2016-04-28 13:32:00',
 'A724978-2016-04-23 19:08:00',
 'A724979-2016-04-23 19:08:00',
 'A724980-2016-04-23 19:09:00',
 'A724981-2016-04-23 19:09:00',
 'A724985-2016-07-02 19:33:00',
 'A724986-2016-07-09 17:05:00',
 'A724987-2016-06-24 18:23:00',
 'A724988-2016-06-28 16:21:00',
 'A724989-2016-07-09 16:25:00',
 'A724994-2016-04-24 11:41:00',
 'A724995-2016-04-24 11:41:00',
 'A725000-2016-04-29 16:13:00',
 'A725002-2016-04-26 13:55:00',
 'A725003-2016-04-29 14:37:00',
 'A725003-2017-08-12 17:45:00',
 'A725023-2016-04-24 16:32:00',
 'A725026-2016-07-21 12:54:00',
 'A725027-2016-06-30 18:50:00',
 'A72502

 'A727091-2016-07-07 12:15:00',
 'A727092-2016-05-21 16:51:00',
 'A727093-2016-05-21 16:51:00',
 'A727094-2016-05-21 16:51:00',
 'A727099-2016-07-01 14:34:00',
 'A727100-2016-06-22 11:11:00',
 'A727101-2016-07-09 13:28:00',
 'A727102-2016-07-01 14:42:00',
 'A727105-2016-06-07 13:52:00',
 'A727106-2016-05-22 18:18:00',
 'A727109-2016-06-26 14:25:00',
 'A727110-2016-06-28 17:04:00',
 'A727111-2016-06-21 14:52:00',
 'A727112-2016-06-28 17:03:00',
 'A727116-2016-06-25 13:14:00',
 'A727117-2016-06-25 13:15:00',
 'A727117-2017-02-24 13:44:00',
 'A727118-2016-06-19 17:52:00',
 'A727119-2016-07-01 00:00:00',
 'A727122-2016-05-25 13:37:00',
 'A727124-2016-07-09 16:48:00',
 'A727125-2016-07-09 16:31:00',
 'A727126-2016-06-28 14:44:00',
 'A727130-2016-06-13 15:24:00',
 'A727131-2016-05-24 12:56:00',
 'A727133-2016-05-25 16:46:00',
 'A727136-2016-06-22 10:41:00',
 'A727146-2016-05-26 14:10:00',
 'A727147-2016-05-26 14:10:00',
 'A727149-2016-05-22 18:18:00',
 'A727150-2016-05-22 18:18:00',
 'A72715

 'A732820-2016-09-20 13:30:00',
 'A732821-2016-09-23 14:36:00',
 'A732822-2016-09-22 16:00:00',
 'A732823-2016-09-21 15:23:00',
 'A732824-2016-09-23 14:38:00',
 'A732825-2016-09-22 16:00:00',
 'A732826-2016-09-10 16:31:00',
 'A732826-2016-09-24 16:01:00',
 'A732827-2016-09-14 15:26:00',
 'A732831-2016-08-13 16:14:00',
 'A732837-2016-09-20 13:30:00',
 'A732838-2016-08-13 16:16:00',
 'A732844-2016-08-17 17:30:00',
 'A732845-2016-08-19 14:27:00',
 'A732846-2016-08-27 16:35:00',
 'A732846-2016-08-28 16:06:00',
 'A732849-2016-08-24 15:09:00',
 'A732850-2016-08-28 17:11:00',
 'A732850-2017-12-20 11:29:00',
 'A732851-2016-08-17 17:20:00',
 'A732852-2016-08-21 13:27:00',
 'A732853-2016-08-21 18:55:00',
 'A732854-2016-08-21 18:55:00',
 'A732855-2016-08-17 17:18:00',
 'A732857-2016-09-10 16:54:00',
 'A732860-2016-08-14 18:58:00',
 'A732865-2016-10-02 14:59:00',
 'A732866-2016-08-19 13:30:00',
 'A732866-2016-12-29 13:15:00',
 'A732876-2016-08-15 09:00:00',
 'A732882-2016-08-14 13:43:00',
 'A73288

 'A734137-2016-09-01 19:11:00',
 'A734138-2016-09-01 19:12:00',
 'A734139-2016-09-01 19:12:00',
 'A734140-2016-09-01 19:12:00',
 'A734141-2016-09-01 19:12:00',
 'A734147-2016-09-13 19:25:00',
 'A734149-2016-09-14 00:00:00',
 'A734150-2016-09-14 00:00:00',
 'A734151-2016-09-14 00:00:00',
 'A734152-2016-09-14 00:00:00',
 'A734153-2016-09-14 00:00:00',
 'A734156-2016-09-17 16:06:00',
 'A734160-2016-09-03 09:00:00',
 'A734161-2016-09-30 11:03:00',
 'A734162-2016-10-11 09:16:00',
 'A734163-2016-09-28 08:37:00',
 'A734189-2016-09-10 14:54:00',
 'A734195-2016-10-12 12:30:00',
 'A734212-2016-09-20 13:33:00',
 'A734231-2016-09-03 14:57:00',
 'A734234-2016-09-12 09:12:00',
 'A734246-2016-10-08 17:55:00',
 'A734261-2017-10-26 12:03:00',
 'A734261-2017-11-14 11:47:00',
 'A734263-2016-09-04 14:31:00',
 'A734263-2016-09-05 14:30:00',
 'A734267-2016-09-03 19:22:00',
 'A734277-2016-09-04 09:00:00',
 'A734278-2016-09-04 09:00:00',
 'A734283-2016-09-04 16:04:00',
 'A734284-2016-10-21 08:08:00',
 'A73428

 'A737097-2016-10-23 09:00:00',
 'A737098-2016-10-26 17:33:00',
 'A737104-2016-11-06 18:50:00',
 'A737105-2016-10-22 15:08:00',
 'A737107-2016-10-22 15:08:00',
 'A737108-2016-10-22 15:07:00',
 'A737109-2016-10-22 16:02:00',
 'A737111-2016-10-24 16:18:00',
 'A737112-2016-10-23 09:00:00',
 'A737113-2016-10-22 18:24:00',
 'A737114-2016-10-22 18:24:00',
 'A737115-2016-10-22 18:24:00',
 'A737118-2016-10-22 15:07:00',
 'A737142-2016-10-23 09:00:00',
 'A737143-2016-10-23 09:00:00',
 'A737144-2016-10-23 09:00:00',
 'A737145-2016-11-05 00:00:00',
 'A737146-2016-10-31 17:01:00',
 'A737147-2016-10-23 09:00:00',
 'A737148-2016-10-23 09:00:00',
 'A737149-2016-10-23 09:00:00',
 'A737150-2016-10-23 09:00:00',
 'A737151-2016-10-22 18:48:00',
 'A737152-2016-10-22 18:48:00',
 'A737158-2016-11-29 11:57:00',
 'A737161-2016-10-23 12:10:00',
 'A737162-2016-10-24 12:49:00',
 'A737165-2016-10-24 16:47:00',
 'A737165-2016-11-25 13:48:00',
 'A737165-2016-11-27 16:23:00',
 'A737168-2016-11-08 15:56:00',
 'A73717

 'A746763-2017-04-09 09:00:00',
 'A746777-2017-04-09 09:42:00',
 'A746783-2017-04-09 16:40:00',
 'A746784-2017-04-09 16:40:00',
 'A746785-2017-04-09 16:41:00',
 'A746786-2017-04-09 16:41:00',
 'A746787-2017-04-09 16:41:00',
 'A746788-2017-04-09 16:42:00',
 'A746801-2017-04-09 15:45:00',
 'A746802-2017-04-09 15:45:00',
 'A746803-2017-04-09 15:45:00',
 'A746807-2017-04-09 13:15:00',
 'A746811-2017-05-07 13:53:00',
 'A746820-2017-04-10 09:00:00',
 'A746821-2017-05-30 09:57:00',
 'A746822-2017-04-10 09:00:00',
 'A746829-2017-05-01 09:55:00',
 'A746830-2017-04-10 09:00:00',
 'A746837-2017-04-24 16:29:00',
 'A746850-2017-04-10 12:47:00',
 'A746856-2017-05-09 12:23:00',
 'A746857-2017-08-15 17:19:00',
 'A746859-2017-04-24 16:34:00',
 'A746860-2017-04-24 16:34:00',
 'A746861-2017-04-24 16:35:00',
 'A746864-2017-04-10 12:48:00',
 'A746867-2017-04-28 13:34:00',
 'A746868-2017-04-27 18:24:00',
 'A746871-2017-04-11 18:47:00',
 'A746872-2017-04-10 16:25:00',
 'A746874-2017-04-11 09:00:00',
 'A74687

 'A748602-2017-05-06 17:07:00',
 'A748603-2017-05-07 13:55:00',
 'A748604-2017-05-07 19:33:00',
 'A748608-2017-06-17 13:52:00',
 'A748609-2017-05-07 13:08:00',
 'A748610-2017-05-04 14:31:00',
 'A748628-2017-05-04 17:14:00',
 'A748629-2017-07-30 18:50:00',
 'A748635-2017-05-10 00:00:00',
 'A748636-2017-05-10 00:00:00',
 'A748637-2017-05-06 13:58:00',
 'A748662-2017-05-24 16:38:00',
 'A748663-2017-05-11 16:38:00',
 'A748664-2017-05-13 15:24:00',
 'A748667-2017-06-03 13:59:00',
 'A748668-2017-06-03 18:46:00',
 'A748669-2017-06-14 17:11:00',
 'A748670-2017-06-06 06:59:00',
 'A748671-2017-06-03 21:05:00',
 'A748675-2017-05-10 00:00:00',
 'A748678-2017-06-17 14:56:00',
 'A748678-2017-06-27 15:00:00',
 'A748679-2017-06-20 08:44:00',
 'A748680-2017-06-17 14:54:00',
 'A748681-2017-06-21 11:20:00',
 'A748681-2017-07-01 17:49:00',
 'A748683-2017-05-05 15:03:00',
 'A748684-2017-05-05 15:03:00',
 'A748685-2017-05-05 15:04:00',
 'A748686-2017-05-05 15:04:00',
 'A748687-2017-05-05 16:01:00',
 'A74868

 'A751140-2017-08-15 17:47:00',
 'A751141-2017-07-29 13:33:00',
 'A751142-2017-08-15 17:47:00',
 'A751145-2017-06-17 17:14:00',
 'A751148-2017-07-03 12:51:00',
 'A751149-2017-07-01 17:18:00',
 'A751150-2017-07-01 18:56:00',
 'A751151-2017-07-01 18:57:00',
 'A751167-2017-07-01 18:19:00',
 'A751170-2017-07-27 10:17:00',
 'A751170-2017-08-16 10:16:00',
 'A751172-2017-06-14 13:42:00',
 'A751174-2017-06-04 18:44:00',
 'A751178-2017-07-01 14:07:00',
 'A751179-2017-07-01 14:11:00',
 'A751180-2017-07-28 18:56:00',
 'A751188-2017-08-19 13:41:00',
 'A751189-2017-08-19 16:11:00',
 'A751191-2017-06-05 15:54:00',
 'A751192-2017-06-07 12:59:00',
 'A751193-2017-06-09 18:28:00',
 'A751199-2017-06-15 15:58:00',
 'A751204-2017-06-17 17:46:00',
 'A751206-2017-08-31 17:34:00',
 'A751207-2017-06-05 12:34:00',
 'A751209-2017-06-05 15:00:00',
 'A751212-2017-08-12 15:15:00',
 'A751213-2017-06-06 14:33:00',
 'A751231-2017-06-09 18:28:00',
 'A751232-2017-06-09 18:29:00',
 'A751233-2017-06-16 19:41:00',
 'A75123

 'A757583-2017-09-13 09:23:00',
 'A757587-2017-11-03 16:48:00',
 'A757588-2017-11-02 18:11:00',
 'A757589-2017-11-02 18:11:00',
 'A757592-2017-09-04 13:24:00',
 'A757594-2017-09-09 17:12:00',
 'A757599-2017-09-04 13:35:00',
 'A757600-2017-10-01 17:37:00',
 'A757602-2017-09-29 16:24:00',
 'A757603-2017-09-09 12:02:00',
 'A757604-2017-09-09 15:07:00',
 'A757606-2017-09-09 15:07:00',
 'A757613-2017-09-04 08:50:00',
 'A757614-2017-09-04 08:50:00',
 'A757615-2017-09-04 08:51:00',
 'A757616-2017-09-04 00:00:00',
 'A757618-2017-09-04 08:51:00',
 'A757666-2017-09-08 18:34:00',
 'A757679-2018-01-27 15:20:00',
 'A757684-2017-09-08 10:24:00',
 'A757687-2017-09-07 17:00:00',
 'A757697-2017-09-12 18:45:00',
 'A757711-2017-09-10 12:07:00',
 'A757714-2017-09-12 15:10:00',
 'A757716-2017-11-01 17:55:00',
 'A757719-2017-09-05 18:39:00',
 'A757720-2017-09-05 18:39:00',
 'A757721-2017-09-05 18:39:00',
 'A757722-2017-09-05 18:39:00',
 'A757723-2017-09-05 18:40:00',
 'A757729-2017-09-28 00:00:00',
 'A75773

 'A759254-2017-11-03 16:54:00',
 'A759264-2017-10-15 15:12:00',
 'A759269-2017-09-29 00:00:00',
 'A759270-2017-09-29 00:00:00',
 'A759292-2017-11-07 08:13:00',
 'A759296-2017-11-05 17:05:00',
 'A759297-2017-11-16 12:33:00',
 'A759298-2018-01-07 17:56:00',
 'A759299-2017-11-25 15:59:00',
 'A759300-2017-10-19 18:33:00',
 'A759301-2017-10-23 13:40:00',
 'A759302-2017-10-23 16:49:00',
 'A759303-2017-10-23 16:50:00',
 'A759304-2017-10-23 16:51:00',
 'A759305-2017-10-23 16:51:00',
 'A759306-2017-10-03 19:02:00',
 'A759315-2017-10-05 17:43:00',
 'A759316-2017-11-29 08:47:00',
 'A759329-2017-11-02 14:16:00',
 'A759330-2017-11-04 19:03:00',
 'A759331-2017-12-12 14:05:00',
 'A759332-2017-12-12 14:04:00',
 'A759336-2017-10-19 21:40:00',
 'A759336-2018-01-15 00:00:00',
 'A759343-2017-10-05 17:44:00',
 'A759344-2017-10-18 13:08:00',
 'A759345-2017-10-18 13:26:00',
 'A759346-2017-10-21 10:54:00',
 'A759348-2017-10-21 10:55:00',
 'A759349-2017-10-22 16:26:00',
 'A759350-2017-10-20 19:02:00',
 'A75935

 'A763472-2017-12-16 13:42:00',
 'A763475-2017-12-16 13:43:00',
 'A763496-2017-12-18 18:53:00',
 'A763539-2018-01-09 11:31:00',
 'A763544-2017-12-10 15:12:00',
 'A763545-2017-12-15 00:00:00',
 'A763546-2017-12-15 00:00:00',
 'A763556-2017-12-10 19:58:00',
 'A763560-2017-12-17 10:20:00',
 'A763574-2017-12-12 16:48:00',
 'A763576-2017-12-16 13:41:00',
 'A763578-2017-12-13 12:37:00',
 'A763579-2017-12-15 12:11:00',
 'A763581-2017-12-15 19:24:00',
 'A763582-2017-12-23 17:39:00',
 'A763588-2017-12-16 17:24:00',
 'A763589-2017-12-15 16:26:00',
 'A763600-2017-12-12 14:03:00',
 'A763613-2017-12-17 16:35:00',
 'A763617-2017-12-21 13:20:00',
 'A763622-2017-12-15 19:24:00',
 'A763623-2017-12-15 19:25:00',
 'A763624-2017-12-16 18:47:00',
 'A763628-2018-01-03 18:56:00',
 'A763632-2017-12-27 16:05:00',
 'A763633-2017-12-26 12:30:00',
 'A763636-2017-12-22 16:23:00',
 'A763645-2017-12-22 11:36:00',
 'A763646-2018-01-10 10:53:00',
 'A763666-2018-01-21 15:26:00',
 'A763668-2018-01-20 16:59:00',
 'A76368

### Justification for removing duplicates:

Since each adoption was listed, animal id alone is not unique as shown by the analysis above.  In a few cases, the same animal was returned and adopted out again.  Adding outcome date helped with 1100 rows, but 7 rows were still duplicated.  The 7 duplicated entries are likely data entry errors (actual duplicate rows) and would have to be corrected before loading into a database table with this as a primary unique key.   Given this I will remove them by creating a new list that does not contain them.      



##### Create final list by only keeping 1 of each duplicated row.

In [19]:
# This runs through the old list of dictionaries and 
# only appends to the new one if there is no duplicated value
# or if it is the first entry found to have a duplicate value
dindex = 0
keptfirstlist = []
final_list = [] 
# Run through list 
while dindex < len(new_list):
    # Gather variables to check for duplicates
    for dictkey, dictval in new_list[dindex].items():
        if dictkey == "Animal ID":
            animal = dictval            
        if dictkey == "Outcome Age Timestamp":
            outcomedt = datetime.datetime.strptime(dictval, '%Y-%m-%d %H:%M:%S')
    newkey = animal + "-" + outcomedt.strftime("%Y-%m-%d %H:%M:%S")
    # Is it duplicated?
    if newkey in duplist:
        # Have we already created an entry for the duplicated row?
        if newkey in keptfirstlist:
            pass
        # If not create one and store it.
        else:
            keptfirstlist.append(newkey)
            final_list.append(new_list[dindex])
    # If not create row.
    else:
        final_list.append(new_list[dindex])
    dindex += 1

##### Validate counts after duplicates are excluded.

In [20]:
# Compare list count with and with out duplicates
# Old list count - with duplicates
print("With duplicates:",len(new_list))
# Final list count - duplicate removed
print("Without duplicates:",len(final_list))


With duplicates: 29421
Without duplicates: 29414


### Conduct Fuzzy Matching 
##### (if you don’t have an obvious example to do this with in your data, create categories and use Fuzzy Matching to lump data together)
### (Data Wrangling with Python pg. 179 – 188)

##### IDA of colors 

In [21]:
color_list = []

In [22]:
# Create a color list from the existing list of dictionarires
# Uses list comprehension 
# concept taken from https://www.geeksforgeeks.org/python-get-values-of-particular-key-in-list-of-dictionaries/
color_list = []
color_list = [ sub['Color'] for sub in final_list ] 

In [23]:
# count of existing colors.
print(len(color_list))

29414


In [24]:
# Gather unique colors
unqcolors, colorcount = np.unique(color_list, return_counts=True) 
# DEBUG - print(unqcolors)
# Convert to list for easier manipulation
unique_colors = unqcolors.tolist()

In [25]:
# Remove null values
unique_colors.remove("")
unique_colors.remove("/")
# Sort values
unique_colors.sort(key=len)


In [26]:
# Show all existing unique values
print(unique_colors)

['tan', 'blue', 'buff', 'fawn', 'gray', 'lynx', 'pink', 'seal', '/blue', '/gray', '/lynx', 'black', 'blue/', 'brown', 'cream', 'flame', 'gray/', 'lilac', 'sable', 'white', '/black', '/brown', '/white', 'black/', 'lynx /', 'orange', 'silver', 'white/', 'yellow', '/orange', '/silver', 'apricot', 'brown /', 'gray/tan', 'orange /', 'blue /tan', 'blue/blue', 'blue/gray', 'chocolate', 'gray/gray', 'lynx /tan', 'tan/brown', 'tan/white', 'white/red', 'white/tan', 'black/blue', 'black/gray', 'black/seal', 'blue /buff', 'blue /gray', 'blue cream', 'blue/brown', 'blue/cream', 'blue/white', 'brown/buff', 'buff/white', 'cream/blue', 'cream/seal', 'gray/black', 'gray/white', 'lynx /blue', 'lynx /gray', 'seal /buff', 'seal /gray', 'white/blue', 'white/gray', 'white/lynx', 'white/seal', '/blue cream', 'black /gray', 'black/black', 'black/brown', 'black/white', 'blue /black', 'blue /cream', 'blue /white', 'blue cream/', 'blue/orange', 'brown /blue', 'brown /gray', 'brown merle', 'brown tiger', 'brown/b

In [27]:
# Number of unique colors
print("Unique color count before fuzzy logic:",len(unique_colors))

Unique color count before fuzzy logic: 153


##### Fuzzy Example 1 - Match based on score of at least 96 to remove specific data entry concerns around color duplication.

In [28]:
# In this example I am focused on finding data entry issues resulting in duplicates 
# These could each be transformed into one value

firstrun_color = []
list_of_groupings = []
used_so_far = []
for colorquery in unique_colors:
    firstrun_color = []
    # DEBUG print (colorquery)
    for found, score in process.extract(colorquery, unique_colors, limit = 5):  
        # if "black" in colorquery:
            # print(colorquery,found,score)
        # From if statement used above 96 was determined to be the most effective value.
        if score > 95:
            if found not in used_so_far:
                firstrun_color.append(found)
                # Needed a way to take out of the list and avoid repeats
                used_so_far.append(found)
    firstrun_color.sort(key=len)        
    if firstrun_color not in list_of_groupings:
        if firstrun_color != []:
            list_of_groupings.append(firstrun_color)
list_of_groupings.sort()            
# pprint.pprint(list_of_groupings)       
# Colors that can be combined into one value due to data entry errors
print("\nThe values in red can be combined to correct data entry errors.\n")
for printlist in list_of_groupings:
    if len(printlist) > 1:
        print(colored(printlist,'red'))
    else:
        print(printlist)


The values in red can be combined to correct data entry errors.

['apricot']
[31m['black', '/black', 'black/'][0m
['black /gray']
['black tiger/white']
[31m['black/black', 'black /black'][0m
['black/blue']
[31m['black/brown', 'black /brown'][0m
['black/chocolate']
['black/gray']
[31m['black/orange', 'black /orange'][0m
['black/seal']
['black/silver']
[31m['black/white', 'black /white'][0m
[31m['blue', '/blue', 'blue/'][0m
['blue /black']
['blue /blue cream']
['blue /buff']
['blue /cream']
['blue /gray']
['blue /tan']
['blue /white']
[31m['blue cream', 'blue/cream', '/blue cream', 'blue cream/'][0m
['blue cream/blue']
['blue cream/buff']
['blue cream/white']
['blue/blue']
['blue/brown']
['blue/gray']
[31m['blue/orange', 'blue /orange'][0m
['blue/white']
[31m['brown', '/brown', 'brown /'][0m
['brown /blue']
['brown /cream']
['brown /gray']
['brown /orange']
['brown merle']
['brown merle/brown']
['brown tiger']
['brown tiger/white']
[31m['brown/black', 'brown /black']

In [29]:
print("Potential count of groupings after removing duplicates due to data entry errors:",len(list_of_groupings))

Potential count of groupings after removing duplicates due to data entry errors: 122


##### Fuzzy Example 2 - Fuzzy logic for color reduction by grouping like colors

In [30]:
#  In this example fuzzy logic is used to drastically reduce the number of 
#  color categories by grouping like lists which could possibly be merged.
#  It is not perfect as it partially relies on order, but given the messiness of the data
#  could be used to make a much cleaner reduction to a subjective set of data.

#  The goal in this cell is to reduce the color choices as much as possible to reduce subjectivity 
#  while keeping some level of integrity from the answers provided thus showing the potential of fuzzy logic.

firstrun_color = []
list_of_groupings = []
used_so_far = []
for colorquery in unique_colors:
    firstrun_color = []
    # Gather top 35 matches
    for found, score in process.extract(colorquery, unique_colors, scorer=fuzz.token_set_ratio, limit = 35):      
        # Only consider scores over 75
        if score > 75:
            if found not in used_so_far:
                firstrun_color.append(found)
                # Needed a way to take out of the list and avoid repeats
                used_so_far.append(found)
    firstrun_color.sort()        
    if firstrun_color not in list_of_groupings:
        if firstrun_color != []:
            list_of_groupings.append(firstrun_color)
list_of_groupings.sort()
counter=0
print ("\nPotential new color groupings:")
for collist in list_of_groupings:
    # Add new line for easier reading of distinct new list of combined colors
    print("\n", collist)
    counter += 1        


Potential new color groupings:

 ['/black', 'black', 'black /black', 'black /brown', 'black /orange', 'black /white', 'black tiger/white', 'black/', 'black/black', 'black/brown', 'black/chocolate', 'black/orange', 'black/silver', 'black/white', 'brown /black', 'brown/black', 'chocolate/black', 'orange /black', 'silver /black', 'white/black']

 ['/blue', '/blue cream', 'black/blue', 'blue', 'blue /black', 'blue /blue cream', 'blue /buff', 'blue /cream', 'blue /gray', 'blue /orange', 'blue /white', 'blue cream', 'blue cream/', 'blue cream/blue', 'blue cream/buff', 'blue cream/white', 'blue/', 'blue/blue', 'blue/brown', 'blue/cream', 'blue/gray', 'blue/orange', 'blue/white', 'brown /blue', 'cream/blue', 'lynx /blue', 'white/blue']

 ['/brown', 'brown', 'brown /', 'brown /brown', 'brown /cream', 'brown /orange', 'brown /white', 'brown merle', 'brown merle/brown', 'brown tiger', 'brown tiger/white', 'brown/brown', 'brown/white', 'chocolate/brown', 'cream/brown', 'orange /brown', 'white/bro

In [31]:
print("New count of groupings after using fuzzy logic to combine colors:",len(list_of_groupings))

New count of groupings after using fuzzy logic to combine colors: 20
