# Animal Shelter Outcomes

Every year, approximately 7.6 million companion animals end up in US shelters. Many animals are given up as unwanted by their owners, while others are picked up after getting lost or taken out of cruelty situations. Many of these animals find forever families to take them home, but just as many are not so lucky. 2.7 million dogs and cats are euthanized in the US every year.

### Data

Kaggle is hosting this competition for the machine learning community to use for data science practice and social good. The dataset is brought to you by Austin Animal Center. Shelter animal statistics were taken from the ASPCA.

### Predictors and Outcomes

Predictors are age, animal type and sex. The outcome will identify whether the animal was adopted, killed, returned to owner, or transfered.

In [1]:
import pandas as pd
import numpy as np
import matplotlib as pyplot
%matplotlib inline

In [2]:
# First import data file and look at data fields

animals = pd.read_csv("train.csv")
animals.head(5)

Unnamed: 0,AnimalID,Name,DateTime,OutcomeType,OutcomeSubtype,AnimalType,SexuponOutcome,AgeuponOutcome,Breed,Color
0,A671945,Hambone,2014-02-12 18:22:00,Return_to_owner,,Dog,Neutered Male,1 year,Shetland Sheepdog Mix,Brown/White
1,A656520,Emily,2013-10-13 12:44:00,Euthanasia,Suffering,Cat,Spayed Female,1 year,Domestic Shorthair Mix,Cream Tabby
2,A686464,Pearce,2015-01-31 12:28:00,Adoption,Foster,Dog,Neutered Male,2 years,Pit Bull Mix,Blue/White
3,A683430,,2014-07-11 19:09:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Cream
4,A667013,,2013-11-15 12:52:00,Transfer,Partner,Dog,Neutered Male,2 years,Lhasa Apso/Miniature Poodle,Tan


In [3]:
# How many records are there?

print len(animals.index)

26729


In [4]:
# Check for missing data

missing_name = animals['Name'].isnull().sum()
print "Missing Names: %r" %missing_name

missing_datetime = animals['DateTime'].isnull().sum()
print "Missing Dates: %r" %missing_datetime

missing_type = animals['AnimalType'].isnull().sum()
print "Missing Animal Types: %r" %missing_type

missing_sex = animals['SexuponOutcome'].isnull().sum()
print "Missing Sexes:  %r" %missing_sex

missing_age = animals['AgeuponOutcome'].isnull().sum()
print "Missing Ages: %r" %missing_age

missing_breed = animals['Breed'].isnull().sum()
print "Missing Breeds: %r" %missing_breed

missing_color = animals['Color'].isnull().sum()
print "Missing Colors: %r" %missing_color

Missing Names: 7691
Missing Dates: 0
Missing Animal Types: 0
Missing Sexes:  1
Missing Ages: 18
Missing Breeds: 0
Missing Colors: 0


While names will likely not affect the outcome of shelter animals, the missing ages and sexes will need to be replaced with values as they will be an important predictor on whether or not an animal gets adopted.

In [5]:
# Partition ages into values that can be converted to integers
new = animals['AgeuponOutcome'].str.partition(" ")
new.head(5)


Unnamed: 0,0,1,2
0,1,,year
1,1,,year
2,2,,years
3,3,,weeks
4,2,,years


In [6]:
new.columns = ['how_old','space','age_unit']

In [7]:
del new['space']

In [8]:
# Join dataframe with partition
new_animals = pd.concat([animals, new],axis = 1)
new_animals.head(5)

Unnamed: 0,AnimalID,Name,DateTime,OutcomeType,OutcomeSubtype,AnimalType,SexuponOutcome,AgeuponOutcome,Breed,Color,how_old,age_unit
0,A671945,Hambone,2014-02-12 18:22:00,Return_to_owner,,Dog,Neutered Male,1 year,Shetland Sheepdog Mix,Brown/White,1,year
1,A656520,Emily,2013-10-13 12:44:00,Euthanasia,Suffering,Cat,Spayed Female,1 year,Domestic Shorthair Mix,Cream Tabby,1,year
2,A686464,Pearce,2015-01-31 12:28:00,Adoption,Foster,Dog,Neutered Male,2 years,Pit Bull Mix,Blue/White,2,years
3,A683430,,2014-07-11 19:09:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Cream,3,weeks
4,A667013,,2013-11-15 12:52:00,Transfer,Partner,Dog,Neutered Male,2 years,Lhasa Apso/Miniature Poodle,Tan,2,years


In [9]:
# Convert column to float
new_animals['num_how_old'] = new_animals['how_old'].astype(float)

In [11]:
# Look at how many different age units are used so they can be converted to a common unit
pd.unique(new_animals["age_unit"])

array(['year', 'years', 'weeks', 'month', 'months', 'days', 'week', 'day',
       None], dtype=object)

In [12]:
# Create new column which will equate age units

## Set a default value for all rows which age unit is in year or years
new_animals['age_in_years'] = new_animals['num_how_old']
## Set value for all rows which age unit is in month or months
new_animals.loc[(new_animals['age_unit'] == 'month'),'age_in_years'] = new_animals['num_how_old']/12
new_animals.loc[(new_animals['age_unit'] == 'months'),'age_in_years'] = new_animals['num_how_old']/12
## Set value for all rows which age unit is in week or weeks
new_animals.loc[(new_animals['age_unit'] == 'week'),'age_in_years'] = new_animals['num_how_old']/52
new_animals.loc[(new_animals['age_unit'] == 'weeks'),'age_in_years'] = new_animals['num_how_old']/52
## Set value for all rows which age unit is in day or days
new_animals.loc[(new_animals['age_unit'] == 'day'),'age_in_years'] = new_animals['num_how_old']/365
new_animals.loc[(new_animals['age_unit'] == 'days'),'age_in_years'] = new_animals['num_how_old']/365

new_animals

Unnamed: 0,AnimalID,Name,DateTime,OutcomeType,OutcomeSubtype,AnimalType,SexuponOutcome,AgeuponOutcome,Breed,Color,how_old,age_unit,num_how_old,age_in_years
0,A671945,Hambone,2014-02-12 18:22:00,Return_to_owner,,Dog,Neutered Male,1 year,Shetland Sheepdog Mix,Brown/White,1,year,1.0,1.000000
1,A656520,Emily,2013-10-13 12:44:00,Euthanasia,Suffering,Cat,Spayed Female,1 year,Domestic Shorthair Mix,Cream Tabby,1,year,1.0,1.000000
2,A686464,Pearce,2015-01-31 12:28:00,Adoption,Foster,Dog,Neutered Male,2 years,Pit Bull Mix,Blue/White,2,years,2.0,2.000000
3,A683430,,2014-07-11 19:09:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Cream,3,weeks,3.0,0.057692
4,A667013,,2013-11-15 12:52:00,Transfer,Partner,Dog,Neutered Male,2 years,Lhasa Apso/Miniature Poodle,Tan,2,years,2.0,2.000000
5,A677334,Elsa,2014-04-25 13:04:00,Transfer,Partner,Dog,Intact Female,1 month,Cairn Terrier/Chihuahua Shorthair,Black/Tan,1,month,1.0,0.083333
6,A699218,Jimmy,2015-03-28 13:11:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Tabby,3,weeks,3.0,0.057692
7,A701489,,2015-04-30 17:02:00,Transfer,Partner,Cat,Unknown,3 weeks,Domestic Shorthair Mix,Brown Tabby,3,weeks,3.0,0.057692
8,A671784,Lucy,2014-02-04 17:17:00,Adoption,,Dog,Spayed Female,5 months,American Pit Bull Terrier Mix,Red/White,5,months,5.0,0.416667
9,A677747,,2014-05-03 07:48:00,Adoption,Offsite,Dog,Spayed Female,1 year,Cairn Terrier,White,1,year,1.0,1.000000


In [14]:
# Get dummies for Animal type

animal_dummies = pd.get_dummies(new_animals['AnimalType'], prefix='type')

animal_dummies.head(5)

Unnamed: 0,type_Cat,type_Dog
0,0.0,1.0
1,1.0,0.0
2,0.0,1.0
3,1.0,0.0
4,0.0,1.0


In [None]:
del new_animals['how_old']

In [None]:
pd.unique(animals["SexuponOutcome"])

In [None]:
pd.unique(animals['AnimalType'])

In [None]:
breeder = pd.unique(animals['Breed'])
len(breeder)
