## Shelter Animal Analysis

### This is an analysis on Kaggle's [animal shelter](https://www.kaggle.com/c/shelter-animal-outcomes/data) dataset

In [1]:
import pandas as pd

### Importing the training dataset 

In [2]:
data = pd.read_csv("dataset/train.csv")

### Exploring dataset's shape and columns

In [3]:
print(data.shape)
print(data.columns)

(26729, 10)
Index(['AnimalID', 'Name', 'DateTime', 'OutcomeType', 'OutcomeSubtype',
       'AnimalType', 'SexuponOutcome', 'AgeuponOutcome', 'Breed', 'Color'],
      dtype='object')


### Exploring the first 10 rows 

In [4]:
data.head(10)

Unnamed: 0,AnimalID,Name,DateTime,OutcomeType,OutcomeSubtype,AnimalType,SexuponOutcome,AgeuponOutcome,Breed,Color
0,A671945,Hambone,2014-02-12 18:22:00,Return_to_owner,,Dog,Neutered Male,1 year,Shetland Sheepdog Mix,Brown/White
1,A656520,Emily,2013-10-13 12:44:00,Euthanasia,Suffering,Cat,Spayed Female,1 year,Domestic Shorthair Mix,Cream Tabby
2,A686464,Pearce,2015-01-31 12:28:00,Adoption,Foster,Dog,Neutered Male,2 years,Pit Bull Mix,Blue/White
3,A683430,,2014-07-11 19:09:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Cream
4,A667013,,2013-11-15 12:52:00,Transfer,Partner,Dog,Neutered Male,2 years,Lhasa Apso/Miniature Poodle,Tan
5,A677334,Elsa,2014-04-25 13:04:00,Transfer,Partner,Dog,Intact Female,1 month,Cairn Terrier/Chihuahua Shorthair,Black/Tan
6,A699218,Jimmy,2015-03-28 13:11:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Tabby
7,A701489,,2015-04-30 17:02:00,Transfer,Partner,Cat,Unknown,3 weeks,Domestic Shorthair Mix,Brown Tabby
8,A671784,Lucy,2014-02-04 17:17:00,Adoption,,Dog,Spayed Female,5 months,American Pit Bull Terrier Mix,Red/White
9,A677747,,2014-05-03 07:48:00,Adoption,Offsite,Dog,Spayed Female,1 year,Cairn Terrier,White


### Check for NaN Values
Seems a lot of values missing in ```Name``` and ```OutcomeSubType```

In [5]:
print(data.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 26729 entries, 0 to 26728
Data columns (total 10 columns):
AnimalID          26729 non-null object
Name              19038 non-null object
DateTime          26729 non-null object
OutcomeType       26729 non-null object
OutcomeSubtype    13117 non-null object
AnimalType        26729 non-null object
SexuponOutcome    26728 non-null object
AgeuponOutcome    26711 non-null object
Breed             26729 non-null object
Color             26729 non-null object
dtypes: object(10)
memory usage: 2.0+ MB
None


### What is the meaning of every column in the data

- AnimalID: unique ID 
- Name: name
- DateTime: not really sure, datetime of the outcome maybe?
- OutcomeType: Adoption, Died, Euthanasia, Return_to_owner, Transfer
- OutcomeSubtype: Suffering, Partner, SCRP, Offsite
- AnimalType: Cat or Dog
- SexuponOutcome: Neutered Male, Intact Male, Spayed Female, Intact Female
- AgeuponOutcome: age in months or years
- Breed: breed
- Color: color 

### What is the percentage outcome of every animal 

In [6]:
data.groupby(['OutcomeType']).size().transform(lambda x: x/sum(x))

OutcomeType
Adoption           0.402896
Died               0.007370
Euthanasia         0.058177
Return_to_owner    0.179056
Transfer           0.352501
dtype: float64

### What is the percentage of cats and dogs in the shelter

The number of dogs in the shelter is higher, but insignificantly 

In [7]:
data.groupby(['AnimalType']).size().transform(lambda x: x/sum(x))

AnimalType
Cat    0.416551
Dog    0.583449
dtype: float64

### Outcome Type corresponding to Animal Type 

For both cats and dogs, adoption is the first outcome. Then for dogs it's Return to Owner and Transfer and for cats, Transfer and Return to Owner. The adoption percentage is higher for dogs, but it seems that in both cases, the outcome doesn't depend on the Animal Type. 

In [8]:
data.groupby(['AnimalType','OutcomeType']).size().transform(lambda x: x/sum(x))

AnimalType  OutcomeType    
Cat         Adoption           0.159826
            Died               0.005500
            Euthanasia         0.026563
            Return_to_owner    0.018706
            Transfer           0.205956
Dog         Adoption           0.243069
            Died               0.001871
            Euthanasia         0.031614
            Return_to_owner    0.160350
            Transfer           0.146545
dtype: float64

### Does the Outcome Type depend on the Outcome sub Type? 

By a quick look of the Breed, the Mix animals are identified in two ways: 1) the breed contains "Mix", 2) two or more breeds are separated with "/". We are creating a new column "Mix", where Mix = 1 when an animal is Mix, and 0 if no.

In [9]:
data_mix = pd.DataFrame({"Mix": [1 if data.iloc[i].Breed.endswith("Mix") else 0 for i in range(len(data))]})
data = data.join(data_mix)
data.head(10)

Unnamed: 0,AnimalID,Name,DateTime,OutcomeType,OutcomeSubtype,AnimalType,SexuponOutcome,AgeuponOutcome,Breed,Color,Mix
0,A671945,Hambone,2014-02-12 18:22:00,Return_to_owner,,Dog,Neutered Male,1 year,Shetland Sheepdog Mix,Brown/White,1
1,A656520,Emily,2013-10-13 12:44:00,Euthanasia,Suffering,Cat,Spayed Female,1 year,Domestic Shorthair Mix,Cream Tabby,1
2,A686464,Pearce,2015-01-31 12:28:00,Adoption,Foster,Dog,Neutered Male,2 years,Pit Bull Mix,Blue/White,1
3,A683430,,2014-07-11 19:09:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Cream,1
4,A667013,,2013-11-15 12:52:00,Transfer,Partner,Dog,Neutered Male,2 years,Lhasa Apso/Miniature Poodle,Tan,0
5,A677334,Elsa,2014-04-25 13:04:00,Transfer,Partner,Dog,Intact Female,1 month,Cairn Terrier/Chihuahua Shorthair,Black/Tan,0
6,A699218,Jimmy,2015-03-28 13:11:00,Transfer,Partner,Cat,Intact Male,3 weeks,Domestic Shorthair Mix,Blue Tabby,1
7,A701489,,2015-04-30 17:02:00,Transfer,Partner,Cat,Unknown,3 weeks,Domestic Shorthair Mix,Brown Tabby,1
8,A671784,Lucy,2014-02-04 17:17:00,Adoption,,Dog,Spayed Female,5 months,American Pit Bull Terrier Mix,Red/White,1
9,A677747,,2014-05-03 07:48:00,Adoption,Offsite,Dog,Spayed Female,1 year,Cairn Terrier,White,0


In [10]:
data.groupby(['Mix']).size().transform(lambda x: x/sum(x))

Mix
0    0.165775
1    0.834225
dtype: float64

It seems that most of the animals in the shelter are mixed breed, so... there is no need to actually see if mixed/non-mixed affects the outcome.

### Does animal age affect the adoption rate?

To be continued.