I was interested in looking at the average age for each outcome. I assumed that younger animals would be more likely to be adopted. 

In [1]:
import numpy as np
import pandas as pd
from pandas import DataFrame,Series
import seaborn as sns

In [2]:
shelterdframe = pd.read_csv('aac_intakes_outcomes.csv')

In [3]:
ages = []
for age in shelterdframe['age_upon_outcome_(days)']:
    number = age/365
    ages.append(number)


shelterdframe['number_age_years']= ages

In [4]:
avg_age_by_outcome = shelterdframe['number_age_years'].groupby(shelterdframe['outcome_type']).mean()

In [5]:
avg_age_by_outcome

outcome_type
Adoption           1.678934
Died               1.490026
Disposal           1.169656
Euthanasia         2.505810
Missing            1.624955
Relocate           0.941553
Return to Owner    3.949087
Rto-Adopt          3.361904
Transfer           1.603294
Name: number_age_years, dtype: float64

From these averages, you can see the highest average age is seen in animals that are euthanized. This makes sense as animals are more prone to health complications with age. The lowest average age is seen in animals that were being relocated. It is interesting that adoption is similar or slightly higher than many other categories. There are many things that could lead to this type of result. It is possible that there would be many more adult dogs being taken into the shelter than puppies. This would then influence the average age making it higher. I think to really look into this question, I will need to separate the animals by age category(exp. puppy,adult) and then look at the distrubition of outcome types within each group. 

To begin, I am separating out the dogs. It will be best to look at this type of question for each species.

In [6]:
dogs_only = shelterdframe.loc[shelterdframe['animal_type'] == 'Dog']

Using list comprehension, I have added a column denoting if a dog is a puppy or adult

In [7]:
dogs_only['age_category'] = ["Puppy" if i < 1.0 else "Adult" for i in dogs_only['number_age_years']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """Entry point for launching an IPython kernel.


In [8]:
dogs_only.head()

Unnamed: 0,age_upon_outcome,animal_id_outcome,date_of_birth,outcome_subtype,outcome_type,sex_upon_outcome,age_upon_outcome_(days),age_upon_outcome_(years),age_upon_outcome_age_group,outcome_datetime,...,intake_month,intake_year,intake_monthyear,intake_weekday,intake_hour,intake_number,time_in_shelter,time_in_shelter_days,number_age_years,age_category
0,10 years,A006100,2007-07-09 00:00:00,,Return to Owner,Neutered Male,3650,10.0,"(7.5, 10.0]",2017-12-07 14:07:00,...,12,2017,2017-12,Thursday,14,1.0,0 days 14:07:00.000000000,0.588194,10.0,Adult
1,7 years,A006100,2007-07-09 00:00:00,,Return to Owner,Neutered Male,2555,7.0,"(5.0, 7.5]",2014-12-20 16:35:00,...,12,2014,2014-12,Friday,10,2.0,1 days 06:14:00.000000000,1.259722,7.0,Adult
2,6 years,A006100,2007-07-09 00:00:00,,Return to Owner,Neutered Male,2190,6.0,"(5.0, 7.5]",2014-03-08 17:10:00,...,3,2014,2014-03,Friday,14,3.0,1 days 02:44:00.000000000,1.113889,6.0,Adult
3,10 years,A047759,2004-04-02 00:00:00,Partner,Transfer,Neutered Male,3650,10.0,"(7.5, 10.0]",2014-04-07 15:12:00,...,4,2014,2014-04,Wednesday,15,1.0,4 days 23:17:00.000000000,4.970139,10.0,Adult
4,16 years,A134067,1997-10-16 00:00:00,,Return to Owner,Neutered Male,5840,16.0,"(15.0, 17.5]",2013-11-16 11:54:00,...,11,2013,2013-11,Saturday,9,1.0,0 days 02:52:00.000000000,0.119444,16.0,Adult


For convenience, I am creating an adult and puppy DataFrame

In [10]:
adult = dogs_only.loc[dogs_only['age_category'] == 'Adult']
puppy = dogs_only.loc[dogs_only['age_category'] == 'Puppy']

In [11]:
adult.head()

Unnamed: 0,age_upon_outcome,animal_id_outcome,date_of_birth,outcome_subtype,outcome_type,sex_upon_outcome,age_upon_outcome_(days),age_upon_outcome_(years),age_upon_outcome_age_group,outcome_datetime,...,intake_month,intake_year,intake_monthyear,intake_weekday,intake_hour,intake_number,time_in_shelter,time_in_shelter_days,number_age_years,age_category
0,10 years,A006100,2007-07-09 00:00:00,,Return to Owner,Neutered Male,3650,10.0,"(7.5, 10.0]",2017-12-07 14:07:00,...,12,2017,2017-12,Thursday,14,1.0,0 days 14:07:00.000000000,0.588194,10.0,Adult
1,7 years,A006100,2007-07-09 00:00:00,,Return to Owner,Neutered Male,2555,7.0,"(5.0, 7.5]",2014-12-20 16:35:00,...,12,2014,2014-12,Friday,10,2.0,1 days 06:14:00.000000000,1.259722,7.0,Adult
2,6 years,A006100,2007-07-09 00:00:00,,Return to Owner,Neutered Male,2190,6.0,"(5.0, 7.5]",2014-03-08 17:10:00,...,3,2014,2014-03,Friday,14,3.0,1 days 02:44:00.000000000,1.113889,6.0,Adult
3,10 years,A047759,2004-04-02 00:00:00,Partner,Transfer,Neutered Male,3650,10.0,"(7.5, 10.0]",2014-04-07 15:12:00,...,4,2014,2014-04,Wednesday,15,1.0,4 days 23:17:00.000000000,4.970139,10.0,Adult
4,16 years,A134067,1997-10-16 00:00:00,,Return to Owner,Neutered Male,5840,16.0,"(15.0, 17.5]",2013-11-16 11:54:00,...,11,2013,2013-11,Saturday,9,1.0,0 days 02:52:00.000000000,0.119444,16.0,Adult


In [12]:
puppy.head()

Unnamed: 0,age_upon_outcome,animal_id_outcome,date_of_birth,outcome_subtype,outcome_type,sex_upon_outcome,age_upon_outcome_(days),age_upon_outcome_(years),age_upon_outcome_age_group,outcome_datetime,...,intake_month,intake_year,intake_monthyear,intake_weekday,intake_hour,intake_number,time_in_shelter,time_in_shelter_days,number_age_years,age_category
1217,11 months,A566868,2013-03-24 00:00:00,,Return to Owner,Intact Female,330,0.90411,"(-0.025, 2.5]",2014-02-24 11:42:00,...,2,2014,2014-02,Sunday,17,3.0,0 days 17:45:00.000000000,0.739583,0.90411,Puppy
1218,10 months,A566868,2013-03-24 00:00:00,,Return to Owner,Intact Female,300,0.821918,"(-0.025, 2.5]",2014-01-23 13:19:00,...,1,2014,2014-01,Monday,17,4.0,2 days 19:21:00.000000000,2.80625,0.821918,Puppy
2283,9 months,A619084,2014-09-09 00:00:00,,Return to Owner,Neutered Male,270,0.739726,"(-0.025, 2.5]",2015-06-10 13:30:00,...,6,2015,2015-06,Tuesday,14,3.0,0 days 22:36:00.000000000,0.941667,0.739726,Puppy
3141,11 months,A644592,2012-10-14 00:00:00,,Return to Owner,Spayed Female,330,0.90411,"(-0.025, 2.5]",2013-10-10 18:14:00,...,10,2013,2013-10,Thursday,11,1.0,0 days 06:59:00.000000000,0.290972,0.90411,Puppy
3189,10 months,A646163,2012-11-18 00:00:00,Partner,Transfer,Spayed Female,300,0.821918,"(-0.025, 2.5]",2013-10-10 11:59:00,...,10,2013,2013-10,Monday,12,1.0,2 days 23:05:00.000000000,2.961806,0.821918,Puppy


For the adult and puppy DataFrames, I found the percent of animals for each outcome category.

In [13]:
adult_outcomes = adult['outcome_type'].groupby(adult['outcome_type']).count()
total_outcomes = adult['outcome_type'].count()

In [34]:
adult_outcomes


outcome_type
Adoption           13105
Died                  74
Disposal               9
Euthanasia          1324
Missing               13
Return to Owner    11833
Rto-Adopt            129
Transfer            6120
Name: outcome_type, dtype: int64

In [15]:
total_outcomes

32607

In [35]:
list_of_outcomes = pd.Series.tolist(adult_outcomes)


In [23]:
percentages = []
for outcome in list_of_outcomes:
    proportion = (outcome / total_outcomes) * 100
    percentages.append(proportion)
    

In [42]:
adult_dframe = DataFrame(adult_outcomes)

In [43]:
adult_dframe['adult_percentages'] = percentages

In [56]:
adults_complete = adult_dframe.rename(index=str, columns={'outcome_type':'adult_counts'})

In [31]:
puppy_outcomes = puppy['outcome_type'].groupby(puppy['outcome_type']).count()
totals = puppy['outcome_type'].count()


In [32]:
outcome_list = pd.Series.tolist(puppy_outcomes)

In [36]:
pup_percent = []
for outcome in outcome_list:
    pup_prop = (outcome / totals) * 100
    pup_percent.append(pup_prop)

In [37]:
pup_dframe = DataFrame(puppy_outcomes)

In [38]:
pup_dframe['puppy_percentages'] = pup_percent

In [55]:
puppies_complete = pup_dframe.rename(index=str, columns={'outcome_type':'puppy_counts'})

With the adult and puppy results in two DataFrames, I merged them on their index to have one final DataFrame comparing the proportion of each age group that has each outcome type.

In [60]:
final_df = adults_complete.merge(puppies_complete,on= 'outcome_type')


In [61]:
final_df

Unnamed: 0_level_0,adult_counts,adult_percentages,puppy_counts,puppy_percentages
outcome_type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Adoption,13105,40.190757,7476,58.60312
Died,74,0.226945,68,0.533041
Disposal,9,0.027601,1,0.007839
Euthanasia,1324,4.060478,178,1.395312
Missing,13,0.039869,3,0.023517
Return to Owner,11833,36.289754,1440,11.28792
Rto-Adopt,129,0.395621,14,0.109744
Transfer,6120,18.768976,3577,28.039508


By separating the dogs into groups by age, we are able to see the puppies are adopted more often than adult dogs. We were unable to see this in the combined data since there are more adult dogs in the shelter than puppies. 