# Data Analysis Project 5: Pet Shelter Analysis

Import Panda and CSVs

In [131]:
import pandas as pd

intakes_outcomes = pd.read_csv('archive/aac_intakes_outcomes.csv')
intakes = pd.read_csv('archive/aac_intakes.csv')
outcomes = pd.read_csv('archive/aac_outcomes.csv')

## 1. Is there an area where more pets are found?
   - Find the top 5 places where animals are found so the shelter can coordinate with local volunteers and animal control to monitor these areas.

Strategy
- Return a series "found_location", count how many times each value appears, print the top five.

In [172]:
top_areas = intakes['found_location'].value_counts().head()

print(f"The top 5 places that animals where found are:")
for label, value in top_areas.items():
    print(f"- In {label}, {value} animals where found.")

The top 5 places that animals where found are:
- In Austin (TX), 14443 animals where found.
- In Outside Jurisdiction, 948 animals where found.
- In Travis (TX), 921 animals where found.
- In 7201 Levander Loop in Austin (TX), 517 animals where found.
- In Del Valle (TX), 411 animals where found.


## 2. What is the average number of pets found in a month in the year 2015?
   - Are there months where there is a higher number of animals found?
   - Knowing the number of pets the shelter might see in a month can help them gather enough resources and donations to care for the animals they receive.

Strategy
- Converted the datetime column to type datetime
- Filtered rows for 2015.
- Group the intakes by month and count them.
- Find the mean of the monthly intakes.
- Months with a higher intake number than the mean.


In [191]:
intakes['datetime'] = pd.to_datetime(intakes['datetime'])

intakes_2015 = intakes[intakes['datetime'].dt.year == 2015]

monthly_intakes = intakes_2015.groupby(intakes_2015['datetime'].dt.month).size()

mean_intakes = monthly_intakes.mean()

higher_months = monthly_intakes[monthly_intakes > mean_intakes]

print(f"The 2015 monthly mean was {mean_intakes:.2f} intakes.")
print(f"Months with intakes above the mean:")
for label, value in higher_months.items():
    print(f"- Month {label} had {value} intakes.")

The 2015 monthly mean was 1559.33 intakes.
Months with intakes above the mean:
- Month 5 had 2094 intakes.
- Month 6 had 2189 intakes.
- Month 7 had 1635 intakes.
- Month 8 had 1718 intakes.
- Month 9 had 1591 intakes.
- Month 10 had 1740 intakes.


## 3. What is the ratio of incoming pets vs. adopted pets?
   - This key metric helps the shelter know how they are doing

Strategy
- Find the total number intakes.
- Find the total number of adoptions
- Find the ratio of incoming pets vs adopted pets.

In [201]:
total_intakes = len(intakes)

total_adoptions = outcomes[outcomes['outcome_type'] == 'Adoption'].shape[0]

ratio = total_adoptions / total_intakes

print(f"The ratio of incoming pets vs. adopted pets is: {ratio:.2f}")

The ratio of incoming pets vs. adopted pets is: 0.43


## 4. What is the distribution of the types of animals in the sheler?
   - Find the count of each type of animal in the shelter

Strategy
- Count the number occurences of each type in the animal type column.

In [209]:
animal_types = intakes['animal_type'].value_counts()

print(f"The count of types of animals:")
for label, value in animal_types.items():
    print(f"- {label}'s: {value}")

The count of types of animals:
- Dog's: 45743
- Cat's: 29659
- Other's: 4434
- Bird's: 342
- Livestock's: 9


## 5. What are the adoption rates for specific breeds?
   - Find the top 5 dog breeds in the shelter (based on count)
   - Find the adoption percentage of each breed

Strategy
- Find total number of breeds, times they appear and return the top 5
- Filter for adoptions and count the number of adoptions per breed
- Filter 'adoptions_by_breed' by 'top_five_breed'
- Calculate percentage

In [None]:
top_five_breeds = intakes['breed'].value_counts().head()

adoptions_by_breed = outcomes[outcomes['outcome_type'] == 'Adoption']['breed'].value_counts()

breed_filtered_adoptions = adoptions_by_breed[adoptions_by_breed.index.isin(top_five_breeds.index)]

breed_adoption_percentage = (breed_filtered_adoptions / top_five_breeds) * 100

print(f"The top 5 breeds adoption percentages are:")
for label, value in breed_adoption_percentage.items():
    print(f"- {label}'s: {value}")

The top 5 breeds adoption percentages are:
- Chihuahua Shorthair Mix's: 47.181069958847736
- Domestic Medium Hair Mix's: 45.65682498930252
- Domestic Shorthair Mix's: 43.067307283472935
- Labrador Retriever Mix's: 49.65916133030366
- Pit Bull Mix's: 37.323722970855535


## 6. What are the adoption rates for different colorings?
- Find the top 5 colorings in the shelter (based on count)
- Find the adoption percentage of each color

Strategy
- Find total number of colors and times they appear and return the top 5
- Filter for adoptions and count the number of adoptions per color
- Filter adoptions_by_color for top_five_colors
- Get percentage
- Order the values descending

In [211]:
top_five_colors = intakes['color'].value_counts().head()

adoptions_by_color = outcomes[outcomes['outcome_type'] == 'Adoption']['color'].value_counts()

color_filtered_adoptions = adoptions_by_color[adoptions_by_color.index.isin(top_five_colors.index)]

color_adoption_percentage = ((color_filtered_adoptions / top_five_colors) * 100).sort_values(ascending=False)

print(f"The top 5 colors adoption percentages are:")
for label, value in color_adoption_percentage.items():
    print(f"- {label}'s: {value}")

The top 5 colors adoption percentages are:
- Black/White's: 45.73141486810551
- Brown Tabby's: 42.656563405393364
- Black's: 41.087928464977644
- White's: 37.97823797823798
- Brown's: 22.194582642343835


## 7. About how many animals are spayed/neutered each month?
   - This will help the shelter allocate resources and staff. Assume that all intact males and females will be spayed/neutered.

Strategy
- Convert 'datetime' column to datetime
- Create a 'month' column as a period object for monthly grouping
- Filter for Spayed Females, Intact Females, Neutered Males and Intact Males
- Concatenate all filtered outcomes into one DataFrame
- Group by month and count occurrences

In [216]:
outcomes['datetime'] = pd.to_datetime(outcomes['datetime'])

outcomes['month'] = outcomes['datetime'].dt.to_period('M')
 
spayed_female = outcomes[outcomes['sex_upon_outcome'] == 'Spayed Female']
intact_female = outcomes[outcomes['sex_upon_outcome'] == 'Intact Female']
neutered_male = outcomes[outcomes['sex_upon_outcome'] == 'Neutered Male']
intact_male = outcomes[outcomes['sex_upon_outcome'] == 'Intact Male']

monthly_spayed_neutered = pd.concat([spayed_female, intact_female, neutered_male, intact_male])

monthly_counts = monthly_spayed_neutered.groupby('month').size().mean()

print(f"Number of animals spayed/neutered each month: {monthly_counts}")

Number of animals spayed/neutered each month: 1343.0


## Extra Credit

## 1. How many animals in the shelter are repeats? Which animal was returned to the shelter the most?
   - This means the animal has been brought in more than once.

Strategy
- Find how many animals are repeats
- Find the top returned animal's id
- Search for the top animal id's name

In [223]:
repeat_animals = sum(intakes['animal_id'].value_counts() > 1)

most_returned_animal_id = intakes['animal_id'].value_counts().index[0]

animal_name = intakes[intakes['animal_id'] == most_returned_animal_id]['name'].iloc[0]

print(f"Number of repeat animals: {repeat_animals} \nMost returned animal's name: {animal_name}")

Number of repeat animals: 6154 
Most returned animal's name: Lil Bit


## 2. What are the adoption rates for the following age groups?
- baby: 4 months and less
- young: 5 months - 2 years
- adult: 3 years - 10 years
- senior 11+

Strategy
- Save a series of 'all_outcomes' and 'adoptions_outcomes'
- Create lists to start totals, adoptions and rates
- Sort the totals and adoptions into groups by age
- Calculare the adoption rate

In [None]:
all_outcomes = outcomes['age_upon_outcome']

adoption_outcomes = outcomes[outcomes['outcome_type'] == 'Adoption']['age_upon_outcome']


total_babies, total_young, total_adult, total_senior = [], [], [], []

adopted_babies, adopted_young, adopted_adult, adopted_senior = [], [], [], []

rates_baby, rates_young, rates_adult, rates_senior = [], [], [], []


def adoption_percentage(total, adopted):
    return len(adopted) / len(total) * 100 


def categorize_outcomes(outcomes):
    # for outcome in outcomes:
    if 'month' in str(outcome):
        age = int(outcome.split(" ")[0])
        if age <= 4:
            return 'babies'
        elif age > 4:
            return 'young'
    elif 'year' in str(outcome):
        age = int(outcome.split(" ")[0])
        if age <= 2:
            return 'young'
        if age > 2 and age <= 10:
            return 'adult'
        elif age >= 11:
            return 'senior'
    return None


for outcome in all_outcomes:
    category = categorize_outcomes(outcome)
    if category == 'babies':
        total_babies.append(outcome)
    elif category == 'young':
        total_young.append(outcome)
    elif category == 'adult':
        total_adult.append(outcome)
    elif category == 'senior':
        total_senior.append(outcome)


for outcome in adoption_outcomes:
    category = categorize_outcomes(outcome)
    if category == 'babies':
        adopted_babies.append(outcome)
    elif category == 'young':
        adopted_young.append(outcome)
    elif category == 'adult':
        adopted_adult.append(outcome)
    elif category == 'senior':
        adopted_senior.append(outcome)


rates_baby = adoption_percentage(total_babies, adopted_babies)
rates_young = adoption_percentage(total_young, adopted_young)
rates_adult = adoption_percentage(total_adult, adopted_adult)
rates_senior = adoption_percentage(total_senior, adopted_senior)

print(f"The adoption rates are: \nBaby:{rates_baby}\nYoung:{rates_young}\nAdult:{rates_adult}\nSenior:{rates_senior}")

The adoption rates are: 
Baby:68.27538395904436
Young:41.83722576079264
Adult:33.67319121094815
Senior:21.234454168585906


## 3. If spay/neuter for a dog costs \$100 and a spay/neuter for a cat costs \$50, how much did the shelter spend in 2015 on these procedures?

Strategy
- Confirm the intakes_outcomes['outcome_datetime'] is a datetime
- Remove any NaN rows from ['sex_upon_intake', 'sex_upon_outcome', 'outcome_datetime', 'animal_type']
-   Remove any Unkown rows
- Filter intakes_outcomes['outcome_datetime'] for year 2015
- Filter for animal whose gender was intact upon intake
- Filter which animals were then neutered or spayed upon outcome
- Sort the spayed and neutered animals by type (dogs vs cats)
- Print the final answer

In [225]:
# Datetime
intakes_outcomes['outcome_datetime'] = pd.to_datetime(intakes_outcomes['outcome_datetime'])

# Filter
intakes_outcomes = intakes_outcomes[['sex_upon_intake', 'sex_upon_outcome', 'outcome_datetime', 'animal_type']].dropna()

intakes_outcomes_filter = intakes_outcomes[
    (intakes_outcomes['sex_upon_intake'] != 'Unknown') & 
    (intakes_outcomes['sex_upon_outcome'] != 'Unknown')
]

intakes_outcomes_2015 = intakes_outcomes_filter[intakes_outcomes_filter['outcome_datetime'].dt.year == 2015]

intact_animals_2015 = intakes_outcomes_2015[intakes_outcomes_2015['sex_upon_intake'].str.contains('Intact')]

neutered_spayed_animals = intact_animals_2015[intact_animals_2015['sex_upon_outcome'].str.contains('Neutered') | 
                                        intact_animals_2015['sex_upon_outcome'].str.contains('Spayed')]

# Sort
spayed_females_dogs_2015 = neutered_spayed_animals[
    (neutered_spayed_animals['sex_upon_outcome'].str.contains('Spayed')) & 
    (neutered_spayed_animals['animal_type'] == 'Dog')
]

neutered_males_dogs_2015 = neutered_spayed_animals[
    (neutered_spayed_animals['sex_upon_outcome'].str.contains('Neutered')) & 
    (neutered_spayed_animals['animal_type'] == 'Dog')
]

spayed_females_cats_2015 = neutered_spayed_animals[
    (neutered_spayed_animals['sex_upon_outcome'].str.contains('Spayed')) & 
    (neutered_spayed_animals['animal_type'] == 'Cat')
]

neutered_males_cats_2015 = neutered_spayed_animals[
    (neutered_spayed_animals['sex_upon_outcome'].str.contains('Neutered')) & 
    (neutered_spayed_animals['animal_type'] == 'Cat')
]

# Answer
print("2015 Totals")
print(f"Spayed Female Dogs: {len(spayed_females_dogs_2015)} at $100 each totals ${len(spayed_females_dogs_2015) * 100:,.2f}")
print(f"Neutered Male Dogs: {len(neutered_males_dogs_2015)} at $100 each totals ${len(neutered_males_dogs_2015) * 100:,.2f}")
print(f"Spayed Female Cats: {len(spayed_females_cats_2015)} at $50 each totals ${len(spayed_females_cats_2015) * 50:,.2f}")
print(f"Neutered Male Cats: {len(neutered_males_cats_2015)} at $50 each totals ${len(neutered_males_cats_2015) * 50:,.2f}")

2015 Totals
Spayed Female Dogs: 1992 at $100 each totals $199,200.00
Neutered Male Dogs: 2296 at $100 each totals $229,600.00
Spayed Female Cats: 1308 at $50 each totals $65,400.00
Neutered Male Cats: 1323 at $50 each totals $66,150.00
