# COGS 108 - EDA Checkpoint

# Names

- Yu Zhang
- Zhiying Guan
- Kaiwen Che
- Zhiwei Wang
- Ariane Yu

<a id='research_question'></a>
# Research Question

By comparing the changes in adoption trend between 2019 and 2020, how does social distancing, as a result of covid outbreak, affect pet adoption preferences, in terms of pets’ types, ages, and gender specifically? 

# Setup

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np

In [2]:
## YOUR CODE HERE
a_sonoma = pd.read_csv('datasets/Sonoma_Adoption.csv')
a_longbeach = pd.read_csv('datasets/Longbeach_Adoption.csv')
a_austin = pd.read_csv('datasets/Austin_Adoption.csv')
a_bloom = pd.read_csv('datasets/Bloomington_Adoption.csv')
a_dallas = pd.read_csv('datasets/Dallas_Adoption')
a_norfolk = pd.read_csv('datasets/Norfolk_Adoption.csv')
a_sacramento = pd.read_csv('datasets/Sacramento_Adoption.csv')
 
ROUND_DECIMALS = 2

def sonoma_standardize_age(timedelta_in):
    output = float(timedelta_in.days)
    output = round(output / 365, ROUND_DECIMALS)
    return output

def standardize_age(agetime_in):
    try:
        str_in = str(agetime_in)
        str_in = str_in.lower()
        str_in = str_in.strip()
        
        if 'day' in str_in or 'days' in str_in:
            str_in = str_in.replace('days', '')
            str_in = str_in.replace('day', '')
            output = float(str_in)
            output = round(output / 365, ROUND_DECIMALS)
        elif 'week' in str_in or 'weeks' in str_in:
            str_in = str_in.replace('weeks', '')
            str_in = str_in.replace('week', '')
            output = float(str_in)
            output = round(output / 52, ROUND_DECIMALS)
        elif 'month' in str_in or 'months' in str_in:
            str_in = str_in.replace('months', '')
            str_in = str_in.replace('month', '')
            output = float(str_in)
            output = round(output / 12, ROUND_DECIMALS)
        elif 'year' in str_in or 'years' in str_in:
            str_in = str_in.replace('years', '')
            str_in = str_in.replace('year', '')
            output = round(float(str_in), ROUND_DECIMALS)
        else:
            output = np.nan
    except:
        output = np.nan
        
    return output

# Overall change of adoption number:
def overall_adoption_change(df_in):
    df_2019 = df_in.loc[df_in['year']==2019]
    df_2020 = df_in.loc[df_in['year']==2020]
    animal_types = df_in['Type'].unique()
    
    overall_change = round(((df_2020.shape[0] - df_2019.shape[0]) / df_2019.shape[0]) * 100, ROUND_DECIMALS)
    print("Overall Change of Total Adoption Number From 2019 to 2020:", overall_change, "%")
    
    print("Change of Adoption Number by Type:")
    for animal in animal_types:
        if animal in df_2019['Type'].value_counts():
            animal_count_2019 = df_2019['Type'].value_counts()[animal]
        else:
            animal_count_2019 = 0
            
        if animal in df_2020['Type'].value_counts():
            animal_count_2020 = df_2020['Type'].value_counts()[animal]
        else:
            animal_count_2020 = 0 
            
        if (animal_count_2019 == 0) & (animal_count_2020 == 0):
            continue
        elif animal_count_2019 == 0:
            print("Adoption number of", animal.upper(), "from 2019 to 2020 increased by a number of", animal_count_2020)
        else:    
            count_change = round(((animal_count_2020 - animal_count_2019) / animal_count_2019) * 100, ROUND_DECIMALS)
            print("Adoption number of", animal.upper(), "from 2019 to 2020 changed by", count_change, "%")
    #print('=======================================================================================')
    

# Data Cleaning

To compare the changes of adoption trends before and after the start of COVID-19, we used Sonoma_Adoption, Longbeach_Adoption, and Austin_Adoption that were read from the csv files we found online. From Sonoma_Adoption, we extracted info of animals' Outcome Date, Type, Breed, Sex, Color, and Age in the Sonoma county. From Longbeach_Adoption, we extracted info of animals' Outcome Date, Animal Type, Sex, Primary Color, and Age in the Longbeach county. From Austin_Adoption, we extracted info of animals' DateTime, Animal Type, Breed, Sex upon Outcome, Color, and Age upon Outcome in the Austin county. By googling specifc animal breeds, we generalize abopted animals to species (e.g. rabbit, cat, dog, rodent). 

Outcome type: adoption <br> 
Outcome date: after 03/01/2020 <br> 
Left categories: Type, Sex, Age <br> 

In [3]:
animals = {'Rabbit Sh': 'Rabbit', 'Guinea Pig': 'Rodent', 'Rabbit Sh':'Rabbit', 'Californian':'Rabbit', 
           'English Spot Mix': 'Rabbit', 'Hamster': 'Rodent', 'Lionhead':'Rabbit', 'Hotot':'Rabbit', 'Mouse': 'Rodent',
          'Lop-Holland': 'Rabbit', 'Chinchilla-Stnd': 'Rodent', 'Rex':'Rabbit', 'Angora-French Mix':'Rabbit',
          'Rabbit Lh': 'Rabbit', 'Lionhead Mix': 'Rabbit', 'Rabbit Sh Mix': 'Rabbit', 'Rex Mix': 'Rabbit', 
           'Havana':'Rabbit', 'Lop-Mini/Hotot':'Rabbit', 'New Zealand Wht': 'Rabbit', 'Jersey Wooly':'Rabbit',
          'Dwarf Hotot':'Rabbit', 'Netherlnd Dwarf':'Rabbit', 'Rex-Mini': 'Rabbit', 'English Spot':'Rabbit',
          'Lop-Holland Mix':'Rabbit', 'Potbelly Pig': 'Livestock', 'Lop-Mini': 'Rabbit', 'Snake':'Reptile', 'Lizard':'Reptile',
          'Turtle': 'Reptile', 'HAVANA/MIX': 'Rabbit', 'RAT':'Rodent', 'Lop-Mini Mix': 'Rabbit', 'GUINEA PIG' : 'Rodent', 
          'CALIFORNIAN/MIX':'Rabbit', 'HOTOT':'Rabbit','CALIFORNIAN/MIX':'Rabbit', 'PIG':'Livestock', 'RABBIT SH':'Rabbit', 
           'HAVANA':'Rabbit', 'HOTOT/MIX':'Rabbit','ANGORA-SATIN':'Rabbit', 'REX':'Rabbit', 'AMERICAN/MIX':'Rabbit', 
          'LOP-AMER FUZZY': 'Rabbit', 'LOP-MINI' : 'Rabbit', 'SILVER/MIX' : 'Rabbit', 'DUTCH/MIX' : 'Rabbit', 
           'ENGLISH SPOT/MIX':'Rabbit', 'COCKATIEL':'Bird', 'RABBIT LH':'Rabbit', 'CALIFORNIAN':'Rabbit', 'DONKEY': 'Livestock',
          'SHEEP':'Livestock', 'GUINEA PIG':'Rodent', 'Mouse Mix':'Rodent', 'Ferret':'Rodent','New Zealand Wht Mix':'Rodent',
          'Rat':'Rodent', 'Havana Mix':'Rabbit','Ferret Mix': 'Rodent', 'Californian Mix':'Rabbit', 'Netherlnd Dwarf Mix':'Rabbit',
          'American':'Rabbit', 'American Mix':'Rabbit','Lop-Amer Fuzzy':'Rabbit', 'Flemish Giant Mix':'Rabbit', 
           'Chinchilla-Amer':'Rabbit', 'Pig Mix':'Livestock', 'Dutch':'Rabbit','Goat':'Livestock', 'Sugar Glider':'Rabbit',
          'Rex/Lop-English': 'Rabbit', 'Rex-Mini Mix': 'Rabbit', 'American Sable': 'Rabbit', 'Rabbit Lh Mix': 'Rabbit',
          'Sheep Mix': 'Livestock', 'Pig':'Livestock', 'Beveren Mix':'Rabbit','GUINEA PIG':'Rodent','Polish':'Rabbit'}


In [4]:
#Sonoma County_adoption
a_sonoma['Outcome Date'] = pd.to_datetime(a_sonoma['Outcome Date'])
a_sonoma['Date Of Birth'] = pd.to_datetime(a_sonoma['Date Of Birth'])

a_sonoma_clean = a_sonoma[a_sonoma['Outcome Type'] == 'ADOPTION']
a_sonoma_clean = a_sonoma_clean.loc[((
    a_sonoma_clean['Outcome Date'] > pd.to_datetime("2019/01/01"))&(
    a_sonoma_clean['Outcome Date'] < pd.to_datetime("2020/12/31")))]

a_sonoma_clean['Age'] = a_sonoma_clean['Outcome Date']-a_sonoma_clean['Date Of Birth']

a_sonoma_clean = a_sonoma_clean[["Outcome Date","Type","Breed", "Sex" ,"Age"]]
a_sonoma_clean = a_sonoma_clean.dropna()

a_sonoma_clean['Age'] = a_sonoma_clean['Age'].apply(sonoma_standardize_age)

a_sonoma_clean['Type'] = np.where(a_sonoma_clean['Type']== 'OTHER', a_sonoma_clean['Breed'], a_sonoma_clean['Type'])
a_sonoma_clean['Type'] = a_sonoma_clean['Type'].replace(animals)

a_sonoma_clean = a_sonoma_clean[["Outcome Date","Type", "Sex","Age"]]
a_sonoma_clean = a_sonoma_clean.rename(columns = {'Outcome Date':'Date'})

a_sonoma_clean = a_sonoma_clean.reset_index()
a_sonoma_clean = a_sonoma_clean.drop(['index'],axis = 1)
a_sonoma_clean.head()
a_sonoma_clean['Type'].unique()

array(['DOG', 'CAT', 'Livestock', 'Rabbit', 'Rodent', 'Bird'],
      dtype=object)

In [5]:
#Long Beach County_adoption
a_longbeach['Outcome Date'] = pd.to_datetime(a_longbeach['Outcome Date'])

a_longbeach_clean = a_longbeach[a_longbeach['Outcome Type'] == 'ADOPTION']
a_longbeach_clean = a_longbeach_clean.loc[((
    a_longbeach_clean['Outcome Date'] > pd.to_datetime("2019/01/01"))&(
    a_longbeach_clean['Outcome Date'] < pd.to_datetime("2020/12/31")))]

a_longbeach_clean = a_longbeach_clean[["Outcome Date","Animal Type", "Sex", "Age"]]
a_longbeach_clean = a_longbeach_clean.dropna()
a_longbeach_clean = a_longbeach_clean[a_longbeach_clean['Animal Type'] != 'OTHER']

a_longbeach_clean = a_longbeach_clean.rename(columns = {'Outcome Date' :'Date' , 
                                                  'Animal Type' : 'Type'})

a_longbeach_clean = a_longbeach_clean.reset_index()
a_longbeach_clean = a_longbeach_clean.drop(['index'],axis = 1)
a_longbeach_clean.head()
a_longbeach_clean['Type'].unique()

array(['CAT', 'DOG', 'RABBIT', 'REPTILE', 'GUINEA PIG', 'LIVESTOCK',
       'BIRD'], dtype=object)

In [6]:
#Austin County_adoption
a_austin['DateTime'] = pd.to_datetime(a_austin['DateTime'])

a_austin_clean = a_austin[a_austin['Outcome Type'] == 'Adoption']
a_austin_clean = a_austin_clean.loc[((
    a_austin_clean['DateTime'] > pd.to_datetime("2019/01/01"))&(
    a_austin_clean['DateTime'] < pd.to_datetime("2020/12/31")))]
a_austin_clean = a_austin_clean[["DateTime","Animal Type","Breed", "Sex upon Outcome","Age upon Outcome"]]
a_austin_clean = a_austin_clean.dropna()
a_austin_clean['Age upon Outcome'] = a_austin_clean['Age upon Outcome'].apply(standardize_age)

a_austin_clean['Animal Type'] = np.where(a_austin_clean['Animal Type'] == 'Other', a_austin_clean['Breed'], a_austin_clean['Animal Type'])
a_austin_clean['Animal Type'] = np.where(a_austin_clean['Animal Type'] == 'Livestock', a_austin_clean['Breed'], a_austin_clean['Animal Type'])
a_austin_clean['Animal Type'] = a_austin_clean['Animal Type'].replace(animals)

def sex(row):
    if row['Sex upon Outcome'] == "Spayed Female":
        val = 'Spayed'
    elif row['Sex upon Outcome'] == "Neutered Male":
        val = 'Neutered'
    elif row['Sex upon Outcome'] == "Intact Female":
        val = 'Female'
    else:
        val = 'Male'
    return val
a_austin_clean['Sex upon Outcome'] = a_austin_clean.apply(sex, axis=1)
a_austin_clean = a_austin_clean[["DateTime","Animal Type","Sex upon Outcome","Age upon Outcome"]]
a_austin_clean = a_austin_clean.rename(columns = {'Sex upon Outcome':'Sex', 
                                                  'Age upon Outcome':'Age' , 
                                                  'Animal Type':'Type',
                                                 'DateTime' : 'Date'})

from datetime import datetime
from datetime import date
def rem_time(d):
    s = date(d.year,d.month, d.day)
    return s

a_austin_clean['Date'] = a_austin_clean['Date'].apply(rem_time)


a_austin_clean = a_austin_clean.reset_index()
a_austin_clean = a_austin_clean.drop(['index'],axis = 1)
a_austin_clean.head()

a_austin_clean['Type'].unique()

array(['Dog', 'Cat', 'Rabbit', 'Bird', 'Rodent', 'Livestock', 'Hedgehog'],
      dtype=object)

In [7]:
#Bloomington
'''
a_sonoma['Outcome Date'] = pd.to_datetime(a_sonoma['Outcome Date'])
a_sonoma['Date Of Birth'] = pd.to_datetime(a_sonoma['Date Of Birth'])

a_sonoma_clean = a_sonoma[a_sonoma['Outcome Type'] == 'ADOPTION']
a_sonoma_clean = a_sonoma_clean.loc[((
    a_sonoma_clean['Outcome Date'] > pd.to_datetime("2019/01/01"))&(
    a_sonoma_clean['Outcome Date'] < pd.to_datetime("2020/12/31")))]

a_sonoma_clean['Age'] = a_sonoma_clean['Outcome Date']-a_sonoma_clean['Date Of Birth']

a_sonoma_clean = a_sonoma_clean[["Outcome Date","Type","Breed", "Sex" ,"Age"]]
a_sonoma_clean = a_sonoma_clean.dropna()

a_sonoma_clean['Age'] = a_sonoma_clean['Age'].apply(sonoma_standardize_age)

a_sonoma_clean['Type'] = np.where(a_sonoma_clean['Type']== 'OTHER', a_sonoma_clean['Breed'], a_sonoma_clean['Type'])
a_sonoma_clean['Type'] = a_sonoma_clean['Type'].replace(animals)

a_sonoma_clean = a_sonoma_clean[["Outcome Date","Type", "Sex","Age"]]
a_sonoma_clean = a_sonoma_clean.rename(columns = {'Outcome Date':'Date'})

a_sonoma_clean = a_sonoma_clean.reset_index()
a_sonoma_clean = a_sonoma_clean.drop(['index'],axis = 1)
a_sonoma_clean.head()
a_sonoma_clean['Type'].unique()
'''

a_bloom['movementdate'] = pd.to_datetime(a_bloom['movementdate'])
a_bloom_clean = a_bloom[a_bloom['movementtype'] == 'Adoption']
a_bloom_clean = a_bloom_clean.loc[((
    a_bloom_clean['movementdate'] > pd.to_datetime("2019/01/01"))&(
    a_bloom_clean['movementdate'] < pd.to_datetime("2020/12/31")))]

a_bloom_clean = a_bloom_clean[["movementdate","speciesname",'sexname','animalage']]



a_bloom_clean['speciesname'] = a_bloom_clean['speciesname'].replace(animals)
a_bloom_clean = a_bloom_clean.drop(a_bloom_clean[a_bloom_clean['sexname']=='Unknown'].index)



a_bloom_clean = a_bloom_clean.rename(columns = {'movementdate':'Date', 
                                                'speciesname':'Type',
                                                'sexname':'Sex' })


'''
def standardize_age2(agetime_in):
    try:
        str_in = str(agetime_in)
        str_in = str_in.lower()
        str_in = str_in.strip()
        str_in = str_in.replace('.', '')
        
        #keep
        if 'day' in str_in or 'days' in str_in:
            str_in = str_in.replace('days', '')
            str_in = str_in.replace('day', '')
            output = float(str_in)
            output = round(output / 365, ROUND_DECIMALS)
        elif 'week' in str_in or 'weeks' in str_in:
            str_in = str_in.replace('weeks', '')
            str_in = str_in.replace('week', '')
            output = float(str_in)
            output = round(output / 52, ROUND_DECIMALS)
            
        elif 'month' in str_in or 'months' in str_in or 'year' in str_in or 'years' in str_in:
            str_in = str_in.replace('months', '')
            str_in = str_in.replace('month', '')
            str_in = str_in.replace('years', ' ')
            str_in = str_in.replace('year', ' ')
            str_in = list(str_in.split(' '))
            output = float(str_in)
            output = round(output / 12, ROUND_DECIMALS)
       
       
            
            output = round(float(str_in), ROUND_DECIMALS)
        else:
            output = np.nan
    except:
        output = np.nan
        
    return output

def Convert(string):
    li = list(string.split(" "))
    return li
 
'''

a_bloom_clean.head()
a_bloom['animalage'].unique()

array(['12 years 6 months.', '12 years 5 months.', '9 years 10 months.',
       '12 years 3 months.', '9 years 3 months.', '7 years 10 months.',
       '6 years 6 months.', '7 years 5 months.', '8 years 2 months.',
       '7 years 2 months.', '6 years 7 months.', '5 years 2 months.',
       '13 years 3 months.', '6 years 8 months.', '6 years 1 month.',
       '16 years 3 months.', '17 years 1 month.', '8 years 1 month.',
       '10 years 2 months.', '15 years 1 month.', '7 years 1 month.',
       '6 years 0 months.', '5 years 11 months.', '7 years 6 months.',
       '5 years 3 months.', '8 years 11 months.', '10 years 0 months.',
       '6 years 2 months.', '10 years 1 month.', '12 years 0 months.',
       '7 years 3 months.', '9 years 0 months.', '5 years 1 month.',
       '5 years 0 months.', '8 years 0 months.', '5 years 9 months.',
       '5 years 8 months.', '7 years 0 months.', '10 years 3 months.',
       '8 years 8 months.', '6 years 5 months.', '20 years 0 months.',
       '21

In [11]:
#Norfolk County_adoption
a_norfolk['Intake Date'] = pd.to_datetime(a_norfolk['Intake Date'])
a_norfolk_clean = a_norfolk[a_norfolk['Outcome Type'] == 'Adoption']
a_norfolk_clean = a_norfolk_clean.loc[((
    a_norfolk_clean['Intake Date'] > pd.to_datetime("2019/01/01"))&(
    a_norfolk_clean['Intake Date'] < pd.to_datetime("2020/12/31")))]

a_norfolk_clean['Age'] = a_norfolk_clean['Years Old']

a_norfolk_clean = a_norfolk_clean[["Outcome Date","Outcome Type","Primary Breed", "Sex" ,"Age"]]
a_norfolk_clean = a_norfolk_clean.dropna()

a_norfolk_clean['Age'] = a_norfolk_clean['Age'].apply(standardize_age)

a_norfolk_clean['Type'] = np.where(a_norfolk_clean['Outcome Type']== 'OTHER', a_norfolk_clean['Primary Breed'], a_norfolk_clean['Outcome Type'])
a_norfolk_clean['Type'] = a_norfolk_clean['Outcome Type'].replace(animals)

def sex(row):
    if row['Sex upon Outcome'] == "Spayed Female":
        val = 'Spayed'
    elif row['Sex upon Outcome'] == "Neutered Male":
        val = 'Neutered'
    elif row['Sex upon Outcome'] == "Intact Female":
        val = 'Female'
    else:
        val = 'Male'
    return val

a_norfolk_clean = a_norfolk_clean[["Outcome Date","Type", "Sex","Age"]]
a_norfolk_clean = a_norfolk_clean.rename(columns = {'Outcome Date':'Date'})

a_norfolk_clean = a_norfolk_clean.reset_index()
a_norfolk_clean = a_norfolk_clean.drop(['index'],axis = 1)
a_norfolk_clean.head()
a_norfolk_clean['Type'].unique()

array(['Adoption'], dtype=object)

In [None]:
a_sacramento

# Data Analysis & Results (EDA)

Carry out EDA on your dataset(s); Describe in this section



For each of pet adoption dataset in three regions, we splitted it into subsets of data in 2019 and in 2020. Then, we took a closer look at the the trends of pet adoption in 2019 and 2020 side by side. 

In [None]:
# Long beach 
a_longbeach_clean = a_longbeach_clean.sort_values(by='Date')
a_longbeach_clean['year'] = pd.DatetimeIndex(a_longbeach_clean['Date']).year
a_longbeach_clean = a_longbeach_clean.drop(a_longbeach_clean[a_longbeach_clean['Sex']=='Unknown'].index)
order=['DOG', 'CAT','RABBIT','REPTILE', 'BIRD', 'RODENT','LIVESTOCK']
a_longbeach_clean

In [None]:
longbeach_2019 = a_longbeach_clean.loc[a_longbeach_clean['year']==2019]
longbeach_2020 = a_longbeach_clean.loc[a_longbeach_clean['year']==2020]

Shown below is a bar graph comparing the total adoption of various pets in Long Beach in 2019 and 2020. Except bird, there are observable decreases in adopting all the types of pets after covid start in 2020. For both years, more cats were adopted than other pet types, while dog was the second perferred pets for adoption. Adoptions for all other types were much less. 

In [None]:
sns.countplot(data = a_longbeach_clean, x = 'Type', hue = 'year',order=order)

Grouped by pet types, the sex differences in adopted pets were illustrated in this graph. In general, less intact cats and dogs were adopted after the start of the covid and all birds adopted in 2020 seemed to be all intact in Long Beach. 

In [None]:
fig, axes = plt.subplots(1, 2, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = longbeach_2019, x = 'Type',hue = 'Sex', ax = axes[0], order=order)
ax2 = sns.countplot(data = longbeach_2020, x = 'Type',hue = 'Sex', ax = axes[1], order=order)
ax1.title.set_text('Sex related to Types in 2019 at Long Beach')
ax2.title.set_text('Sex related to Types in 2020 at Long Beach')

Ages of adopted pets in Long Beach seemed to be skewed to right in both 2019 and 2020. People seemed to have a consistent preference to young cats, while their preferred ages for dogs seemed to be about 1-2 years older than cats. 

In [None]:
fig, axes = plt.subplots(2, 1, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = longbeach_2019, x = 'Age',hue = 'Type', ax = axes[0],hue_order=order)
ax2 = sns.countplot(data = longbeach_2020, x = 'Age',hue = 'Type', ax = axes[1],hue_order=order)
ax1.title.set_text('Age related to Types in 2019 at Long Beach')
ax2.title.set_text('Age related to Types in 2020 at Long Beach')

In [None]:
# Basic Statistical analysis
overall_adoption_change(a_longbeach_clean)

Same data processing steps were used to make subsets of pet adoptions in 2019 and 2020 for sonoma.

In [None]:
#Sonoma
a_sonoma_clean = a_sonoma_clean.sort_values(by='Date')
a_sonoma_clean['year'] = pd.DatetimeIndex(a_sonoma_clean['Date']).year
a_sonoma_clean

In [None]:
sonoma_2019 = a_sonoma_clean.loc[a_sonoma_clean['year']==2019]
sonoma_2020 = a_sonoma_clean.loc[a_sonoma_clean['year']==2020]

A dramatic decline in pet adoption was observed in sonoma after covid starts in 2020, but the preference towards cat and dog adoptions proceeds. 

In [None]:
sns.countplot(data = a_sonoma_clean, x = 'Type', hue = 'year')

Mostly pets adopted in Sonoma seemed to be already neutered or spayed. There seemed to be a preference towards spayed cats and neutered dogs in 2020, while people adopted more neutered cats and spayed dogs in 2019. 

In [None]:
fig, axes = plt.subplots(1, 2, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = sonoma_2019, x = 'Type',hue = 'Sex', ax = axes[0])
ax2 = sns.countplot(data = sonoma_2020, x = 'Type',hue = 'Sex', ax = axes[1])
ax1.title.set_text('Sex related to Types in 2019 at Sonoma')
ax2.title.set_text('Sex related to Types in 2020 at Sonoma')

Age preferences towards adoptable pets seemed to hold true for people in sonoma. 

In [None]:
fig, axes = plt.subplots(2, 1, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = sonoma_2019, x = 'Age',hue = 'Type', ax = axes[0])
ax2 = sns.countplot(data = sonoma_2020, x = 'Age',hue = 'Type', ax = axes[1])
ax1.title.set_text('Age related to Types in 2019 at Sonoma')
ax2.title.set_text('Age related to Types in 2020 at Sonoma')

In [None]:
# Basic Statistical analysis
overall_adoption_change(a_sonoma_clean)

Subsets of Austin were also generated by year. 

In [None]:
# Austin
a_austin_clean['year'] = pd.DatetimeIndex(a_austin_clean['Date']).year

In addition to decreased adoption counts, people seemed to have a preference to cats in 2019, while their preference  shifted to dogs in 2020. 

In [None]:
# Type 2019 vs 2020
austin_2019 = a_austin_clean[a_austin_clean['year'] == 2019]
austin_2020 = a_austin_clean[a_austin_clean['year'] == 2020]

fig, axes = plt.subplots(1, 2, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = austin_2019, x = 'Type', hue = 'year', ax = axes[0])
ax2 = sns.countplot(data = austin_2020, x = 'Type', hue = 'year',ax = axes[1])
ax1.title.set_text('Type count in 2019 at Austin')
ax2.title.set_text('Type count in 2020 at Austin')

Like in other regions, people in Austin have a preference towards really young cats, and their preferred age of adopted dogs seemed to be about 1 year older than which of cats. 

In [None]:
# Age by type in 2019 vs 2020
austin_two_year = a_austin_clean[(a_austin_clean['year'] == 2019) | (a_austin_clean['year'] == 2020)]

fig, axes = plt.subplots(figsize=(40,10))
Age_compare = sns.barplot(x="Type", y="Age", hue="year", data=austin_two_year)

There seemed to be an increased preference towards neutered dogs in 2020. In Austin, people seemed to consistently prefer neutered cats. 

In [None]:
# Sex by type in 2019 vs 2020
fig, axes = plt.subplots(1, 2, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = austin_2019, x = 'Type',hue = 'Sex', ax = axes[0])
ax2 = sns.countplot(data = austin_2020, x = 'Type',hue = 'Sex',ax = axes[1])
ax1.title.set_text('Sex by types in 2019 at Austin')
ax2.title.set_text('Sex by types in 2020 at Austin')

In [None]:
# Basic Statistical analysis
overall_adoption_change(a_austin_clean)

After looking at differences in pet adoption trends in different regions, we combined them into a larger dataset. 

In [None]:
# Combining dataset from three regions
combined_2019 = pd.concat([longbeach_2019, sonoma_2019, austin_2019], ignore_index=True)
combined_2020 = pd.concat([longbeach_2020, sonoma_2020, austin_2020], ignore_index=True)
combined_2019['Type'] = combined_2019['Type'].str.upper()
combined_2020['Type'] = combined_2020['Type'].str.upper()
combined_total = pd.concat([combined_2019, combined_2020], ignore_index=True)

Decreases in pet adoption were well illustrated this combined bar graph. Dog and cat adoptions almost decreaed by 50% by counts.

In [None]:
sns.countplot(data = combined_total, x = 'Type', hue = 'year')
plt.legend(loc = 'upper right')

People's preference towards sex of adoptable pets seemed to be consistent across pet types. 

In [None]:
# Sex by type in 2019 vs 2020
fig, axes = plt.subplots(1, 2, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = combined_2019, x = 'Type',hue = 'Sex', ax = axes[0], order=order)
ax2 = sns.countplot(data = combined_2020, x = 'Type',hue = 'Sex', ax = axes[1], order=order)
ax1.title.set_text('Sex related to Types in 2019 Combined')
ax2.title.set_text('Sex related to Types in 2020 Combined')


In [None]:
combined_total['Age'].max()

In [None]:
fig, axes = plt.subplots(2, 1, sharex=True, figsize=(20,10))
ax1 = sns.countplot(data = combined_2019, x = 'Age',hue = 'Type', ax = axes[0],hue_order=order)
ax2 = sns.countplot(data = combined_2020, x = 'Age',hue = 'Type', ax = axes[1],hue_order=order)
ax1.title.set_text('Age related to Types in 2019 Combined')
ax2.title.set_text('Age related to Types in 2020 Combined')
plt.xticks(range(6))

In [None]:
fig, axes = plt.subplots(figsize=(40,10))
Age_compare = sns.barplot(x="Type", y="Age", hue="year", data=combined_total)

In [None]:
# Basic Statistical analysis
overall_adoption_change(combined_total)

Based on individual datasets from Long Beach, Sonoma, and Austin as well as combined dataset, there is a clear decrease in pet adoption in 2020 than in 2019. Besides overall decrease, we also observed the skewedness of ages of adopted pets. In general, most adopted pets were cats or dogs, in which most of them were neutered or spayed. 