## EDA: Examining a Dataset Drill

For this drill, we will using a modified dataset from the CO Animal Shelter info (named PACFA and can be found here: https://ag.colorado.gov/ics/pet-animal-care-facilities-act-pacfa/animal-shelter-and-rescue-individual-statistics )

The data consists of two separate files - one for the 2019 statistics and one for the 2020 statistics.  

Your boss wants you to answer a few of questions from the dataset:
- Overall numbers for 2019 and 2020 in these categories:
  - Intake for cats & dogs
  - Outake for cats & dogs
  - Comparison of 2020 vs 2019 both in numbers and a percent
  - Make this a table so you can present it to your boss & other stakeholders
- Did data quality improve from 2019 to 2020?
- Anecdotally, during 2020 it seemed like everyone was adopting new pets - does this show in the data?


Some considerations about the dataset:
- In 2020, the collection requirements and process changed in an attempt to improve data quality. This also caused the column names to be slightly different between years. 
- You can combine the files into one but this isn't necessary; both approaches have different problems to solve!

### Examine the dataset

Use as many cells below as you'd like to examine the two files:

- 2019 Individual Shelter and Rescue Report.csv
- 2020 Animal Shelter and Rescue Individual Report.csv

Some areas to check (feel free to check more items also!):

- Columns and column names 
- Datatypes of each of the columns/variables
- Min/Max/Mean statistics of each of those columns/variables
- Any NA values?

In [1]:
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', None)

In [2]:
# first lets go through the 2019 data
as19 = pd.read_csv('2019 Individual Shelter and Rescue Report.csv')

In [3]:
print(len(as19))
as19.head()

355


Unnamed: 0,Facility Name,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Adult,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Juvenile,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Adult,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Juvenile,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Birds,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Adult,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Juvenile,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Cats-Adult,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Cats-Juvenile,Animal Intake Statistics - Stray - Dogs-Adult,Animal Intake Statistics - Stray - Dogs-Juvenile,Animal Intake Statistics - Stray - Cats-Adult,Animal Intake Statistics - Stray - Cats-Juvenile,Animal Intake Statistics - Owner Relinquished - Dogs-Adult,Animal Intake Statistics - Owner Relinquished - Dogs-Juvenile,Animal Intake Statistics - Owner Relinquished - Cats-Adult,Animal Intake Statistics - Owner Relinquished - Cats-Juvenile,Animal Intake Statistics - Transfer In from a Colorado Organization - Dogs-Adult,Animal Intake Statistics - Transfer In from a Colorado Organization - Dogs-Juvenile,Animal Intake Statistics - Transfer In from a Colorado Organization - Cats-Adult,Animal Intake Statistics - Transfer In from a Colorado Organization - Cats-Juvenile,Animal Intake Statistics - Transfer In from an Out of State Organization - Dogs-Adult,Animal Intake Statistics - Transfer In from an Out of State Organization - Dogs-Juvenile,Animal Intake Statistics - Transfer In from an Out of State Organization - Cats-Adult,Animal Intake Statistics - Transfer In from an Out of State Organization - Cats-Juvenile,"Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Dogs-Adult","Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Dogs-Juvenile","Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Cats-Adult","Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Cats-Juvenile",Animal Outcome Statistics - Adoption - Dogs-Adult,Animal Outcome Statistics - Adoption - Dogs-Juvenile,Animal Outcome Statistics - Adoption - Cats-Adult,Animal Outcome Statistics - Adoption - Cats-Juvenile,Animal Outcome Statistics - Return to Owner - Dogs-Adult,Animal Outcome Statistics - Return to Owner - Dogs-Juvenile,Animal Outcome Statistics - Return to Owner - Cats-Adult,Animal Outcome Statistics - Return to Owner - Cats-Juvenile,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Dogs-Adult,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Dogs-Juvenile,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Cats-Adult,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Cats-Juvenile,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Dogs-Adult,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Dogs-Juvenile,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Cats-Adult,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Cats-Juvenile,Animal Outcome Statistics - Other Live Outcomes - Dogs-Adult,Animal Outcome Statistics - Other Live Outcomes - Dogs-Juvenile,Animal Outcome Statistics - Other Live Outcomes - Cats-Adult,Animal Outcome Statistics - Other Live Outcomes - Cats-Juvenile,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Dogs-Adult,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Dogs-Juvenile,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Cats-Adult,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Cats-Juvenile,Unnamed: 54
0,2 Blondes All Breed Rescue,59.0,30.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,526.0,1228.0,0.0,0.0,54.0,81.0,0.0,0.0,448.0,1262.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,78.0,0.0,0.0,0.0,
1,"2nd Chance Vizsla Rescue, Inc.",0.0,0.0,0.0,0.0,0.0,8.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,5.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,4 Paws 4 Life Rescue,26.0,16.0,6.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,23.0,7.0,3.0,4.0,4.0,0.0,0.0,0.0,274.0,498.0,28.0,38.0,0.0,0.0,0.0,0.0,311.0,491.0,32.0,41.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,6.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,18.0,5.0,4.0,
3,9 Lives Rescue,0.0,0.0,0.0,0.0,0.0,0.0,0.0,35.0,0.0,0.0,0.0,4.0,1.0,0.0,0.0,12.0,0.0,0.0,0.0,11.0,9.0,0.0,0.0,13.0,39.0,0.0,0.0,7.0,0.0,0.0,0.0,82.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4.0,0.0,0.0,0.0,0.0,0.0,
4,Acadiana Animal Aid,17.0,44.0,76.0,9.0,0.0,32.0,41.0,4.0,11.0,89.0,71.0,31.0,119.0,68.0,55.0,54.0,12.0,0.0,0.0,0.0,0.0,466.0,1037.0,245.0,325.0,9.0,32.0,10.0,4.0,207.0,276.0,146.0,228.0,10.0,1.0,0.0,0.0,101.0,324.0,74.0,20.0,374.0,560.0,160.0,156.0,0.0,0.0,0.0,0.0,17.0,26.0,32.0,37.0,


In [4]:
as19.columns

Index(['Facility Name',
       'Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Adult',
       'Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Juvenile',
       'Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Adult',
       'Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Juvenile',
       'Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Birds',
       'Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Adult',
       'Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Juvenile',
       'Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Cats-Adult',
       'Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Cats-Juvenile',
       'Animal Intake Statistics - Stray - Dogs-Adult',
       'Animal Intake Statistics - Stray - Dogs-Juvenile',
       'Animal Intake Statistics - Stray - Cats-Adult',
       'Animal Intake Statistics - Str

It looks like there is an extra column for some reason - I want to look at whats happening there. Most of the column names are very long so might need to rename the ones we need to use.

In [5]:
as19['Unnamed: 54'].value_counts(dropna=False)

NaN    355
Name: Unnamed: 54, dtype: int64

So this column is holding no information - could be just how the data is stored with an extra unnamed column in the dataset. Probably can drop this one. 

In [6]:
as19.dtypes

Facility Name                                                                                              object
Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Adult                                 float64
Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Juvenile                              float64
Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Adult                                 float64
Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Juvenile                              float64
Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Birds                                      float64
Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Adult                                float64
Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Juvenile                             float64
Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Cats-Adult              

Looks like the datatypes for everything is float - not sure if there are fractions in there or nans but for now float is ok

In [7]:
as19.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 355 entries, 0 to 354
Data columns (total 55 columns):
 #   Column                                                                                                  Non-Null Count  Dtype  
---  ------                                                                                                  --------------  -----  
 0   Facility Name                                                                                           349 non-null    object 
 1   Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Adult                               349 non-null    float64
 2   Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Juvenile                            349 non-null    float64
 3   Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Adult                               349 non-null    float64
 4   Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Juvenile                

This dataset looks pretty clean and doesnt have any non-null values in all but the last column

In [8]:
as19.describe()

Unnamed: 0,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Adult,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Dogs-Juvenile,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Adult,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Cats-Juvenile,Starting Animal Statistics - In Shelter Count as of 1/1/2019 - Birds,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Adult,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Dogs-Juvenile,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Cats-Adult,Starting Animal Statistics - In Foster Care Count as 1/1/2019 - Cats-Juvenile,Animal Intake Statistics - Stray - Dogs-Adult,Animal Intake Statistics - Stray - Dogs-Juvenile,Animal Intake Statistics - Stray - Cats-Adult,Animal Intake Statistics - Stray - Cats-Juvenile,Animal Intake Statistics - Owner Relinquished - Dogs-Adult,Animal Intake Statistics - Owner Relinquished - Dogs-Juvenile,Animal Intake Statistics - Owner Relinquished - Cats-Adult,Animal Intake Statistics - Owner Relinquished - Cats-Juvenile,Animal Intake Statistics - Transfer In from a Colorado Organization - Dogs-Adult,Animal Intake Statistics - Transfer In from a Colorado Organization - Dogs-Juvenile,Animal Intake Statistics - Transfer In from a Colorado Organization - Cats-Adult,Animal Intake Statistics - Transfer In from a Colorado Organization - Cats-Juvenile,Animal Intake Statistics - Transfer In from an Out of State Organization - Dogs-Adult,Animal Intake Statistics - Transfer In from an Out of State Organization - Dogs-Juvenile,Animal Intake Statistics - Transfer In from an Out of State Organization - Cats-Adult,Animal Intake Statistics - Transfer In from an Out of State Organization - Cats-Juvenile,"Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Dogs-Adult","Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Dogs-Juvenile","Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Cats-Adult","Animal Intake Statistics - Other; TNR/Protective Custody/Returns/Disaster Relief, etc. - Cats-Juvenile",Animal Outcome Statistics - Adoption - Dogs-Adult,Animal Outcome Statistics - Adoption - Dogs-Juvenile,Animal Outcome Statistics - Adoption - Cats-Adult,Animal Outcome Statistics - Adoption - Cats-Juvenile,Animal Outcome Statistics - Return to Owner - Dogs-Adult,Animal Outcome Statistics - Return to Owner - Dogs-Juvenile,Animal Outcome Statistics - Return to Owner - Cats-Adult,Animal Outcome Statistics - Return to Owner - Cats-Juvenile,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Dogs-Adult,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Dogs-Juvenile,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Cats-Adult,Animal Outcome Statistics - Transfer Out to a Colorado Organization - Cats-Juvenile,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Dogs-Adult,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Dogs-Juvenile,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Cats-Adult,Animal Outcome Statistics - Transfer Out to an Out of State Organization - Cats-Juvenile,Animal Outcome Statistics - Other Live Outcomes - Dogs-Adult,Animal Outcome Statistics - Other Live Outcomes - Dogs-Juvenile,Animal Outcome Statistics - Other Live Outcomes - Cats-Adult,Animal Outcome Statistics - Other Live Outcomes - Cats-Juvenile,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Dogs-Adult,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Dogs-Juvenile,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Cats-Adult,Ending Animal Statistics - In Shelter Count as of 12/31/2019 - Cats-Juvenile,Unnamed: 54
count,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,349.0,0.0
mean,11.338109,2.971347,7.352436,1.398281,1.868195,5.22063,2.902579,1.911175,2.555874,85.094556,7.498567,37.492837,33.014327,46.759312,9.713467,32.836676,18.303725,13.206304,5.444126,6.793696,8.77937,47.180516,54.538682,9.647564,19.137536,22.191977,5.765043,16.936963,7.759312,117.842407,69.842407,68.527221,64.524355,65.045845,2.011461,7.432665,0.567335,12.65616,4.309456,5.530086,7.985673,3.573066,2.710602,0.590258,0.767908,2.111748,0.510029,10.802292,2.389685,9.243553,1.638968,7.633238,2.584527,
std,39.311755,26.253172,24.655538,5.55889,32.208549,12.069722,11.353349,8.227765,14.221378,357.741494,26.197704,184.072381,123.812137,215.604908,35.286811,176.773929,74.723663,43.7553,23.84156,28.64814,38.552249,133.871074,194.934875,44.134384,135.278336,119.349002,36.524544,93.61594,42.094372,365.278552,211.922659,283.907267,241.036299,269.435773,7.510138,36.157449,3.422234,51.224902,23.840501,22.680751,40.328426,28.798916,32.838939,8.670933,9.478075,16.076026,7.300058,81.531169,18.948888,21.899608,10.08876,27.526427,10.88851,
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-10.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-16.0,0.0,-20.0,0.0,0.0,0.0,0.0,0.0,
25%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
50%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,15.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
75%,7.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,0.0,3.0,0.0,1.0,0.0,16.0,5.0,3.0,2.0,7.0,0.0,0.0,0.0,28.0,13.0,0.0,0.0,0.0,0.0,0.0,0.0,92.0,40.0,14.0,14.0,1.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,7.0,0.0,0.0,0.0,
max,608.0,476.0,242.0,50.0,601.0,97.0,145.0,90.0,178.0,4245.0,228.0,2434.0,1337.0,3160.0,318.0,2597.0,770.0,567.0,313.0,380.0,414.0,1340.0,1914.0,519.0,2176.0,1292.0,614.0,1362.0,547.0,4321.0,1991.0,3635.0,2465.0,3004.0,81.0,402.0,48.0,649.0,324.0,244.0,540.0,374.0,560.0,160.0,156.0,216.0,135.0,1316.0,310.0,136.0,170.0,264.0,143.0,


The list of items that need to be done for the 2019 data:
- drop the column with no information
- figure out what columns we need in the analysis; drop the rest 
- rename the columns will we use for ease of access

Next let's look at the 2020 data!

In [9]:
as20 = pd.read_csv('2020 Animal Shelter and Rescue Individual Report.csv')
print(len(as20))
as20.head()

450


Unnamed: 0,Facility Name,1/1/2020\n Adult Dogs\n In Shelter,1/1/2020\n Adult Dogs\n In Foster Care,2020\n Adult Dogs\n Stray,2020\n Adult Dogs\n Owner Relinquished,2020\n Adult Dogs Transfer from another Colorado Organization,2020\n Adult Dogs\n Transfer from Out of State,2020\n Adult Dogs\n Other Intake,2020\n Adult Dogs\n Adoption,2020\n Adult Dogs\n Returned to Owner (RTO),2020\n Adult Dogs\n Transfer to another Colorado Organization,2020\n Adult Dogs\n Transfer to Out of State,2020\n Adult Dogs\n Other Transfer,1/1/2020\n Juvenile Dogs\n In Shelter,1/1/2020\n Juvenile Dogs\n In Foster Care,2020\n Juvenile Dogs\n Stray,2020\n Juvenile Dogs\n Owner Relinquished,2020\n Juvenile Dogs\n Transfer from another Colorado Organization,2020\n Juvenile Dogs\n Transfer from Out of State,2020\n Juvenile Dogs\n Other Intake,2020\n Juvenile Dogs\n Adoption,2020\n Juvenile Dogs\n Returned to Owner (RTO),2020\n Juvenile Dogs\n Transfer to another Colorado Organization,2020\n Juvenile Dogs\n Transfer to Out of State,2020\n Juvenile Dogs\n Other Transfer,1/1/2020\n Adult Cats\n In Shelter,1/1/2020\n Adult Cats\n In Foster Care,2020\n Adult Cats\n Stray,2020\n Adult Cats\n Owner Relinquished,2020\n Adult Cats\n Transfer from another Colorado Organization,2020\n Adult Cats\n Transfer from Out of State,2020\n Adult Cats\n Other Intake,2020\n Adult Cats\n Adoption,2020\n Adult Cats\n Returned to Owner (RTO),2020\n Adult Cats\n Transfer to another Colorado Organization,2020\n Adult Cats\n Transfer to Out of State,2020\n Adult Cats\n Other Transfer,1/1/2020\n Juvenile Cats\n In Shelter,1/1/2020\n Juvenile Cats\n In Foster Care,2020\n Juvenile Cats\n Stray,2020\n Juvenile Cats\n Owner Relinquished,2020\n Juvenile Cats\n Transfer from another Colorado Organization,2020\n Juvenile Cats\n Transfer from Out of State,2020\n Juvenile Cats\n Other Intake,2020\n Juvenile Cats\n Adoption,2020\n Juvenile Cats\n Returned to Owner (RTO),2020\n Juvenile Cats\n Transfer to another Colorado Organization,2020\n Juvenile Cats\n Transfer to Out of State,2020\n Juvenile Cats\n Other Transfer,Unnamed: 49
0,"2 Blondes All Breed Rescue, Inc.",78.0,47.0,0.0,0.0,0.0,379.0,40.0,498.0,0.0,0.0,1.0,0.0,0.0,56.0,0.0,0.0,0.0,1179.0,0.0,1157.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
1,"2nd Chance Vizsla Rescue, Inc.",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
2,4 Paws 4 Life Rescue,5.0,0.0,0.0,11.0,0.0,454.0,0.0,459.0,0.0,0.0,6.0,0.0,18.0,0.0,0.0,9.0,0.0,859.0,19.0,882.0,0.0,0.0,0.0,0.0,5.0,0.0,0.0,2.0,0.0,46.0,0.0,45.0,0.0,0.0,0.0,0.0,4.0,0.0,7.0,4.0,0.0,157.0,12.0,174.0,0.0,0.0,0.0,0.0,
3,9 Lives Rescue,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,42.0,17.0,4.0,6.0,0.0,7.0,47.0,1.0,0.0,0.0,3.0,0.0,0.0,12.0,8.0,0.0,0.0,8.0,11.0,0.0,0.0,0.0,0.0,
4,A Friend of Jack Rescue,0.0,0.0,0.0,1.0,0.0,181.0,0.0,175.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,101.0,32.0,108.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,


From the first look - the columns are completely different and there are more rows in this dataset (more shelters were added in 2020 maybe - seems like a big increase?)

In [10]:
as20.columns

Index(['Facility Name', '1/1/2020\n Adult Dogs\n In Shelter',
       '1/1/2020\n Adult Dogs\n In Foster Care', '2020\n Adult Dogs\n Stray',
       '2020\n Adult Dogs\n Owner Relinquished',
       '2020\n Adult Dogs Transfer from another Colorado Organization',
       '2020\n Adult Dogs\n Transfer from Out of State',
       '2020\n Adult Dogs\n Other Intake', '2020\n Adult Dogs\n Adoption',
       '2020\n Adult Dogs\n Returned to Owner (RTO)',
       '2020\n Adult Dogs\n Transfer to another Colorado Organization',
       '2020\n Adult Dogs\n Transfer to Out of State',
       '2020\n Adult Dogs\n Other Transfer',
       '1/1/2020\n Juvenile Dogs\n In Shelter',
       '1/1/2020\n Juvenile Dogs\n In Foster Care',
       '2020\n Juvenile Dogs\n Stray',
       '2020\n Juvenile Dogs\n Owner Relinquished',
       '2020\n Juvenile Dogs\n Transfer from another Colorado Organization',
       '2020\n Juvenile Dogs\n Transfer from Out of State',
       '2020\n Juvenile Dogs\n Other Intake',
       

Since the column names are different - we will have to find the matching columns in 2019 to compare them to 2020. We also have an extra column here. Lets check that out first.

In [11]:
as20['Unnamed: 49'].value_counts(dropna=False)

NaN    450
Name: Unnamed: 49, dtype: int64

Again its all NaNs so we can drop it.

In [12]:
as20.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 450 entries, 0 to 449
Data columns (total 50 columns):
 #   Column                                                            Non-Null Count  Dtype  
---  ------                                                            --------------  -----  
 0   Facility Name                                                     356 non-null    object 
 1   1/1/2020
 Adult Dogs
 In Shelter                                  356 non-null    float64
 2   1/1/2020
 Adult Dogs
 In Foster Care                              356 non-null    float64
 3   2020
 Adult Dogs
 Stray                                           356 non-null    float64
 4   2020
 Adult Dogs
 Owner Relinquished                              356 non-null    float64
 5   2020
 Adult Dogs Transfer from another Colorado Organization      356 non-null    float64
 6   2020
 Adult Dogs
 Transfer from Out of State                      356 non-null    float64
 7   2020
 Adult Dogs
 Other Intake     

Looking at the info - the columns are float again so that matches the first dataset. But there are 450 entries and most columns have 356 nulls. I thought it was slighlty off that over 100 new shelters were added year over year (almost 25% increase) so I wonder what is happening here. Let's take a look at the bottom of the dataset.

In [13]:
as20.tail()

Unnamed: 0,Facility Name,1/1/2020\n Adult Dogs\n In Shelter,1/1/2020\n Adult Dogs\n In Foster Care,2020\n Adult Dogs\n Stray,2020\n Adult Dogs\n Owner Relinquished,2020\n Adult Dogs Transfer from another Colorado Organization,2020\n Adult Dogs\n Transfer from Out of State,2020\n Adult Dogs\n Other Intake,2020\n Adult Dogs\n Adoption,2020\n Adult Dogs\n Returned to Owner (RTO),2020\n Adult Dogs\n Transfer to another Colorado Organization,2020\n Adult Dogs\n Transfer to Out of State,2020\n Adult Dogs\n Other Transfer,1/1/2020\n Juvenile Dogs\n In Shelter,1/1/2020\n Juvenile Dogs\n In Foster Care,2020\n Juvenile Dogs\n Stray,2020\n Juvenile Dogs\n Owner Relinquished,2020\n Juvenile Dogs\n Transfer from another Colorado Organization,2020\n Juvenile Dogs\n Transfer from Out of State,2020\n Juvenile Dogs\n Other Intake,2020\n Juvenile Dogs\n Adoption,2020\n Juvenile Dogs\n Returned to Owner (RTO),2020\n Juvenile Dogs\n Transfer to another Colorado Organization,2020\n Juvenile Dogs\n Transfer to Out of State,2020\n Juvenile Dogs\n Other Transfer,1/1/2020\n Adult Cats\n In Shelter,1/1/2020\n Adult Cats\n In Foster Care,2020\n Adult Cats\n Stray,2020\n Adult Cats\n Owner Relinquished,2020\n Adult Cats\n Transfer from another Colorado Organization,2020\n Adult Cats\n Transfer from Out of State,2020\n Adult Cats\n Other Intake,2020\n Adult Cats\n Adoption,2020\n Adult Cats\n Returned to Owner (RTO),2020\n Adult Cats\n Transfer to another Colorado Organization,2020\n Adult Cats\n Transfer to Out of State,2020\n Adult Cats\n Other Transfer,1/1/2020\n Juvenile Cats\n In Shelter,1/1/2020\n Juvenile Cats\n In Foster Care,2020\n Juvenile Cats\n Stray,2020\n Juvenile Cats\n Owner Relinquished,2020\n Juvenile Cats\n Transfer from another Colorado Organization,2020\n Juvenile Cats\n Transfer from Out of State,2020\n Juvenile Cats\n Other Intake,2020\n Juvenile Cats\n Adoption,2020\n Juvenile Cats\n Returned to Owner (RTO),2020\n Juvenile Cats\n Transfer to another Colorado Organization,2020\n Juvenile Cats\n Transfer to Out of State,2020\n Juvenile Cats\n Other Transfer,Unnamed: 49
445,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
446,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
447,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
448,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
449,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


So that explains the reason this dataset is longer. It looks like there are extra blank rows at the bottom of the dataset that need to be removed. We will have to filter these out.

In [14]:
as20.describe()

Unnamed: 0,1/1/2020\n Adult Dogs\n In Shelter,1/1/2020\n Adult Dogs\n In Foster Care,2020\n Adult Dogs\n Stray,2020\n Adult Dogs\n Owner Relinquished,2020\n Adult Dogs Transfer from another Colorado Organization,2020\n Adult Dogs\n Transfer from Out of State,2020\n Adult Dogs\n Other Intake,2020\n Adult Dogs\n Adoption,2020\n Adult Dogs\n Returned to Owner (RTO),2020\n Adult Dogs\n Transfer to another Colorado Organization,2020\n Adult Dogs\n Transfer to Out of State,2020\n Adult Dogs\n Other Transfer,1/1/2020\n Juvenile Dogs\n In Shelter,1/1/2020\n Juvenile Dogs\n In Foster Care,2020\n Juvenile Dogs\n Stray,2020\n Juvenile Dogs\n Owner Relinquished,2020\n Juvenile Dogs\n Transfer from another Colorado Organization,2020\n Juvenile Dogs\n Transfer from Out of State,2020\n Juvenile Dogs\n Other Intake,2020\n Juvenile Dogs\n Adoption,2020\n Juvenile Dogs\n Returned to Owner (RTO),2020\n Juvenile Dogs\n Transfer to another Colorado Organization,2020\n Juvenile Dogs\n Transfer to Out of State,2020\n Juvenile Dogs\n Other Transfer,1/1/2020\n Adult Cats\n In Shelter,1/1/2020\n Adult Cats\n In Foster Care,2020\n Adult Cats\n Stray,2020\n Adult Cats\n Owner Relinquished,2020\n Adult Cats\n Transfer from another Colorado Organization,2020\n Adult Cats\n Transfer from Out of State,2020\n Adult Cats\n Other Intake,2020\n Adult Cats\n Adoption,2020\n Adult Cats\n Returned to Owner (RTO),2020\n Adult Cats\n Transfer to another Colorado Organization,2020\n Adult Cats\n Transfer to Out of State,2020\n Adult Cats\n Other Transfer,1/1/2020\n Juvenile Cats\n In Shelter,1/1/2020\n Juvenile Cats\n In Foster Care,2020\n Juvenile Cats\n Stray,2020\n Juvenile Cats\n Owner Relinquished,2020\n Juvenile Cats\n Transfer from another Colorado Organization,2020\n Juvenile Cats\n Transfer from Out of State,2020\n Juvenile Cats\n Other Intake,2020\n Juvenile Cats\n Adoption,2020\n Juvenile Cats\n Returned to Owner (RTO),2020\n Juvenile Cats\n Transfer to another Colorado Organization,2020\n Juvenile Cats\n Transfer to Out of State,2020\n Juvenile Cats\n Other Transfer,Unnamed: 49
count,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,356.0,0.0
mean,8.898876,5.620787,66.991573,37.772472,11.893258,48.803371,12.398876,106.356742,49.087079,10.727528,2.036517,1.741573,1.463483,3.764045,13.519663,9.901685,5.794944,53.365169,5.002809,76.719101,2.134831,3.699438,0.679775,0.452247,8.157303,2.337079,31.16573,26.581461,6.660112,14.120787,15.452247,62.098315,6.879213,5.825843,0.053371,12.356742,1.988764,3.042135,26.311798,16.480337,8.455056,25.039326,5.542135,67.570225,0.485955,7.081461,0.053371,1.367978,
std,20.634332,13.07743,268.876541,157.994389,38.020526,125.148253,56.195668,270.175997,198.61708,44.261087,20.798058,17.655085,6.869513,12.807148,82.846623,41.679108,38.009785,202.127129,20.136226,232.551033,9.983018,25.365877,7.500891,5.832446,27.766085,9.468934,147.733441,120.672793,25.326592,66.557564,92.59593,218.23179,32.929023,22.737647,0.404154,90.060695,7.940617,15.340857,88.007656,60.546126,40.574546,113.219685,24.345076,198.866657,2.247333,38.343027,0.95532,10.762061,
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
25%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
50%,0.0,0.0,0.0,3.0,0.0,1.0,0.0,19.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
75%,7.0,6.0,4.25,17.0,7.0,26.75,0.0,94.5,1.0,1.0,0.0,0.0,0.0,1.0,0.0,4.0,0.0,15.0,0.0,42.5,0.0,0.0,0.0,0.0,0.0,0.0,1.25,3.25,0.0,0.0,0.0,18.75,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,0.0,0.0,0.0,25.75,0.0,0.0,0.0,0.0,
max,143.0,129.0,3000.0,2301.0,409.0,975.0,590.0,2846.0,2140.0,396.0,275.0,298.0,97.0,118.0,1146.0,592.0,670.0,2444.0,240.0,2578.0,141.0,387.0,116.0,108.0,307.0,118.0,1808.0,1574.0,333.0,815.0,1097.0,2280.0,363.0,279.0,5.0,1185.0,100.0,234.0,796.0,674.0,538.0,1155.0,270.0,1810.0,23.0,536.0,18.0,151.0,


These values seem reasonable - showing 356 values in each column so this reinforces that some blank rows need to be filtered out.

The list of items that need to be done for the 2020 data:
- drop the column with no information
- figure out what columns we need in the analysis and how they match to the 2019 data columns; drop the rest 
- rename the columns will we use for ease of access
- drop the rows with all NaNs

Total list of items to fix for the 2019/2020 data:
- Drop the columns with no information
- Figure out what columns we need in the analysis; drop the rest
- Need to see what columns in 2019 are the same as the 2020 data
- Rename the columns will we use for ease of access
- 2020 drop the rows that have all NaNs