In [25]:
import pandas as pd
# Austin Animal Center Shelter Intakes and Outcomes 2013-2018
aac_in_out = pd.read_csv('./data/aac_intakes_outcomes.csv')
aac_in = pd.read_csv('./data/aac_intakes.csv')
aac_out = pd.read_csv('./data/aac_outcomes.csv')

<style>
    * {
      margin: 0;
      padding: 0;
      box-sizing: border-box;
    }
    small {
      font-size: 10px;
    }
    h3 {
      margin-bottom: 5px;
    }
    p {
      font-size: 14px;
    }
</style>

<small>Found Pets - Meets 1/1</small>
### Is there an area where more pets are found?


I filtered the `aac_in` **DataFrame** to only include rows where the `found_location` column contains the substring `' in '`. The resulting filtered **DataFrame** is then used to calculate the count of each unique value in the `found_location` column. I then returned the top five values.

In [26]:
aac_in[aac_in['found_location'].str.contains(' in ')].value_counts('found_location').head()

found_location
7201 Levander Loop in Austin (TX)     517
4434 Frontier Trl in Austin (TX)      163
124 W Anderson Ln in Austin (TX)      153
12034 Research Blvd in Austin (TX)     98
1156 W Cesar Chavez in Austin (TX)     98
Name: count, dtype: int64

<style>
    * {
      margin: 0;
      padding: 0;
      box-sizing: border-box;
    }
    small {
      font-size: 10px;
    }
    h3 {
      margin-bottom: 5px;
    }
    p {
      font-size: 14px;
    }
</style>

<small>Found Pets - Exceeds 1/2</small>
### How many animals in the shelter are repeats?

Using the `animal_id` column, I counted the number of animals that appear more than once in the `aac_in` **DataFrame** and assigned the results of that Boolean **Series**, to the `repeat_animals` variable. Then, I calculated the total sum of the `repeat_animals` **Series**.<br><br>



In [27]:
repeat_animals = aac_in.animal_id.value_counts() > 1
repeat_animals.sum()

6154

<style>
    * {
      margin: 0;
      padding: 0;
      box-sizing: border-box;
    }
    small {
      font-size: 10px;
    }
    h3 {
      margin-bottom: 5px;
    }
    p {
      font-size: 14px;
    }
</style>

<small>Found Pets - Exceeds 2/2</small>
### Which animal was returned to the shelter the most?

I used the `repeat_animals` Boolean **Series** generated in the previous code block to identify the animal ID that appears most frequently in the `aac_in` **DataFrame**. I then used the `.index[0]` method to retrieve the animal ID as a string, and assigned it to the `most_repeat_animals` variable. The `most_repeat_animals` animal ID string is then used to filter the `aac_in` **DataFrame**. The resulting filtered **DataFrame** is then subsetted to only include the animal_id, name, and breed columns and the first row, for better data presentation.

In [28]:
most_repeat_animal = repeat_animals.head(1).index[0]
aac_in[aac_in['animal_id'] == most_repeat_animal][['animal_id','name', 'breed']].set_index('animal_id').head(1)

Unnamed: 0_level_0,name,breed
animal_id,Unnamed: 1_level_1,Unnamed: 2_level_1
A721033,Lil Bit,Rat Terrier Mix


<style>
    img {
      margin: 0;
      max-width: 15%;
    }
</style>

![Lil Bit](img/lil_bit.png)

<style>
    * {
      margin: 0;
      padding: 0;
      box-sizing: border-box;
    }
    small {
      font-size: 10px;
    }
    h3 {
      margin-bottom: 5px;
    }
    p {
      font-size: 14px;
    }
</style>

<small>Average Found Pets - Meets 1/2</small>
### What is the average number of pets found in a month in the year 2015?

First, I converted the `datetime` column in the `aac_in` **DataFrame** into a Pandas `datetime64` dtype. Then I created a new **DataFrame** named `pets_2015` that contains only the 2015 intake records copied from `aac_in`. Then, I added a new column `intake_month` to it that contains each month of the 2015 intakes, formatted to show the abbreviated month and the year. Next, I counted the amount of pets found in each month. Finally, I calculated the average number of pets found in a month in 2015 and rounded it to the nearest whole number.

In [29]:
aac_in['datetime'] = pd.to_datetime(aac_in['datetime'])
pets_2015 = aac_in[aac_in['datetime'].dt.year == 2015].copy()
pets_2015['intake_month'] = pets_2015['datetime'].dt.strftime('%b %Y')
pets_2015_monthly =  pets_2015['intake_month'].value_counts()
pets_2015_monthly_avg = pets_2015_monthly.mean()
round(pets_2015_monthly_avg)

1559

<style>
    * {
      margin: 0;
      padding: 0;
      box-sizing: border-box;
    }
    small {
      font-size: 10px;
    }
    h3 {
      margin-bottom: 5px;
    }
    p {
      font-size: 14px;
    }
</style>

<small>Average Found Pets - Meets 2/2</small>
### Are there months where there is a higher number of animals found?

I filtered the `pets_2015_monthly` **Series** from the code block above to only include months where the count of intakes is greater than the average number of monthly intakes for 2015.

In [30]:
pets_2015_monthly[pets_2015_monthly > pets_2015_monthly_avg]

intake_month
Jun 2015    2189
May 2015    2094
Oct 2015    1740
Aug 2015    1718
Jul 2015    1635
Sep 2015    1591
Name: count, dtype: int64

<style>
    * {
      margin: 0;
      padding: 0;
      box-sizing: border-box;
    }
    small {
      font-size: 10px;
    }
    h3 {
      margin-bottom: 5px;
    }
    p {
      font-size: 14px;
    }
</style>

<small>Ratio of Incoming vs. Outgoing - Meets 1/1</small>
### What is the ratio of incoming pets vs. adopted pets?

First, I calculated the sum of all unique animal IDs in the `aac_in` **DataFrame**. Then, I calculated the sum of all of the Adopted animals in the `outcome_type` column of the `aac_out` **DataFrame**. Finally, I divided the sum of the adopted animals by the sum of the incoming animals to get the ratio of adopted animals to incoming animals.

In [31]:
animal_intakes = aac_in.animal_id.value_counts().sum()
animals_adopted = aac_out[aac_out['outcome_type'] == 'Adoption'].value_counts('outcome_type').values[0]
round(animal_intakes / animals_adopted, 2)

2.34

<style>
    * {
      margin: 0;
      padding: 0;
      box-sizing: border-box;
    }
    small {
      font-size: 10px;
    }
    h3 {
      margin-bottom: 5px;
    }
    p {
      font-size: 14px;
    }
</style>

<small>Animal Distribution - Meets 1/1</small>
### What is the distribution of the types of animals in the shelter?
I counted the total number of animals by type in the `animal_type` column of the `aac_in` **DataFrame**.

In [32]:
aac_in.animal_type.value_counts()

animal_type
Dog          45743
Cat          29659
Other         4434
Bird           342
Livestock        9
Name: count, dtype: int64