***Sam Cressman Capstone Project: Shelter Animal Outcomes***

# Summary Overview:

[Capstone Overview](#Capstone_Overview) <br>
[My Mission Statement](#Mission_Statement) <br>
[Capstone Data](#Capstone_Data) <br>
[Data Cleaning](#Data_Cleaning) <br>
[Exploratory Data Analysis (my main takeaways)](#Exploratory_Data_Analysis) <br>
[Modeling](#Modeling) <br>
[Tableau Story](#Tableau_Story) <br>
[Limitations and Next Steps](#Limitations_and_Next_Steps) <br>

<a id = "Capstone_Overview"></a>
# Capstone Overview

***Capstone inspiration:*** [Kaggle](https://www.kaggle.com/c/shelter-animal-outcomes)

"Every year, approximately 7.6 million companion animals end up in US shelters. Many animals are given up as unwanted by their owners, while others are picked up after getting lost or taken out of cruelty situations. Many of these animals find forever families to take them home, but just as many are not so lucky. 2.7 million dogs and cats are euthanized in the US every year. <br>

Using a dataset of intake information including breed, color, sex, and age from the Austin Animal Center, we're asking Kagglers to predict the outcome for each animal. <br>

We also believe this dataset can help us understand trends in animal outcomes. These insights could help shelters focus their energy on specific animals who need a little extra help finding a new home. We encourage you to publish your insights on Scripts so they are publicly accessible." <br>

"Annually over 90% of animals entering the center, are adopted, transferred to rescue or returned to their owners. The Outcomes data set reflects that Austin, TX. is the largest "No Kill" city in the country." ([Austin Animal Center](https://data.austintexas.gov/Health-and-Community-Services/Austin-Animal-Center-Outcomes/9t4d-g238))

<a id = "Mission_Statement"></a>
# My Mission Statement:

Utilize machine learning and Tableau to help improve outcomes for shelter animals

<a id = "Capstone_Data"></a>
# Capstone Data

[Intake data (pulled 6/25/18)](https://data.austintexas.gov/Health-and-Community-Services/Austin-Animal-Center-Intakes/wter-evkm) <br>
[Outcome data (pulled 6/25/18)](https://data.austintexas.gov/Health-and-Community-Services/Austin-Animal-Center-Outcomes/9t4d-g238) <br>

<a id = "Data_Cleaning"></a>
# Data Cleaning

For more details: [Data Cleaning Notebook](Data_Cleaning_Animal_Shelter_Outcomes.ipynb)

***Main Takeaways***:

- Nearly every column required cleaning: it is critical to examine each column. <br>
<br>
- I merged animal intake data with animal outcome data on each animal's unique Animal ID to ensure that each animal had one intake and one outcome. <br>
<br>
- I broke out the DateTime columns (Intake Time, Outcome Time) by day, day of week, day of month, and year. <br>
<br>
- I created columns for Age at Intake, Age at Outcome, and Length of Time in Shelter (Days). Also, half of the name values were null: I created a new column "has_name" which held a 1 if the animal left the animal shelter with a name or 0 otherwise. <br> 
<br>
- Regarding Outcome Type, I combined "Died", "Disposal", "Missing", and "Relocate" into an "Other" bucket since these outcomes occurred very infrequently. The outcomes that I consider to be positive are "Adoption", "Return to Owner", and "Transfer", and I consider "Euthanizia" and "Other" to be negative outcomes. Austin relies on [numerous "Transfer" partners](http://www.austintexas.gov/department/approved-partners) (usually animal specific) to reach its goal of being 90+% no kill. <br>
<br>
- With over 2212 "unique" Breed combinations, I used a Count Vectorizer to find the most important breeds. Then I created dummy columns to reflect those breeds numerically. This process was very similar for Color (539 "unique" colors).

<a id = "Exploratory_Data_Analysis"></a>
# Exploratory Data Analysis

For more details: [Exploratory Data Analysis / Quick Visualizations Notebook](Graphing_EDA_Animal_Shelter_Outcomes.ipynb)

***Main Takeaways***:

- Dogs represent most Austin animal shelter intakes. <br>
<br>
- Most intakes are stray animals. <br>
<br>
- Overall Austin animal shelter animals fare extremely well (see above mission: 90+% no kill goal). However, "other" animals (ex. bats, raccoons) do not fare well. <br>
<br>
- Focus on improving adoptions for pitbulls and domestic shorthair cats. <br>
<br>
- Color does not seem to be a critical determining factor in adoption rates. <br>
<br>
- Do not request euthaniasia for your animal unless necessary. <br>
<br>
- Promote older animals as much as possible (especially cats). <br>
<br>
- Give every animal a name. <br>
<br>
- Expect an influx of animals during warmer months. <br>
<br>
- Host as many events as possible. <br>

<a id = "Modeling"></a>
# Modeling

For more details: [Modeling Notebook](Modeling_Animal_Shelter_Outcomes.ipynb)

***Main Takeaways:***

- Due to the 90+% no-kill goal of the Austin Animal Shelter System, baseline accuracy was 90%, and my Logistic Regression, Random Forest, and Neural Network all performed in the 95% range. This will be addressed further below in Limitations / Next Steps. <br>
<br>
- My model metric was accuracy (binary classification: positive outcome or negative outcome). <br>
<br>
- ***Top 10 "positive" Logistic Regression features (descending):***

    has_name <br>
    Sex upon Outcome_Spayed Female <br>
    Sex upon Outcome_Neutered Male <br>
    Intake Condition_Normal <br>
    Sex upon Intake_Intact Female <br>
    Intake Type_Stray <br>
    Sex upon Intake_Intact Male <br>
    Intake Condition_Nursing <br>
    Intake Condition_Pregnant <br>
    retriever <br>
<br>
- ***Bottom 10 "negative" Logistic Regression features (ascending)***:
 
    bat	<br>
    Sex upon Outcome_Intact Male <br>
    raccoon <br>
    Sex upon Outcome_Intact Female <br>
    Intake Condition_Injured <br> 
    Intake Type_Wildlife <br>
    skunk <br>
    pit <br>
    Sex upon Intake_Neutered Male <br>
    Sex upon Intake_Spayed Female <br>

<a id = "Tableau_Story"></a>
# Tableau Story

[Tableau Story Link](https://public.tableau.com/profile/sam.cressman#!/vizhome/SamCressmanAustinAnimalShelterOutcomesCapstoneStory/AnimalShelterOutcomesVisualizations)

***Main Takeaways:***

- Using high-level visualization is critical towards utilizing data to tell a story (regardless of the complexity of the problem you are attempting to solve). <br>
<br>
- Initially creating and viewing basic visualizations using Python can help to determine the stories the data can tell. <br>
<br>
- Clean and manipulate data using Python before bringing into Tableau. <br>

<a id = "Limitations_and_Next_Steps"></a>
# Limitations and Next Steps

***Main Takeaways:***

- Austin has gold standard animal shelter and partner systems due to lofty no-kill goals: it would be great to compare to a city with lower positive outcomes or less ambitious goals (open data for animal shelters is limited). <br>
<br>
- I would be curious to see the impact of animal size on Outcome Type: this is something that could further be explored. <br>
<br>
- I would also be curious to examine shelter intakes and outcomes from a colder climate than Austin, Texas. <br>
<br>
- I noticed that the Austin Animal Center has a strong social media presence [(example)](https://www.instagram.com/austinanimalcenter/): it would be fascinating to understand social media's impact on driving animal adoptions.