## Introduction
Over the years from 2017 to the present, I have heard more about pets being abandoned or lost and I was wondering whether these animals were increasingly getting abandoned over the years or if they just happened to get more coverage and are being talked about more. I will be using the datasets that I have gathered to see if I can see if any specific animal is being increasingly abandoned and see if I can predict whether there will be more or less abondonments this year than there were last year. I will do this with methods such as countplots and line plots to track how many pets were taken in or abandoned and whether certain parts of the year have more abandonments. With this I will hopefully be able to predict the animals that see the worst of human behavior and find ways to stop it. Knowing about this we could use this prediction to create ads or educate the rest of the population on how to stop this grave problem.

## Modules
I have imported numpy, seaborn, matplotlib, pandas and regex so that I can start exploring the data. I will use pandas for the table and any table methods. Seaborn and matplotlib will be used to graph different points and explore the data. Regex and numpy will mostly be used to create any data cleaning needs that I deem necessary.

In [4]:
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import re

This dataset is from a csv called animal-shelter-intakes-and-outcomes. It originally had 27650 rows and 23 columns. My first order of business will be able to split the intake day into month, day and year columns. The original dataset was used to track the different pets a company was intaking. A peculiar thing that I found is that apparently there are pink wild animals with a secondary color of white.

In [5]:
df = pd.read_csv('animal-shelter-intakes-and-outcomes.csv')
df

Unnamed: 0,Animal ID,Animal Name,Animal Type,Primary Color,Secondary Color,Sex,DOB,Intake Date,Intake Condition,Intake Type,...,Crossing,Jurisdiction,Outcome Type,Outcome Subtype,latitude,longitude,intake_is_dead,outcome_is_dead,was_outcome_alive,geopoint
0,A627149,*NICOLE,CAT,BLACK,,Female,2019-03-29,2019-05-26,NORMAL,STRAY,...,"17000 BLK LANRELBROOK PL, CERRITOS, CA 90703",CERRITOS,ADOPTION,PFE/PAWSHP,33.868583,-118.040415,Alive on Intake,False,1,"33.8685835, -118.0404148"
1,A642976,CHARLIE,DOG,CREAM,,Neutered,2019-12-28,2022-04-20,NORMAL,OWNER SURRENDER,...,"17000 BLK LAURELBROOK PL, CERRITOS, CA 90703",CERRITOS,TRANSFER,,33.868583,-118.040415,Alive on Intake,False,1,"33.8685835, -118.0404148"
2,A698770,*MISHA,CAT,GRAY,,Female,2023-04-27,2023-05-13,UNDER AGE/WEIGHT,STRAY,...,17000 BLK MAURICE AVE CER 90703,CERRITOS,HOMEFIRST,,33.876770,-118.055554,Alive on Intake,False,1,"33.87677, -118.0555541"
3,A672338,,WILD,PINK,WHITE,Unknown,,2022-03-26,UNDER AGE/WEIGHT,WILDLIFE,...,"17000 BLK STARK AVE, CERRITOS, CA 90703",CERRITOS,EUTHANASIA,UNDRAGE/WT,33.875791,-118.068383,Alive on Intake,True,0,"33.8757908, -118.0683835"
4,A627949,JASPER,DOG,TAN,WHITE,Neutered,2015-06-08,2019-06-08,NORMAL,STRAY,...,"17000 BLK STORK AVE, CERRITOS, CA 90703",CERRITOS,ADOPTION,WALKIN,33.872455,-118.059591,Alive on Intake,False,1,"33.8724552, -118.0595907"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
27645,A702835,*NESTLE,DOG,BLACK,WHITE,Neutered,2023-02-14,2023-07-13,NORMAL,OWNER SURRENDER,...,"WILLOW/SANTA FE, LONG BEACH, CA 90810",LONG BEACH,ADOPTION,,33.804288,-118.215114,Alive on Intake,False,1,"33.804288, -118.215114"
27646,A699371,*GIZMO,DOG,WHITE,TAN,Male,2020-05-25,2023-05-25,NORMAL,STRAY,...,WOODRUFF,LONG BEACH,RESCUE,VARGASRANC,33.841446,-118.116203,Alive on Intake,False,1,"33.841446, -118.1162032"
27647,A630046,,DOG,TAN,,Female,2014-07-17,2019-07-17,NORMAL,STRAY,...,"WOODRUFF AVE / E WARDLOW RD, LONG BEACH, CA 90808",LONG BEACH,TRANSFER,SPCALA,33.818753,-118.116312,Alive on Intake,False,1,"33.8187528, -118.1163116"
27648,A479602,SPIKE#2,DOG,TAN,,Male,2006-05-07,2019-07-13,NORMAL,STRAY,...,"WOODRUFF AVE / E WARDLOW RD, LONG BEACH, CA 90808",LONG BEACH,RETURN TO OWNER,,33.818753,-118.116312,Alive on Intake,False,1,"33.8187528, -118.1163116"


## Cleaning/Processing

In [11]:
df['Year'] = [int(re.findall('[0-9]{4}', i)[0]) for i in df['Intake Date']]
df['Month'] = [int(re.findall('\d{4}-(\d{2})-\d{2}', i)[0]) for i in df['Intake Date']]
df['Day'] = [int(re.findall('\d{4}-\d{2}-(\d{2})', i)[0]) for i in df['Intake Date']]
df

Unnamed: 0,Animal ID,Animal Name,Animal Type,Primary Color,Secondary Color,Sex,DOB,Intake Date,Intake Condition,Intake Type,...,Outcome Subtype,latitude,longitude,intake_is_dead,outcome_is_dead,was_outcome_alive,geopoint,Year,Month,Day
0,A627149,*NICOLE,CAT,BLACK,,Female,2019-03-29,2019-05-26,NORMAL,STRAY,...,PFE/PAWSHP,33.868583,-118.040415,Alive on Intake,False,1,"33.8685835, -118.0404148",2019,5,26
1,A642976,CHARLIE,DOG,CREAM,,Neutered,2019-12-28,2022-04-20,NORMAL,OWNER SURRENDER,...,,33.868583,-118.040415,Alive on Intake,False,1,"33.8685835, -118.0404148",2022,4,20
2,A698770,*MISHA,CAT,GRAY,,Female,2023-04-27,2023-05-13,UNDER AGE/WEIGHT,STRAY,...,,33.876770,-118.055554,Alive on Intake,False,1,"33.87677, -118.0555541",2023,5,13
3,A672338,,WILD,PINK,WHITE,Unknown,,2022-03-26,UNDER AGE/WEIGHT,WILDLIFE,...,UNDRAGE/WT,33.875791,-118.068383,Alive on Intake,True,0,"33.8757908, -118.0683835",2022,3,26
4,A627949,JASPER,DOG,TAN,WHITE,Neutered,2015-06-08,2019-06-08,NORMAL,STRAY,...,WALKIN,33.872455,-118.059591,Alive on Intake,False,1,"33.8724552, -118.0595907",2019,6,8
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
27645,A702835,*NESTLE,DOG,BLACK,WHITE,Neutered,2023-02-14,2023-07-13,NORMAL,OWNER SURRENDER,...,,33.804288,-118.215114,Alive on Intake,False,1,"33.804288, -118.215114",2023,7,13
27646,A699371,*GIZMO,DOG,WHITE,TAN,Male,2020-05-25,2023-05-25,NORMAL,STRAY,...,VARGASRANC,33.841446,-118.116203,Alive on Intake,False,1,"33.841446, -118.1162032",2023,5,25
27647,A630046,,DOG,TAN,,Female,2014-07-17,2019-07-17,NORMAL,STRAY,...,SPCALA,33.818753,-118.116312,Alive on Intake,False,1,"33.8187528, -118.1163116",2019,7,17
27648,A479602,SPIKE#2,DOG,TAN,,Male,2006-05-07,2019-07-13,NORMAL,STRAY,...,,33.818753,-118.116312,Alive on Intake,False,1,"33.8187528, -118.1163116",2019,7,13


In [15]:
df.Year.value_counts()

Year
2017    4884
2023    4614
2019    4301
2018    4134
2022    3667
2020    2975
2021    2874
2024     201
Name: count, dtype: int64

In [16]:
df.isna().sum()

Animal ID                0
Animal Name          11631
Animal Type              0
Primary Color            0
Secondary Color      14478
Sex                      0
DOB                   3507
Intake Date              0
Intake Condition         0
Intake Type              0
Intake Subtype         317
Reason for Intake    25758
Outcome Date           140
Crossing                 0
Jurisdiction             1
Outcome Type           142
Outcome Subtype       3384
latitude                 0
longitude                0
intake_is_dead           0
outcome_is_dead          0
was_outcome_alive        0
geopoint                 0
Year                     0
Month                    0
Day                      0
dtype: int64

As of right now the columns that have null values are okay to stay there so I will not be removing any rows.

In [17]:
lowest_year = df.iloc[df.Year.argmin()].Year
highest_year = df.iloc[df.Year.argmax()].Year
lowest_year, highest_year

(2017, 2024)

### More to Come...

The GitHub repo is here: 