## Data Collection

Requesting data from the Austin Animal shelter API using requests.get

Both datasets contain around 160,000 rows of data. To retrieve all the data, using ```?$limit=160000``` at the end of both API Endpoints 

Requesting data from the intake API and storing as a JSON

In [45]:
import requests 
import pandas as pd

response = requests.get('https://data.austintexas.gov/resource/wter-evkm.json?$limit=160000')
data_set1 = response.json()
data_set1

[{'animal_id': 'A786884',
  'name': '*Brock',
  'datetime': '2019-01-03T16:19:00.000',
  'datetime2': '2019-01-03T16:19:00.000',
  'found_location': '2501 Magin Meadow Dr in Austin (TX)',
  'intake_type': 'Stray',
  'intake_condition': 'Normal',
  'animal_type': 'Dog',
  'sex_upon_intake': 'Neutered Male',
  'age_upon_intake': '2 years',
  'breed': 'Beagle Mix',
  'color': 'Tricolor'},
 {'animal_id': 'A706918',
  'name': 'Belle',
  'datetime': '2015-07-05T12:59:00.000',
  'datetime2': '2015-07-05T12:59:00.000',
  'found_location': '9409 Bluegrass Dr in Austin (TX)',
  'intake_type': 'Stray',
  'intake_condition': 'Normal',
  'animal_type': 'Dog',
  'sex_upon_intake': 'Spayed Female',
  'age_upon_intake': '8 years',
  'breed': 'English Springer Spaniel',
  'color': 'White/Liver'},
 {'animal_id': 'A724273',
  'name': 'Runster',
  'datetime': '2016-04-14T18:43:00.000',
  'datetime2': '2016-04-14T18:43:00.000',
  'found_location': '2818 Palomino Trail in Austin (TX)',
  'intake_type': 'Str

Requesting data from the output API and storing as a JSON

In [15]:
response2 = requests.get('https://data.austintexas.gov/resource/9t4d-g238.json?$limit=160000')
data_set2 = response2.json()
data_set2

[{'animal_id': 'A794011',
  'name': 'Chunk',
  'datetime': '2019-05-08T18:20:00.000',
  'monthyear': '2019-05-08T18:20:00.000',
  'date_of_birth': '2017-05-02T00:00:00.000',
  'outcome_type': 'Rto-Adopt',
  'animal_type': 'Cat',
  'sex_upon_outcome': 'Neutered Male',
  'age_upon_outcome': '2 years',
  'breed': 'Domestic Shorthair Mix',
  'color': 'Brown Tabby/White'},
 {'animal_id': 'A776359',
  'name': 'Gizmo',
  'datetime': '2018-07-18T16:02:00.000',
  'monthyear': '2018-07-18T16:02:00.000',
  'date_of_birth': '2017-07-12T00:00:00.000',
  'outcome_type': 'Adoption',
  'animal_type': 'Dog',
  'sex_upon_outcome': 'Neutered Male',
  'age_upon_outcome': '1 year',
  'breed': 'Chihuahua Shorthair Mix',
  'color': 'White/Brown'},
 {'animal_id': 'A821648',
  'datetime': '2020-08-16T11:38:00.000',
  'monthyear': '2020-08-16T11:38:00.000',
  'date_of_birth': '2019-08-16T00:00:00.000',
  'outcome_type': 'Euthanasia',
  'animal_type': 'Other',
  'sex_upon_outcome': 'Unknown',
  'age_upon_outcome

### Converting JSON data to a dataframe using PANDAS

In [44]:
# dataframe(df) for intake data
animal_intake_df = pd.DataFrame(data_set1) 

# only showing the first 5 rows - can remove .head() to see more rows
animal_intake_df.info()



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 160 entries, 0 to 159
Data columns (total 12 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   animal_id         160 non-null    object
 1   name              103 non-null    object
 2   datetime          160 non-null    object
 3   datetime2         160 non-null    object
 4   found_location    160 non-null    object
 5   intake_type       160 non-null    object
 6   intake_condition  160 non-null    object
 7   animal_type       160 non-null    object
 8   sex_upon_intake   160 non-null    object
 9   age_upon_intake   160 non-null    object
 10  breed             160 non-null    object
 11  color             160 non-null    object
dtypes: object(12)
memory usage: 15.1+ KB


In [26]:
# dataframe 2 for output

animal_outcome_df = pd.DataFrame(data_set2) 

animal_outcome_df.head(5)


Unnamed: 0,animal_id,name,datetime,monthyear,date_of_birth,outcome_type,animal_type,sex_upon_outcome,age_upon_outcome,breed,color,outcome_subtype
0,A794011,Chunk,2019-05-08T18:20:00.000,2019-05-08T18:20:00.000,2017-05-02T00:00:00.000,Rto-Adopt,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,
1,A776359,Gizmo,2018-07-18T16:02:00.000,2018-07-18T16:02:00.000,2017-07-12T00:00:00.000,Adoption,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,
2,A821648,,2020-08-16T11:38:00.000,2020-08-16T11:38:00.000,2019-08-16T00:00:00.000,Euthanasia,Other,Unknown,1 year,Raccoon,Gray,
3,A720371,Moose,2016-02-13T17:59:00.000,2016-02-13T17:59:00.000,2015-10-08T00:00:00.000,Adoption,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff,
4,A674754,,2014-03-18T11:47:00.000,2014-03-18T11:47:00.000,2014-03-12T00:00:00.000,Transfer,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,Partner


### Merging both datasets to retireve the rows that contain both intake and output information

### Cleaning columns

As both datasets contains 2 columns that store the datatime, we can remove both columns and only store the date

#### Cleaning the intake dataset

In [47]:
# checking the datatypes of the columns
animal_intake_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 160 entries, 0 to 159
Data columns (total 12 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   animal_id         160 non-null    object
 1   name              103 non-null    object
 2   datetime          160 non-null    object
 3   datetime2         160 non-null    object
 4   found_location    160 non-null    object
 5   intake_type       160 non-null    object
 6   intake_condition  160 non-null    object
 7   animal_type       160 non-null    object
 8   sex_upon_intake   160 non-null    object
 9   age_upon_intake   160 non-null    object
 10  breed             160 non-null    object
 11  color             160 non-null    object
dtypes: object(12)
memory usage: 15.1+ KB


In [50]:
# Removing the time from the datetime column, so we are just left with the date

# as the datetime column is a object, we are using the pandas to_datetime function, to convert the column into a datetime object 
# then using the dt (datetime) date PANDAS function to convert our datetime object to just the date
intake_date = pd.to_datetime(animal_intake_df['datetime']).dt.date

# print(intake_date) - this will show the output which is just the date

# using the PANDAS drop function to drop the 'datetime' and 'datetime2' columns from the intake dataset
intake_df = animal_intake_df.drop(['datetime', 'datetime2'], axis='columns')

# inserting the new column which we created before - (intake_date) and placing this at index 2 in the dateset
intake_df.insert(2, 'intake_date', intake_date)

# returning first 5 rows, just to show how the dataframe now looks
intake_df.head(5)


Unnamed: 0,animal_id,name,intake_date,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color
0,A786884,*Brock,2019-01-03,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor
1,A706918,Belle,2015-07-05,9409 Bluegrass Dr in Austin (TX),Stray,Normal,Dog,Spayed Female,8 years,English Springer Spaniel,White/Liver
2,A724273,Runster,2016-04-14,2818 Palomino Trail in Austin (TX),Stray,Normal,Dog,Intact Male,11 months,Basenji Mix,Sable/White
3,A857105,Johnny Ringo,2022-05-12,4404 Sarasota Drive in Austin (TX),Public Assist,Normal,Cat,Neutered Male,2 years,Domestic Shorthair,Orange Tabby
4,A682524,Rio,2014-06-29,800 Grove Blvd in Austin (TX),Stray,Normal,Dog,Neutered Male,4 years,Doberman Pinsch/Australian Cattle Dog,Tan/Gray


#### Cleaning the outcome dataset

Doing the same as above

In [51]:
outcome_date = pd.to_datetime(animal_outcome_df['datetime']).dt.date

# this dataset contains the 'datetime' and 'monthyear' column which both can be removed
outcome_df = animal_outcome_df.drop(['datetime', 'monthyear'], axis='columns')

outcome_df.insert(2, 'outcome_date', outcome_date)

outcome_df.head(5)

Unnamed: 0,animal_id,name,outcome_date,date_of_birth,outcome_type,animal_type,sex_upon_outcome,age_upon_outcome,breed,color,outcome_subtype
0,A794011,Chunk,2019-05-08,2017-05-02T00:00:00.000,Rto-Adopt,Cat,Neutered Male,2 years,Domestic Shorthair Mix,Brown Tabby/White,
1,A776359,Gizmo,2018-07-18,2017-07-12T00:00:00.000,Adoption,Dog,Neutered Male,1 year,Chihuahua Shorthair Mix,White/Brown,
2,A821648,,2020-08-16,2019-08-16T00:00:00.000,Euthanasia,Other,Unknown,1 year,Raccoon,Gray,
3,A720371,Moose,2016-02-13,2015-10-08T00:00:00.000,Adoption,Dog,Neutered Male,4 months,Anatol Shepherd/Labrador Retriever,Buff,
4,A674754,,2014-03-18,2014-03-12T00:00:00.000,Transfer,Cat,Intact Male,6 days,Domestic Shorthair Mix,Orange Tabby,Partner


In [52]:
# merging the cleaned datasets

merged_df = intake_df.merge(outcome_df)
merged_df

Unnamed: 0,animal_id,name,intake_date,found_location,intake_type,intake_condition,animal_type,sex_upon_intake,age_upon_intake,breed,color,outcome_date,date_of_birth,outcome_type,sex_upon_outcome,age_upon_outcome,outcome_subtype
0,A786884,*Brock,2019-01-03,2501 Magin Meadow Dr in Austin (TX),Stray,Normal,Dog,Neutered Male,2 years,Beagle Mix,Tricolor,2019-01-08,2017-01-03T00:00:00.000,Transfer,Neutered Male,2 years,Partner
1,A706918,Belle,2015-07-05,9409 Bluegrass Dr in Austin (TX),Stray,Normal,Dog,Spayed Female,8 years,English Springer Spaniel,White/Liver,2015-07-05,2007-07-05T00:00:00.000,Return to Owner,Spayed Female,8 years,
2,A724273,Runster,2016-04-14,2818 Palomino Trail in Austin (TX),Stray,Normal,Dog,Intact Male,11 months,Basenji Mix,Sable/White,2016-04-21,2015-04-17T00:00:00.000,Return to Owner,Neutered Male,1 year,
3,A857105,Johnny Ringo,2022-05-12,4404 Sarasota Drive in Austin (TX),Public Assist,Normal,Cat,Neutered Male,2 years,Domestic Shorthair,Orange Tabby,2022-05-12,2020-05-12T00:00:00.000,Transfer,Neutered Male,2 years,Partner
4,A682524,Rio,2014-06-29,800 Grove Blvd in Austin (TX),Stray,Normal,Dog,Neutered Male,4 years,Doberman Pinsch/Australian Cattle Dog,Tan/Gray,2014-07-02,2010-06-29T00:00:00.000,Return to Owner,Neutered Male,4 years,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
201,A787448,Watermelon,2019-04-01,12900 Dessau in Travis (TX),Stray,Normal,Dog,Intact Female,3 years,Labrador Retriever Mix,Brown Tiger/White,2019-11-14,2016-01-13T00:00:00.000,Adoption,Spayed Female,3 years,
202,A787448,Watermelon,2019-04-01,12900 Dessau in Travis (TX),Stray,Normal,Dog,Intact Female,3 years,Labrador Retriever Mix,Brown Tiger/White,2019-01-15,2016-01-13T00:00:00.000,Return to Owner,Intact Female,3 years,
203,A787448,Watermelon,2019-04-01,12900 Dessau in Travis (TX),Stray,Normal,Dog,Intact Female,3 years,Labrador Retriever Mix,Brown Tiger/White,2019-04-04,2016-01-13T00:00:00.000,Return to Owner,Intact Female,3 years,
204,A787448,Watermelon,2019-04-01,12900 Dessau in Travis (TX),Stray,Normal,Dog,Intact Female,3 years,Labrador Retriever Mix,Brown Tiger/White,2019-07-08,2016-01-13T00:00:00.000,Transfer,Spayed Female,3 years,Partner
