* Type - Type of animal (1 = Dog, 2 = Cat)
* Name - Name of pet (Empty if not named)
* Age - Age of pet when listed, in months
* Breed1 - Primary breed of pet (Refer to breed_labels.csv)
* Breed2 - Secondary breed of pet, if pet is of mixed breed (Refer to breed_labels.csv
* Gender - Gender of pet (1 = Male, 2 = Female, 3 = Mixed, if profile represents group of pets)
* Color1 - Color 1 of pet (Refer to color_labels.csv)
* Color2 - Color 2 of pet (Refer to color_labels.csv)
* Color3 - Color 3 of pet (Refer to color_labels.csv)
* MaturitySize - Size at maturity (1 = Small, 2 = Medium, 3 = Large, 4 = Extra Large, 0 = Not Specified)
* FurLength - Fur length (1 = Short, 2 = Medium, 3 = Long, 0 = Not Specified)
* Vaccinated - Pet has been vaccinated (1 = Yes, 2 = No, 3 = Not Sure)
* Dewormed - Pet has been dewormed (1 = Yes, 2 = No, 3 = Not Sure)
* Sterilized - Pet has been spayed / neutered (1 = Yes, 2 = No, 3 = Not Sure)
* Health - Health Condition (1 = Healthy, 2 = Minor Injury, 3 = Serious Injury, 0 = Not Specified)
* Quantity - Number of pets represented in profile
* Fee - Adoption fee (0 = Free)
* State - State location in Malaysia (Refer to state_labels.csv)
* RescuerID - Unique hash ID of rescuer
* VideoAmt - Total uploaded videos for this pet
* Description - Profile write-up for this pet. The primary language used is English, with some in Malay or Chinese.
* PetID - Unique hash ID of pet profile
* PhotoAmt - Total uploaded photos for this pet
* AdoptionSpeed - Categorical speed of adoption:
    - 0 - Pet was adopted on the same day as it was listed.
    - 1 - Pet was adopted between 1 and 7 days (1st week) after being listed.
    - 2 - Pet was adopted between 8 and 30 days (1st month) after being listed.
    - 3 - Pet was adopted between 31 and 90 days (2nd & 3rd month) after being listed.
    - 4 - No adoption after 100 days of being listed. (There are no pets in this dataset that waited between 90 and 100 days).

In [67]:
import warnings
warnings.filterwarnings("ignore")

import pandas as pd
pd.set_option("display.max_columns", None)
import numpy as np
import pandas_profiling
import itertools

%matplotlib inline
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style("dark")
sns.set_context("talk")

import data_cleaning as clean
%reload_ext autoreload
%autoreload 2

In [68]:
!pwd

/Users/valmadrid/DataScienceBootcamp/Projects/Final Project/Predicting-PetFinder-Adoption-Rate


In [72]:
pets = pd.read_csv("dataset/petfinder-adoption-prediction/train/train.csv")
pets.columns = pets.columns.map(lambda x: x.lower())
pets.head()

Unnamed: 0,type,name,age,breed1,breed2,gender,color1,color2,color3,maturitysize,furlength,vaccinated,dewormed,sterilized,health,quantity,fee,state,rescuerid,videoamt,description,petid,photoamt,adoptionspeed
0,2,Nibble,3,299,0,1,1,7,0,1,1,2,2,2,1,1,100,41326,8480853f516546f6cf33aa88cd76c379,0,Nibble is a 3+ month old ball of cuteness. He ...,86e1089a3,1.0,2
1,2,No Name Yet,1,265,0,1,1,2,0,2,2,3,3,3,1,1,0,41401,3082c7125d8fb66f7dd4bff4192c8b14,0,I just found it alone yesterday near my apartm...,6296e909a,2.0,0
2,1,Brisco,1,307,0,1,2,7,0,2,2,1,1,2,1,1,0,41326,fa90fa5b1ee11c86938398b60abc32cb,0,Their pregnant mother was dumped by her irresp...,3422e4906,7.0,3
3,1,Miko,4,307,0,2,1,2,0,2,1,1,1,2,1,1,150,41401,9238e4f44c71a75282e62f7136c6b240,0,"Good guard dog, very alert, active, obedience ...",5842f1ff5,8.0,2
4,1,Hunter,1,307,0,1,1,0,0,2,1,2,2,2,1,1,0,41326,95481e953f8aed9ec3d16fc4509537e8,0,This handsome yet cute boy is up for adoption....,850a43f90,3.0,2


In [73]:
breeds = pd.read_csv(
    "/Users/valmadrid/DataScienceBootcamp/Projects/Final Project/Predicting-PetFinder-Adoption-Rate/dataset/petfinder-adoption-prediction/breed_labels.csv"
)
colors = pd.read_csv(
    "/Users/valmadrid/DataScienceBootcamp/Projects/Final Project/Predicting-PetFinder-Adoption-Rate/dataset/petfinder-adoption-prediction/color_labels.csv"
)
states = pd.read_csv(
    "/Users/valmadrid/DataScienceBootcamp/Projects/Final Project/Predicting-PetFinder-Adoption-Rate/dataset/petfinder-adoption-prediction/state_labels.csv"
)

pets = clean.get_breed(pets, breeds, "breed1")
pets = clean.get_breed(pets, breeds, "breed2")
pets = clean.get_color(pets, colors, "color1")
pets = clean.get_color(pets, colors, "color2")
pets = clean.get_color(pets, colors, "color3")
pets = clean.get_state(pets, states, "state")
pets.head()

Unnamed: 0,type,name,age,breed1,breed2,gender,color1,color2,color3,maturitysize,furlength,vaccinated,dewormed,sterilized,health,quantity,fee,state,rescuerid,videoamt,description,petid,photoamt,adoptionspeed,breed1_desc,breed2_desc,color1_desc,color2_desc,color3_desc,state_desc
0,2,Golden Tabby Girl,1,299,266,2,2,3,5,2,1,2,1,2,1,1,50,41326,438a9bdce8ef4d5948fc40e422d34d0d,0,A cute tabby kitten looking for new home. She ...,dae13a47e,7.0,1,Tabby,Domestic Short Hair,Brown,Golden,Cream,Selangor
1,2,Ogen & Oyen,2,265,266,3,2,3,5,1,2,2,2,2,1,2,0,41326,15316b9044ea4f6a57f6cb4b45fc67aa,0,Ogen (male) & Oyen (female) are about 2 months...,718b14a08,2.0,1,Domestic Medium Hair,Domestic Short Hair,Brown,Golden,Cream,Selangor
2,2,Money,1,266,265,2,2,3,5,1,1,3,3,3,1,1,0,41326,5201c3e05aa6ff174b006c070e9a06b5,0,Please adopt this cute little kitten... I eant...,19982272a,4.0,4,Domestic Short Hair,Domestic Medium Hair,Brown,Golden,Cream,Selangor
3,2,Noah And Nellie,2,299,299,3,2,3,5,1,1,1,1,2,1,2,0,41326,2e4363f80f02bda5f2f8115b6ce6aef6,0,"Once again, thanks to petfinder, Noah and Nell...",4a590a1cc,5.0,1,Tabby,Tabby,Brown,Golden,Cream,Selangor
4,1,Karlo,2,307,307,1,2,3,5,2,2,2,2,2,1,1,0,41326,95481e953f8aed9ec3d16fc4509537e8,0,"Meet Karlo, brother of little Karla, the cutes...",f61c4cead,1.0,3,Mixed Breed,Mixed Breed,Brown,Golden,Cream,Selangor


In [74]:
pets = pets[[
    "petid", "type", "name", "age", "breed1", "breed1_desc", "breed2",
    "breed2_desc", "gender", "color1", "color1_desc", "color2", "color2_desc",
    "color3", "color3_desc", "maturitysize", "furlength", "vaccinated",
    "dewormed", "sterilized", "health", "quantity", "fee", "state",
    "state_desc", "rescuerid", "videoamt", "photoamt", "description",
    "adoptionspeed"
]]

pets.rename(mapper={
    "maturitysize": "maturity_size",
    "furlength": "fur_length",
    "rescuerid": "rescuer_id",
    "videoamt": "video_amt",
    "petid": "pet_id",
    "adoptionspeed": "adoption_speed"
},
            axis=1,
            inplace=True)

pets.head()

Unnamed: 0,pet_id,type,name,age,breed1,breed1_desc,breed2,breed2_desc,gender,color1,color1_desc,color2,color2_desc,color3,color3_desc,maturity_size,fur_length,vaccinated,dewormed,sterilized,health,quantity,fee,state,state_desc,rescuer_id,video_amt,photoamt,description,adoption_speed
0,dae13a47e,2,Golden Tabby Girl,1,299,Tabby,266,Domestic Short Hair,2,2,Brown,3,Golden,5,Cream,2,1,2,1,2,1,1,50,41326,Selangor,438a9bdce8ef4d5948fc40e422d34d0d,0,7.0,A cute tabby kitten looking for new home. She ...,1
1,718b14a08,2,Ogen & Oyen,2,265,Domestic Medium Hair,266,Domestic Short Hair,3,2,Brown,3,Golden,5,Cream,1,2,2,2,2,1,2,0,41326,Selangor,15316b9044ea4f6a57f6cb4b45fc67aa,0,2.0,Ogen (male) & Oyen (female) are about 2 months...,1
2,19982272a,2,Money,1,266,Domestic Short Hair,265,Domestic Medium Hair,2,2,Brown,3,Golden,5,Cream,1,1,3,3,3,1,1,0,41326,Selangor,5201c3e05aa6ff174b006c070e9a06b5,0,4.0,Please adopt this cute little kitten... I eant...,4
3,4a590a1cc,2,Noah And Nellie,2,299,Tabby,299,Tabby,3,2,Brown,3,Golden,5,Cream,1,1,1,1,2,1,2,0,41326,Selangor,2e4363f80f02bda5f2f8115b6ce6aef6,0,5.0,"Once again, thanks to petfinder, Noah and Nell...",1
4,f61c4cead,1,Karlo,2,307,Mixed Breed,307,Mixed Breed,1,2,Brown,3,Golden,5,Cream,2,2,2,2,2,1,1,0,41326,Selangor,95481e953f8aed9ec3d16fc4509537e8,0,1.0,"Meet Karlo, brother of little Karla, the cutes...",3


In [76]:
pets.to_csv("pets.csv", index = False)

In [75]:
petfinder_report = pets.profile_report(
    title="Petfinder Detailed Profiling Report",
    correlation_threshold_pearson=.9,
    sort="None")
petfinder_report.to_file(output_file="pets.html")
petfinder_report

