### Data to the Rescue
FetchMaker’s mission is to match up prospective dog owners with their perfect pet. They have been collecting data on their adoptable dogs, and our goal is to analyze some of that datat to get some valuable insights.


FetchMaker has provided us with data for a sample of dogs from their app, including the following attributes:
- `weight`, an integer representing how heavy a dog is in pounds
- `tail_length`, a float representing tail length in inches
- `age`, in years
- `color`, a String such as "brown" or "grey"
- `is_rescue`, a boolean 0 or 1

In [13]:
# Import libraries
import pandas as pd
import numpy as np

from scipy.stats import binom_test

# Load the data and print a sample
dogs = pd.read_csv("dog_data.csv")
print(dogs.head())

   is_rescue  weight  tail_length  age  color  likes_children  \
0          0       6         2.25    2  black               1   
1          0       4         5.36    4  black               0   
2          0       7         3.63    3  black               0   
3          0       5         0.19    2  black               0   
4          0       5         0.37    1  black               1   

   is_hypoallergenic      name      breed  
0                  0      Huey  chihuahua  
1                  0   Cherish  chihuahua  
2                  1     Becka  chihuahua  
3                  0     Addie  chihuahua  
4                  1  Beverlee  chihuahua  


FetchMaker estimates (based on historical data for all dogs) that 8% of dogs in their system are rescues. They would like to know if whippets are significantly more or less likely than other dogs to be a rescue.

In [14]:
# Save 'is_rescue' for whippets
whippet_rescue = dogs.is_rescue[dogs.breed == "whippet"]

# Count the number of 'rescues' and print the result
num_whippet_rescues = np.sum(whippet_rescue == 1)
print(num_whippet_rescues)

# Count the number of whippets in total
num_whippets = len(whippet_rescue)
# Print the result
print(num_whippets)

6
100


Now we use a hypothesis test to test the following null and alternative hypotheses:
- null: 8% of whippets are rescues
- alternative: more or less than 8% of whippets are rescues

For this test, we are focused on a single binary categorical variable, which indicates whether or not each whippet is a rescue. We want to compare the number of rescues in our sample to a hypothetical population-level proportion of 0.08. Therefore, we should use a binomial test.

In [15]:
# Run a binomial test 
pval = binom_test(num_whippet_rescues, num_whippets, .08)
# Print the result
print(pval)

0.5811780106238107


Using a significance threshold of 0.05, we can conclude that the proportion of whippets who are rescues is NOT significantly different from 8%.