# Bread Hypothetical A/B Test Analysis

### Table of Contents
1. Import and explore data
2. Data cleaning and feature engineering

## 1. Import and explore data

1. Combine intellicron data with all prequals
    - identify intellicron and non-intellicron records
2. Complete descriptive statistics to analyze the data
    - Confirm record count to ensure succesful merge of data
    - Look at spread of prequal_dates
3. Feature engineering
    - checkout: yes or no
3. What was the impact from implementing Intellicron?
    - Compare the raw approval rates between the two groups
    - Compare the checkout rates between the two groups
4. Perform a hypothesis test to measure the A/B test results. What was the result of the hypothesis test? What is your confidence interval?
    - Follow and replicate the Udacity A/B test project to respond to this
5. Was test and control group assignment done correctly? How can you tell? Is there anything you would do dierently next time?
6. Should we implement Intellicron? What other data, if any, would you need to make this determination?
7. Convert entire code to interactive
    - design as main
    - make interactive to show responses for all 4 questions

In [7]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

prequals = pd.read_csv('prequals.csv')
intellicron_prequals = pd.read_csv('intellicron_prequals.csv')

print(f"There are {len(prequals)} records in prequals")
print(f"There are {len(intellicron_prequals)} records in intellicron_prequals")

There are 190976 records in prequals
There are 8609 records in intellicron_prequals


### Preview table data

In [8]:
prequals.head()

Unnamed: 0.1,Unnamed: 0,prequal_id,checkout_id,prequal_date,completed_prequal,approved
0,1,00081cb5-27bb-428a-bc53-076bacc7ad02,,2019-06-22,1,0
1,2,00120f05-bf9d-40db-99d1-05a8cbd8aa0e,,2019-04-16,0,0
2,3,00139f6d-0af4-49c5-b26f-f9c999a06bcb,6e01514b-6b6b-4510-8f6c-d994b871273c,2019-05-09,1,1
3,4,0019854e-e4c2-42df-be79-59cf1a13ac89,,2019-01-17,0,0
4,5,0019cb64-a44c-4320-b149-9c0167c714e9,,2019-04-16,1,1


In [9]:
intellicron_prequals.head()

Unnamed: 0.1,Unnamed: 0,prequal_id,assignment_date
0,1,00081cb5-27bb-428a-bc53-076bacc7ad02,2019-06-22
1,2,0136e545-347d-4a3b-b964-0a4561d32567,2019-06-22
2,3,0144b69a-f364-4746-8600-f6d7334d8f3f,2019-06-27
3,4,020412e9-c946-4252-94cf-85a4aaa75e2e,2019-06-29
4,5,02f54404-995d-4eb0-b91b-40f40f68f506,2019-06-19


### Date range

In [10]:
# Date range for intellicron dataset
min(intellicron_prequals.assignment_date), max(intellicron_prequals.assignment_date)

('2019-06-16', '2019-06-30')

In [11]:
# Date range for prequals dataset
min(prequals.prequal_date), max(prequals.prequal_date)

('2019-01-01', '2019-06-30')

### Data types

In [15]:
intellicron_prequals.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8609 entries, 0 to 8608
Data columns (total 3 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   Unnamed: 0       8609 non-null   int64 
 1   prequal_id       8609 non-null   object
 2   assignment_date  8609 non-null   object
dtypes: int64(1), object(2)
memory usage: 201.9+ KB


In [14]:
prequals.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 190976 entries, 0 to 190975
Data columns (total 6 columns):
 #   Column             Non-Null Count   Dtype 
---  ------             --------------   ----- 
 0   Unnamed: 0         190976 non-null  int64 
 1   prequal_id         190976 non-null  object
 2   checkout_id        49323 non-null   object
 3   prequal_date       190976 non-null  object
 4   completed_prequal  190976 non-null  int64 
 5   approved           190976 non-null  int64 
dtypes: int64(3), object(3)
memory usage: 8.7+ MB


### Null data

In [17]:
np.sum(prequals.isnull())

Unnamed: 0                0
prequal_id                0
checkout_id          141653
prequal_date              0
completed_prequal         0
approved                  0
dtype: int64

In [18]:
np.sum(intellicron_prequals.isnull())

Unnamed: 0         0
prequal_id         0
assignment_date    0
dtype: int64

## 2. Data cleaning and feature engineering

In [16]:
# Convert date columns to date time


In [None]:
# Assign a new column to identify intellicron prequals
intellicron_prequals['intellicron'] = 'intellicron'

# Add intellicron identifier to prequals
prequals_new = pd.prequals