# **Waze Project - An example of A/B Testing**  

**The purpose** of this project is to demostrate knowledge of how to conduct a two-sample hypothesis test.

**The goal** is to apply descriptive statistics and hypothesis testing in Python.
<br/>

*This project has three parts:*

**Part 1:** Imports and data loading
* What data packages will be necessary for hypothesis testing?

**Part 2:** Conduct hypothesis testing
* How did computing descriptive statistics help us analyze our data?

* How did we formulate our null hypothesis and alternative hypothesis?

**Part 3:** Communicate insights with stakeholders

* What key business insight(s) emerged from the hypothesis test?

* What business recommendations can we propose based on your results?

<br/>

# **Data exploration and hypothesis testing**

### **Imports and data loading**




Import packages and libraries needed to compute descriptive statistics and conduct a hypothesis test.

In [1]:
# Import any relevant packages or libraries
import pandas as pd
import scipy.stats as stats

Import the dataset.

**Note:** As shown in this cell, the dataset has been automatically loaded in for you. You do not need to download the .csv file, or provide more code, in order to access the dataset and proceed with this lab. Please continue with this activity by completing the following instructions.

In [2]:
# Load dataset into dataframe
df = pd.read_csv('waze_dataset.csv')

### **Part 2. Data exploration**

Use descriptive statistics to conduct exploratory data analysis (EDA).

**Note:** In the dataset, `device` is a categorical variable with the labels `iPhone` and `Android`.

In order to perform this analysis, we must turn each label into an integer.  The following code assigns a `1` for an `iPhone` user and a `2` for `Android`.  It assigns this label back to the variable `device_new`.

**Note:** Creating a new variable is ideal so that we don't overwrite original data.



1. Create a dictionary called `map_dictionary` that contains the class labels (`'Android'` and `'iPhone'`) for keys and the values we want to convert them to (`2` and `1`) as values.

2. Create a new column called `device_type` that is a copy of the `device` column.

3. Use the [`map()`](https://pandas.pydata.org/docs/reference/api/pandas.Series.map.html#pandas-series-map) method on the `device_type` series. Pass `map_dictionary` as its argument. Reassign the result back to the `device_type` series.
</br></br>
When we pass a dictionary to the `Series.map()` method, it will replace the data in the series where that data matches the dictionary's keys. The values that get imputed are the values of the dictionary.

In [3]:
#Create `map_dictionary`
map_dictionary = {'iPhone': 1, 'Android': 2}
df['device_type'] = df['device'].map(map_dictionary)
df['device_type']


0        2
1        1
2        2
3        1
4        2
        ..
14994    1
14995    2
14996    1
14997    1
14998    1
Name: device_type, Length: 14999, dtype: int64

We are interested in the relationship between device type and the number of drives. One approach is to look at the average number of drives for each device type.

In [5]:
df

Unnamed: 0,ID,label,sessions,drives,total_sessions,n_days_after_onboarding,total_navigations_fav1,total_navigations_fav2,driven_km_drives,duration_minutes_drives,activity_days,driving_days,device,device_type
0,0,retained,283,226,296.748273,2276,208,0,2628.845068,1985.775061,28,19,Android,2
1,1,retained,133,107,326.896596,1225,19,64,13715.920550,3160.472914,13,11,iPhone,1
2,2,retained,114,95,135.522926,2651,0,0,3059.148818,1610.735904,14,8,Android,2
3,3,retained,49,40,67.589221,15,322,7,913.591123,587.196542,7,3,iPhone,1
4,4,retained,84,68,168.247020,1562,166,5,3950.202008,1219.555924,27,18,Android,2
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
14994,14994,retained,60,55,207.875622,140,317,0,2890.496901,2186.155708,25,17,iPhone,1
14995,14995,retained,42,35,187.670313,2505,15,10,4062.575194,1208.583193,25,20,Android,2
14996,14996,retained,273,219,422.017241,1873,17,0,3097.825028,1031.278706,18,17,iPhone,1
14997,14997,churned,149,120,180.524184,3150,45,0,4051.758549,254.187763,6,6,iPhone,1


In [10]:
df.groupby(['device']).mean()['drives']

device
Android    66.231838
iPhone     67.859078
Name: drives, dtype: float64

In [4]:
67.859078-66.231838

1.6272400000000005

Based on the averages shown, it appears that drivers who use an iPhone device to interact with the application have on averrage 1.6 more drives than Android users. However, this difference might arise from random sampling, rather than being a true difference in the number of drives. To assess whether the difference is statistically significant, we can conduct a hypothesis test.


### **Part 3. Hypothesis testing**

1.   State the null hypothesis and the alternative hypothesis
2.   Choose a signficance level
3.   Find the p-value
4.   Reject or fail to reject the null hypothesis

This is a t-test for two independent samples. This is the appropriate test since the two groups are independent (Android users vs. iPhone users).

In a two-sample t-test, the null hypothesis states that there is no difference between the means of our two groups. The alternative hypothesis states the contrary claim: there is a difference between the means of your two groups. 

We use $H_0$ to denote the null hypothesis, and $H_A$ to denote the alternative hypothesis.

*   $H_0$: There is no difference in the mean number of drives between iPhone and Android users
*   $H_A$: There is a difference in the mean number of drives between iPhone and Android users


Next, we choose 5% as the significance level (p level of 0.05) and proceed with a two-sample t-test. 
A 5% level of significance is a widely accepted standard for determining statistical significance. 
A significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference

**Technical note**: The default for the argument `equal_var` in `stats.ttest_ind()` is `True`, which assumes population variances are equal. This equal variance assumption might not hold in practice (that is, there is no strong reason to assume that the two groups have the same variance); we can relax this assumption by setting `equal_var` to `False`, and `stats.ttest_ind()` will perform the unequal variances $t$-test (known as Welch's `t`-test). Refer to the [scipy t-test documentation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html) for more information.


1. Isolate the `drives` column for iPhone users.
2. Isolate the `drives` column for Android users.
3. Perform the t-test

P-value refers to the probability of observing results as or more extreme than those observed when the null hypothesis is true.

Based on our sample data, the difference between the mean number of drives for iPhone and Android users is 1.6 drives. Our null hypothesis claims that this difference is due to chance. Our p-value is the probability of observing an absolute difference in sample means that is 1.6 or greater if the null hypothesis is true. If the probability of this outcome is very unlikely (p-value less than our significance level of 5%) – then we will reject the null hypothesis.


In [5]:
# 1. Isolate the `drives` column for iPhone users.
drives_iphone = df[df['device'] == 'iPhone']['drives']
# 2. Isolate the `drives` column for Android users.
drives_android = df[df['device'] == 'Android']['drives']
# 3. Perform the t-test
stats.ttest_ind(a=drives_iphone, b=drives_android, equal_var=False)

Ttest_indResult(statistic=1.4635232068852353, pvalue=0.1433519726802059)

In [9]:
drives_iphone

1        107
3         40
5        103
6          2
7         35
        ... 
14993     57
14994     55
14996    219
14997    120
14998     58
Name: drives, Length: 9672, dtype: int64

Do we reject or fail to reject the null hypothesis?


The p-value is about 0.143, or 14.3%.

This means there is a 14% probability that the absolute difference between the two mean number of drives would be 1.6 drives or greater if the null hypothesis is true. This is greater than the 5% we set as the significance level so we fail to reject the null hypothesis.

### **Part 4. Communicate insights with stakeholders**


* What business insight(s) can we draw from the result of our hypothesis test?

There is no statistically significant difference between the number of the drives done by iPhone and Android users. If there had been a difference perhaps more resources would have been put into the iPhone application. 