# **Waze Project: User Churn**

**STAGE 3: Hypothesis Test**

So far, the team has completed a project proposal (stage 1), and used Python to explore and analyze Waze’s user data and create some initial visualisations of the data set (stage 2).

Task for **Stage 3**: statistical analysis (two-sample t-test) of the relationship between mean amount of rides and device type (iPhone® users and Android™ users).

*This stage has three parts:*

**Part 1:** Imports and data loading

**Part 2:** Conduct hypothesis testing

**Part 3:** Highlight key insights for communication with stakeholders

## **Part 1: Imports and data loading**

Import packages and libraries needed to compute descriptive statistics and conduct a hypothesis test.


In [1]:
# Import functional packages and libraries
import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.api as sm

In [2]:
# Load dataset into dataframe
df = pd.read_csv('waze_dataset.csv')
df.head()

Unnamed: 0,ID,label,sessions,drives,total_sessions,n_days_after_onboarding,total_navigations_fav1,total_navigations_fav2,driven_km_drives,duration_minutes_drives,activity_days,driving_days,device
0,0,retained,283,226,296.748273,2276,208,0,2628.845068,1985.775061,28,19,Android
1,1,retained,133,107,326.896596,1225,19,64,13715.92055,3160.472914,13,11,iPhone
2,2,retained,114,95,135.522926,2651,0,0,3059.148818,1610.735904,14,8,Android
3,3,retained,49,40,67.589221,15,322,7,913.591123,587.196542,7,3,iPhone
4,4,retained,84,68,168.24702,1562,166,5,3950.202008,1219.555924,27,18,Android


## **Part 2. Data exploration**

Use descriptive statistics to conduct exploratory data analysis (EDA).

A copy of the DataFrame is created so that the original data isn't overwritten. Create a new column `device_cat` to encode device type.


In [3]:
# Copy DataFrame
df_copy = df

# Set 'device' to category data type and categorical encode
df_copy["device"] = df_copy["device"].astype("category")
df_copy["device_type"] = df_copy["device"].cat.codes

# Inspect head of the new DataFrame
df_copy.head()

Unnamed: 0,ID,label,sessions,drives,total_sessions,n_days_after_onboarding,total_navigations_fav1,total_navigations_fav2,driven_km_drives,duration_minutes_drives,activity_days,driving_days,device,device_type
0,0,retained,283,226,296.748273,2276,208,0,2628.845068,1985.775061,28,19,Android,0
1,1,retained,133,107,326.896596,1225,19,64,13715.92055,3160.472914,13,11,iPhone,1
2,2,retained,114,95,135.522926,2651,0,0,3059.148818,1610.735904,14,8,Android,0
3,3,retained,49,40,67.589221,15,322,7,913.591123,587.196542,7,3,iPhone,1
4,4,retained,84,68,168.24702,1562,166,5,3950.202008,1219.555924,27,18,Android,0


We are interested in the relationship between device type and the number of drives. One approach is to look at the average number of drives for each device type. These averages are calculated.

In [4]:
# Calculate average number of drives per device type
android_mean = df[df["device_type"] == 0]["drives"].mean()
iphone_mean = df[df["device_type"] == 1]["drives"].mean()

print(f"Mean drives Android: {android_mean}\nMean drives iPhone:  {iphone_mean}")

Mean drives Android: 66.23183780739629
Mean drives iPhone:  67.85907775020678


Based on the averages shown, it appears that drivers who use an iPhone device to interact with the application have a higher number of drives on average. However, this difference might arise from random sampling rather than being a true difference in the number of drives. To assess whether the difference is statistically significant, a hypothesis test will be conducted.


### **Part 3. Hypothesis testing**

The goal is to conduct a two-sample t-test following the steps of a hypothesis test:

1. State the null hypothesis and the alternative hypothesis
2. Choose a signficance level
3. Find the p-value (run t-test)
4. Reject or fail to reject the null hypothesis

This is a t-test for two independent samples, which is the appropriate test since the two groups are independent (Android users vs. iPhone users).

The default for the argument `equal_var` in `stats.ttest_ind()` is `True`, which assumes population variances are equal. This equal variance assumption might not hold in practice (that is, there is no strong reason to assume that the two groups have the same variance); we will relax this assumption by setting `equal_var` to `False`, and `stats.ttest_ind()` will perform the unequal variances $t$-test (known as Welch's `t`-test). (See [scipy t-test documentation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_ind.html).)

**1. State hypotheses**

($H_0$): There is no difference in the population mean number of drives per user by device type.

($H_A$): There is a difference in the population mean number of drives per user by device type. 

**2. Choose significance level**

A standard significance level of **5%** will be used.

**3. Find the p-value**

1. Isolate the `drives` column for iPhone users.
2. Isolate the `drives` column for Android users.
3. Perform the t-test

In [5]:
# 1. Isolate the `drives` column for iPhone users.
iphone_drives = df[df["device_type"] == 1]["drives"]

# 2. Isolate the `drives` column for Android users.
android_drives = df[df["device_type"] == 0]["drives"]

# 3. Perform the t-test
tstat, pvalue = stats.ttest_ind(a=iphone_drives, b=android_drives, equal_var=False)
print(f"T-statistic: {tstat}\nP-value: {pvalue}")

T-statistic: 1.463523206885235
P-value: 0.14335197268020597


**4. Reject or fail to reject the null-hypothesis**

The null hypothesis is rejected only if the p-value is less than the significance level. In our test, the p-value of 14% is greater than the significance level of 5%. Thus, the difference is not statistically significant and we fail to reject the null hypothesis.

We conclude that until we have further data or other other tests, we will assume that there is no statistically significant difference between the mean number of drives per user by device type.

## **Part 3. Communicate key insights to stakeholders**

The next step is to share our findings with the Waze leadership team. Consider the following question as you prepare to write your executive summary:

* We could not establish any statistically significant difference between the average number of drives between Android and iPhone users.
    * This means that we assume that any difference between the mean number of drives by device type is due to chance. Thus, it does not seem likely that any changes in user experience or satisfaction that targets users by their device type will have a difference on user churn.
<br/><br/>
* To proceed with this project, we should look for other factors that influence user churn by conducting further hypothesis tests.
