<h1 id="basics" style="font-family:verdana;"> 
    <center> A/B Testing for the Mobile Games Dataset
    </center>
</h1>
<div style="width:100%;text-align: center;"> <img align=middle src="https://appradar.com/wp-content/uploads/2021/06/mobile_app_ab_testing-754x503.png" alt="A/B Testing" style="height:500px;margin-top:1rem;"> </div>



This dataset includes A/B test results of Cookie Cats to examine what happens when the first gate in the game was moved from level 30 to level 40. When a player installed the game, he or she was randomly assigned to either gate_30 or gate_40.

The data we have is from 90,189 players that installed the game while the AB-test was running. The variables are:

1. userid: A unique number that identifies each player.
2. version: Whether the player was put in the control group (gate_30 - a gate at level 30) or the group with the moved gate (gate_40 - a gate at level 40).
3. sum_gamerounds: the number of game rounds played by the player during the first 14 days after install.
4. retention_1: Did the player come back and play 1 day after installing?
5. retention_7: Did the player come back and play 7 days after installing?

When a player installed the game, he or she was randomly assigned to either.

## Main topics of the study can be seen below:

* [Aim of the study](#section-one)
* [Understanding the data](#section-two)
* [Preparation of data](#section-three)
* [What is the A/B Testing](#section-four)
* [A/B Testing Model Process](#section-five)
* [Hypothesis](#section-six)
* [Assumption Control](#section-seven)
    * [Normal Distribution](#section-eight)
    * [Variance Homogeneity Assumption](#section-nine)
* [Apply of the Hypothesis](#section-ten)
* [Conclusion and Reccomendation](#section-eleven)



<a id="section-one"></a>
## 1. Aim of the Study

This dataset includes A/B test results of Cookie Cats to examine what happens when the first gate in the game was moved from level 30 to level 40. When a player installed the game, he or she was randomly assigned to either gate_30 or gate_40. In this study, we will try to find out whether gate 30 - gate 40 effected to game playing duration or not.

<div style="width:100%;text-align: center;"> <img align=middle src="https://devopedia.org/images/article/32/6055.1530296772.jpg" alt="A/B Testing" style="height:500px;margin-top:1rem;"> </div>

<a id="section-two"></a>
## 2. Understanding the Data

First of all we should import the libraries that will use during the analysis process.

In [1]:
# Lets import the dataset

import itertools
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.stats.api as sms
from scipy.stats import ttest_1samp, shapiro, levene, ttest_ind, mannwhitneyu, pearsonr, spearmanr, \
    kendalltau, f_oneway, kruskal

In [2]:
# Lets import the dataset

df = pd.read_csv(r"/kaggle/input/mobile-games-ab-testing-cookie-cats/cookie_cats.csv", encoding= 'unicode_escape')

In [3]:
# To understand the "check_df" functione can be used to decide the what should we do about the data.

def check_df(dataframe, head=5):
    print("########## Info #############")
    print(dataframe.info())
    print("########## Shape #############")
    print(dataframe.shape)
    print("########## Data Types #############")
    print(dataframe.dtypes)
    print("########## Head of Data #############")
    print(dataframe.head(head))
    print("########## Tail of Data #############")
    print(dataframe.tail(head))
    print("########## Null Values of Data #############")
    print(dataframe.isnull().sum())
    print("########## Describe of the Numerical Datas #############")
    print(dataframe.describe([0, 0.05, 0.50, 0.95, 0.99, 1]).T)

check_df(df)

########## Info #############
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90189 entries, 0 to 90188
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   userid          90189 non-null  int64 
 1   version         90189 non-null  object
 2   sum_gamerounds  90189 non-null  int64 
 3   retention_1     90189 non-null  bool  
 4   retention_7     90189 non-null  bool  
dtypes: bool(2), int64(2), object(1)
memory usage: 2.2+ MB
None
########## Shape #############
(90189, 5)
########## Data Types #############
userid             int64
version           object
sum_gamerounds     int64
retention_1         bool
retention_7         bool
dtype: object
########## Head of Data #############
   userid  version  sum_gamerounds  retention_1  retention_7
0     116  gate_30               3        False        False
1     337  gate_30              38         True        False
2     377  gate_40             165         True     

Before the start the analysis, according to dataset summary, dataset has 5 variables. Lets check them;

1. userid: A unique number that identifies each player.
2. version: Whether the player was put in the control group (gate_30 - a gate at level 30) or the group with the moved gate (gate_40 - a gate at level 40).
3. sum_gamerounds: the number of game rounds played by the player during the first 14 days after install.
4. retention_1: Did the player come back and play 1 day after installing?
5. retention_7: Did the player come back and play 7 days after installing?

According to quick analysis of the data, dataset does not have any null data and all userids are used as individually.


<a id="section-three"></a>
## 3. Preparation of the Data

In this stage, If any null values are in the dataset, they will drop it from the data.

In [4]:
# dropna() command will help to drop the null values from the data.
df.shape
df.isnull().sum()
df.dropna(inplace = True)

# Lets check the data

df.describe().T


Unnamed: 0,count,mean,std,min,25%,50%,75%,max
userid,90189.0,4998412.0,2883286.0,116.0,2512230.0,4995815.0,7496452.0,9999861.0
sum_gamerounds,90189.0,51.87246,195.0509,0.0,5.0,16.0,51.0,49854.0


<a id="section-four"></a>
## 4. What is the A/B Testing

<div style="width:100%;text-align: center;"> <img align=middle src="https://sp-ao.shortpixel.ai/client/to_auto,q_glossy,ret_img,w_900/https://www.brillmark.com/wp-content/uploads/2021/03/What-is-AB-Testing.png" alt="A/B Testing" style="height:300px;margin-top:1rem;"> </div>

> A/B testing (also known as split testing or bucket testing) is a methodology for comparing two versions of a webpage or app against each other to determine which one performs better. A/B testing is essentially an experiment where two or more variants of a page are shown to users at random, and statistical analysis is used to determine which variation performs better for a given conversion goal.

> Running an A/B test that directly compares a variation against a current experience lets you ask focused questions about changes to your website or app and then collect data about the impact of that change.

> Testing takes the guesswork out of website optimization and enables data-informed decisions that shift business conversations from "we think" to "we know." By measuring the impact that changes have on your metrics, you can ensure that every change produces positive results.


<a id="section-five"></a>
## 5. A/B Testing Model Process

To apply the A/B Testing for the dataset we should have follow the fundamental steps according to literature of the A/B Testing. These steps can be seen below:

1. Hypothesis
2. Assumptions Control
3. Apply of the Hyphothesis

<a id="section-six"></a>
## 6. Hypothesis

First step of the A/B Testing we should define the "Hypothesis". It means, which condition or situation will be tested in the dataset and what is the boundaries of the hypothesis step.

According to literature, A/B testing has two hyphtohesises H0 and H1. Lets explain these hypothesises.

> A statistical hypothesis is an assertion or conjecture concerning one or more populations. To prove that a hypothesis is true, or false, with absolute certainty, we would need absolute knowledge. That is, we would have to examine the entire population. Instead, hypothesis testing concerns on how to use a random sample to judge if it is evidence that supports or not the hypothesis.

Hypothesis testing is formulated in terms of two hypotheses:
- H0: the null hypothesis;
- H1: the alternate hypothesis.
    
    
The hypothesis we want to test is if H1 is “likely” true. So, there are two possible outcomes:
- Reject H0 and accept H1 because of sufficient evidence in the sample in favor or H1;
- Do not reject H0 because of insufficient evidence to support H1.

Note that failure to reject H0 does not mean the null hypothesis is true. There is no formal outcome that says “accept H0.” It only means that we do not have sufficient evidence to support H1.

<a id="section-seven"></a>
## 7. Assumption Control

<a id="section-eight"></a>
### 7.1 Normal Distribution

To understand if the mean of a sample is significantly different from the population mean (μ), we need to perform a Z-test. At the moment we are interested in a two-tails test that is formulated as:

H0: m = m0 – Null hypothesis – The mean of our sample (m or X-bar) is not different to the value m0.
H1: m ≠ m0 – Alternative hypothesis – The mean of our sample (m) is different to the value m0.

The Z-score is calculated based on the formula below:

<div style="width:100%;text-align: center;"> <img align=middle src="https://analyticsmayhem.com/wp-content/uploads/2021/03/z-score.gif" alt="A/B Testing" style="height:30px;margin-top:1rem;"> </div>

- X-bar: sample mean
- μ: population meean
- σ: population standard deviation

<div style="width:100%;text-align: center;"> <img align=middle src="https://analyticsmayhem.com/wp-content/uploads/2021/03/snd.png" alt="A/B Testing" style="height:600px;margin-top:1rem;"> </div>


In [5]:
####################################
# Normal Distribution
####################################

# H0: Normal distribution assumption is true.
# H1: False.

# If p-value < 0.05, H0 = Reject
# If p-value > 0.05 H0 = Cannot Reject.

# Normal Distribution control for gate_30

test_stat, pvalue = shapiro(df.loc[df["version"] == "gate_30", "sum_gamerounds"])
print("Test Stat = %.4f, p-value = %.4f" % (test_stat, pvalue))

# Normal Distribution control for gate_40

test_stat, pvalue = shapiro(df.loc[df["version"] == "gate_40", "sum_gamerounds"])
print("Test Stat = %.4f, p-value = %.4f" % (test_stat, pvalue))

# According to p-values, H0 = Reject

Test Stat = 0.0881, p-value = 0.0000
Test Stat = 0.4826, p-value = 0.0000




<a id="section-nine"></a>
### 7.2 Variance Homogeneity Assumption

The assumption of homogeneity of variance means that the level of variance for a particular variable is constant across the sample. If you’ve collected groups of data then this means that the variance of your outcome variable(s) should be the same in each of these groups (i.e. across schools, years, testing groups or predicted values).

In [6]:
####################################
# Variance Homogeneity Assumption:
####################################

# H0: Variance is homogeneous.
# H1: Variance is not homogeneous.

# If p-value < 0.05, H0 = Reject
# If p-value > 0.05 H0 = Cannot Reject.

test_stat, pvalue = levene(df.loc[df["version"] == "gate_30", "sum_gamerounds"],
                            df.loc[df["version"] == "gate_40", "sum_gamerounds"])
print("Test Stat = %.4f, p-value = %.4f" % (test_stat, pvalue))

# According to p-values, H0 =  Cannot Reject

Test Stat = 0.5292, p-value = 0.4669


<a id="section-ten"></a>
## 8. Apply of the Hypothesis

According to "Normal Distribution" and "Variance Homogeneity Assumption" controls shown that the first control H0 = reject, and the second one is H0 = Cannot reject. It means, we should use the mannwtihneyu() command to apply of the hypothesis.

The Mann-Whitney test is based on a comparison of every observation xi in the first sample with every observation yj in the other sample. The total number of pairwise comparisons that can be made is nxny.

In [7]:
####################################
# Apply of the Hypothesis:
####################################

test_stat, pvalue = mannwhitneyu(df.loc[df["version"] == "gate_30", "sum_gamerounds"],
                            df.loc[df["version"] == "gate_40", "sum_gamerounds"])
print("Test Stat = %.4f, p-value = %.4f" % (test_stat, pvalue))

# If p-value < 0.05, H0 = Reject
# If p-value > 0.05 H0 = Cannot Reject.

Test Stat = 1024331250.5000, p-value = 0.0502


<a id="section-eleven"></a>
## 9. Conclusion and Recommendation

According to A/B Testing results to find out the effect of the "gate_30" and "gate_40" are effected to the total game rounds that played on the game? 

In the first of testing process, we ask as "means are equal or not" for the case study that shared above. To find out it, firstly checked normal distribution and variance homogeneity assumption are distributed normal or not? A/B testing says that if Normal Distribution is not equal, you should use the Mann-Whitney U test to find out the effect of the "gate_30" and "gate_40". 

The result of the Mann-Whitney U test, p-value calculated as 0.0502, and it means H0 = Cannot Reject according to Hypothesis assumption, and its meaning for the A/B Testing, both gates results are occurred just accidentally. 


## Keep in Touch!

You can follow my the other social media adresses to see this kind of works!

1. [GitHub](https://github.com/KeskinHakan)
2. [LinkedIn](https://www.linkedin.com/in/hakan-keskin-/)
3. [Medium](https://medium.com/@hakan-keskin)
