# Mobile Games A/B Testing - Cookie Cats
by Jessica Syafaq Muthmaina 

## Assignment Outcomes:

- You should be able to understand the concept of A/B testing and how to analyze its result.

## Assessment requirement:

- Problem definition
- Hypotheses formulation
- Hypotheses testing
- Clear result interpretation

## **Overview**

The dataset of the Cookie Cats is from a Kaggle challenge. It’s a mobile puzzle game where the same colour tiles should be connected to clear the board and win the level. After crossing various levels, there are gates placed which are usually there to show users ads or act as a paywall to purchase the app.

## **Problem**

We now consider the challenge of placing gates. Initially, the gate was placed on level 30, but what if we place the gate at level 40. There are chances the user might be interested to play more that is retention might increase for each player and increase the traffic for the mobile game but to be confident enough we should back up our conversion rates with a valid explanation or statistical analysis.

We perform A/B testing, to check if placing the gate at a different level leads to more retention. But first, start with Exploratory Data Analysis of the dataset to get a picture of what kind of data is and what it holds.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
cc = pd.read_csv('../input/mobile-games-ab-testing-cookie-cats/cookie_cats.csv')

In [None]:
cc

In [None]:
cc.head()

The dataset contains 5 variables. Let’s see what each variable holds:

- userid: A unique number that identifies each player.
- version: Whether the player was put in the control group (gate_30 — a gate at level 30) or the group with the moved gate (gate_40 — a gate at level 40).
- sum_gamerounds: The number of game rounds played by the player during the first 14 days after installation.
- retention_1: The player comes back and plays 1 day after installing?
- retention_7: The player comes back and plays 7 days after installing?

While performing the EDA check for null values, duplicates, any incorrect data and the type of data.

In [None]:
cc.info()

From the above data, it can be seen that:
- The values in the dataset do not have any null values.
- The userid contains all unique ids.
- The sum_gamerounds variable is an integer value while retention_1 and retention_7 are boolean variables( 1 or 0 ; True or False).
- version is a categorical variable.

In [None]:
cc.isnull().sum()

In [None]:
cc.isna().sum()

In [None]:
# Let's check for outliers in a numerical variable
import matplotlib.pyplot as plt
%matplotlib inline 

plt.scatter(cc.userid,cc['sum_gamerounds'], c = "blue", marker = "s")
plt.figure()

In [None]:
#From the above figure it can be seen there is an outlier present in the data.
#The values of outliers gives us an incorrect information if used in analysis. It's better to get rid of it

cc = cc[cc['sum_gamerounds']<40000]
plt.scatter(cc.userid,cc['sum_gamerounds'], c = "blue", marker = "s")
plt.figure()

In [None]:
cc.groupby("version").agg(total_games_played=('sum_gamerounds', np.sum)).reset_index()

In [None]:
cc.groupby(["version"])["userid"].count().reset_index()

In [None]:
# This graph checks the number of true and false values for column 'retention_1' for both the control and Experiment group
cc.groupby('version')['retention_1'].value_counts().plot(kind = 'pie', figsize = (5,5))

Similarly, I also plotted for the retention on day 7.

In [None]:
# This graph checks the number of true and false values for column 'retention_1' for both the control and Experiment group
cc.groupby('version')['retention_7'].value_counts().plot(kind = 'pie', figsize = (5,5))

The above pie graph is showing the retention on day 1 is more as compared to day 7 for both the control and experiment groups. Now, let's get a value i.e. conversion rates for day 1 retention and day 7 retention.

## Retention Rates for each variant
Before checking the conversion rates. Let’s understand what variant is in A/B testing?

A variant is a change which we are planning to test in comparison to the default website and whichever turned out to be better we launch it.

In this case, we test at which game level we should introduce the gate so that the retention rate is higher i.e. user plays the game again.

## Day 1 Retention for Control(Gate_30) and Experiment Group(Gate_40)

In [None]:
# Let's look at the conversion rate for the day 1 retention for gate_30 i.e Control Group
retention_day_1=cc.groupby('version')['retention_1'].sum()
user_table_day_1=cc.groupby('version')['userid'].count()
retention_gate_30_day_1=round((retention_day_1['gate_30']/user_table_day_1['gate_30']),4)
perc_retention_gate_30=round(retention_gate_30_day_1*100,2)
print('The propertion of retention after day 1 for Control group is %s' %retention_gate_30_day1)
print('The percentage of retention after day 1 for Control group is %s%%' %perc_retention_gate_30)

In [None]:
# Let's look at the conversion rate for the day 1 retention for gate_40 i.e. Experiment Group
retention_gate_40_day_1=round((retention_day_1['gate_40']/user_table_day_1['gate_40']),4)
perc_retention_gate_40=round(retention_gate_40_day_1*100,2)

#perc_retention_gate_40
print('The proportion of retention after day 1 for Experiment group is %s' %retention_gate_40_day_1)
print('The percentage of retention after day 1 for Experiment group is %s%%' %perc_retention_gate_40)

## Day 7 Retention for Control(Gate_30) and Experiment Group(Gate 40)

In [None]:
# Let's look at the conversion rate for the day 1 retention for gate_30 i.e Control Group
retention_day_7=cc.groupby('version')['retention_7'].sum()
user_table_day_7=cc.groupby('version')['userid'].count()
retention_gate_30_day_7=round((retention_day_7['gate_30']/user_table_day_7['gate_30']),4)
perc_retention_gate_30=round(retention_gate_30_day_1*100,2)
print('The propertion of retention after day 7 for Control group is %s' %retention_gate_30_day_7)
print('The percentage of retention after day 7 for Control group is %s%%' %perc_retention_gate_30)

In [None]:
# Let's look at the conversion rate for the day 7 retention for gate_40 i.e. Experiment Group
retention_gate_40_day_7=round((retention_day_7['gate_40']/user_table_day_7['gate_40']),4)
perc_retention_gate_40=round(retention_gate_40_day_1*100,2)

#perc_retention_gate_40
print('The proportion of retention after day 7 for Experiment group is %s' %retention_gate_40_day_7)
print('The percentage of retention after day 7 for Experiment group is %s%%' %perc_retention_gate_40)

From the above retention, it could be seen that retention rates are higher in the control group. Also, the retention rate is much higher for day 1 as compared to day 7.

When we consider a sample instead of a population, there is a high possibility that it happens by chance. To be sure of it didn't happen by chance we perform hypothesis testing.

## Metrics
The unit of diversion is a user id through which the user(experimental units) are randomly split into two different groups i.e. Control and Experiment. Make sure the users are randomly assigned to one and only one group.

The evaluation metric or response variable chosen here is the retention of a player. This metric is used to measure the impact of our change. Retention is basically people coming back to your product to use it again.

Our interest lies in comparing the means of both the control and experimental group. For this, an Independent sample t-test would be an appropriate choice. I would talk about it more in the later section.

## Formulating Hypothesis
Null Hypothesis: It states there is no difference between the control and experiment groups which means:
- The retention rates are the same in both groups.
- There is no statistically significant result.

Alternate Hypothesis: There is a difference between the control and experiment groups which means:
- Retention rates are different in both groups.
- It gives Statistical Significant result

## Sample Size

Power analysis and Significance level: Once after decided on the metrics and the hypothesis, we should check whether we have enough data to run our A/B testing. When calculating sample size these errors should be avoided:
- To avoid type I errors, the significance level must be specified while calculating the sample size.
- To avoid type II errors, the sample size should be large enough, to achieve this set the power at 0.8 or 0.9 if possible when calculating your sample size.

For this problem, We choose a confidence Interval i.e. 95% that gives us the value of the Significance Level of the test(alpha)=0.05 and the Power level of the test( 1 — Beta) =80% .

# CONCLUSION:

Descriptive analysis plays an important role in the exploration of data.

When choosing sample size, the experiment units should be chosen at random. Hence, not giving the inconsistent result.

Also, the duration plays a key role while computing the test size because if stopped early we won’t be having enough data to get significant results.
