### About Dataset

#### Context
This dataset includes A/B test results of Cookie Cats to examine what happens when the first gate in the game was moved from level 30 to level 40. When a player installed the game, he or she was randomly assigned to either gate_30 or gate_40.

#### Content
The data we have is from 90,189 players that installed the game while the AB-test was running. The variables are:

- userid: A unique number that identifies each player.
- version: Whether the player was put in the control group (gate_30 - a gate at level 30) or the group with the moved gate (gate_40 - a gate at level 40).
- sum_gamerounds: the number of game rounds played by the player during the first 14 days after install.
- retention_1: Did the player come back and play 1 day after installing?
- retention_7: Did the player come back and play 7 days after installing?

When a player installed the game, he or she was randomly assigned to either

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats as sts
import statsmodels.stats.api as sms
from scipy.stats import shapiro, levene, mannwhitneyu, ttest_ind, pearsonr, fisher_exact

import warnings

warnings.simplefilter(action='ignore', category=FutureWarning)
pd.options.mode.chained_assignment = None

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', None)
pd.set_option('display.float_format', lambda x: '%.3f' % x)


In [2]:
df = pd.read_csv('/kaggle/input/mobile-games-ab-testing-cookie-cats/cookie_cats.csv')
df.head()

Unnamed: 0,userid,version,sum_gamerounds,retention_1,retention_7
0,116,gate_30,3,False,False
1,337,gate_30,38,True,False
2,377,gate_40,165,True,False
3,483,gate_40,1,False,False
4,488,gate_40,179,True,True


In [3]:
df.isnull().sum()

userid            0
version           0
sum_gamerounds    0
retention_1       0
retention_7       0
dtype: int64

In [4]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
userid,90189.0,4998412.234,2883285.608,116.0,2512230.0,4995815.0,7496452.0,9999861.0
sum_gamerounds,90189.0,51.872,195.051,0.0,5.0,16.0,51.0,49854.0


In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90189 entries, 0 to 90188
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   userid          90189 non-null  int64 
 1   version         90189 non-null  object
 2   sum_gamerounds  90189 non-null  int64 
 3   retention_1     90189 non-null  bool  
 4   retention_7     90189 non-null  bool  
dtypes: bool(2), int64(2), object(1)
memory usage: 2.2+ MB


In [6]:
df['version'].value_counts()

version
gate_40    45489
gate_30    44700
Name: count, dtype: int64

In [7]:
value_counts = df['version'].value_counts(normalize=True) * 100
print(value_counts)

version
gate_40   50.437
gate_30   49.563
Name: proportion, dtype: float64


In [8]:
df.groupby('version')['sum_gamerounds'].describe().T

version,gate_30,gate_40
count,44700.0,45489.0
mean,52.456,51.299
std,256.716,103.294
min,0.0,0.0
25%,5.0,5.0
50%,17.0,16.0
75%,50.0,52.0
max,49854.0,2640.0


In [9]:
df.groupby('version')['sum_gamerounds'].var()

version
gate_30   65903.322
gate_40   10669.736
Name: sum_gamerounds, dtype: float64

##### Number of Games Played (sum_gamerounds):
In the gate_30 version, the average number of games played (sum_gamerounds) is higher (average 65903,322).

In the gate_40 version, the average number of games played is lower (average 10669.736).

This may indicate that the gate_30 version causes players to play for longer periods of time or to perform more gaming activities.

##### Other Basic Statistics:
The other statistics given (count, min, 25%, 50%, 75%, max) also show the differences between the two versions.

For example, the maximum number of games in the gate_30 version (49854) is higher than in the gate_40 version (2640).

The median (50%) also differs. This may indicate that the central tendency of the distribution of the two versions is different.

In [10]:
control = df[df['version'] == 'gate_30']
control['retention_1'] = control['retention_1'].map({True: 1, False: 0})
control['retention_7'] = control['retention_7'].map({True: 1, False: 0})

control.head()

Unnamed: 0,userid,version,sum_gamerounds,retention_1,retention_7
0,116,gate_30,3,0,0
1,337,gate_30,38,1,0
6,1066,gate_30,0,0,0
11,2101,gate_30,0,0,0
13,2179,gate_30,39,1,0


In [11]:
test = df[df['version'] == 'gate_40']
test['retention_1'] = test['retention_1'].map({True: 1, False: 0})
test['retention_7'] = test['retention_7'].map({True: 1, False: 0})

test.head()

Unnamed: 0,userid,version,sum_gamerounds,retention_1,retention_7
2,377,gate_40,165,1,0
3,483,gate_40,1,0,0
4,488,gate_40,179,1,1
5,540,gate_40,187,1,1
7,1444,gate_40,2,0,0


### A/B Testing Analysis
First, let us analyse how moving the gate from level 30 to level 40 affects the number of players after 1 and 7 days.

In [12]:
def calculate_returning_players_stats(df):
    returning_players_1_day = df['retention_1'].sum()
    returning_players_7_day = df['retention_7'].sum()
    total_users = len(df)
    
    percentage_1_day = (returning_players_1_day / total_users)*100
    percentage_7_day = (returning_players_7_day / total_users)*100
    
    return percentage_1_day, percentage_7_day

groups = {'control': control, 'test' : test}

for group_name, group_data in groups.items():
    percentage_1_day, percentage_7_day = calculate_returning_players_stats(group_data)
    print(f'Percentage of returning players of {group_name} group after 1 day: {round(percentage_1_day, 1)} %')
    print(f'Percentage of returning players of {group_name} group after 7 day: {round(percentage_7_day, 1)} %')

Percentage of returning players of control group after 1 day: 44.8 %
Percentage of returning players of control group after 7 day: 19.0 %
Percentage of returning players of test group after 1 day: 44.2 %
Percentage of returning players of test group after 7 day: 18.2 %


When the percentages are analysed, no significant difference can be observed.

- pvalue : 0.05
- h0 < pvalue h0: not accepted
- h0 > pvalue : h0: accepted

- H0 : Normally distributed.
- H1 : Reject normally distributed.

In [13]:
# Control grubu için Shapiro-Wilk testi
stat_control_1, p_control_1 = shapiro(control['retention_1'])
stat_control_7, p_control_7 = shapiro(control['retention_7'])

# Test grubu için Shapiro-Wilk testi
stat_test_1, p_test_1 = shapiro(test['retention_1'])
stat_test_7, p_test_7 = shapiro(test['retention_7'])

print("Control 1-day Shapiro-Wilk test p-value:", p_control_1)
print("Control 7-day Shapiro-Wilk test p-value:", p_control_7)
print("Test 1-day Shapiro-Wilk test p-value:", p_test_1)
print("Test 7-day Shapiro-Wilk test p-value:", p_test_7)

Control 1-day Shapiro-Wilk test p-value: 0.0
Control 7-day Shapiro-Wilk test p-value: 0.0
Test 1-day Shapiro-Wilk test p-value: 0.0
Test 7-day Shapiro-Wilk test p-value: 0.0




p-value < 0.05 H0 rejected. Not normally distributed. We will use mannwhitneyu for this

In [14]:
test_stat, p_value = sts.mannwhitneyu(control["sum_gamerounds"], test["sum_gamerounds"])
print('Test Stat = %.4f, p-value = %.4f' % (test_stat, p_value))

Test Stat = 1024331250.5000, p-value = 0.0502


Since p-value > 0.05, it can be concluded that there is no statistically significant difference between the two groups.