## Analysis of Mobile Game AB test results

### Table of content
1. [Description of dataset](#Description)
2. [Main parameters](#parameters)
3. [Analysis of numeric data in column named 'sum_gamerounds'](#numeric)
  1. [Conclusion about sum_gamerounds](#gameround)
4. [Analysis of categorical data in column named 'retention_1'](#retention1)
  1. [Conclusion about data of retention_1](#retention_1) 
5. [Analysis of categorical data in column named 'retention_7'](#retention7)
  1. [Conclusion about data of retention_7](#retention_7)
6. [Conclusion and recommendations](#recommendations)

### Description of dataset <a name="Description"></a>
The data we have is from 90,189 players that installed the game while the AB-test was running. The variables are:<br>
<br>
userid - a unique number that identifies each player.<br>
version - whether the player was put in the control group (gate_30 - a gate at level 30) or the group with the moved gate (gate_40 - a gate at level 40).<br>
sum_gamerounds - the number of game rounds played by the player during the first 14 days after install.<br>
retention_1 - did the player come back and play 1 day after installing?<br>
retention_7 - did the player come back and play 7 days after installing?<br>
When a player installed the game, he or she was randomly assigned to either gate_30 or gate_40.

In [17]:
import scipy.stats as stats
import numpy as np
import pandas as pd

import warnings
warnings.filterwarnings('ignore')
warnings.warn('DelftStack')
warnings.warn('Do not show this message')

In [2]:
data = pd.read_csv('./game_ab_test_results.csv')
data.head(10)

Unnamed: 0,userid,version,sum_gamerounds,retention_1,retention_7
0,116,gate_30,3,0,0
1,337,gate_30,38,1,0
2,377,gate_40,165,1,0
3,483,gate_40,1,0,0
4,488,gate_40,179,1,1
5,540,gate_40,187,1,1
6,1066,gate_30,0,0,0
7,1444,gate_40,2,0,0
8,1574,gate_40,108,1,1
9,1587,gate_40,153,1,0


Let's check **main parameters** <a name ="parameters"></a>

In [3]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90189 entries, 0 to 90188
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   userid          90189 non-null  int64 
 1   version         90189 non-null  object
 2   sum_gamerounds  90189 non-null  int64 
 3   retention_1     90189 non-null  int64 
 4   retention_7     90189 non-null  int64 
dtypes: int64(4), object(1)
memory usage: 3.4+ MB


In [4]:
data.describe()

Unnamed: 0,userid,sum_gamerounds,retention_1,retention_7
count,90189.0,90189.0,90189.0,90189.0
mean,4998412.0,51.872457,0.44521,0.186065
std,2883286.0,195.050858,0.496992,0.389161
min,116.0,0.0,0.0,0.0
25%,2512230.0,5.0,0.0,0.0
50%,4995815.0,16.0,0.0,0.0
75%,7496452.0,51.0,1.0,0.0
max,9999861.0,49854.0,1.0,1.0


In [5]:
data.userid.nunique()

90189

In [9]:
data.version.value_counts()

gate_40    45489
gate_30    44700
Name: version, dtype: int64

All user_ids in dataset are unique.<br>
Types of data in different columns are correct<br>
It is almost 50% of data in each group now.<br>
Dataset coonsists numeric data in column named 'sum_gamerounds' and categorcial data in columns named 'retention1' and 'retention7'.<br>

Let's start from analysis of numeric data.

### Analysis of numeric data in column named 'sum_gamerounds' <a name = "numeric"></a>
Let's check for normality first.

In [10]:
alpha = 0.05

st = stats.shapiro(data.sum_gamerounds)
print('Distribution is {}normal\n'.format( {True:'not ',
False:''}[st[1] < alpha]));

Distribution is not normal



Because of the distribution is not normal let's use Mann-Whitney test.

In [11]:
gate_30 = data.query('version == "gate_30"')
gate_40 = data.query('version == "gate_40"')

In [12]:
stats.mannwhitneyu(x=gate_30['sum_gamerounds'].values, y=gate_40['sum_gamerounds'].values)

MannwhitneyuResult(statistic=1024331250.5, pvalue=0.05020880772044255)

In [13]:
gate_30.sum_gamerounds.mean()

52.45626398210291

In [14]:
gate_40.sum_gamerounds.mean()

51.29877552814966

#### Conclusion about sum_gamerounds <a name = "gameround"></a>
According to Mann-Whitney test pvalue (0.0502) is almost equal to alpha (0.05)<br>
At a given level of accuracy game_30 and game_40 distributions are equal.<br>
Let's check categorical data.

### Analysis of categorical data in column named 'retention_1' <a name = "retention1"></a>

In [15]:
n_gate_30_ret_1 = data[data['version'] == 'gate_30'].shape[0]
n_gate_40_ret_1 = data[data['version'] == 'gate_40'].shape[0]
suc_gate_30_ret_1 = data[data['version'] == 'gate_30'].retention_1.sum()
suc_gate_40_ret_1 = data[data['version'] == 'gate_40'].retention_1.sum()

print (n_gate_30_ret_1, n_gate_40_ret_1)
print (suc_gate_30_ret_1, suc_gate_40_ret_1)

44700 45489
20034 20119


In [18]:
from statsmodels.stats import proportion
chisq, pvalue, table = proportion.proportions_chisquare(np.array([suc_gate_30_ret_1, suc_gate_40_ret_1]), \
                                                        np.array([n_gate_30_ret_1, n_gate_40_ret_1]))

print(f'Results are chisq = {round(chisq, 3)}, pvalue = {round(pvalue, 3)}')

Results are chisq = 3.183, pvalue = 0.074


#### Conclusion about data of retention_1 <a name = "retention_1"></a>
According to Chi square test pvalue (0.074) is more than alpha (0.05)<br>
At a given level of accuracy game_30 and game_40 distributions are equal.

### Analysis of categorical data in column named 'retention_7' <a name = "retention7"></a>

In [19]:
n_gate_30_ret_7 = data[data['version'] == 'gate_30'].shape[0]
n_gate_40_ret_7 = data[data['version'] == 'gate_40'].shape[0]
suc_gate_30_ret_7 = data[data['version'] == 'gate_30'].retention_7.sum()
suc_gate_40_ret_7 = data[data['version'] == 'gate_40'].retention_7.sum()

print (n_gate_30_ret_7, n_gate_40_ret_7)
print (suc_gate_30_ret_7, suc_gate_40_ret_7)

44700 45489
8502 8279


In [20]:
from statsmodels.stats import proportion
chisq, pvalue, table = proportion.proportions_chisquare(np.array([suc_gate_30_ret_7, suc_gate_40_ret_7]), \
                                                        np.array([n_gate_30_ret_7, n_gate_40_ret_7]))

print(f'Results are chisq = {round(chisq, 3)}, pvalue = {round(pvalue, 3)}')

Results are chisq = 10.013, pvalue = 0.002


Значение pvalue значительно меньше alpha, то есть у выборок есть статистически значимые различия. Возврат пользователей через неделю выше в первой версии игры.

#### Conclusion about data of retention_7 <a name = "retention_7"></a>
According to Chi square test pvalue (0.002) is less than alpha (0.05)<br>
At a given level of accuracy game_30 and game_40 distributions are not equal.<br>
This means that users who experienced game_40 version coming back to the game after 7 days less likely than users of game_30 version.

### Conclusion and recommendations <a name = "recommendations"></a>
Average number of gamerounds and retention of users at first day are almost equal between different versions.<br>
Even though for game_40 version they both are little bit lower.<br>
At the same time, retention of users of game_40 version at seventh day is less then retention of users of game_30 version.<br>
Based on this analysis I recommend to keep game_30 version of game.