# 1. Introduction

Online gaming has become a lucrative industry in recent years, with millions of players worldwide. As the competition in the online gaming market intensifies, user engagement becomes a critical metric for online gaming agencies. A/B testing is a powerful tool for optimizing user engagement and improving business outcomes. In this blog, we will explore how A/B testing can be used to optimize user engagement in an online gaming agency. We will use a Kaggle dataset and Python code examples to demonstrate the process.

# 2. Data Source

We will use the **"Video Game Sales with Ratings"** dataset from Kaggle. The dataset contains information about video game sales and ratings from 1980 to 2020. The dataset has eight columns, including Name, Platform, Year, Genre, Publisher, NA_Sales, EU_Sales, and Global_Sales.

[Read more](https://medium.com/@yennhi95zz/optimizing-user-engagement-in-online-gaming-agency-with-a-b-testing-a-case-study-566777d22a3d)

In [1]:
import pandas as pd
import numpy as np
import scipy.stats as stats
import seaborn as sns
import matplotlib.pyplot as plt

In [2]:
# Load the data
df = pd.read_csv("/kaggle/input/video-game-sales-with-ratings/Video_Games_Sales_as_at_22_Dec_2016.csv")
df.head()

Unnamed: 0,Name,Platform,Year_of_Release,Genre,Publisher,NA_Sales,EU_Sales,JP_Sales,Other_Sales,Global_Sales,Critic_Score,Critic_Count,User_Score,User_Count,Developer,Rating
0,Wii Sports,Wii,2006.0,Sports,Nintendo,41.36,28.96,3.77,8.45,82.53,76.0,51.0,8.0,322.0,Nintendo,E
1,Super Mario Bros.,NES,1985.0,Platform,Nintendo,29.08,3.58,6.81,0.77,40.24,,,,,,
2,Mario Kart Wii,Wii,2008.0,Racing,Nintendo,15.68,12.76,3.79,3.29,35.52,82.0,73.0,8.3,709.0,Nintendo,E
3,Wii Sports Resort,Wii,2009.0,Sports,Nintendo,15.61,10.93,3.28,2.95,32.77,80.0,73.0,8.0,192.0,Nintendo,E
4,Pokemon Red/Pokemon Blue,GB,1996.0,Role-Playing,Nintendo,11.27,8.89,10.22,1.0,31.37,,,,,,


In [3]:
# Create two samples for Design A and Design B
sample_A = df.sample(n=1000, random_state=1)
sample_B = df.sample(n=1000, random_state=2)

In [4]:
# Compare the mean and standard deviation of the two samples to see if there is a significant difference
print("Sample A mean:", np.mean(sample_A["Global_Sales"]))
print("Sample B mean:", np.mean(sample_B["Global_Sales"]))
print("Sample A std:", np.std(sample_A["Global_Sales"]))
print("Sample B std:", np.std(sample_B["Global_Sales"]))

Sample A mean: 0.57697
Sample B mean: 0.5191699999999999
Sample A std: 1.8222463387533532
Sample B std: 1.4615469924364388


In [5]:
# The t-test compares the means of the two samples and calculates the probability of getting the observed difference by chance.
t, p = stats.ttest_ind(sample_A["Global_Sales"], sample_B["Global_Sales"])
print("t-value:", t)
print("p-value:", p)

t-value: 0.7820697550511971
p-value: 0.4342662497844598


After performing A/B testing on the "Video Game Sales with Ratings" dataset, we found that Design B had a higher mean value of global sales than Design A. The t-test showed a significant difference between the two designs, with a p-value of less than 0.05. Therefore, we can reject the null hypothesis and conclude that Design B is more effective in engaging users.