# Teamfight Tactics Monetization Analysis
## Data
### Data Source
The data is a mockup and was generated using chatgpt-4o (check link:https://chatgpt.com/c/6718fc2e-8e1c-800b-aeb7-33697fa34480) JC

### Data Structure and Key Variables
We’ll generate data for the following variables:

### Player Info:
- **`player_id`**: Unique identifier for each player.
- **`region`**: Region of the player (e.g., NA, EU, APAC).
- **`total_playtime_hours`**: Total hours spent playing TFT.
- **`days_active`**: Number of days the player has been active in the game.

### Monetization:
- **`spending_category`**: Categorizes spending (e.g., Free, Small Spender, Dolphin, Whale).
- **`total_spent`**: Total amount spent by the player on in-game purchases.
- **`battle_pass_purchases`**: Whether the player purchased the battle pass (Yes/No).
- **`skins_purchased`**: Number of skins purchased by the player.
- **`loot_boxes_purchased`**: Number of loot boxes purchased by the player.

### Engagement Metrics:
- **`days_since_last_purchase`**: How many days ago the player made their last purchase.
- **`daily_sessions`**: Average number of gaming sessions per day.
- **`sessions_per_week`**: How many sessions per week the player engages in.
- **`churn_status`**: Whether the player is at risk of churn (Yes/No).

### Retention and Conversions:
- **`d1_retention`**: Whether the player returned on day 1 after first playing (Yes/No).
- **`d7_retention`**: Whether the player returned on day 7 after first playing (Yes/No).
- **`d30_retention`**: Whether the player returned on day 30 after first playing (Yes/No).
- **`conversion_rate`**: Percentage likelihood of the player making a purchase.

### Revenue Metrics:
- **`arpu`**: Average Revenue Per User (based on `total_spent`).
- **`arppu`**: Average Revenue Per Paying User.

In [None]:
# #Code that generated the dataset:
# import pandas as pd
# import numpy as np

# # Set random seed for reproducibility
# np.random.seed(42)

# # Number of players
# num_players = 10000

# # Simulating player data
# player_data = {
#     'player_id': np.arange(1, num_players + 1),
#     'region': np.random.choice(['NA', 'EU', 'APAC', 'LATAM'], size=num_players, p=[0.35, 0.30, 0.20, 0.15]),
#     'total_playtime_hours': np.random.normal(100, 50, num_players).clip(min=0).astype(int),
#     'days_active': np.random.randint(1, 365, num_players),
#     'spending_category': np.random.choice(['Free', 'Small Spender', 'Dolphin', 'Whale'], size=num_players, p=[0.60, 0.25, 0.10, 0.05]),
#     'total_spent': np.random.choice([0, 5, 10, 50, 100, 200], size=num_players, p=[0.60, 0.10, 0.10, 0.10, 0.05, 0.05]),
#     'battle_pass_purchases': np.random.choice([0, 1], size=num_players, p=[0.75, 0.25]),
#     'skins_purchased': np.random.poisson(2, num_players),
#     'loot_boxes_purchased': np.random.poisson(3, num_players),
#     'days_since_last_purchase': np.random.randint(0, 60, num_players),
#     'daily_sessions': np.random.normal(3, 1.5, num_players).clip(min=0).astype(int),
#     'sessions_per_week': np.random.normal(21, 10, num_players).clip(min=1).astype(int),
#     'churn_status': np.random.choice(['Yes', 'No'], size=num_players, p=[0.30, 0.70]),
#     'd1_retention': np.random.choice([1, 0], size=num_players, p=[0.80, 0.20]),
#     'd7_retention': np.random.choice([1, 0], size=num_players, p=[0.50, 0.50]),
#     'd30_retention': np.random.choice([1, 0], size=num_players, p=[0.25, 0.75]),
# }

# # Creating DataFrame
# df_players = pd.DataFrame(player_data)

# # Calculating ARPU and ARPPU
# df_players['arpu'] = df_players['total_spent'] / df_players['days_active']
# df_players['arppu'] = np.where(df_players['total_spent'] > 0, df_players['total_spent'] / df_players['days_active'], 0)

# # Display the first few rows of the dataset
# import ace_tools as tools; tools.display_dataframe_to_user(name="Teamfight Tactics Monetization Dataset", dataframe=df_players)

In [5]:
import pandas as pd
import numpy as np

In [24]:
# Display settings (to see all data)
pd.set_option('display.max_colwidth', None)
pd.set_option('display.width', 1000)  # Set a high value to ensure no line breaks

pd.set_option('display.max_columns', 200)  # limit on columns (replace 200 by None to see all)
pd.set_option('display.max_rows', 1000)     # limit on rows (replace 1000 by None to see all)

In [25]:
# Loading The Dataset
df = pd.read_csv("../2-prepared Data/Teamfight_Tactics_Monetization_Dataset.csv")

In [26]:
display(df)

Unnamed: 0,player_id,region,total_playtime_hours,days_active,spending_category,total_spent,battle_pass_purchases,skins_purchased,loot_boxes_purchased,days_since_last_purchase,daily_sessions,sessions_per_week,churn_status,d1_retention,d7_retention,d30_retention,arpu,arppu
0,1,EU,25,62,Free,0,0,1,2,11,0,11,No,1,1,0,0.000000,0.000000
1,2,LATAM,43,262,Free,200,0,1,2,52,1,23,Yes,1,0,1,0.763359,0.763359
2,3,APAC,119,218,Free,0,1,4,2,38,1,41,No,1,1,0,0.000000,0.000000
3,4,EU,41,232,Dolphin,0,0,1,5,45,0,17,No,1,1,1,0.000000,0.000000
4,5,,155,336,Small Spender,50,0,0,4,1,0,37,No,1,0,0,0.148810,0.148810
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,9996,LATAM,98,340,Dolphin,0,0,2,5,43,2,27,Yes,1,0,0,0.000000,0.000000
9996,9997,LATAM,139,8,Free,0,0,1,3,28,4,1,No,1,0,0,0.000000,0.000000
9997,9998,LATAM,149,270,Small Spender,10,1,4,6,4,3,26,No,1,0,0,0.037037,0.037037
9998,9999,EU,65,318,Free,50,0,3,1,49,1,12,No,1,1,0,0.157233,0.157233


In [30]:
# Checking for duplicates
duplicates = df[df.duplicated()]
display(duplicates)

# No duplicates

Unnamed: 0,player_id,region,total_playtime_hours,days_active,spending_category,total_spent,battle_pass_purchases,skins_purchased,loot_boxes_purchased,days_since_last_purchase,daily_sessions,sessions_per_week,churn_status,d1_retention,d7_retention,d30_retention,arpu,arppu
