# Predicting Player Activity on a Minecraft Research Server


### Introduction


In this investigation, we will predict the total hours played by players according to the players.csv dataset. In the study led by Frank Wood, players’ actions and profiles were observed on a Minecraft server. One of the research's key challenges is predicting which types of players are likely to generate large amounts of gameplay data. This information would allow them to target recruitment more effectively and plan the necessary computing resources to operate the server. For the purpose of this report, we will see if the data describing each player in players.csv, which includes 196 observations and 7 variables: (experience, subscription, “hashedEmail,” played hours, name, gender, and age), can predict the total hours played by a player. Below is a summary of the players.csv data set:


|Variable Name | Type | Description |
|---|---|---|
| experience | character | Tells us the player's experience level: beginner, regular, amateur, veteran, or pro  |
|subscribe | logical | Whether the player subscribed to the newsletter (TRUE = subscribed, FALSE = not subscribed), there are 52 players not subscribed and 144 subscribed  |
|hashedemail | character | Encrypted player email identifier  |
|played_hours | numeric | Total number of hours the player has spent playing|
|name | character | Player's name|
| gender | character | Gives the player's gender|
| age | numeric | Player's age in years|


To begin exploring our question, we wanted to choose predictor variables based on those we inferred would have the most impact on predicting the total hours played. We selected gender, age, subscription status, and experience level. Gender and age capture basic demographic differences that may affect gaming habits. Subscription status reflects a player’s level of interest or engagement, which may relate to how much they play. Experience level indicates how comfortable a player is with the game, which can affect how long they stay active on the server. This narrowed down our question to: ***Can the variables of gender, age, subscription, and experience be used to predict the total hours played by a player?***

### Method & Results

In [3]:
# Loading appropriate packages 
library(tidyverse)
library(repr)
library(tidymodels)