# TITLE

## **INTRODUCTION**

### Backgroud

Understanding player engagement is a critical aspect of game research, influencing game design, resource allocation, and targeted player recruitment, which has led to increased interest in analyzing player activity data. A research group in the Computer Science department at the University of British Columbia (UBC), led by Frank Wood, is investigating how players interact with video games by collecting their in-game behavioral data.

For their study, the research team has set up a Minecraft server, where players' in-game actions are recorded as they navigate through the virtual world. The data collected provides an understanding of how different types of players engage with the game. However, not all players contribute equally. Some engage with the game significantly more than others, and running such a project efficiently requires targeted recruitment strategies to attract the players who will contribute more data. By identifying player characteristics that correlate with high levels of engagement, the research team can focus their recruitment efforts on individuals who are most likely to generate the data that they are looking for.

### Research Question

This report aims to answer the following question:

**"Can certain experience levels, ages, and genders predict the total number of hours a player contributes to the players dataset?"**

By analyzing how these player characteristics relate to total playtime, we can determine which groups are the most engaged. These findings will help the research team refine recruitment strategies and allocate their resources more efficiently.

### The Datasets

To answer this question, we used two datasets: <br>
**`players.csv`** &mdash; Contains general demographic and experience-related information about players <br>
**`sessions.csv`** &mdash; The logs of individual game sessions

***
The descriptive summary of the variables in the **`players.csv`** dataset: <br>
<br> - Number of observations: **196** (indicates 196 different users/players).
<br> - Number of variables: **7** (listed below)
| Variable | Type | Description |
| --- | --- | --- | 
| `experience`| categorical (chr) | Refers to the player's experience level (amateur, regular, veteran, pro). |
| `subscribe` | logical (lgl) | Whether the player is a subscriber to the game-related newsletter. |
| `hashedEmail` | categorical (chr) | A hashed (encrypted) version of a player's email that acts as an anonymized identifier for the player; this is done to avoid using their actual email addresses and thus protect their privacy. |
| `played_hours` | numerical (dbl) | Total hours played by the player. |
| `name` | categorical (chr) | Player's name. |
| `gender` | categorical (chr) | Gender of the player. | 
| `Age` | numerical (dbl) | Player's age. |


Other notes for the **`players.csv`** dataset:
<br> - Some variables such as `gender` and `experience` may not be evenly distributed, which may introduce biases to predictions.
<br> - `played_hours` could have outliers; some players are observed to have extreme values. This could potentially skew the averages and impact the modeling.

***
The descriptive summary of the variables in the **`sessions.csv`** dataset: <br>
<br> - Number of observations: **1535** (indicates 1535 recorded sessions).
<br> - Number of variables: **5** (listed below)

| Variable | Type | Description |
| --- | --- | --- | 
| `hashedEmail`| categorical (chr) | Anonymized player identifier, matches the **`players.csv`** dataset. |
| `start_time` | categorical (chr) | Date and time for the start of player's session. |
| `end_time` | categorical (chr) | Date and time for the end of player's session. |
| `original_start_time` | numerical (dbl) | A timestamp version of `start_time`. |
| `original_end_time` | numerical (dbl) | A timestamp version of `end_time`. |

Other notes for the **`sessions.csv`** dataset:
<br> - There are significantly more session observations than player observations, indicating that some players has multiple sessions.

## Methods & Results