# <center>The Rise and Fall of Fall Guys</center>

# Introduction

There has been a lot of criticism and hate lately towards the video game Fall Guys. The game was very popular during its early days from when it was released, but after many complaints its popularity seemed to have plummeted, with people calling it a "dead" game. I wanted to observe what people are generally saying, how they feel about the game, and compare if opinions have changed since the initial release of the game. To achieve this, I will analyze customer reviews from the Steam store that were written since the game was released.

For context, Fall Guys: Ultimate Knockout is a PC game released on August 4, 2020. For PC users, the game can be purchased from the [Steam Store](https://store.steampowered.com/app/1097150/Fall_Guys_Ultimate_Knockout/). On Steam, users who have bought or received the game as a gift are given the option to leave a review for the game, and can give a "thumbs up" or "thumbs down" which serves as an indicator of whether the user recommends the game or not.

<i>All of the data for this project was collected on February 13, 2021.</i>

# Outline

# Part 1: Understanding the Data

## The data

`reviews.csv` contains every review that has been written on Steam for Fall Guys, with the condition that the review was written in English. It contains information such as the number of hours played, whether the reviewer recommended the game or not, and when the review was written.

Column | Definition
--- | ---
Review | User review of the game
Recommended | True if the reviewer gave a positive recommendation, false otherwise
Total_Playtime | Total playtime (in minutes) at the time the review was written
Review_Timestamp | Date and time of when the review was written. Formatted as m/d/y and GMT Time
Last_Played | Unix timestamp of when the user last played the game 

More information on the Steam review fields: [https://partner.steamgames.com/doc/store/getreviews](https://partner.steamgames.com/doc/store/getreviews)

## Importing libraries and loading the data

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [12]:
# Loading the data
df = pd.read_csv('data/reviews.csv', encoding='ISO-8859-1')

df.drop("Steam ID", axis=1, inplace=True) # Steam ID identifies each user, which we don't need

In [13]:
df

Unnamed: 0,Review,Recommended,Total_Playtime,Review_Timestamp,Last_Played
0,Wall Guys Sucks,True,275,2/13/21 8:25,1613204675
1,1,False,4164,2/13/21 8:04,1608127700
2,very nice game i bought it yesterday i love it...,True,33,2/13/21 7:34,1613203192
3,its ok,True,318,2/13/21 6:51,1613192434
4,ehyehvvchvvcdgyvv dhdhwdfhycfydgvdyyeegt heecduvh,True,8158,2/13/21 6:49,1613202336
...,...,...,...,...,...
128652,"Honestly, one of the most refreshing and bold ...",True,375,8/4/20 7:20,1602679579
128653,Was it so hard to focus more on additional map...,False,13,8/4/20 7:18,1596528167
128654,[h1]9/10[/h1]\n it is really cool that more an...,True,2116,8/4/20 7:12,1605827503
128655,https://www.youtube.com/watch?v=anPEhQvsBr4\n\...,True,3712,8/4/20 7:12,1601014874


# Cleaning the data

The `Last_Played` field is in a Unix timestamp that is hard to interpret for humans. To make the field more human readable, we'll convert it to year-month-day.

In [41]:
df['Last_Played'] = pd.to_datetime(df.Last_Played).dt.date

In [42]:
df

Unnamed: 0,Review,Recommended,Total_Playtime,Review_Timestamp,Last_Played
0,Wall Guys Sucks,True,275,2/13/21 8:25,2021-02-13
1,1,False,4164,2/13/21 8:04,2020-12-16
2,very nice game i bought it yesterday i love it...,True,33,2/13/21 7:34,2021-02-13
3,its ok,True,318,2/13/21 6:51,2021-02-13
4,ehyehvvchvvcdgyvv dhdhwdfhycfydgvdyyeegt heecduvh,True,8158,2/13/21 6:49,2021-02-13
...,...,...,...,...,...
128652,"Honestly, one of the most refreshing and bold ...",True,375,8/4/20 7:20,2020-10-14
128653,Was it so hard to focus more on additional map...,False,13,8/4/20 7:18,2020-08-04
128654,[h1]9/10[/h1]\n it is really cool that more an...,True,2116,8/4/20 7:12,2020-11-19
128655,https://www.youtube.com/watch?v=anPEhQvsBr4\n\...,True,3712,8/4/20 7:12,2020-09-25


To keep the `Review_Timestamp` and `Last_Played` fields consistent, we'll remove the hour:minute part of the timestamp and format the date to be similar as the `Last_Played` field (hour:minute is not needed for our analysis).

In [44]:
df['Review_Timestamp'] = pd.to_datetime(df.Review_Timestamp).dt.date

In [45]:
df

Unnamed: 0,Review,Recommended,Total_Playtime,Review_Timestamp,Last_Played
0,Wall Guys Sucks,True,275,2021-02-13,2021-02-13
1,1,False,4164,2021-02-13,2020-12-16
2,very nice game i bought it yesterday i love it...,True,33,2021-02-13,2021-02-13
3,its ok,True,318,2021-02-13,2021-02-13
4,ehyehvvchvvcdgyvv dhdhwdfhycfydgvdyyeegt heecduvh,True,8158,2021-02-13,2021-02-13
...,...,...,...,...,...
128652,"Honestly, one of the most refreshing and bold ...",True,375,2020-08-04,2020-10-14
128653,Was it so hard to focus more on additional map...,False,13,2020-08-04,2020-08-04
128654,[h1]9/10[/h1]\n it is really cool that more an...,True,2116,2020-08-04,2020-11-19
128655,https://www.youtube.com/watch?v=anPEhQvsBr4\n\...,True,3712,2020-08-04,2020-09-25
