What genres and game tags are the most well-received from players on Steam (the biggest online video game marketplace and platform)? 



I used the follow genres and game tags for analysis:
- Genres
    - Action
    - Adventure
    - Strategy
    - Casual
    - RPG (Role-playing Game)
    - Massively Multiplayer
- Tags
    - Singleplayer
    - Multiplayer
    - Rogue-like
    - RTS (Real-time Strategy)
    - FPS (First-person Shooter)
    - MOBA (Multiplayer Online Battle Arena)

In [25]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

First, I filtered out a subset of the data with only the columns I was interested in.

- Game name
- \# of positive votes
- \# of negative votes
- Genres that this game belongs to
- Tags that this game belongs to

In [26]:
steam = pd.read_csv('games.csv')

highest_positive = steam.sort_values(by='Positive', ascending=False)[['Name', 'Positive', 'Negative', 'Genres', 'Tags']]

highest_positive.head(10)

highest_positive[highest_positive['Tags'].str.contains('Rhythm', na=False)]

Unnamed: 0,Name,Positive,Negative,Genres,Tags
3325,Geometry Dash,182534,12372,"Action,Indie","Difficult,Music,Great Soundtrack,Precision Pla..."
1637,Sekiro™: Shadows Die Twice - GOTY Edition,179498,9339,"Action,Adventure","Souls-like,Difficult,Action,Singleplayer,Ninja..."
55061,Helltaker,104471,2160,"Adventure,Free to Play,Indie","Free to Play,Cute,Demons,Puzzle,Indie,Anime,Gr..."
46796,Muse Dash,66332,4419,"Action,Casual,Indie","Music,Rhythm,Anime,Cute,Female Protagonist,Sex..."
33104,Beat Saber,61280,2549,Indie,"VR,Rhythm,Music,Great Soundtrack,Moddable,Indi..."
...,...,...,...,...,...
62729,Saber Ship,0,1,Indie,"VR,Rhythm,Music-Based Procedural Generation,Sc..."
13303,Garden Madness,0,1,Indie,"Runner,Clicker,Indie,Rhythm,Arcade,2D,Casual,P..."
64252,StarMaker VR,0,1,"Action,Indie","On-Rails Shooter,VR,FPS,Music-Based Procedural..."
65221,Dancing Hime: Rhythm Matching,0,3,"Casual,Early Access","Casual,Match 3,Rhythm,VR,Early Access,Relaxing..."


To measure how well-received a genre is by players, I grouped the most 10 most popular video games on Steam for each genre or game tag together. Then, I calculated the average positive vote percentage ratio from the 10 most popular games and compared each genre/tag's average positive vote percentage ratio.

In [27]:
data = {
    'category': [],
    'positive': []
}

def get_average_positive(filter_type, category):
    '''Get the average positive vote percentage of the top 10 games of a genre or tag.

    Keyword arguments:
    filter_type -- 'Genres' or 'Tags' depending on what to filter by
    category -- the actual genre or tag to filter by
    '''
    top_ten = highest_positive[highest_positive[filter_type].str.contains(category, na=False)].head(10)

    positive_percentages = top_ten.apply(lambda x: x['Positive'] / (x['Positive'] + x['Negative']), axis=1)
    
    data['category'].append(category)
    data['positive'].append(np.mean(positive_percentages) * 100)

In [28]:
get_average_positive('Genres', 'Action')
get_average_positive('Genres', 'Adventure')
get_average_positive('Genres', 'Strategy')
get_average_positive('Genres', 'Casual')
get_average_positive('Genres', 'RPG')
get_average_positive('Genres', 'Massively Multiplayer')

get_average_positive('Tags', 'Singleplayer')
get_average_positive('Tags', 'Multiplayer')
get_average_positive('Tags', 'Rogue-like')
get_average_positive('Tags', 'RTS')
get_average_positive('Tags', 'FPS')
get_average_positive('Tags', 'MOBA')

positive_genres = pd.DataFrame(data)

positive_genres.sort_values(by='positive', ascending=False)

Unnamed: 0,category,positive
8,Rogue-like,96.628269
6,Singleplayer,93.406937
3,Casual,92.14518
2,Strategy,90.869242
4,RPG,90.111013
10,FPS,87.279299
7,Multiplayer,87.162297
0,Action,86.430413
9,RTS,86.359162
1,Adventure,86.023193


Displayed above are the video game genres we were interested in ranked in descending order by average positive vote-to-vote ratio.

Some notes about the findings:
- Rogue-likes are the most well-received genre by players. The genre is ahead by a lot more than the competition, beating second place (singleplayer) by 3.22%.
- Massively Multiplayer games are the least well-received genre, losing to the next highest genre (MOBAs) by 4.16%.

Some constraints to consider when analyzing the results are:
- Steam has a weird system for categorizing games. Their categorizing system is split between genres and tags. Genres are types of games that are officially recognized and listed as a category you can search by on Steam. On the other hand, there are tags which are user-defined and players can assign games tags that they feel belong to the game. The game then belongs to the tags that are most assigned by players. Since tags are user-defined, our results could be skewed because game categorization by tag is subjective by the players of that game, not officially by Steam itself.
- The majority of video games in existence don't just belong to one genre. Our results could be skewed because of some games that are very popular and belong to multiple genres. Or, there could be the case where a game isn't as popular in one genre as it is in another genre that the game belongs in.