# Analysis of Steam gaming data (Kapezov Shyngys)

### Content
+ Introduction
+ Data Description and Objectives
+ Data preparation

### 1. Introduction
Steam is an online digital distribution service for PC games and software developed and maintained by Valve. Steam acts as a technical copyright protection, multiplayer and streaming platform, and social network for players. Prior to 2009, most games released on Steam had traditional anti-piracy measures, including the assignment and distribution of product keys and support for digital rights management software tools such as SecuROM or non-malicious rootkits.

Streaming Downloads - Steam supports streaming content downloads. This allows you to prioritize content downloads. Thus, the part of the game that is required to start is loaded first. The rest of the files are in the background. The loading of the game level is suspended if the necessary files have not been loaded yet. Streaming content requires additional effort on the part of the developer, so not many games are actively using it.

The ability to buy a game for another person as a gift or to give someone an “extra” game that was bought again as part of a collection. Gifts have become the most common form of purchasing Steam games through third parties among users who are unable to buy the game directly due to the lack of a credit card. After the money is transferred, the intermediary buys the game as a gift through a credit card and sends it to the buyer via email or via Steam.

Source(https://en.wikipedia.org/wiki/Steam_(service)) 


<img src="attachment:Steam.png" width="150">

### 2. Data Description and Objectives
The gaming industry is evolving year after year and there is no exception that the topic of Steam games is quite popular among young people. There are Counter-Strike, Dota 2 games that have over a million owners. That is why, to me it is interesting to analyze such type of data since I face it from day to day.
Chosen dataset was organized in 2016 based on 2009 year statistics and in order to provide more powerful and actual project the web scraping of nowadays statistics will be used further. 
+ RecommendationCount - integer, number of games recommendations.
+ SteamSpyPlayersEstimate - best estimate of total number of people who have played the game since March 2009
+ PlatformWindows - games that are available on Windows
+ PlatformLinux - games that are available on Linux
+ PlatformMac - games that are available on Mac
+ CategorySinglePlayer - single player game category
+ CategoryMultiplayer - multiplayer game category
+ CategoryCoop - cooperative game category
+ CategoryMMO - MMO game category
+ GenreIsIndie - indie game genre
+ GenreIsAction - action game genre
+ GenreIsAdventure - adventure game genre
+ GenreIsCasual - casual game genre
+ GenreIsStrategy - strategy game genre
+ GenreIsRPG - RPG game genre
+ GenreIsSimulation - simulation game genre

Research questions: 
1. Analyze the games` recommendations rate towards players.
2. Analyze on which operating systems games are available.
3. Analyze which of categories, like singleplayer, multiplayer,cooperative, MMO, VR is most popular.
4. Analyze the three most popular and least popular genres of games.
5. Analyze the difference between the number of owners of the top 100 games between 2009 and the present.

### 3. Data preparation

In [1]:
#manipulation with first research question
import numpy as np
import pandas as pd


In [2]:
col_li = ["QueryName","RecommendationCount","SteamSpyPlayersEstimate"]#having an appropriate array with required columns 
pd1=pd.read_csv('games-features.csv',usecols=col_li)

In [3]:
pd1.dropna()#checking for the missing values.
pd2=pd1.sort_values(by='SteamSpyPlayersEstimate',ascending=0)#sortring in descending order group by SteamSpyPlayersEstimate
pd3=pd2[pd2.RecommendationCount!=0]
pd3[pd3.SteamSpyPlayersEstimate!=0]
#deleting rows with values=0

Unnamed: 0,QueryName,RecommendationCount,SteamSpyPlayersEstimate
23,Dota 2,590480,90687580
20,Team Fortress 2,383949,37878812
27,Counter-Strike: Global Offensive,1427633,25150372
4028,Unturned,222301,21438373
22,Left 4 Dead 2,140726,13583400
...,...,...,...
7546,Mooch,109,1076
8807,Ultimate Arena,158,897
8364,Divergence: Online,118,803
1947,CRYENGINE,224,803


In [4]:
#manipulation with second research question
col_li=["QueryName","PlatformWindows","PlatformLinux","PlatformMac","PCMinReqsText","LinuxMinReqsText","MacMinReqsText"]
pd11=pd.read_csv('games-features.csv',usecols=col_li)

In [5]:
pd11.dropna()

Unnamed: 0,QueryName,PlatformWindows,PlatformLinux,PlatformMac,PCMinReqsText,LinuxMinReqsText,MacMinReqsText
0,Counter-Strike,True,True,True,Minimum: 500 mhz processor 96mb ram 16mb video...,Minimum: Linux Ubuntu 12.04 Dual-core from Int...,Minimum: OS X Snow Leopard 10.6.3 1GB RAM 4GB...
1,Team Fortress Classic,True,True,True,Minimum: 500 mhz processor 96mb ram 16mb video...,Minimum: Linux Ubuntu 12.04 Dual-core from Int...,Minimum: OS X Snow Leopard 10.6.3 1GB RAM 4GB...
2,Day of Defeat,True,True,True,Minimum: 500 mhz processor 96mb ram 16mb video...,Minimum: Linux Ubuntu 12.04 Dual-core from Int...,Minimum: OS X Snow Leopard 10.6.3 1GB RAM 4GB...
3,Deathmatch Classic,True,True,True,Minimum: 500 mhz processor 96mb ram 16mb video...,Minimum: Linux Ubuntu 12.04 Dual-core from Int...,Minimum: OS X Snow Leopard 10.6.3 1GB RAM 4GB...
4,Half-Life: Opposing Force,True,True,True,Minimum: 500 mhz processor 96mb ram 16mb video...,Minimum: Linux Ubuntu 12.04 Dual-core from Int...,Minimum: OS X Snow Leopard 10.6.3 1GB RAM 4GB...
...,...,...,...,...,...,...,...
13352,Baseball Riot,True,False,False,Minimum:OS: Windows XP / Vista / 7 / 8 / 10Pro...,,
13353,Passage 4,True,False,False,Minimum:OS: Windows 2000/XP/Vista/7/8/10Proces...,,
13354,Piximalism,True,False,False,Minimum:OS: Microsoft(r) Windows(r) XP / Vista...,,
13355,Technoball,True,False,False,Minimum:OS: Windows 7 (64-bit)Processor: 2.5 G...,,


In [6]:
#manipulation with third research question
col_li=["QueryName","CategorySinglePlayer","CategoryMultiplayer","CategoryCoop","CategoryMMO","CategoryVRSupport"]
pd21=pd.read_csv('games-features.csv',usecols=col_li)
pd21.dropna()

Unnamed: 0,QueryName,CategorySinglePlayer,CategoryMultiplayer,CategoryCoop,CategoryMMO,CategoryVRSupport
0,Counter-Strike,False,True,False,False,False
1,Team Fortress Classic,False,True,False,False,False
2,Day of Defeat,False,True,False,False,False
3,Deathmatch Classic,False,True,False,False,False
4,Half-Life: Opposing Force,True,True,False,False,False
...,...,...,...,...,...,...
13352,Baseball Riot,True,False,False,False,False
13353,Passage 4,True,False,False,False,False
13354,Piximalism,True,False,False,False,False
13355,Technoball,True,True,True,False,False


In [7]:
#manipulation with forth research question
col_li=["QueryName","GenreIsIndie","GenreIsAction","GenreIsAdventure","GenreIsCasual","GenreIsStrategy","GenreIsRPG","GenreIsSimulation","GenreIsSports","GenreIsRacing"]
pd41=pd.read_csv('games-features.csv',usecols=col_li)
pd41.dropna()

Unnamed: 0,QueryName,GenreIsIndie,GenreIsAction,GenreIsAdventure,GenreIsCasual,GenreIsStrategy,GenreIsRPG,GenreIsSimulation,GenreIsSports,GenreIsRacing
0,Counter-Strike,False,True,False,False,False,False,False,False,False
1,Team Fortress Classic,False,True,False,False,False,False,False,False,False
2,Day of Defeat,False,True,False,False,False,False,False,False,False
3,Deathmatch Classic,False,True,False,False,False,False,False,False,False
4,Half-Life: Opposing Force,False,True,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...
13352,Baseball Riot,True,False,False,True,False,False,False,True,False
13353,Passage 4,True,False,False,True,False,False,False,False,False
13354,Piximalism,True,True,True,True,False,False,False,False,False
13355,Technoball,True,True,False,True,False,False,False,True,False


In [8]:
#manipulation with fifth research question
col_li=["QueryName","SteamSpyOwners"]
pd41=pd.read_csv('games-features.csv',usecols=col_li)
pd42=pd41.dropna()
pd43=pd42.sort_values(by='SteamSpyOwners',ascending=0)
pd43.head(100)
#top 100 steamspyowners from dataset

Unnamed: 0,QueryName,SteamSpyOwners
23,Dota 2,90687580
20,Team Fortress 2,37878812
4028,Unturned,27025292
27,Counter-Strike: Global Offensive,25833156
22,Left 4 Dead 2,15574539
...,...,...
1546,XCOM: Enemy Unknown,3498393
30,Killing Floor,3496958
4473,Life Is Strange™,3446920
2015,Defiance,3409615
