# Analysis of the Professional Counter Strike Global Offensive Circuit
### CMSC 320 Final Project
### Richard Chen
___

## Introduction and Motivation
____
![](https://seeklogo.com/images/C/csgo-logo-CAA0A4D48A-seeklogo.com.png)


Counter-Strike Global Offensive (CS:GO) is a competitive PC game developed by Valve Corporation in 2012. It is part of a franchise of games stretching back to the release of the first Counter-Strike in 2000. CS:GO is played in a first person perspective, in which two teams of 5 players compete to see who wins a total of 16 rounds the fastest. For the first 15 rounds of the game, one side will play an offensive side that is tasked with planting and bomb and having it detonate. The other side is tasked with defusing the bomb or preventing it from being planted at all. Both teams can also win the round by eliminating all enemy players. In case both teams draw at the 30th round on 15-15, 6 more rounds are added as overtime; ends if a team wins 4 out of 6 rounds. At any point, there are 7 active maps in the competitive pool. Overall matches are played in a **best of 3 maps** to see who wins the series overall.

Because of the competitive nature of the game, a professional circuit has thrived over the past 8 years. With record player numbers in the base game (1.1 million concurrent players), and increased investment from traditional North American sports franchises, such as the Dallas Cowboys, CS:GO might be a sneak preview into the **future of entertainment**. Having a predictive model of how a competitive CS:GO team can be successful could be worth **millions** as the industry becomes more mainstream. Success in this model could open more opportunities for the broader sport of esports. 

In this tutorial, I will be analyzing a dataset from Kaggle Datasets. I will be going step-by-step through the data pipeline of professional CS:GO teams to determine winning trends and strategies. My goal is to lay the framework and an example for others to build upon when analyzing a competitive multiplayer game. This tutorial and analysis will be written with Python 3 in mind.

## Dataset

The dataset used for this tutorial analysis was found via Kaggle Datasets and can be found [here.](https://www.kaggle.com/mateusdmachado/csgo-professional-matches) Once you download the files from the link, you should find 4 csv files that stores the data we're interested in. This dataset includes data about competitive CS:GO matches between November 2015 to March 2020. The original source where the data was originally scraped from was www.hltv.org.

 - Results.csv: stores data about map scores and team rankings
 - Picks.csv: stores data about team's map picks and vetos during the prematch selection between opposing teams
 - Economy.csv: stores data about round start equipment values for all rounds played
 - Players.csv: stores individual performances of professional players on each map

## Getting Started

First we'll need to import the following python libraries:
 - [Pandas](https://pandas.pydata.org/)
 - [Beautiful Soup (OPTIONAL)](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)

In [12]:
# Importing the required libraries
import pandas as pd

## Data Scraping
Because the author of the CS:GO Kaggle dataset already scrapped the data off www.hltv.org, I will not be explicitly showing the steps needed to perform that. However, more information can be found by exploring documentation of **Beautiful Soup**, a Python library used to get data and parse a HTML page. [Link here to explore Beautiful Soup.](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)

## Data Tidying
After the data is scrapped, I use the Pandas library's read_csv function to load it into a a data structure called a "DataFrame." This will allow me to easily manipulate rows and columns based on their values. A DataFrame will be the basis of making our data usable for the rest of the pipeline. 

In [15]:
# loading data from our data into a dataframe
df = pd.read_csv("./csgo-professional-matches/results.csv")
# displaying the first 10 items of our dataframe
df.head(10)

Unnamed: 0,date,team_1,team_2,_map,result_1,result_2,map_winner,starting_ct,ct_1,t_2,t_1,ct_2,event_id,match_id,rank_1,rank_2,map_wins_1,map_wins_2,match_winner
0,2020-03-18,Recon 5,TeamOne,Dust2,0,16,2,2,0,1,0,15,5151,2340454,62,63,0,2,2
1,2020-03-18,Recon 5,TeamOne,Inferno,13,16,2,2,8,6,5,10,5151,2340454,62,63,0,2,2
2,2020-03-18,New England Whalers,Station7,Inferno,12,16,2,1,9,6,3,10,5243,2340461,140,118,12,16,2
3,2020-03-18,Rugratz,Bad News Bears,Inferno,7,16,2,2,0,8,7,8,5151,2340453,61,38,0,2,2
4,2020-03-18,Rugratz,Bad News Bears,Vertigo,8,16,2,2,4,5,4,11,5151,2340453,61,38,0,2,2
5,2020-03-17,Singularity,Endpoint,Overpass,13,16,2,2,8,6,5,10,5247,2340456,71,41,0,2,2
6,2020-03-17,Singularity,Endpoint,Vertigo,11,16,2,1,6,9,5,7,5247,2340456,71,41,0,2,2
7,2020-03-17,Espada,Tricked,Dust2,16,10,1,2,3,8,13,2,5247,2340455,56,77,2,0,1
8,2020-03-17,Espada,Tricked,Nuke,16,10,1,1,7,8,9,2,5247,2340455,56,77,2,0,1
9,2020-03-17,fnatic,BIG,Mirage,12,16,2,1,9,6,3,10,5226,2340397,5,18,1,2,2


## Sources

[Kaggle CS:GO Professional Matches](https://www.kaggle.com/mateusdmachado/csgo-professional-matches)
[Pandas Python Library](https://pandas.pydata.org/)
[Beautiful Soup]([https://www.crummy.com/software/BeautifulSoup/bs4/doc/](https://www.crummy.com/software/BeautifulSoup/bs4/doc/))
