# Esports: Basic exploratory analysis

My personal interests in competitive games has resulted in watching a lot of esport tournaments in the past few years. The esports industry is continously growing and the future seems to be promising. The number of viewers watching these tournaments is increasing every year and a lot of kids are no longer interested in traditional sports but are dreaming of becoming a professional gamer.

I used this dataset to practice my pandas and numpy skills while gaining additional insight into an industry I am personally interested in.

**Number of peak viewers for esports tournaments in 2018**

![PeakViewers](https://i.imgur.com/96EWnfq.png)

Source: https://escharts.com/2018


#### Importing modules and data
Lets start off by importing the various modules and datasets we will need in this notebook.

In [None]:
# Import the needed modules
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Read the main data sets
teams_df = pd.read_csv(r'../input/esports-earnings-for-players-teams-by-game/highest_earning_teams.csv', sep=',')
players_df = pd.read_csv(r'../input/esports-earnings-for-players-teams-by-game/highest_earning_players.csv', sep=',')

#### Exploring teams data
By constructing a pandas dataframe, we are able to easily perform operations on the data contained in the dataframe.

In [None]:
#Show teams data structure and data
teams_df.head

I thought it would be interesting to see which games are included in the dataset.

In [None]:
#Show all the unique game included in the dataset
teams_df['Game'].unique()

I thought it would be interesting to show the five biggest esports teams by total prize winnings and additionaly calculate the total sum of prize money won by playing esports tournaments.

In [None]:
#Show the five largest teams by the amount of prize money they have won
teams_df.nlargest(5, 'TotalUSDPrize')

In [None]:
#Show the sum of all prize money in the dataset, rounded to two decimals
round(teams_df['TotalUSDPrize'].sum(), 2)

I had the assumption that there were huge differences in the amount of prize money for certain games and a pie chart would be suitable to visualize the differences.

In [None]:
#Show a pie chart of all the different games and prize money sums in the dataset

earnings_per_game = teams_df.groupby(['Game'])['TotalUSDPrize'].sum()

earnings_per_game.plot(kind="pie")
plt.show()

Personally, I was interested in finding out which esports teams were on average the most succesfull. By dividing the total amount of prize money won by the number of total tournaments played; We would be able to calculate the average amount of prize money winings per tournament for every team.

In [None]:
#Divide the total prize money by the number of tournaments
avg_earnings = teams_df['TotalUSDPrize'].div(teams_df['TotalTournaments'], axis='index')

#Add the team names and total tournaments played to the calculated averages
avg_earnings_df = pd.concat([teams_df['TeamName'], teams_df['TotalTournaments'], round(avg_earnings, 2)], axis=1)

#Name the colums
avg_earnings_df.columns = ['Team', 'TotalTournaments', 'AvgEarnings']

#Show the top 10 esport teams with the highest prize money averages
avg_earnings_df.nlargest(10, 'AvgEarnings')

eStar Gaming seems to be particullary succesfull with an average prize winning of $522664,73 for every tournament they have played.

#### Expore player data
I am specifically interested in Counter-Strike: Global Offensive (CSGO) players as I play and watch this game a lot.
First, I will explore the data in a general manner similar to what we did with the esports team data.

In [None]:
#Show players dataset structure and data
players_df.head

It could be interesting to find out which countries have the most esports players.

In [None]:
#Count the number of players for each country
players_per_country = players_df['CountryCode'].value_counts()

#Select the top 10 countries with most players in the dataset
top10 = players_per_country.nlargest(10)
top10

I expected that western countries such as the US and Germany would have a lot more esports players but apparently Korea and China have a lot of professional esports players. A visualization could help with understanding the differences.

In [None]:
#Visualize the top 10 countries with most esport players in a bar plot.
top10.plot(kind="bar", color=['purple'])
plt.show()

I thought about other details that could be interesting before diving deeper into the CSGO scene. The best player according to their earnings for every game could provide some insight in the differences between these games.

In [None]:
#Show the best earning player for every game in the dataset
players_df.groupby('Game').head(1)

In [None]:
#Show the number of players for each game in the dataset
players_df['Game'].value_counts()

Lets dive deeper into CSGO and the most succesfull players.

In [None]:
#Filter all the players that play CSGO
CSGO_players = players_df.loc[players_df['Game'] == 'Counter-Strike: Global Offensive']

#Show the first 10 CSGO players
CSGO_players.head(10)

The dataset misses the name of the team these players play for. I wanted to add a new column to store the current team a player plays for.

In [None]:
#Add a column to store the CSGO players current team
CSGO_players.insert(loc=8, column="CurrentTeam", value=['' for i in range(CSGO_players.shape[0])])

#Add team names to the top five CSGO players
CSGO_players.loc[0:4, 'CurrentTeam'] = "Astralis"
CSGO_players.loc[5, 'CurrentTeam'] = "Team Liquid"
CSGO_players.head(10)

To conclude this explorarty analysis, I wanted to visualize which countries CSGO players come from.

In [None]:
#Visualize the countries with more than five professional CSGO players in a pie chart
countries = CSGO_players['CountryCode'].value_counts()[CSGO_players['CountryCode'].value_counts() > 5]
plt.pie(countries, labels = countries.index, shadow = True, radius=1.5, autopct = '%1.1f%%')
plt.show()

I enjoyed working with this dataset and want to thank Jack Daoud (https://www.kaggle.com/jackdaoud) for the opportunity. The dataset provided me with a lot of new insights and most of all allowed me to practice and demonstrate my python skills.