# Capstone Project: Winner Winner Chickie Dinner?
<font size="4">Winner Prediction for Fortnite</font>

---

## Table of Contents

### [Overview](#Overview)

### [Problem Statement](#ProbState)

### [Part 1: Data Collection](#Part1)
1. [**Imports (All imported libraries are added here)**](#imports)
2. [**Web scraping**](#scrape)    
    2.1 [Events Archive](#events)   
    2.2 [Match IDs](#matchid)
3. [**Progress thus far**](#progress)

---

<a id='Overview'></a>
## Overview

Electronic sports (or e-sports) is a form of competition using video games, and takes the format of organized multiplayer video game competitions held between professional players, either as an individual or as a team. The video game genres normally associated with e-sports are Multiplayer Online Battle Arena (MOBA), First-Person Shooter (FPS), Battle Royale, Real-Time Strategy (RTS), Fighting, Collectible Card Games (CCG) and Sports & Racing.

Based on the reports by Newzoo (games and esports analytics group), the global revenue for esports for year 2020 reached 1 billion dollars and the global audience reached 495 million. ([Source](https://venturebeat.com/2020/02/25/newzoo-global-esports-will-top-1-billion-in-2020-with-china-as-the-top-market/)). Looking at the historical data from Esports Earnings, considering how the prize money for tournaments is over 200 million dollars for the years 2019 and 2021, and 125 million dollars in 2020 despite the COVID-19 pandemic, it comes as no surprise that there is an increasing focus on sports in the recent years. ([Source](https://www.esportsearnings.com/history)).

In the case of Singapore, as the host of first Asian edition of Gamescom (The world's largest video game industry event) as well as the host for the Global Esport Games in the year 2021 last year, it is without doubt that the e-sports sector will become increasingly more important, with Singapore being the epicenter for the South-East Asia region. ([Source](https://www.straitstimes.com/business/companies-markets/singapore-sets-sights-on-becoming-world-force-in-e-sports)).

<a id='ProbState'></a>
## Problem Statement

With how popular the e-sports sector is, it comes without a doubt that the most popular form of entertainment is going to be imminent, the act of guessing the winner. As such, this project will be tackling the subject of winner prediction.

Looking at the various genres of video games, team sports such as MOBA, FPS and RTS tends to be more popular and easier to guess the winner since the matches are between two teams. As such, this project will be undertaking the task of predicting the winner of a match of a Battle Royale game. In this case, the focus will be on Fortnite, the current third highest prize-awarding game. ([Source](https://www.esportsearnings.com/games)).

Fortnite by itself as a battle royale has various match modes that cater to the preferences of the player that are divided into the number of people in same group fighting for the winner spot. In this case, they are 'Solo' for 1v1 matches, 'Duos' for teams of 2, 'Trios' for teams of 3 and 'Squads' for teams of 4.

To simplify this project, the main task of this project will be the development of a classification-based model to handle the prediction of the winner of a solo match in Fortnite. Since the act of winner prediction is usually before the beginning of the match, this task will be based off the historical player statistics.

<a id='Part1'></a>
## Part 1: Data Collection
In this notebook, the first part of the web scraping process is conducted where we gather the `match ids` which will allow us to identify and scrape for the match statistics that we need for our data.

<a id='imports'></a>
### 1. Imports (All imported libraries are added here)

In [1]:
# Import libaries
import pandas as pd
import requests
import re
from bs4 import BeautifulSoup

import selenium
from selenium import webdriver

<a id='scrape'></a>
### 2. Web scraping

In [2]:
# Defining base url
base_url = "https://fortnitetracker.com"

<a id='events'></a>
### 2.1 Events Archive

At this point in time, we will be scraping for the data of the events archive to begin the process of gathering of the data needed for this project. 

In [3]:
# url for events archived
url = base_url + '/events/archived/'

In [4]:
# requests
req = requests.get(url)
soup = BeautifulSoup(req.content, "lxml")

# Compiling dataframe of tournaments
tourneylist = soup.find_all('a', {'class': "fne-poster__child"})

tournaments = []
for i in tourneylist:
    tourney = {}
    tourney['region'] = i.find('span').text  
    tourney['href'] = i.attrs['href']
    tourney['status'] = i.find_all('span')[1].attrs['class'][1]
    tournaments.append(tourney)
tournaments_df = pd.DataFrame(tournaments)
tournaments_df

Unnamed: 0,region,href,status
0,Europe,/events/epicgames_S19_TriosHypeCup_Champion_EU,fne-status--upcoming
1,NA East,/events/epicgames_S19_TriosHypeCup_Champion_NAE,fne-status--upcoming
2,NA West,/events/epicgames_S19_TriosHypeCup_Champion_NAW,fne-status--upcoming
3,Oceania,/events/epicgames_S19_TriosHypeCup_Champion_OCE,fne-status--upcoming
4,Asia,/events/epicgames_S19_TriosHypeCup_Champion_ASIA,fne-status--upcoming
...,...,...,...
2011,Oceania,/events/epicgames_ShareTheLove_Placement_Duo_OCE,fne-status--ended
2012,Europe,/events/epicgames_ShareTheLove_Placement_Solo_EU,fne-status--ended
2013,Brazil,/events/epicgames_ShareTheLove_Placement_Solo_BR,fne-status--ended
2014,Asia,/events/epicgames_ShareTheLove_Placement_Solo_...,fne-status--ended


In [5]:
# Excluding ongoing or upcoming tournaments
tournaments_df = tournaments_df[tournaments_df['status'] == 'fne-status--ended'].reset_index(drop = True)
tournaments_df

Unnamed: 0,region,href,status
0,Europe,/events/epicgames_S19_PlaystationCup_EU,fne-status--ended
1,NA East,/events/epicgames_S19_PlaystationCup_NAE,fne-status--ended
2,NA West,/events/epicgames_S19_PlaystationCup_NAW,fne-status--ended
3,Brazil,/events/epicgames_S19_PlaystationCup_BR,fne-status--ended
4,Middle East,/events/epicgames_S19_PlaystationCup_ME,fne-status--ended
...,...,...,...
1948,Oceania,/events/epicgames_ShareTheLove_Placement_Duo_OCE,fne-status--ended
1949,Europe,/events/epicgames_ShareTheLove_Placement_Solo_EU,fne-status--ended
1950,Brazil,/events/epicgames_ShareTheLove_Placement_Solo_BR,fne-status--ended
1951,Asia,/events/epicgames_ShareTheLove_Placement_Solo_...,fne-status--ended


<a id='matchid'></a>
### 2.2 Match IDs

At this point in time, since we now have the urls for each of the events, we will be scraping for the match ids that will be acting as the identification of the matches that we are going to use for analysis.

In [6]:
%%time
# Scraping the match ids
driver = webdriver.Chrome()
matches = []

for i in tournaments_df.index:
    print(f'Scraping tournament index {i}')
    try:
        driver.get(base_url + tournaments_df['href'][i])
        imp_leaderboard = driver.execute_script('return imp_leaderboard')
    
        for j in imp_leaderboard['entries'][0]['sessionHistory']:
            match = {}
            match['tournament'] = imp_leaderboard['eventId']
            match['event_round'] = imp_leaderboard['eventWindowId']
            match['match_id'] = j['sessionId']
            matches.append(match)
                    
#     except AttributeError:
#         print ('Error in scraping with no value, skipping.')
#         continue
#     except KeyError:
#         print ('Error in scraping with missing keys, skipping.')
#         continue
    except Exception:
        print ('Error in scraping with missing javascript variable, skipping.')
        continue
   
driver.quit()

Scraping tournament index 0
Scraping tournament index 1
Scraping tournament index 2
Scraping tournament index 3
Scraping tournament index 4
Scraping tournament index 5
Scraping tournament index 6
Scraping tournament index 7
Scraping tournament index 8
Scraping tournament index 9
Scraping tournament index 10
Scraping tournament index 11
Scraping tournament index 12
Scraping tournament index 13
Scraping tournament index 14
Scraping tournament index 15
Scraping tournament index 16
Scraping tournament index 17
Scraping tournament index 18
Scraping tournament index 19
Scraping tournament index 20
Scraping tournament index 21
Scraping tournament index 22
Scraping tournament index 23
Scraping tournament index 24
Scraping tournament index 25
Scraping tournament index 26
Scraping tournament index 27
Scraping tournament index 28
Scraping tournament index 29
Scraping tournament index 30
Scraping tournament index 31
Scraping tournament index 32
Scraping tournament index 33
Scraping tournament inde

Scraping tournament index 273
Scraping tournament index 274
Scraping tournament index 275
Scraping tournament index 276
Scraping tournament index 277
Scraping tournament index 278
Scraping tournament index 279
Scraping tournament index 280
Scraping tournament index 281
Scraping tournament index 282
Scraping tournament index 283
Scraping tournament index 284
Scraping tournament index 285
Scraping tournament index 286
Scraping tournament index 287
Scraping tournament index 288
Scraping tournament index 289
Scraping tournament index 290
Scraping tournament index 291
Scraping tournament index 292
Scraping tournament index 293
Scraping tournament index 294
Scraping tournament index 295
Scraping tournament index 296
Scraping tournament index 297
Scraping tournament index 298
Scraping tournament index 299
Scraping tournament index 300
Scraping tournament index 301
Scraping tournament index 302
Scraping tournament index 303
Scraping tournament index 304
Scraping tournament index 305
Scraping t

Scraping tournament index 540
Scraping tournament index 541
Scraping tournament index 542
Scraping tournament index 543
Scraping tournament index 544
Scraping tournament index 545
Scraping tournament index 546
Scraping tournament index 547
Scraping tournament index 548
Scraping tournament index 549
Scraping tournament index 550
Scraping tournament index 551
Scraping tournament index 552
Scraping tournament index 553
Scraping tournament index 554
Scraping tournament index 555
Scraping tournament index 556
Scraping tournament index 557
Scraping tournament index 558
Scraping tournament index 559
Scraping tournament index 560
Scraping tournament index 561
Scraping tournament index 562
Scraping tournament index 563
Scraping tournament index 564
Scraping tournament index 565
Scraping tournament index 566
Scraping tournament index 567
Scraping tournament index 568
Scraping tournament index 569
Scraping tournament index 570
Scraping tournament index 571
Scraping tournament index 572
Scraping t

Scraping tournament index 787
Scraping tournament index 788
Scraping tournament index 789
Scraping tournament index 790
Scraping tournament index 791
Scraping tournament index 792
Scraping tournament index 793
Scraping tournament index 794
Scraping tournament index 795
Scraping tournament index 796
Scraping tournament index 797
Scraping tournament index 798
Scraping tournament index 799
Scraping tournament index 800
Scraping tournament index 801
Scraping tournament index 802
Scraping tournament index 803
Scraping tournament index 804
Scraping tournament index 805
Scraping tournament index 806
Scraping tournament index 807
Scraping tournament index 808
Scraping tournament index 809
Scraping tournament index 810
Scraping tournament index 811
Scraping tournament index 812
Scraping tournament index 813
Scraping tournament index 814
Scraping tournament index 815
Scraping tournament index 816
Scraping tournament index 817
Scraping tournament index 818
Scraping tournament index 819
Scraping t

Scraping tournament index 1053
Scraping tournament index 1054
Scraping tournament index 1055
Scraping tournament index 1056
Scraping tournament index 1057
Scraping tournament index 1058
Scraping tournament index 1059
Scraping tournament index 1060
Scraping tournament index 1061
Scraping tournament index 1062
Scraping tournament index 1063
Scraping tournament index 1064
Scraping tournament index 1065
Scraping tournament index 1066
Scraping tournament index 1067
Scraping tournament index 1068
Scraping tournament index 1069
Scraping tournament index 1070
Scraping tournament index 1071
Scraping tournament index 1072
Scraping tournament index 1073
Scraping tournament index 1074
Scraping tournament index 1075
Scraping tournament index 1076
Scraping tournament index 1077
Scraping tournament index 1078
Scraping tournament index 1079
Scraping tournament index 1080
Scraping tournament index 1081
Scraping tournament index 1082
Scraping tournament index 1083
Scraping tournament index 1084
Scraping

Scraping tournament index 1300
Scraping tournament index 1301
Scraping tournament index 1302
Scraping tournament index 1303
Scraping tournament index 1304
Scraping tournament index 1305
Scraping tournament index 1306
Scraping tournament index 1307
Scraping tournament index 1308
Scraping tournament index 1309
Scraping tournament index 1310
Scraping tournament index 1311
Scraping tournament index 1312
Scraping tournament index 1313
Scraping tournament index 1314
Scraping tournament index 1315
Scraping tournament index 1316
Scraping tournament index 1317
Scraping tournament index 1318
Scraping tournament index 1319
Scraping tournament index 1320
Scraping tournament index 1321
Scraping tournament index 1322
Scraping tournament index 1323
Scraping tournament index 1324
Scraping tournament index 1325
Scraping tournament index 1326
Scraping tournament index 1327
Scraping tournament index 1328
Scraping tournament index 1329
Scraping tournament index 1330
Scraping tournament index 1331
Scraping

Scraping tournament index 1553
Scraping tournament index 1554
Scraping tournament index 1555
Scraping tournament index 1556
Scraping tournament index 1557
Scraping tournament index 1558
Scraping tournament index 1559
Scraping tournament index 1560
Scraping tournament index 1561
Scraping tournament index 1562
Scraping tournament index 1563
Scraping tournament index 1564
Scraping tournament index 1565
Scraping tournament index 1566
Scraping tournament index 1567
Scraping tournament index 1568
Scraping tournament index 1569
Scraping tournament index 1570
Scraping tournament index 1571
Scraping tournament index 1572
Scraping tournament index 1573
Scraping tournament index 1574
Scraping tournament index 1575
Scraping tournament index 1576
Scraping tournament index 1577
Scraping tournament index 1578
Scraping tournament index 1579
Scraping tournament index 1580
Scraping tournament index 1581
Scraping tournament index 1582
Scraping tournament index 1583
Scraping tournament index 1584
Scraping

Scraping tournament index 1803
Scraping tournament index 1804
Scraping tournament index 1805
Scraping tournament index 1806
Scraping tournament index 1807
Scraping tournament index 1808
Scraping tournament index 1809
Scraping tournament index 1810
Scraping tournament index 1811
Scraping tournament index 1812
Scraping tournament index 1813
Scraping tournament index 1814
Scraping tournament index 1815
Scraping tournament index 1816
Scraping tournament index 1817
Scraping tournament index 1818
Scraping tournament index 1819
Scraping tournament index 1820
Scraping tournament index 1821
Scraping tournament index 1822
Scraping tournament index 1823
Scraping tournament index 1824
Scraping tournament index 1825
Scraping tournament index 1826
Scraping tournament index 1827
Scraping tournament index 1828
Scraping tournament index 1829
Scraping tournament index 1830
Scraping tournament index 1831
Scraping tournament index 1832
Scraping tournament index 1833
Scraping tournament index 1834
Scraping

In [7]:
# Passing to dataframe
matches_df = pd.DataFrame(matches)
matches_df

Unnamed: 0,tournament,event_round,match_id
0,epicgames_S19_PlaystationCup_EU,S19_PlaystationCup_EU_Event1_Round2,2f5edf4cf77e4d62a1a596097527fc7c
1,epicgames_S19_PlaystationCup_EU,S19_PlaystationCup_EU_Event1_Round2,cda0a24adfed404e884236740bacc1f7
2,epicgames_S19_PlaystationCup_EU,S19_PlaystationCup_EU_Event1_Round2,011af1ad879c45e0ad8607647220efe0
3,epicgames_S19_PlaystationCup_EU,S19_PlaystationCup_EU_Event1_Round2,c0d7f093a3ba48e0960c2c2c74014b75
4,epicgames_S19_PlaystationCup_EU,S19_PlaystationCup_EU_Event1_Round2,001e52411f634f7a938c76c0681482cd
...,...,...,...
15748,epicgames_ShareTheLove_Prospect_Solo_OCE,ShareTheLove_Prospect_Solo_OCE_Event5,0b0964ff5a0d4271aeb8ad9174924202
15749,epicgames_ShareTheLove_Prospect_Solo_OCE,ShareTheLove_Prospect_Solo_OCE_Event5,371895f9332948e6970a748461a92cdf
15750,epicgames_ShareTheLove_Prospect_Solo_OCE,ShareTheLove_Prospect_Solo_OCE_Event5,1e2590adc4d743028839efdfbd5c3ed6
15751,epicgames_ShareTheLove_Prospect_Solo_OCE,ShareTheLove_Prospect_Solo_OCE_Event5,0f1d50843a304eda8e51ae6857cde1ac


In [9]:
# Saving to file
matches_df.to_csv('../datasets/match_ids.csv', index = False)

<a id='progress'></a>
### 3. Progress so far
In [Part 1-1 Webscraping Match IDs](./1-1.webscraping_match_ids.ipynb), we have finished the web scraping of the match ids from the website `Fortnite Tracker` that are going to be used for the web scraping of the match sessions which will contain all of the match statistics for our analysis. 

It should be noted that the match ids that had been scraped are not comprehensive since there are various issues with the website that made webscraping difficult. This is not considered as a heavy issue since there are enough matches sessions for analysis.

We will continue next in: [Part 1-2 Webscraping Match Sessions](./1-2.webscraping_match_sessions.ipynb)