# 1. Libraries

In [156]:
# Import Libraries
import requests
import pandas as pd
import os
from dotenv import load_dotenv

# 2. Introduction

The purpose of this notebook is to gather data on the European ARMS Challenge (EUAC) tournament series. <br>
The EUAC is an online biweekly tournament series for the video game "ARMS" released for the Nintendo Switch console in 2017. <br>
The EUAC started in 2017 and ran consistently until 2022.<br>
<br>
More information can be found in references below.

This notebook is to purely gather the data, act as a proof of concept, and format it for future use/analysis. <br>
It will then be exported to a csv file if successful. <br>

# 3. Objectives

- Determine what data that is necessary
- Gather EUAC#1 data from start.gg using their api
- Gather EUAC#2 to EUAC#110 data from Challonge.com using their api
- Format the data into a pandas dataframe
- Export to csv

## 3.1. Objective 1

### Determine what data is necessary

For each tournament, we will need: <br>
- The participants who entered
- Each match that was played
- The result of each match
- The date the tournament took place
- The final placements
- Each player's seeding

## 3.2. Objective 2 - Data Gathering

The tournaments were mostly hosted on Challonge.com but the first one was hosted on start.gg. <br>

### 3.2.1. - Start.gg

Start.gg's API uses GraphQL as its query language <br>
The GraphQL queries will be wrapped in multiline strings so they can be sent to the start.gg API to retrieve the data. <br>
Queries also expect an API token to passed through the header. This token will be loaded from a .env file.<br>
<br>
A link to the tutorial that was followed to come up with the queries can be found in References.

In [157]:
# Load API Keys stored in .env file
load_dotenv(".env")
START_GG_API_TOKEN = os.getenv("START_GG_API_TOKEN")

The tournament was hosted at this url: https://www.start.gg/tournament/eu-arms-challenge-1/events <br>
For some of the information that is needed, the id of the event is required. Which is obtained here using the url slug

In [158]:
# GraphQL query wrapped in a multiline string. 
query = """
query TournamentQuery {
  tournament(slug: "tournament/eu-arms-challenge-1") {
    name
    events {
      name
      id
    }
  }
}
"""

In [159]:
# Sends the query and returns the response. Will be using this a lot
def run_query(query):
    url = "https://api.start.gg/gql/alpha"
    headers = {
        "Authorization": f"Bearer {START_GG_API_TOKEN}",
        "Content-Type": "application/json"
    }
    response = requests.post(url, json={'query': query}, headers=headers)
    return response.json()

In [160]:
data = run_query(query)

In [161]:
data

{'data': {'tournament': {'name': 'EU ARMS Challenge #1',
   'events': [{'name': 'EU ARMS Challenge #1', 'id': 53835}]}},
 'extensions': {'cacheControl': {'version': 1,
   'hints': [{'path': ['tournament'], 'maxAge': 300, 'scope': 'PRIVATE'}]},
  'queryComplexity': 2},
 'actionRecords': []}

In [162]:
eventId = data["data"]["tournament"]["events"][0]["id"]

In [163]:
eventId

53835

The id of the event is 53835

Using the event id, a query can be written to get the players, their seedings and placements, and the start date of the tournament <br>
Some of this information could have been gotten before but not all

In [164]:
# Query for players, seedings, placements

query = """
query {
  event(id: 53835) {
    id
    name
    startAt
    standings(query: {
      page: 1
    }) {
      nodes {
        placement
        entrant {
          id
          name
          seeds {
            seedNum
          }
        }
      }
    }
  }
}
"""

In [165]:
data = run_query(query)

In [166]:
data

{'data': {'event': {'id': 53835,
   'name': 'EU ARMS Challenge #1',
   'startAt': 1508605200,
   'standings': {'nodes': [{'placement': 1,
      'entrant': {'id': 1152276,
       'name': 'FR | Maxou0708',
       'seeds': [{'seedNum': 20}]}},
     {'placement': 2,
      'entrant': {'id': 1118825,
       'name': 'Rapha_MTH',
       'seeds': [{'seedNum': 6}]}},
     {'placement': 3,
      'entrant': {'id': 1133810, 'name': 'Sabaca', 'seeds': [{'seedNum': 8}]}},
     {'placement': 4,
      'entrant': {'id': 1152521,
       'name': 'TCM | Raffa',
       'seeds': [{'seedNum': 21}]}},
     {'placement': 5,
      'entrant': {'id': 1152258,
       'name': 'FrankTank',
       'seeds': [{'seedNum': 19}]}},
     {'placement': 5,
      'entrant': {'id': 1114568, 'name': 'ocrim', 'seeds': [{'seedNum': 2}]}},
     {'placement': 7,
      'entrant': {'id': 1139232,
       'name': 'Alumento',
       'seeds': [{'seedNum': 11}]}},
     {'placement': 7,
      'entrant': {'id': 1133727,
       'name': 'SC☆Mo

Data is a nested dictionary. Now we'll gather a list of participants, their placements and seeds, by digging through this

In [167]:
data["data"]["event"]["name"]

'EU ARMS Challenge #1'

In [168]:
data["data"]["event"]["startAt"]

1508605200

In [169]:
tournamentDate = data["data"]["event"]["startAt"]

In [170]:
tournamentDate

1508605200

UNIX time for when tournament took place. Only the date is required. The time is unnecessary

In [171]:
from datetime import datetime

date = datetime.utcfromtimestamp(tournamentDate).strftime('%d/%m/%y')
print(date) 

21/10/17


In [172]:
data["data"]["event"]["standings"]

{'nodes': [{'placement': 1,
   'entrant': {'id': 1152276,
    'name': 'FR | Maxou0708',
    'seeds': [{'seedNum': 20}]}},
  {'placement': 2,
   'entrant': {'id': 1118825, 'name': 'Rapha_MTH', 'seeds': [{'seedNum': 6}]}},
  {'placement': 3,
   'entrant': {'id': 1133810, 'name': 'Sabaca', 'seeds': [{'seedNum': 8}]}},
  {'placement': 4,
   'entrant': {'id': 1152521,
    'name': 'TCM | Raffa',
    'seeds': [{'seedNum': 21}]}},
  {'placement': 5,
   'entrant': {'id': 1152258,
    'name': 'FrankTank',
    'seeds': [{'seedNum': 19}]}},
  {'placement': 5,
   'entrant': {'id': 1114568, 'name': 'ocrim', 'seeds': [{'seedNum': 2}]}},
  {'placement': 7,
   'entrant': {'id': 1139232, 'name': 'Alumento', 'seeds': [{'seedNum': 11}]}},
  {'placement': 7,
   'entrant': {'id': 1133727, 'name': 'SC☆Momso', 'seeds': [{'seedNum': 7}]}},
  {'placement': 9,
   'entrant': {'id': 1114889,
    'name': 'VilleViljar',
    'seeds': [{'seedNum': 4}]}},
  {'placement': 9,
   'entrant': {'id': 1152242,
    'name': 'TC

In [173]:
data["data"]["event"]["standings"]["nodes"][0] #Information for one player

{'placement': 1,
 'entrant': {'id': 1152276,
  'name': 'FR | Maxou0708',
  'seeds': [{'seedNum': 20}]}}

In [174]:
# Placement
data["data"]["event"]["standings"]["nodes"][0]["placement"]

1

In [175]:
# Name
data["data"]["event"]["standings"]["nodes"][0]["entrant"]["name"]

'FR | Maxou0708'

In [176]:
# id
data["data"]["event"]["standings"]["nodes"][0]["entrant"]["id"]

1152276

In [177]:
# Seeding
data["data"]["event"]["standings"]["nodes"][0]["entrant"]["seeds"][0]["seedNum"]

20

In [178]:
# Putting it all together in a for loop. Store information in arrays

playerArray = []
placementArray = []
seedArray = []
idArray = []

for entrant in data["data"]["event"]["standings"]["nodes"]:
    playerArray.append(entrant["entrant"]["name"])
    idArray.append(entrant["entrant"]["id"])
    placementArray.append(entrant["placement"])
    seedArray.append(entrant["entrant"]["seeds"][0]["seedNum"])

In [179]:
# Make a pandas dataframe of the arrays

playerdf = pd.DataFrame({
    "Start ID": idArray,
    "Player": playerArray,
    "Seed": seedArray,
    "Placement": placementArray
})

In [180]:
playerdf.head()

Unnamed: 0,Start ID,Player,Seed,Placement
0,1152276,FR | Maxou0708,20,1
1,1118825,Rapha_MTH,6,2
2,1133810,Sabaca,8,3
3,1152521,TCM | Raffa,21,4
4,1152258,FrankTank,19,5


In [181]:
playerdf.to_csv("EUAC1Placements.csv", index=False)

All players, their Start ids, their placements and seeding, have been gathered. <br>To note for later: players can sign up with "tags". Denoted with a |. But | can be present in a tag

Due to the hierarchical structure of how tournaments in Start.gg can be, we need to use the event ID to get the phase ID, use the Phase Id to get the Phase Group ID, and then using the Phase Group ID we can obtain the sets in the tournaments that were played with their reported scores

In [182]:
# Query to get Phase ID

query = """
query {
  event(id: 53835) {
    id
    name
    phases {
      id
      name
    }
  }
}"""

In [183]:
data = run_query(query)

In [184]:
data

{'data': {'event': {'id': 53835,
   'name': 'EU ARMS Challenge #1',
   'phases': [{'id': 159661, 'name': 'Bracket'}]}},
 'extensions': {'cacheControl': {'version': 1,
   'hints': [{'path': ['event'], 'maxAge': 60, 'scope': 'PRIVATE'}]},
  'queryComplexity': 2},
 'actionRecords': []}

In [185]:
# Phase ID
data["data"]["event"]["phases"][0]["id"]

159661

In [186]:
# Query to get Phase Group ID
query = """
query {
  phase(id: 159661) {
    phaseGroups {
      nodes {
        id
      }
    }
  }
}"""

In [187]:
data = run_query(query)

In [188]:
data

{'data': {'phase': {'phaseGroups': {'nodes': [{'id': 431370}]}}},
 'extensions': {'cacheControl': {'version': 1, 'hints': None},
  'queryComplexity': 1},
 'actionRecords': []}

In [189]:
# Phase Group ID
data["data"]["phase"]["phaseGroups"]["nodes"][0]["id"]

431370

In [190]:
# Query to get all sets

query = """
query {
  phaseGroup(id: 431370) {
    sets(page: 1, perPage: 100) {
      nodes {
        id
        displayScore
        fullRoundText
        winnerId
        slots {
          entrant {
            name
          }
        }
      }
    }
  }
}"""

In [191]:
data = run_query(query)

In [192]:
# Display all sets
data

{'data': {'phaseGroup': {'sets': {'nodes': [{'id': 10598232,
      'displayScore': 'Rapha_MTH 1 - FR | Maxou0708 3',
      'fullRoundText': 'Grand Final',
      'winnerId': 1152276,
      'slots': [{'entrant': {'name': 'Rapha_MTH'}},
       {'entrant': {'name': 'FR | Maxou0708'}}]},
     {'id': 10598233,
      'displayScore': 'FR | Maxou0708 3 - Rapha_MTH 0',
      'fullRoundText': 'Grand Final Reset',
      'winnerId': 1152276,
      'slots': [{'entrant': {'name': 'FR | Maxou0708'}},
       {'entrant': {'name': 'Rapha_MTH'}}]},
     {'id': 10598231,
      'displayScore': 'Sabaca 0 - Rapha_MTH 2',
      'fullRoundText': 'Winners Final',
      'winnerId': 1118825,
      'slots': [{'entrant': {'name': 'Sabaca'}},
       {'entrant': {'name': 'Rapha_MTH'}}]},
     {'id': 10598295,
      'displayScore': 'DQ',
      'fullRoundText': 'Losers Final',
      'winnerId': 1152276,
      'slots': [{'entrant': {'name': 'Sabaca'}},
       {'entrant': {'name': 'FR | Maxou0708'}}]},
     {'id': 1059829

In [193]:
# Showing a set example
data["data"]["phaseGroup"]["sets"]["nodes"][0]

{'id': 10598232,
 'displayScore': 'Rapha_MTH 1 - FR | Maxou0708 3',
 'fullRoundText': 'Grand Final',
 'winnerId': 1152276,
 'slots': [{'entrant': {'name': 'Rapha_MTH'}},
  {'entrant': {'name': 'FR | Maxou0708'}}]}

In [194]:
# Retrieving a player's name
data["data"]["phaseGroup"]["sets"]["nodes"][0]["slots"][0]["entrant"]["name"]

'Rapha_MTH'

This starts at the "end" of the tournament. The last match is first returned

In [195]:
data["data"]["phaseGroup"]["sets"]["nodes"][-3]

{'id': 10598210,
 'displayScore': 'DQ',
 'fullRoundText': 'Winners Round 1',
 'winnerId': 1152122,
 'slots': [{'entrant': {'name': 'Kotorious BRD'}},
  {'entrant': {'name': 'Altair'}}]}

In [196]:
# Showing a set's score
data["data"]["phaseGroup"]["sets"]["nodes"][7]["displayScore"]

'ocrim 0 - FR | Maxou0708 2'

Player names and Scores can be gotten from "displayScore" but DQs will require a bit more work. <br>
Will have to access "slots" to get player names, query dataframe for the winner id. Loser is the other. <br>
<br>
To get the score, the string will be split by the " - " and the last character of each string is the score

In [197]:
scoreSplit = data["data"]["phaseGroup"]["sets"]["nodes"][7]["displayScore"].split(" - ")
scoreSplit

['ocrim 0', 'FR | Maxou0708 2']

In [198]:
# Function to retrieve player name from dataframe with their start id no.

def retrieve_player(df, pid):
    player = df[df["Start ID"] == pid]
    if not player.empty:
        return player.iloc[0]["Player"]
    else:
        print(f"No player matching ID {pid} was found")

In [199]:
retrieve_player(playerdf, 1152276)

'FR | Maxou0708'

In [200]:
import re

In [201]:
player1Array = []
player2Array = []
winnerArray = []
loserArray = []
matchArray = []
scoreArray = []
matchNo = 1

for sets in reversed(data["data"]["phaseGroup"]["sets"]["nodes"]):
    #Setting Player 1 and Player 2
    player1 = sets["slots"][0]["entrant"]["name"]
    player2 = sets["slots"][1]["entrant"]["name"]
    player1Array.append(player1)
    player2Array.append(player2)
    
    # Determining Winner and Loser
    winner = retrieve_player(playerdf, sets["winnerId"])
    winnerArray.append(retrieve_player(playerdf, sets["winnerId"]))
    # If winner is p1, then loser is p2. Otherwise, p1 must be the loser
    if winner == player1:
        loser = player2
    else:
        loser = player1
    loserArray.append(loser)
    
    # Match no
    matchArray.append(matchNo)
    matchNo += 1
    
    # Score
    scoreSplit = sets["displayScore"].split(" - ")
    # Catch DQs and register them as "0--1"
    if scoreSplit[0][-1] == "D" or scoreSplit[0][-1] == "Q":
        score = "0--1"
        scoreArray.append(score)
    else:
        # Write score in perspective of winner. Higher score always first
        # Ex. 2-0. Never 0-2
        if int(scoreSplit[0][-1]) > int(scoreSplit[1][-1]):
            score = f"{scoreSplit[0][-1]}-{scoreSplit[1][-1]}"
            scoreArray.append(score)
        elif int(scoreSplit[1][-1]) > int(scoreSplit[0][-1]):
            score = f"{scoreSplit[1][-1]}-{scoreSplit[0][-1]}"
            scoreArray.append(score)

In [202]:
df = pd.DataFrame({
    "Player1": player1Array,
    "Player2": player2Array,
    "Winner": winnerArray,
    "Score": scoreArray,
    "Loser": loserArray,
    "MatchNo": matchArray,
    "EUAC": 1,
    "Date": date
})

In [203]:
df.head()

Unnamed: 0,Player1,Player2,Winner,Score,Loser,MatchNo,EUAC,Date
0,Alumento,Owdy,Alumento,2-0,Owdy,1,1,21/10/17
1,BambooBoss,FrankTank,FrankTank,2-0,BambooBoss,2,1,21/10/17
2,Kotorious BRD,Altair,Kotorious BRD,0--1,Altair,3,1,21/10/17
3,RD | | Dushni,TCM | Raffa,TCM | Raffa,2-0,RD | | Dushni,4,1,21/10/17
4,FR|TCM | InkAlyut,FR | Maxou0708,FR|TCM | InkAlyut,2-1,FR | Maxou0708,5,1,21/10/17


In [204]:
df.to_csv("EUAC1Sets.csv", index=False)

### 3.2.2 - Challonge.com

#### 3.2.2.1 - Acquiring URLs

Using the direct url for the singular Start.gg link was fine but there are too many Challonge links to do it again that way. <br>
Instead we will acquire the links from parsing the wiki page on this topic which contains them all.

In [205]:
from bs4 import BeautifulSoup

url = "https://armswiki.org/wiki/EU_ARMS_Challenge"

response = requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")

In [206]:
print(soup)

<!DOCTYPE html>

<html class="client-nojs" dir="ltr" lang="en">
<head>
<meta charset="utf-8"/>
<title>EU ARMS Challenge - ARMS Institute, the ARMS Wiki</title>
<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":false,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"8099ca58d868fceee76cd9bc","wgCanonicalNamespace":"","wgCanonicalSpecialPageName":false,"wgNamespaceNumber":0,"wgPageName":"EU_ARMS_Challenge","wgTitle":"EU ARMS Challenge","wgCurRevisionId":23064,"wgRevisionId":23064,"wgArticleId":4223,"wgIsArticle":true,"wgIsRedirect":false,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Tournaments"],"wgPageViewLanguage":"en","wgPageContentLanguage":"en","wgPageContentModel":"wikitext","wgRelevantPageName":"EU_ARMS_Challenge","wgRelevantArticleId":

In [207]:
# Reduce data down to the div containing all the link needed
tournamentData = soup.findAll('div', attrs={'class':"mw-body-content mw-content-ltr"})

In [208]:
tournamentArray = []
for links in tournamentData:
    tournamentArray.append(links.find_all("a", class_="external text"))

In [209]:
tournamentArray

[[<a class="external text" href="https://smash.gg/tournament/eu-arms-challenge-1/details" rel="nofollow">EU ARMS Challenge #1</a>,
  <a class="external text" href="https://challonge.com/EUCHALLLENGE2" rel="nofollow">EU ARMS Challenge #2</a>,
  <a class="external text" href="https://challonge.com/EUCHALLLENGE3" rel="nofollow">EU ARMS Challenge #3</a>,
  <a class="external text" href="https://challonge.com/EUCHALLENGE4_Europe" rel="nofollow">EU ARMS Challenge #4 (EUROPE BRACKET)</a>,
  <a class="external text" href="https://challonge.com/EUChallenge5" rel="nofollow">EU ARMS Challenge #5</a>,
  <a class="external text" href="https://challonge.com/EUChallenge6" rel="nofollow">EU ARMS Challenge #6</a>,
  <a class="external text" href="https://challonge.com/EUChallenge7" rel="nofollow">EU ARMS Challenge #7</a>,
  <a class="external text" href="https://challonge.com/EUChallenge8" rel="nofollow">EU ARMS Challenge #8</a>,
  <a class="external text" href="https://challonge.com/EUChallenge9" rel=

A list of lists in a list... Could have done this another way

In [210]:
linkArray = []
for links in tournamentArray:
    for link in links:
        linkArray.append(link["href"]) # Extract only the href tag (the link)

In [211]:
linkArray

['https://smash.gg/tournament/eu-arms-challenge-1/details',
 'https://challonge.com/EUCHALLLENGE2',
 'https://challonge.com/EUCHALLLENGE3',
 'https://challonge.com/EUCHALLENGE4_Europe',
 'https://challonge.com/EUChallenge5',
 'https://challonge.com/EUChallenge6',
 'https://challonge.com/EUChallenge7',
 'https://challonge.com/EUChallenge8',
 'https://challonge.com/EUChallenge9',
 'https://challonge.com/EUChallenge10',
 'https://challonge.com/EUChallenge11',
 'https://challonge.com/EUChallenge12',
 'https://challonge.com/EUChallenge13',
 'https://challonge.com/EUChallenge14',
 'https://challonge.com/EUChallenge15',
 'https://challonge.com/EUChallenge16',
 'https://challonge.com/EUChallenge17',
 'https://challonge.com/EUChallenge18',
 'https://challonge.com/EUChallenge19',
 'https://challonge.com/EUChallenge20',
 'https://challonge.com/EUChallenge21',
 'https://challonge.com/EUChallenge22',
 'https://challonge.com/EUChallenge23',
 'https://challonge.com/EUChallenge24',
 'https://challonge

In [212]:
# Remove non-challonge links

challongeArray = []

for links in linkArray:
    if "challonge" in links:
        challongeArray.append(links)

In [213]:
challongeArray

['https://challonge.com/EUCHALLLENGE2',
 'https://challonge.com/EUCHALLLENGE3',
 'https://challonge.com/EUCHALLENGE4_Europe',
 'https://challonge.com/EUChallenge5',
 'https://challonge.com/EUChallenge6',
 'https://challonge.com/EUChallenge7',
 'https://challonge.com/EUChallenge8',
 'https://challonge.com/EUChallenge9',
 'https://challonge.com/EUChallenge10',
 'https://challonge.com/EUChallenge11',
 'https://challonge.com/EUChallenge12',
 'https://challonge.com/EUChallenge13',
 'https://challonge.com/EUChallenge14',
 'https://challonge.com/EUChallenge15',
 'https://challonge.com/EUChallenge16',
 'https://challonge.com/EUChallenge17',
 'https://challonge.com/EUChallenge18',
 'https://challonge.com/EUChallenge19',
 'https://challonge.com/EUChallenge20',
 'https://challonge.com/EUChallenge21',
 'https://challonge.com/EUChallenge22',
 'https://challonge.com/EUChallenge23',
 'https://challonge.com/EUChallenge24',
 'https://challonge.com/EUChallenge25',
 'https://challonge.com/EUChallenge26',

#### 3.2.2.2 - Challonge API Exploration

Section for loading api keys and interacting with the API to figure out how to collect the data

In [214]:
# Load API Keys from .env file
API_KEY = os.getenv("API_KEY")
API_SECRET = os.getenv("API_SECRET")

In [215]:
import challonge

# Tell pychallonge about your [CHALLONGE! API credentials](http://api.challonge.com/v1).
challonge.set_credentials(API_KEY, API_SECRET)

In [216]:
# See available methods
methods_and_attributes = dir(challonge)
print(methods_and_attributes)

['ChallongeException', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', 'api', 'attachments', 'fetch', 'get_credentials', 'get_timezone', 'matches', 'participants', 'set_credentials', 'set_timezone', 'set_user_agent', 'tournaments']


In [217]:
# Retrieve a tournament by its id (or its url).
tournament = challonge.tournaments.show(challongeArray[0].split("/")[-1]) # Requires only URL slug

In [218]:
tournament

{'id': 3973078,
 'name': 'EU ARMS Challenge #2',
 'url': 'EUCHALLLENGE2',
 'description': '<p><span style="background-color: rgb(39, 42, 51);">Tournament hosted by EUARMSCompetitve\xa0Discord!</span></p><p><span style="background-color: rgb(39, 42, 51);">The tournament is EU exclusive, meaning anyone outside of EU can\'t participate.</span></p><p>If we get 32+ participants the winner of the tournament will get a\xa0Nintendo eShop Card 15 €!</p><p><span style="background-color: rgb(39, 42, 51);"><br>You must be part of the EU ARMS discord in order to participate and coordinate your matchups\xa0</span><span style="background-color: rgb(39, 42, 51);">\xa0</span>https://discord.gg/W486K28</p><p>Rules:\xa0\xa0https://goo.gl/PkXDZo<br><br></p>',
 'tournament_type': 'double elimination',
 'started_at': datetime.datetime(2017, 11, 12, 14, 7, 13, 162000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'completed_at': datetime.datetime(2017, 11, 12, 17, 22, 0, 232000, tzinfo=<DstTzInfo 'Eur

In [219]:
matches = challonge.matches.index(tournament["id"])

In [220]:
matches[0]

{'id': 103347707,
 'tournament_id': 3973078,
 'state': 'complete',
 'player1_id': 64282703,
 'player2_id': 64283849,
 'player1_prereq_match_id': None,
 'player2_prereq_match_id': None,
 'player1_is_prereq_match_loser': False,
 'player2_is_prereq_match_loser': False,
 'winner_id': 64283849,
 'loser_id': 64282703,
 'started_at': datetime.datetime(2017, 11, 12, 14, 7, 13, 287000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'created_at': datetime.datetime(2017, 11, 12, 14, 7, 12, 863000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'updated_at': datetime.datetime(2017, 11, 12, 14, 21, 3, 299000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'identifier': 'A',
 'has_attachment': False,
 'round': 1,
 'player1_votes': None,
 'player2_votes': None,
 'group_id': None,
 'attachment_count': None,
 'scheduled_time': None,
 'location': None,
 'underway_at': None,
 'optional': False,
 'completed_at': datetime.datetime(2017, 11, 12, 14, 21, 3, 433000, tzinfo=<DstTzInfo 'Europe

Match stores the players' ids. The winner id, loser id, score

In [221]:
# Retrieve the participants for a given tournament.
participants = challonge.participants.index(tournament["id"])

In [222]:
participants[0]

{'id': 63926376,
 'tournament_id': 3973078,
 'name': '',
 'seed': 1,
 'active': True,
 'created_at': datetime.datetime(2017, 11, 5, 12, 9, 29, 690000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'updated_at': datetime.datetime(2017, 11, 5, 12, 9, 29, 690000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'invite_email': None,
 'final_rank': 5,
 'misc': None,
 'icon': None,
 'on_waiting_list': False,
 'invitation_id': None,
 'group_id': None,
 'checked_in_at': datetime.datetime(2017, 11, 12, 13, 39, 2, 915000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'ranked_member_id': None,
 'custom_field_response': None,
 'clinch': None,
 'integration_uids': None,
 'challonge_username': 'InkA_',
 'challonge_user_id': 1902903,
 'challonge_email_address_verified': True,
 'removable': False,
 'participatable_or_invitation_attached': True,
 'confirm_remove': True,
 'invitation_pending': False,
 'display_name_with_invitation_email_address': 'InkA_',
 'email_hash': 'f1dcf32d96b85

Participants stores a player's name, username, seed, challonge id, and tournament player id

In [223]:
print(len(participants))

17


In [224]:
participants[-1]

{'id': 64282712,
 'tournament_id': 3973078,
 'name': '',
 'seed': 17,
 'active': False,
 'created_at': datetime.datetime(2017, 11, 12, 11, 30, 59, 551000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'updated_at': datetime.datetime(2017, 11, 12, 11, 30, 59, 551000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>),
 'invite_email': None,
 'final_rank': None,
 'misc': None,
 'icon': None,
 'on_waiting_list': False,
 'invitation_id': None,
 'group_id': None,
 'checked_in_at': None,
 'ranked_member_id': None,
 'custom_field_response': None,
 'clinch': None,
 'integration_uids': None,
 'challonge_username': 'ThatD',
 'challonge_user_id': 2532908,
 'challonge_email_address_verified': False,
 'removable': False,
 'participatable_or_invitation_attached': True,
 'confirm_remove': True,
 'invitation_pending': False,
 'display_name_with_invitation_email_address': 'ThatD',
 'email_hash': '54e045ca1ed55fe5d90d1a4c6980a9db',
 'username': 'ThatD',
 'display_name': 'ThatD',
 'attached_partic

Also stores participants who signed up but did not check-in. Meaning they did not participate in the tournament <br>
Denoted by "checked_in = False" and "active = False"

Players in a "match" only have their player id instead of their name or challonge id.

In [225]:
matches[17]["scores_csv"]

'86-101'

Users can input scores themselves. Can be misleading but will deal with it later in data cleaning

#### 3.2.3.3 - Challonge Data Collection

In this section, we are going to add players from the second EUAC to the table from the first EUAC (playerdf) <br>
For the time being, we are going to keep Start ID and Challonge ID seperate rather than having one ID column. Players from the 1st EUAC may have been present in later EUACs but it is not yet possible to identify them. <br>
Later on, we will identify the players who were in the first EUAC and subsequent ones.

In [226]:
playerdf.head()

Unnamed: 0,Start ID,Player,Seed,Placement
0,1152276,FR | Maxou0708,20,1
1,1118825,Rapha_MTH,6,2
2,1133810,Sabaca,8,3
3,1152521,TCM | Raffa,21,4
4,1152258,FrankTank,19,5


In [227]:
def get_seed(name, df):
    player = df[df["Player"] == name]
    if not player.empty:
        return player.iloc[0]["Seed"]
    else:
        print(f"No player named {name} was found")

In [230]:
def get_placement(name, df):
    player = df[df["Player"] == name]
    if not player.empty:
        return player.iloc[0]["Placement"]
    else:
        print(f"No player named {name} was found")

In [229]:
get_seed("Rapha_MTH", playerdf)

6

In [None]:
p1Seed = []
p2Seed = []
p1Placement = []
p2Placement = []

for p1 in df["Player1"]:
    

Won't need Seed and Placement from first EUAC anymore

In [145]:
# Remove Seed and Placement columns
playerdf = playerdf.drop(["Seed", "Placement"], axis = "columns")

In [146]:
# Add players from second EUAC to the player dataframe
nameArray = []
idArray = []
for players in participants:
    if players["active"] == True:
        idArray.append(players["challonge_user_id"])
        nameArray.append(players["challonge_username"])
    
challongedf = pd.DataFrame({
    "Player": nameArray,
    "Challonge ID": idArray
})

playerdf = pd.concat([playerdf,challongedf], ignore_index=False, sort=False)

In [147]:
playerdf.head()

Unnamed: 0,Start ID,Player,Challonge ID
0,1152276.0,FR | Maxou0708,
1,1118825.0,Rapha_MTH,
2,1133810.0,Sabaca,
3,1152521.0,TCM | Raffa,
4,1152258.0,FrankTank,


Within "participants" and "matches" from the API pulls, the ids and availability of access to name differ. <br>
For each tournament, a player has an id but they also have a unique challonge id. A player's name is also not accessible within match, for example. <br>
To get around this limitation, functions will be written to access this data <br>
Then a loop can be written to compile this data and merge with the existing dataframe containing the matches (df)

In [148]:
# Get a challonge id given their player id.
# Player id is unique for each tournament.
# Necessary for missing information from some api pulls

def get_challonge_id(uid):
    for i in participants:
        if i["id"] == uid:
            return i["challonge_user_id"]
        else:
            pass

In [149]:
# Returns challonge name given a player's tournament id
def get_challonge_name(uid):
    for i in participants:
        if i["id"] == uid:
            return i["challonge_username"]
        else:
            pass

In [150]:
# Get tournament date
date = tournament["started_at"]
date

datetime.datetime(2017, 11, 12, 14, 7, 13, 162000, tzinfo=<DstTzInfo 'Europe/London' GMT0:00:00 STD>)

In [151]:
# Format date
date = date.strftime("%d/%m/%y")
date

'12/11/17'

In [152]:
df.head()

Unnamed: 0,Player1,Player2,Winner,Score,Loser,MatchNo,EUAC,Date
0,Alumento,Owdy,Alumento,2-0,Owdy,1,1,21/10/17
1,BambooBoss,FrankTank,FrankTank,2-0,BambooBoss,2,1,21/10/17
2,Kotorious BRD,Altair,Kotorious BRD,0--1,Altair,3,1,21/10/17
3,RD | | Dushni,TCM | Raffa,TCM | Raffa,2-0,RD | | Dushni,4,1,21/10/17
4,FR|TCM | InkAlyut,FR | Maxou0708,FR|TCM | InkAlyut,2-1,FR | Maxou0708,5,1,21/10/17


In [153]:
tournamentNo = 2
matchNo = 1

for i in matches:
    
    # Get player details
    player1Id = i["player1_id"]
    player2Id = i["player2_id"]
    player1CId = get_challonge_id(player1Id)
    player2CId = get_challonge_id(player2Id)
    player1Name = get_challonge_name(player1Id)
    player2Name = get_challonge_name(player2Id)
    
    
    # Determine winner and loser
    if i["player1_id"] == i["winner_id"]:
        winner = player1Name
        loser = player2Name
    else:
        loser = player1Name
        winner = player2Name
    
    # Score
    score = i["scores_csv"]
    
    # Add to dataframe
    df.loc[len(df)] = [player1Name, player2Name, winner, score, loser, matchNo, tournamentNo, date]
    matchNo += 1

In [154]:
df.tail()

Unnamed: 0,Player1,Player2,Winner,Score,Loser,MatchNo,EUAC,Date
60,InkA_,_Rem_,_Rem_,86-101,InkA_,18,2,12/11/17
61,GameFroggit,Ripha,Ripha,0-2,GameFroggit,19,2,12/11/17
62,_Rem_,Ripha,_Rem_,3-0,Ripha,20,2,12/11/17
63,Frank001,_Rem_,Frank001,3-0,_Rem_,21,2,12/11/17
64,Raffa_,Frank001,Raffa_,3-0,Frank001,22,2,12/11/17


Next step is to put this altogether in one big loop and finish compiling all the match data

In [155]:
from tqdm import tqdm

In [None]:
for url in tqdm(urlArray, desc="Processing", unit="step"):
    try:
        # Retrieve a tournament by its id (or its url).
        tournament = challonge.tournaments.show(url)
    except:
        print(f"Error at {url}")
        continue
    
    # Retrieve matches
    matches = challonge.matches.index(tournament["id"])
    
    # Retrive participants
    participants = challonge.participants.index(tournament["id"])
    
    # Add new players to elo table if they checked in
    for i in participants:
        if i["active"] == True:
            if i["challonge_username"] == None:
                add_new_player(elodf, i["display_name"], i["display_name"], i["id"])
            else:
                add_new_player(elodf, i["challonge_username"], i["challonge_username"], i["challonge_user_id"])
        else:
            pass
        
    # Building Dataframe info   
    tournamentno = re.findall(r'\d+', url)
    tournamentno = tournamentno[0]
    tournamentno = int(tournamentno)
    date = tournament["start_at"].date()
    
    # Acquiring player information
    matchno = 0
    for i in matches:
        if i["suggested_play_order"] == None:
            matchno += 1
        else:
            matchno = i["suggested_play_order"]
        
        player1id = i["player1_id"]
        player2id = i["player2_id"]
        player1cid = get_challonge_id(player1id)
        player2cid = get_challonge_id(player2id)
        player1name = get_challonge_name(player1id)
        player2name = get_challonge_name(player2id)
        player1seed = get_seed(player1id)
        player2seed = get_seed(player2id)
        player1fp = get_placement(player1id)
        player2fp = get_placement(player2id)
        if i["player1_id"] == i["winner_id"]:
            winner = player1name
            loser = player2name
        else:
            loser = player1name
            winner = player2name
        
        # Getting score
        score = i["scores_csv"]
        
        # Add to dataframe
    df.loc[len(df)] = [player1Name, player2Name, winner, score, loser, matchNo, tournamentNo, date]
    time.sleep(3) #Avoid too many api calls

# References

- https://armswiki.org/wiki/EU_ARMS_Challenge
- https://developer.start.gg/docs/examples/queries/get-event
- https://github.com/ZEDGR/pychallonge