# Get Required Data

We want to train a machine learning classifier to predict the outcome of a game of Dota 2 after heroes have been selected. Let's take a look at exactly what data we need and where we can get it from.

### Required Data

#### Required Input:

 - Radiant Hero IDs 0-4
 - Dire Hero IDs 0-4
 - Date
 
#### Required Output:

 - Winner (Radiant or Dire)
 
 
#### Optional Input
 - Radiant Player IDs 0-4
 - Dire Player IDs 0-4
 - Patch number
 - Starting items
 - Starting gold


For now we'll build the simplest possible system so we'll only start with the required input and output.

In [1]:
import os
import json
import time
import datetime
import requests

In [2]:
if 'STEAM_API_KEY' not in os.environ:
  print("No API Key :(")
else:
  print("Found API Key.")
  STEAM_API_KEY = os.environ['STEAM_API_KEY']

Found API Key.


In [6]:
base_url = 'https://api.steampowered.com'

def __request(method, path, **kwargs):
  url = base_url + path
  kwargs.setdefault('params', dict()).update(key=STEAM_API_KEY)
  response = requests.request(method, url, **kwargs)
  return response.json()

def get_match_history_by_seq_num(seq_num, num_matches, **params):
  path = '/IDOTA2Match_570/GetMatchHistoryBySequenceNum/V001'
  params.update(start_at_match_seq_num=seq_num)
  params.update(matches_requested=num_matches)  
  return __request('get', path, params=params)

In [7]:
response = get_match_history_by_seq_num(5126114401, 1)

In [16]:
def get_hero_id(response, player_id):
    players = response['result']['matches'][0]['players']
    return players[player_id]['hero_id']

In [20]:
for i in range(0, 10):
    print(get_hero_id(response, i))

62
14
106
28
11
10
85
112
2
48


In [23]:
response['result']['matches'][0]['radiant_win']

True

In [24]:
response['result']['matches'][0]['start_time']

1628362217

Not so bad! Next, let's collect this information in bulk.