### This Script will Get the list of all games with its id number and ouput a file at `./data/game_id.csv`
### As of 11/8/2019. There are 345727 games. More information about the API can be found here https://rawg.io/apidocs and its endpoints can be found here https://api.rawg.io/docs/

In [14]:
import json
import requests
from pprint import pprint
import os
import csv
from time import time
import concurrent.futures
import functools
import math
from pandas import json_normalize
import pandas as pd

## Multithreading
#### This function is responsible for requesting pages of games (20 games per page) and save as a JSON file in `/data/game_id/`. As of 11/8/2019, there are 17272 pages

In [2]:
def threading(start_index, pages_per_worker, urls, downloaded_files, headers):
    for url in urls[start_index : start_index + pages_per_worker]:
        if url.rsplit("?page=")[-1] in downloaded_files: continue 
        try:
            # Request API
            json_data = json.loads(requests.get(url, headers=headers).text)

            # Get wanted data
            D = {game["id"]:game["slug"] for game in json_data["results"]}

            # Save data
            page_no = int(url.split("page=")[-1])
            with open(fr"./data/game_id/{page_no}.json", "w", encoding="utf8") as f:
                json.dump(D, f)
        except:
            print(f"Error with {url}")
    # Verbose notification
    print(f"Done from {page_no - N} to {page_no}")

#### Create folder if not existed

In [3]:
if not os.path.exists('data/game_id/'):
    os.makedirs('data/game_id/')

#### The following codes apply concurrent programming to speed up the progress. 50 workers are running at the same time. Each of the workers will individually make a request. Time was reduced from ~ 4 hours to ~40 minutes for  17272 pages

In [4]:
# Make the first request to get the total amount of pages to get
headers = { 'User-Agent': 'App Name: Education purpose',}
json_data = json.loads(requests.get(r"https://api.rawg.io/api/games", headers=headers).text)
no_of_pages = math.ceil(json_data["count"]/20)

# Set up number of workers
max_workers = 50
pages_per_worker = int(no_of_pages/max_workers)
start_index = range(0, no_of_pages, pages_per_worker)

# Make urls
url = "https://api.rawg.io/api/games?page=1"
urls = [url[:-1] + str(i) for i in range(1, no_of_pages + 1)]

In [6]:
# Skipped downloaded files
downloaded_files = {file.split(".",1)[0] for file in os.listdir("data/game_id/")}

# Time
t0=time()
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
    temp = functools.partial(threading,
                             pages_per_worker=pages_per_worker,
                             urls=urls,
                             downloaded_files=downloaded_files,
                             headers=headers,
                            )
    executor.map(temp, start_index)
print(f"Time taken: {time()-t0}")

Error with https://api.rawg.io/api/games?page=10335
Error with https://api.rawg.io/api/games?page=9864
Error with https://api.rawg.io/api/games?page=5640
Error with https://api.rawg.io/api/games?page=8929
Error with https://api.rawg.io/api/games?page=14077
Error with https://api.rawg.io/api/games?page=9865
Error with https://api.rawg.io/api/games?page=10336
Error with https://api.rawg.io/api/games?page=5641
Error with https://api.rawg.io/api/games?page=17830Error with https://api.rawg.io/api/games?page=4229

Error with https://api.rawg.io/api/games?page=6106Error with https://api.rawg.io/api/games?page=14078

Error with https://api.rawg.io/api/games?page=2830
Error with https://api.rawg.io/api/games?page=10337Error with https://api.rawg.io/api/games?page=21584

Error with https://api.rawg.io/api/games?page=8462
Error with https://api.rawg.io/api/games?page=1415
Error with https://api.rawg.io/api/games?page=5642Error with https://api.rawg.io/api/games?page=20640

Error with https://api.

Error with https://api.rawg.io/api/games?page=16131Error with https://api.rawg.io/api/games?page=15132Error with https://api.rawg.io/api/games?page=19859

Error with https://api.rawg.io/api/games?page=1981

Error with https://api.rawg.io/api/games?page=2937
Error with https://api.rawg.io/api/games?page=11362
Error with https://api.rawg.io/api/games?page=18396
Error with https://api.rawg.io/api/games?page=3424
Error with https://api.rawg.io/api/games?page=9991Error with https://api.rawg.io/api/games?page=20309Error with https://api.rawg.io/api/games?page=11850


Error with https://api.rawg.io/api/games?page=568
Error with https://api.rawg.io/api/games?page=6712
Error with https://api.rawg.io/api/games?page=21224
Error with https://api.rawg.io/api/games?page=1513
Error with https://api.rawg.io/api/games?page=20741Error with https://api.rawg.io/api/games?page=12326
Error with https://api.rawg.io/api/games?page=8606

Error with https://api.rawg.io/api/games?page=5779
Error with https://api

Error with https://api.rawg.io/api/games?page=13378
Error with https://api.rawg.io/api/games?page=22697
Error with https://api.rawg.io/api/games?page=21780
Error with https://api.rawg.io/api/games?page=12910
Error with https://api.rawg.io/api/games?page=23297
Error with https://api.rawg.io/api/games?page=9154
Error with https://api.rawg.io/api/games?page=13826Error with https://api.rawg.io/api/games?page=4484
Error with https://api.rawg.io/api/games?page=19962Error with https://api.rawg.io/api/games?page=6338


Error with https://api.rawg.io/api/games?page=7287
Error with https://api.rawg.io/api/games?page=16272Error with https://api.rawg.io/api/games?page=12431Error with https://api.rawg.io/api/games?page=6828


Error with https://api.rawg.io/api/games?page=22698Error with https://api.rawg.io/api/games?page=10086Error with https://api.rawg.io/api/games?page=13379


Error with https://api.rawg.io/api/games?page=8189
Error with https://api.rawg.io/api/games?page=9155
Error with https://

Error with https://api.rawg.io/api/games?page=19142
Error with https://api.rawg.io/api/games?page=17277
Error with https://api.rawg.io/api/games?page=4584
Error with https://api.rawg.io/api/games?page=2744
Error with https://api.rawg.io/api/games?page=11070
Error with https://api.rawg.io/api/games?page=14906Error with https://api.rawg.io/api/games?page=6940

Error with https://api.rawg.io/api/games?page=9226
Error with https://api.rawg.io/api/games?page=18147
Error with https://api.rawg.io/api/games?page=11622
Error with https://api.rawg.io/api/games?page=17707
Error with https://api.rawg.io/api/games?page=5053Error with https://api.rawg.io/api/games?page=2193Error with https://api.rawg.io/api/games?page=14910Error with https://api.rawg.io/api/games?page=18148


Error with https://api.rawg.io/api/games?page=3660

Error with https://api.rawg.io/api/games?page=19151
Error with https://api.rawg.io/api/games?page=7411Error with https://api.rawg.io/api/games?page=3135

Error with https://ap

Error with https://api.rawg.io/api/games?page=251Error with https://api.rawg.io/api/games?page=12064

Error with https://api.rawg.io/api/games?page=6979
Error with https://api.rawg.io/api/games?page=2785
Error with https://api.rawg.io/api/games?page=10734
Error with https://api.rawg.io/api/games?page=21420Error with https://api.rawg.io/api/games?page=8878

Error with https://api.rawg.io/api/games?page=14939
Error with https://api.rawg.io/api/games?page=12543
Error with https://api.rawg.io/api/games?page=18637
Error with https://api.rawg.io/api/games?page=22379
Error with https://api.rawg.io/api/games?page=20499Error with https://api.rawg.io/api/games?page=15839

Error with https://api.rawg.io/api/games?page=21869
Error with https://api.rawg.io/api/games?page=17328
Error with https://api.rawg.io/api/games?page=747
Error with https://api.rawg.io/api/games?page=10216Error with https://api.rawg.io/api/games?page=12078

Error with https://api.rawg.io/api/games?page=2801
Error with https://a

Error with https://api.rawg.io/api/games?page=20537Error with https://api.rawg.io/api/games?page=1251

Error with https://api.rawg.io/api/games?page=12580Error with https://api.rawg.io/api/games?page=9776
Error with https://api.rawg.io/api/games?page=20951

Error with https://api.rawg.io/api/games?page=7034
Error with https://api.rawg.io/api/games?page=15874Error with https://api.rawg.io/api/games?page=18682

Error with https://api.rawg.io/api/games?page=21902
Error with https://api.rawg.io/api/games?page=15415Error with https://api.rawg.io/api/games?page=13996

Error with https://api.rawg.io/api/games?page=18214
Error with https://api.rawg.io/api/games?page=9289
Error with https://api.rawg.io/api/games?page=17777Error with https://api.rawg.io/api/games?page=22404Error with https://api.rawg.io/api/games?page=3256
Error with https://api.rawg.io/api/games?page=774
Error with https://api.rawg.io/api/games?page=19229
Error with https://api.rawg.io/api/games?page=7899


Error with https://a

#### Load each JSON file in `data/game_id/` and write to a CSV file which i saved at `data/game_id.csv`

In [7]:
with open("data/game_id.csv", "w") as f:
    csv_file = csv.writer(f, lineterminator="\n")
    for file in os.listdir("./data/game_id/"):
        try:
            json_data = json.load(open(f"./data/game_id/{file}", "r"))
        except:
            print(file)
        for game_id, game_name in json_data.items():
            csv_file.writerow([game_id, game_name])