# Find categories in main page

In the main page, we can see there is a 4-tab section showing games in different labels

- Specials
- Popular Upcoming
- Top Sellers
- New & Trending

I stumbled on an article in which the author using web-scraping and HTML post-processing to extract the games and their prices

However, we can use the single API to collect the games and then use the /api/appdetails API to retrieve different informations (like price, or others)

Besides it also allows us to track for more games by tweaking the parameters passed to the API.  

/search/result/ API is used to achieve the function

In [10]:
from datetime import datetime
import time
import requests
import pickle
from pathlib import Path
import re

In [11]:
def print_log(*args):
    print(f"[{str(datetime.now())[:-3]}] ", end="")
    print(*args)

In [12]:
def get_search_results(params):
    req_sr = requests.get(
        "https://store.steampowered.com/search/results/",
        params=params)
    
    if req_sr.status_code != 200:
        print_log(f"Failed to get search results: {req_sr.status_code}")
        return {"items": []}
    
    try:
        search_results = req_sr.json()
    except Exception as e:
        print_log(f"Failed to parse search results: {e}")
        return {"items": []}
    
    return search_results

In [13]:
def get_app_details(appid):
    while(True):
        if appid == None:
            print_log("App Id is None.")
            return {}

        appdetails_req = requests.get(
            "https://store.steampowered.com/api/appdetails/",
            params={"appids": appid, "cc": "hk", "l": "english"})
        
        if appdetails_req.status_code == 200:
            appdetails = appdetails_req.json()
            appdetails = appdetails[str(appid)]
            print_log(f"App Id: {appid} - {appdetails['success']}")
            break

        elif appdetails_req.status_code == 429:
            print_log(f'Too many requests. Sleep for 10 sec')
            time.sleep(10)
            continue

        elif appdetails_req.status_code == 403:
            print_log(f'Forbidden to access. Sleep for 5 min.')
            time.sleep(5 * 60)
            continue

        else:
            print_log("ERROR: status code:", appdetails_req.status_code)
            print_log(f"Error in App Id: {appid}.")
            appdetails = {}
            break

    return appdetails

To retrieve different search results, one can use different filter value, which can be observed by looking at the URL of the search page (i.e. store.steampowered.com/search/...)

Through my observation, there are five values
- "" (empty): default behaviour. Usually used with field "sorted_by"="Released_DESC" to retrieve the latest published games on Steam
- "popularnew": means "Popular New Releases", correspond to "New & Trending" tab
- "topsellers": correspond to "Top Sellers" tab. Can set whether includes free-to-play games or not
- "globaltopsellers": correspond to "Global Top Sellers" button in the "See more:" section of the "Top Sellers" tab
- "popularcommingsoon": correspond to "Popular Upcoming" tab

For specials tab, just set the "specials" field to 1 while passing the params to the API

To make things simple, for each game/app in the returned result, we call the /api/appdetails once to get all the information of an application on Steam for further analysis

In [4]:
execute_datetime = datetime.now()

search_result_folder_path = Path(f"search_results_{execute_datetime.strftime('%Y%m%d')}")
if not search_result_folder_path.exists():
    search_result_folder_path.mkdir()

In [17]:
# a list of filters
params_list = [
    {"filter": "topsellers"},
    {"filter": "globaltopsellers"},
    {"filter": "popularnew"},
    {"filter": "popularcommingsoon"},
    {"filter": "", "specials": 1}
]
page_list = list(range(1, 5))

params_sr_default = {
    "filter": "topsellers",
    "hidef2p": 1,
    "page": 1,              # page is used to go through different parts of the ranking. Each page contains 25 results
    "json": 1
}

for update_param in params_list:

    items_all = []
    if update_param["filter"]:
        filename = f"{update_param['filter']}_{execute_datetime.strftime('%Y%m%d')}.pkl"
    else:
        filename = f"specials_{execute_datetime.strftime('%Y%m%d')}.pkl"

    if (search_result_folder_path / filename).exists():
        print_log(f"File {filename} exists. Skip.")
        continue

    for page_no in page_list:
        param = params_sr_default.copy()
        param.update(update_param)
        param["page"] = page_no

        search_results = get_search_results(param)
        print_log(search_results)

        if not search_results:
            continue

        items = search_results.get("items", [])

        # proprocessing search results to retrieve the appid of the game
        for item in items:
            try:
                item["appid"] = re.search(r"steam/\w+/(\d+)", item["logo"]).group(1)      # the URL can be steam/bundles/{appid} or steam/apps/{appid}
            except Exception as e:
                print_log(f"Failed to extract appid: {e}")
                item["appid"] = None

        # request for game information using appid
        for item in items:
            appid = item["appid"]
            appdetails = get_app_details(appid)
            item["appdetail"] = appdetails

        items_all.extend(items)

    # save the search results
    with open(search_result_folder_path / filename, "wb") as f:
        pickle.dump(items_all, f)
    print_log(f"Saved {filename}")
    

[2024-05-28 11:09:55.199] File topsellers_20240528.pkl exists. Skip.
[2024-05-28 11:09:55.199] File globaltopsellers_20240528.pkl exists. Skip.
[2024-05-28 11:09:55.199] File popularnew_20240528.pkl exists. Skip.
[2024-05-28 11:09:55.199] File popularcommingsoon_20240528.pkl exists. Skip.
[2024-05-28 11:09:55.632] {'desc': '', 'items': [{'name': 'Planet Coaster', 'logo': 'https://shared.cloudflare.steamstatic.com/store_item_assets/steam/apps/493340/capsule_sm_120.jpg?t=1709820332'}, {'name': 'Red Dead Redemption 2', 'logo': 'https://shared.cloudflare.steamstatic.com/store_item_assets/steam/apps/1174180/capsule_sm_120.jpg?t=1714055653'}, {'name': 'Ready or Not', 'logo': 'https://shared.cloudflare.steamstatic.com/store_item_assets/steam/apps/1144200/capsule_sm_120.jpg?t=1707410886'}, {'name': "Tom Clancy's Rainbow Six® Siege", 'logo': 'https://shared.cloudflare.steamstatic.com/store_item_assets/steam/apps/359550/capsule_sm_120.jpg?t=1715902855'}, {'name': 'Total War: WARHAMMER III', 'log

## Read search results of a particular date

In [21]:
date_to_read = datetime(2024, 5, 28)

items_listlist = []

for file in search_result_folder_path.iterdir():
    if file.suffix == ".pkl":
        date = datetime.strptime(file.stem.split("_")[-1], "%Y%m%d")
        if date == date_to_read:
            with open(file, "rb") as f:
                items = pickle.load(f)
            print_log(f"Read {file}")
            items_listlist.append(items)
            

[2024-05-28 11:17:42.246] Read search_results/topsellers_20240528.pkl
[2024-05-28 11:17:42.250] Read search_results/popularcommingsoon_20240528.pkl
[2024-05-28 11:17:42.252] Read search_results/specials_20240528.pkl
[2024-05-28 11:17:42.255] Read search_results/globaltopsellers_20240528.pkl
[2024-05-28 11:17:42.259] Read search_results/popularnew_20240528.pkl
