## Objective

The objective of this notebook is to retrieve the historic sales data of all shoes. We use the StockX api and pass in parameters to get requests and retrieve the data as json files.

In [1]:
import pandas as pd
import requests
import time
import csv
import json
import psycopg2 as pg2
from psycopg2.extras import RealDictCursor, Json
from sqlalchemy import create_engine

%run sql_test.py

## Import Jordans Product Ids

In [2]:
df = pd.read_csv('../data/jordan_ids.csv')

In [3]:
df.drop(columns='Unnamed: 0', inplace=True)

## Function for stockX api to retrieve historic sales data as json

The functions below query requests for historic sales data using the StockX api along with the unique ID of each shoe with sleep of 1 second per request and 10 sec per shoe ID and saves data as json files. The function returns false when there are no more sales to query, stops iterating through the pages and stops requesting. 

In [4]:
def stockxApiQuery(product_id, page, limit=1000, order='DESC'):
    url = f'https://stockx.com/api/products/{product_id}/activity?state=480&currency=USD&limit={limit}&page={page}&sort=createdAt&order={order}'
    headers = {'User-agent': 'Ben01'}
    response = requests.get(url, headers=headers)
    the_json = response.json()
    if the_json['ProductActivity']:
        now = f'{time.time():.0f}'
        with open(f'../data/raw_json/{product_id}_{now}.json', 'w+') as f:
            json.dump(the_json['ProductActivity'], f)        
        time.sleep(1)
        return True
    else:
        return False

In [5]:
def grab_all_sales(shoe_id):
    more = 1
    page = 1
    while more:
        more *= stockxApiQuery(shoe_id, page=page)
        page += 1
        time.sleep(1)

In [6]:
def get_all_shoes(shoe_list):
    for shoe_id in shoe_list:
        grab_all_sales(shoe_id)
        time.sleep(10)

In [7]:
shoe_list = df['product_id']

In [8]:
# get_all_shoes(shoe_list)