# Python API Example - Price Release Data Import and Storage in Dataframe

This guide is designed to provide an example of how to access the Spark API:
- The path to your client credentials is the only input needed to run this script (just before Section 2)
- This script has been designed to display the raw outputs of requests from the API, and then shows you how to format those outputs to enable easy reading and analysis
- This script can be copied and pasted by customers for quick use of the API
- Once comfortable with the process, you can change the variables that are called to produce your own custom analysis products. (Section 2 onwards in this guide).
 
__N.B. This guide is just for Price release data. If you're looking for other API data products (such as Freight routes or Netbacks), please refer to their according code example files.__ 


### Have any questions?

If you have any questions regarding our API, or need help accessing specific datasets, please contact us at:

__data@sparkcommodities.com__

or refer to our API website for more information about this endpoint: https://www.sparkcommodities.com/api/request/contracts.html

## 1. Importing Data

Here we define the functions that allow us to retrieve the valid credentials to access the Spark API.

This section can remain unchanged for most Spark API users.

In [1]:
# Importing libraries for calling the API
import json
import os
import sys
from base64 import b64encode
from urllib.parse import urljoin
import pandas as pd


try:
    from urllib import request, parse
    from urllib.error import HTTPError
except ImportError:
    raise RuntimeError("Python 3 required")

In [2]:
# Defining functions for API request

API_BASE_URL = "https://api.sparkcommodities.com"


def retrieve_credentials(file_path=None):
    """
    Find credentials either by reading the client_credentials file or reading
    environment variables
    """
    if file_path is None:
        client_id = os.getenv("SPARK_CLIENT_ID")
        client_secret = os.getenv("SPARK_CLIENT_SECRET")
        if not client_id or not client_secret:
            raise RuntimeError(
                "SPARK_CLIENT_ID and SPARK_CLIENT_SECRET environment vars required"
            )
    else:
        # Parse the file
        if not os.path.isfile(file_path):
            raise RuntimeError("The file {} doesn't exist".format(file_path))

        with open(file_path) as fp:
            lines = [l.replace("\n", "") for l in fp.readlines()]

        if lines[0] in ("clientId,clientSecret", "client_id,client_secret"):
            client_id, client_secret = lines[1].split(",")
        else:
            print("First line read: '{}'".format(lines[0]))
            raise RuntimeError(
                "The specified file {} doesn't look like to be a Spark API client "
                "credentials file".format(file_path)
            )

    print(">>>> Found credentials!")
    print(
        ">>>> Client_id={}, client_secret={}****".format(client_id, client_secret[:5])
    )

    return client_id, client_secret


def do_api_post_query(uri, body, headers):
    """
    OAuth2 authentication requires a POST request with client credentials before accessing the API. 
    This POST request will return an Access Token which will be used for the API GET request.
    """
    url = urljoin(API_BASE_URL, uri)

    data = json.dumps(body).encode("utf-8")

    # HTTP POST request
    req = request.Request(url, data=data, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 201, resp_content

    # The server returned a JSON response
    content = json.loads(resp_content)

    return content


def do_api_get_query(uri, access_token):
    """
    After receiving an Access Token, we can request information from the API.
    """
    url = urljoin(API_BASE_URL, uri)

    headers = {
        "Authorization": "Bearer {}".format(access_token),
        "Accept": "application/json",
    }

    # HTTP POST request
    req = request.Request(url, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 200, resp_content

    # The server returned a JSON response
    content = json.loads(resp_content)

    return content


def get_access_token(client_id, client_secret):
    """
    Get a new access_token. Access tokens are the thing that applications use to make
    API requests. Access tokens must be kept confidential in storage.

    # Procedure:

    Do a POST query with `grantType` and `scopes` in the body. A basic authorization
    HTTP header is required. The "Basic" HTTP authentication scheme is defined in
    RFC 7617, which transmits credentials as `clientId:clientSecret` pairs, encoded
    using base64.
    """

    # Note: for the sake of this example, we choose to use the Python urllib from the
    # standard lib. One should consider using https://requests.readthedocs.io/

    payload = "{}:{}".format(client_id, client_secret).encode()
    headers = {
        "Authorization": b64encode(payload).decode(),
        "Accept": "application/json",
        "Content-Type": "application/json",
    }
    body = {
        "grantType": "clientCredentials",
        "scopes": "read:prices,read:routes",
    }

    content = do_api_post_query(uri="/oauth/token/", body=body, headers=headers)

    print(
        ">>>> Successfully fetched an access token {}****, valid {} seconds.".format(
            content["accessToken"][:5], content["expiresIn"]
        )
    )

    return content["accessToken"]

## Defining Fetch Request

Here is where we define what type of data we want to fetch from the API.

In my fetch request, I use the URL:

__uri="/v1.0/contracts/"__

This is to query contract price data specifically. Other data products (such as shipping route costs) require different URL's in the fetch request (refer to other Python API examples).

In [3]:
# Define function for listing contracts from API
def list_contracts(access_token):
    """
    Fetch available contracts. Return contract ticker symbols

    # Procedure:

    Do a GET query to /v1.0/contracts/ with a Bearer token authorization HTTP header.
    """
    content = do_api_get_query(uri="/v1.0/contracts/", access_token=access_token)

    print(">>>> All the contracts you can fetch")
    tickers = []
    for contract in content["data"]:
        print(contract["fullName"])
        tickers.append(contract["id"])

    return tickers

## N.B. Credentials

Here we call the above functions, and input the file path to our credentials.

N.B. You must have downloaded your client credentials CSV file before proceeding. Please refer to the API documentation if you have not dowloaded them already. Instructions for downloading your credentials can be found here:

https://api.sparkcommodities.com/redoc#section/Authentication/Create-an-Oauth2-Client


The code then prints the available prices that are callable from the API, and their corresponding Python ticker names are displayed as a list at the bottom of the Output.

In [4]:
# Insert file path to your client credentials here
client_id, client_secret = retrieve_credentials(file_path="/tmp/client_credentials.csv")

# Authenticate:
access_token = get_access_token(client_id, client_secret)

# Fetch all contracts:
tickers = list_contracts(access_token)


print(tickers)

>>>> Found credentials!
>>>> Client_id=875f483b-19de-421a-8e9b-dceff6703e83, client_secret=6cdf8****
>>>> Successfully fetched an access token eyJhb****, valid 604799 seconds.
>>>> All the contracts you can fetch
Spark25F Pacific 160 TFDE
Spark30F Atlantic 160 TFDE
Spark25S Pacific
Spark25Fo Pacific
Spark25FFA Pacific
Spark25FFAYearly Pacific
Spark30S Atlantic
Spark30Fo Atlantic
Spark30FFA Atlantic
Spark30FFAYearly Atlantic
SparkNWE DES 1H
SparkNWE-B 1H
SparkNWE DES 2H
SparkNWE-B 2H
SparkNWE-B F
SparkNWE DES F
SparkNWE-B Fo
SparkNWE DES Fo
SparkNWE-DES-Fin Monthly
SparkNWE-Fin Monthly
SparkSWE-B F
SparkSWE DES F
SparkSWE-B Fo
SparkSWE DES Fo
SparkSWE-DES-Fin Monthly
SparkSWE-Fin Monthly
['spark25f', 'spark30f', 'spark25s', 'spark25fo', 'spark25ffa-monthly', 'spark25ffa-yearly', 'spark30s', 'spark30fo', 'spark30ffa-monthly', 'spark30ffa-yearly', 'sparknwe-1h', 'sparknwe-b-1h', 'sparknwe-2h', 'sparknwe-b-2h', 'sparknwe-b-f', 'sparknwe-f', 'sparknwe-b-fo', 'sparknwe-fo', 'sparknwe-des-fin

## 2. Latest Price Release

Here we call the latest price release and print it in a readable format. This is done using the URL:

__/v1.0/contracts/{contract_ticker_symbol}/price-releases/latest/__

'tickers[2]' is the Python ticker called here. 'tickers' refers to the printed list above, so we can see that 'tickers[2]' refers to 'spark25s'.

We then save the entire dataset as a local variable called 'my_dict'.

__N.B. The first two tickers, 'spark25f' and 'spark30f', are deprecated. Historical data for these tickers are available up until 2022-04-01 (yyyy-mm-dd)__

For more information on API updates, please refer to the API documentation:

https://api.sparkcommodities.com/redoc#section/API-Changelog

In [5]:
## Defining the function


def fetch_latest_price_releases(access_token, ticker):
    """
    For a contract, fetch then display the latest price release

    # Procedure:

    Do GET queries to /v1.0/contracts/{contract_ticker_symbol}/price-releases/latest/
    with a Bearer token authorization HTTP header.
    """
    content = do_api_get_query(
        uri="/v1.0/contracts/{}/price-releases/latest/".format(ticker),
        access_token=access_token,
    )

    release_date = content["data"]["releaseDate"]

    print(">>>> Get latest price release for {}".format(ticker))
    print("release date =", release_date)

    data_points = content["data"]["data"][0]["dataPoints"]

    for data_point in data_points:
        period_start_at = data_point["deliveryPeriod"]["startAt"]

        spark_prices = dict()
        for unit, prices in data_point["derivedPrices"].items():
            spark_prices[unit] = prices["spark"]

        print(f"Spark Price={spark_prices} for period starting on {period_start_at}")
        print(ticker)

    return content["data"]


## Calling that function and storing the output

# Here we store the entire dataset called from the API

my_dict = fetch_latest_price_releases(access_token, tickers[2])

>>>> Get latest price release for spark25s
release date = 2025-01-14
Spark Price={'usdPerDay': '21250', 'usdPerMMBtu': '0.53'} for period starting on 2025-01-29
spark25s


In [6]:
# Shows how the raw output is formatted
print(my_dict)

{'id': 20250114, 'contractId': 'spark25s', 'releaseDate': '2025-01-14', 'previousPriceRelease': {'id': 20250113, 'releaseDate': '2025-01-13'}, 'nextPriceRelease': {'id': 20250115, 'releaseDate': '2025-01-15'}, 'assessmentWindowClosedAt': '2025-01-14T11:30:00Z', 'assessmentWindowOpenedAt': '2025-01-14T08:00:00Z', 'data': [{'revisionNumber': 0, 'revisionPublishedAt': '2025-01-14T11:02:42.068771Z', 'numberOfAssessors': None, 'dataPoints': [{'index': 0, 'deliveryPeriod': {'type': 'days', 'startAt': '2025-01-29', 'endAt': '2025-02-28', 'name': 'SparkS', 'lastAssessmentDate': None}, 'yourAssessedPrice': None, 'derivedPrices': {'usdPerDay': {'spark': '21250', 'sparkMin': '20000', 'sparkMax': '25000', 'portfolioPlayer': None, 'portfolioPlayerMin': None, 'portfolioPlayerMax': None, 'shipOwner': None, 'shipOwnerMin': None, 'shipOwnerMax': None}, 'usdPerMMBtu': {'spark': '0.53', 'sparkMin': '0.52', 'sparkMax': '0.56', 'portfolioPlayer': None, 'portfolioPlayerMin': None, 'portfolioPlayerMax': None

### N.B.

Here we extract the prices and not the entire dataset, and this is saved as a dictionary called 'spark_prices'.

In [7]:
# extract the prices
data_points = my_dict["data"][0]["dataPoints"]

for data_point in data_points:
    period_start_at = data_point["deliveryPeriod"]["startAt"]

    spark_prices = dict()
    for unit, prices in data_point["derivedPrices"].items():
        spark_prices[unit] = prices["spark"]

    print(spark_prices)

{'usdPerDay': '21250', 'usdPerMMBtu': '0.53'}


## 3. Historical Prices

Here we perform a similar task, but with historical prices instead. This is done using the URL:

__/v1.0/contracts/{contract_ticker_symbol}/price-releases/{limit}{offset}__

First we define the function that imports the data from the Spark API.

We then call that function, and define 2 parameters:
- 'tickers': which ticker do you want to call.
    - We define the variable 'my_ticker' after the function definition, and set this to 'tickers[2]' which corresponds to Spark25s
    - Alter this variable to whatever price product you need.
- 'limit': this allows you to control how many datapoints you want to call. We use 'limit=10', which means we have called the last 10 datapoints (the Spark25 spot price for the last 10 business days).
    - For __Premium__ Users, alter this limit to however many datapoints you need.
    - For __Trial__ Users, the limit parameter must not exceed 14 datapoints, as historical data is limited to 2 weeks for this plan.


We save the output as a local variable called 'my_dict_hist'.

In [8]:
def fetch_historical_price_releases(access_token, ticker, limit=4, offset=None):
    """
    For a selected contract, this endpoint returns all the Price Releases you can
    access according to your current subscription, ordered by release date descending.

    **Note**: Unlimited access to historical data and full forward curves is only
    available to those with Premium access. Get in touch to find out more.

    **Params**

    limit: optional integer value to set an upper limit on the number of price
           releases returned by the endpoint. Default here is 4.

    offset: optional integer value to set from where to start returning data.
            Default is 0.

    # Procedure:

    Do GET queries to /v1.0/contracts/{contract_ticker_symbol}/price-releases/
    with a Bearer token authorization HTTP header.
    """
    print(">>>> Get price releases for {}".format(ticker))

    query_params = "?limit={}".format(limit)
    if offset is not None:
        query_params += "&offset={}".format(offset)

    content = do_api_get_query(
        uri="/v1.0/contracts/{}/price-releases/{}".format(ticker, query_params),
        access_token=access_token,
    )

    my_dict = content["data"]

    for release in content["data"]:
        release_date = release["releaseDate"]

        print("- release date =", release_date)

        data_points = release["data"][0]["dataPoints"]

        for data_point in data_points:
            period_start_at = data_point["deliveryPeriod"]["startAt"]

            spark_prices = dict()
            for unit, prices in data_point["derivedPrices"].items():
                spark_prices[unit] = prices["spark"]

            print(
                f"Spark Price={spark_prices} for period starting on {period_start_at}"
            )

    return my_dict

### N.B. Plan Limits

__Premium__ Plan users have __no__ limits on historical data.

__Trial__ Plan users only have access to the latest 2 weeks worth of historical data. Therefore the limit parameter cannot exceed 14.

In [10]:
### Define which price product you want to retrieve
my_ticker = tickers[2]

my_dict_hist = fetch_historical_price_releases(access_token, my_ticker, limit=10)

>>>> Get price releases for spark25s
- release date = 2025-01-14
Spark Price={'usdPerDay': '21250', 'usdPerMMBtu': '0.53'} for period starting on 2025-01-29
- release date = 2025-01-13
Spark Price={'usdPerDay': '21500', 'usdPerMMBtu': '0.53'} for period starting on 2025-01-28
- release date = 2025-01-10
Spark Price={'usdPerDay': '21750', 'usdPerMMBtu': '0.53'} for period starting on 2025-01-25
- release date = 2025-01-09
Spark Price={'usdPerDay': '21750', 'usdPerMMBtu': '0.53'} for period starting on 2025-01-24
- release date = 2025-01-08
Spark Price={'usdPerDay': '21750', 'usdPerMMBtu': '0.54'} for period starting on 2025-01-23
- release date = 2025-01-07
Spark Price={'usdPerDay': '21750', 'usdPerMMBtu': '0.54'} for period starting on 2025-01-22
- release date = 2025-01-06
Spark Price={'usdPerDay': '21750', 'usdPerMMBtu': '0.54'} for period starting on 2025-01-21
- release date = 2025-01-03
Spark Price={'usdPerDay': '21750', 'usdPerMMBtu': '0.55'} for period starting on 2025-01-18
- r

### Formatting into a Pandas DataFrame

The outputted data has several nested lists and dictionaries. If we are aware of what variables we want, we can externally store these values as lists and create a Pandas DataFrame.

Within a new dictionary, we create empty lists for variables:
- Release Dates
- Start of Period
- Ticker
- Price in dollars/day
- Price in dollars/MMBtu
- The spread of the data used to calculate the Spot Price
    - Min
    - Max

The dictionary is then transformed into a Pandas Dataframe for readability and ease of use. 

## N.B. 
This JSON structure is not consistent across all datasets, and so might need to be amended when calling other Spark contracts.

In [9]:
# Defining the function for storing and formatting the data into a Pandas DataFrame

def store_and_format(dict_hist):
    stored_data = {
        "ticker": [],
        "Period Start": [],
        "USDperday": [],
        "USDperdayMax": [],
        "USDperdayMin": [],
        "USDperMMBtu": [],
        "Release Date": []
    }

    for release in dict_hist:
        release_date = release["releaseDate"]
        stored_data['ticker'].append(release["contractId"])

        stored_data['Release Date'].append(release_date)

        data_points = release["data"][0]["dataPoints"]

        for data_point in data_points:
            period_start_at = data_point["deliveryPeriod"]["startAt"]
            stored_data['Period Start'].append(period_start_at)

            stored_data['USDperday'].append(data_point["derivedPrices"]["usdPerDay"]["spark"])
            stored_data['USDperdayMin'].append(data_point["derivedPrices"]["usdPerDay"]["sparkMin"])
            stored_data['USDperdayMax'].append(data_point["derivedPrices"]["usdPerDay"]["sparkMax"])

            stored_data['USDperMMBtu'].append(data_point["derivedPrices"]["usdPerMMBtu"]["spark"])
            
        historical_df = pd.DataFrame(stored_data)
        
        historical_df["USDperday"] = pd.to_numeric(historical_df["USDperday"])
        historical_df["USDperdayMax"] = pd.to_numeric(historical_df["USDperdayMax"])
        historical_df["USDperdayMin"] = pd.to_numeric(historical_df["USDperdayMin"])

        historical_df["USDperMMBtu"] = pd.to_numeric(historical_df["USDperMMBtu"])
        historical_df["Release Date"] = pd.to_datetime(historical_df["Release Date"])
        
    return historical_df

In [10]:
# Running the function to store the values
historical_df = store_and_format(my_dict_hist)
historical_df.head()

Unnamed: 0,ticker,Period Start,USDperday,USDperdayMax,USDperdayMin,USDperMMBtu,Release Date
0,spark25s,2024-11-01,43750,47500,38000,0.66,2024-10-17
1,spark25s,2024-10-31,46500,50000,42500,0.68,2024-10-16
2,spark25s,2024-10-30,50250,55500,45000,0.7,2024-10-15
3,spark25s,2024-10-29,51750,57500,48000,0.71,2024-10-14
4,spark25s,2024-10-26,52250,57500,50000,0.71,2024-10-11


# Analytics Gallery
Want to gain market insights using our data?

Take a look at our [Analytics Gallery](https://www.sparkcommodities.com/api/code-examples/analytics-examples.html) on the Spark API website, which includes:

- __Freight Spot Price Seasonality Chart__ - Compare freight spot prices across several years, to understand how the current market compares to historical prices at equivalent periods (e.g. Dec22 vs Dec23 vs Dec24).

Want to create meaningful charts using our data?

View our Freight Spot Price Seasonality Chart [here](https://www.sparkcommodities.com/api/code-examples/analytics-examples.html). 
