# Python API Example - Access Terminal Costs Import and Storage in Dataframe

This guide is designed to provide an example of how to access the Spark API:
- The path to your client credentials is the only input needed to run this script (just before Section 2)
- This script has been designed to display the raw outputs of requests from the API, and then shows you how to format those outputs to enable easy reading and analysis
- This script can be copied and pasted by customers for quick use of the API

__N.B. This guide is just for Access terminal data. If you're looking for other API data products (such as contract prices, Freight routes or Netbacks), please refer to their according code example files.__ 

### Have any questions?

If you have any questions regarding our API, or need help accessing specific datasets, please contact us at:

__data@sparkcommodities.com__

or refer to our API website for more information about this endpoint: https://www.sparkcommodities.com/api/request/access.html

## 1. Importing Data

Here we define the functions that allow us to retrieve the valid credentials to access the Spark API.

This section can remain unchanged for most Spark API users.

In [1]:
# import libraries for callin the API
import json
import os
import sys
import pandas as pd
from base64 import b64encode
from urllib.parse import urljoin
from pprint import pprint

try:
    from urllib import request, parse
    from urllib.error import HTTPError
except ImportError:
    raise RuntimeError("Python 3 required")

In [2]:
# defining query functions 
API_BASE_URL = "https://api.sparkcommodities.com"


def retrieve_credentials(file_path=None):
    """
    Find credentials either by reading the client_credentials file or reading
    environment variables
    """
    if file_path is None:
        client_id = os.getenv("SPARK_CLIENT_ID")
        client_secret = os.getenv("SPARK_CLIENT_SECRET")
        if not client_id or not client_secret:
            raise RuntimeError(
                "SPARK_CLIENT_ID and SPARK_CLIENT_SECRET environment vars required"
            )
    else:
        # Parse the file
        if not os.path.isfile(file_path):
            raise RuntimeError("The file {} doesn't exist".format(file_path))

        with open(file_path) as fp:
            lines = [l.replace("\n", "") for l in fp.readlines()]

        if lines[0] in ("clientId,clientSecret", "client_id,client_secret"):
            client_id, client_secret = lines[1].split(",")
        else:
            print("First line read: '{}'".format(lines[0]))
            raise RuntimeError(
                "The specified file {} doesn't look like to be a Spark API client "
                "credentials file".format(file_path)
            )

    print(">>>> Found credentials!")
    print(
        ">>>> Client_id={}****, client_secret={}****".format(
            client_id[:5], client_secret[:5]
        )
    )

    return client_id, client_secret


def do_api_post_query(uri, body, headers):
    """
    OAuth2 authentication requires a POST request with client credentials before accessing the API. 
    This POST request will return an Access Token which will be used for the API GET request.
    """
    url = urljoin(API_BASE_URL, uri)

    data = json.dumps(body).encode("utf-8")

    # HTTP POST request
    req = request.Request(url, data=data, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 201, resp_content

    # The server returned a JSON response
    content = json.loads(resp_content)

    return content


def do_api_get_query(uri, access_token):
    """
    After receiving an Access Token, we can request information from the API.
    """
    url = urljoin(API_BASE_URL, uri)

    headers = {
        "Authorization": "Bearer {}".format(access_token),
        "accept": "application/json",
    }

    print(f"Fetching {url}")

    # HTTP GET request
    req = request.Request(url, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 200, resp_content

    # The server returned a JSON response
    content = json.loads(resp_content)

    return content


def get_access_token(client_id, client_secret):
    """
    Get a new access_token. Access tokens are the thing that applications use to make
    API requests. Access tokens must be kept confidential in storage.

    # Procedure:

    Do a POST query with `grantType` and `scopes` in the body. A basic authorization
    HTTP header is required. The "Basic" HTTP authentication scheme is defined in
    RFC 7617, which transmits credentials as `clientId:clientSecret` pairs, encoded
    using base64.
    """

    # Note: for the sake of this example, we choose to use the Python urllib from the
    # standard lib. One should consider using https://requests.readthedocs.io/

    payload = "{}:{}".format(client_id, client_secret).encode()
    headers = {
        "Authorization": b64encode(payload).decode(),
        "Accept": "application/json",
        "Content-Type": "application/json",
    }
    body = {
        "grantType": "clientCredentials",
        "scopes": "read:access",
    }

    content = do_api_post_query(uri="/oauth/token/", body=body, headers=headers)

    print(
        ">>>> Successfully fetched an access token {}****, valid {} seconds.".format(
            content["accessToken"][:5], content["expiresIn"]
        )
    )

    return content["accessToken"]

## N.B. Credentials

Here we call the above functions, and input the file path to our credentials.

N.B. You must have downloaded your client credentials CSV file before proceeding. Please refer to the API documentation if you have not dowloaded them already.

The code then prints the available prices that are callable from the API, and their corresponding Python ticker names are displayed as a list at the bottom of the Output.

In [3]:
# Insert file path to your client credentials here
client_id, client_secret = retrieve_credentials(file_path="/tmp/client_credentials.csv")

# Authenticate:
access_token = get_access_token(client_id, client_secret)

>>>> Found credentials!
>>>> Client_id=875f4****, client_secret=6cdf8****
>>>> Successfully fetched an access token eyJhb****, valid 604799 seconds.


## 2. Latest Price Release

Here we call the latest price release and print it in a readable format. This is done using the URL:

__/beta/sparkr/releases/latest/__


We then save the entire dataset as a local variable called 'latest'.

In [4]:
## Defining the latest release function


def fetch_latest_price_releases(access_token):
    content = do_api_get_query(
        uri="/beta/sparkr/releases/latest/", access_token=access_token
    )

    return content["data"]


## Calling that function and storing the output

latest = fetch_latest_price_releases(access_token)

Fetching https://api.sparkcommodities.com/beta/sparkr/releases/latest/


In [5]:
# Checking structure of data
latest[0]

{'releaseDate': '2024-10-17',
 'terminalCode': 'grain-lng',
 'terminalName': 'Isle of Grain',
 'perVesselSize': {'160000': {'deliveryMonths': [{'month': '2024-11-01',
     'costsInUsdPerMmbtu': {'total': '1.285',
      'breakdown': {'basic-slot-berth': {'type': 'basic-slot-berth',
        'value': '0.0',
        'description': 'Slot (Berth)'},
       'basic-slot-unload-storage-regas': {'type': 'basic-slot-unload-storage-regas',
        'value': '0.0',
        'description': 'Slot (Unload, Storage, Regas)'},
       'basic-slot-berth-unload-storage-regas': {'type': 'basic-slot-berth-unload-storage-regas',
        'value': '0.355',
        'description': 'Slot (Berth, Unload, Storage, Regas)'},
       'additional-storage': {'type': 'additional-storage',
        'value': '0.0',
        'description': 'Additional Storage'},
       'additional-send-out': {'type': 'additional-send-out',
        'value': '0.209',
        'description': 'Additional Send Out'},
       'fuel-gas-losses-gas-in-kin

In [6]:
# Showing available vessel sizes
print(list(latest[0]["perVesselSize"]))

['160000', '174000']


## Storing Data as a Dataframe

Define a function to store regas costs as a Dataframe, making the data more easily readable and so that specific datasets can be indexed easily.

__N.B.__ Gas in Kind, Entry Capacity and Commodity Charge cost components are only available for Premium subscribers. For Trial API users, these components are wrapped up into one variable, named 'other costs'. 

__Want to upgrade?__ Contact data@sparkcommodities.com to find out more!

In [7]:
def organise_dataframe(latest):
    """
    This function sorts the API content into a dataframe. The columns available are Release Date, Terminal, Month, Vessel Size, $/MMBtu and €/MWh. 
    Essentially, this function parses the Access database using the Month, Terminal and Vessel size columns as reference.
    """
    # create columns
    data_dict = {
        'Release Date':[],
        'Terminal':[],
        'Month':[],
        'Vessel Size':[],
        'Total $/MMBtu':[],
        'Basic Slot (Berth)':[],
        'Basic Slot (Unload/Stor/Regas)':[],
        'Basic Slot (B/U/S/R)':[],
        'Additional Storage':[],
        'Additional Sendout':[],
        'Gas in Kind': [],
        'Entry Capacity':[],
        'Commodity Charge':[]
    }

    # loop for each Terminal
    for l in latest:
        sizes_available = list(latest[0]['perVesselSize'].keys())

        # loop for each available size
        for s in sizes_available:
            
            # loop for each month (in the form: YYYY-MM-DD)
            for month in range(len(l['perVesselSize'][f'{s}']['deliveryMonths'])):
                
                # assigning values to each column
                data_dict['Release Date'].append(l["releaseDate"])
                data_dict['Terminal'].append(l["terminalName"])
                data_dict['Month'].append(l['perVesselSize'][f'{s}']['deliveryMonths'][month]['month'])
                data_dict['Vessel Size'].append(s)
                data_dict['Total $/MMBtu'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["total"]))
                
                data_dict['Basic Slot (Berth)'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['basic-slot-berth']['value']))
                data_dict['Basic Slot (Unload/Stor/Regas)'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['basic-slot-unload-storage-regas']['value']))
                data_dict['Basic Slot (B/U/S/R)'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['basic-slot-berth-unload-storage-regas']['value']))
                data_dict['Additional Storage'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['additional-storage']['value']))
                data_dict['Additional Sendout'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['additional-send-out']['value']))
                data_dict['Gas in Kind'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['fuel-gas-losses-gas-in-kind']['value']))
                data_dict['Entry Capacity'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['entry-capacity']['value']))
                data_dict['Commodity Charge'].append(float(l['perVesselSize'][f'{s}']['deliveryMonths'][month]["costsInUsdPerMmbtu"]["breakdown"]['commodity-charge']['value']))
                
    
    # convert into dataframe
    df = pd.DataFrame(data_dict)
    
    df['Month'] = pd.to_datetime(df['Month'])
    df['Release Date'] = pd.to_datetime(df['Release Date'])
    
    return df


In [8]:
prices_df = organise_dataframe(latest)

In [9]:
# Example of calling specific data for a chosen terminal and vessel size

prices_df[(prices_df['Terminal'] == 'Zeebrugge') & (prices_df['Vessel Size'] == '174000')]

Unnamed: 0,Release Date,Terminal,Month,Vessel Size,Total $/MMBtu,Basic Slot (Berth),Basic Slot (Unload/Stor/Regas),Basic Slot (B/U/S/R),Additional Storage,Additional Sendout,Gas in Kind,Entry Capacity,Commodity Charge
60,2024-10-17,Zeebrugge,2024-11-01,174000,0.42,0.0,0.0,0.177,0.001,0.009,0.166,0.057,0.01
61,2024-10-17,Zeebrugge,2024-12-01,174000,0.43,0.0,0.0,0.177,0.001,0.009,0.168,0.065,0.01
62,2024-10-17,Zeebrugge,2025-01-01,174000,0.44,0.0,0.0,0.177,0.001,0.009,0.169,0.074,0.01
63,2024-10-17,Zeebrugge,2025-02-01,174000,0.433,0.0,0.0,0.177,0.001,0.009,0.17,0.066,0.01
64,2024-10-17,Zeebrugge,2025-03-01,174000,0.421,0.0,0.0,0.177,0.001,0.009,0.169,0.055,0.01
65,2024-10-17,Zeebrugge,2025-04-01,174000,0.4,0.0,0.0,0.177,0.001,0.009,0.163,0.04,0.01
66,2024-10-17,Zeebrugge,2025-05-01,174000,0.387,0.0,0.0,0.178,0.001,0.009,0.162,0.027,0.01
67,2024-10-17,Zeebrugge,2025-06-01,174000,0.38,0.0,0.0,0.178,0.001,0.009,0.161,0.021,0.01
68,2024-10-17,Zeebrugge,2025-07-01,174000,0.38,0.0,0.0,0.178,0.001,0.009,0.161,0.021,0.01
69,2024-10-17,Zeebrugge,2025-08-01,174000,0.38,0.0,0.0,0.178,0.001,0.009,0.161,0.021,0.01


## 3. Historical Prices

Here we perform a similar task, but with historical prices instead. This is done using the URL:

__/beta/sparkr/releases/{limit}{offset}__

First we define the function that imports the data from the Spark API.

We then call that function, and define 2 parameters:
- 'limit': this allows you to control how many datapoints you want to call. Here we use 'limit=3', which means we have called the last 3 datapoints (Terminal price data for the last 3 business days).
    - Alter this limit to however many datapoints you need.
    - The default is set as 4 (in the first line of the function). If the limit parameter is not defined, this value will be used.
- 'offset': This parameter is optional, and the default value is None. Input how many business days you would like to offset the data
    - for example, offset=2 gets terminal data from 2 business days ago.


We save the output as a local variable called 'historical'

In [10]:
## Defining the function


def fetch_price_releases(access_token, limit=4, offset=None):
    query_params = "?limit={}".format(limit)
    if offset is not None:
        query_params += "&offset={}".format(offset)

    content = do_api_get_query(
        uri="/beta/sparkr/releases/{}".format(query_params), access_token=access_token
    )

    return content["data"]


## Calling that function and storing the output

historical = fetch_price_releases(access_token, limit=3)

Fetching https://api.sparkcommodities.com/beta/sparkr/releases/?limit=3


In [11]:
# checking raw data structure
historical[0]

{'releaseDate': '2024-10-17',
 'terminalCode': 'montoir',
 'terminalName': 'Montoir',
 'perVesselSize': {'160000': {'deliveryMonths': [{'month': '2024-11-01',
     'costsInUsdPerMmbtu': {'total': '0.367',
      'breakdown': {'basic-slot-berth': {'type': 'basic-slot-berth',
        'value': '0.028',
        'description': 'Slot (Berth)'},
       'basic-slot-unload-storage-regas': {'type': 'basic-slot-unload-storage-regas',
        'value': '0.176',
        'description': 'Slot (Unload, Storage, Regas)'},
       'basic-slot-berth-unload-storage-regas': {'type': 'basic-slot-berth-unload-storage-regas',
        'value': '0.0',
        'description': 'Slot (Berth, Unload, Storage, Regas)'},
       'additional-storage': {'type': 'additional-storage',
        'value': '0.0',
        'description': 'Additional Storage'},
       'additional-send-out': {'type': 'additional-send-out',
        'value': '0.0',
        'description': 'Additional Send Out'},
       'fuel-gas-losses-gas-in-kind': {'ty

### Storing as a DataFrame
We can reuse our 'organise_dataframe' function to parse the content into a dataframe

In [12]:
# running function
hist_df = organise_dataframe(historical)

In [13]:
hist_df

Unnamed: 0,Release Date,Terminal,Month,Vessel Size,Total $/MMBtu,Basic Slot (Berth),Basic Slot (Unload/Stor/Regas),Basic Slot (B/U/S/R),Additional Storage,Additional Sendout,Gas in Kind,Entry Capacity,Commodity Charge
0,2024-10-17,Montoir,2024-11-01,160000,0.367,0.028,0.176,0.00,0.0,0.0,0.062,0.101,0.0
1,2024-10-17,Montoir,2024-12-01,160000,0.368,0.028,0.176,0.00,0.0,0.0,0.063,0.101,0.0
2,2024-10-17,Montoir,2025-01-01,160000,0.369,0.028,0.176,0.00,0.0,0.0,0.064,0.101,0.0
3,2024-10-17,Montoir,2025-02-01,160000,0.369,0.028,0.176,0.00,0.0,0.0,0.064,0.101,0.0
4,2024-10-17,Montoir,2025-03-01,160000,0.369,0.028,0.176,0.00,0.0,0.0,0.064,0.101,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...
1369,2024-10-15,Dunkerque,2025-06-01,174000,0.714,0.000,0.000,0.61,0.0,0.0,0.000,0.104,0.0
1370,2024-10-15,Dunkerque,2025-07-01,174000,0.714,0.000,0.000,0.61,0.0,0.0,0.000,0.104,0.0
1371,2024-10-15,Dunkerque,2025-08-01,174000,0.714,0.000,0.000,0.61,0.0,0.0,0.000,0.104,0.0
1372,2024-10-15,Dunkerque,2025-09-01,174000,0.714,0.000,0.000,0.61,0.0,0.0,0.000,0.104,0.0


## N.B. Historical Data Limits

Currently, a maximum of 30 historical datasets can be called at one time due to the size of the data file. 

If more data points are required, the below code can be used. It calls 30 historical datasets at a time, but utilises the 'offset' parameter to call datasets further back in the historical database. To call more history, increase the 'n30_offset' parameter in the first line of the code. The 'n30_offset' parameter describes the number of historical data requests to be executed.

In [14]:
def loop_historical_data(token,n30_offset):
    # initalise first set of historical data and initialising dataframe
    historical = fetch_price_releases(access_token=token,limit=30)
    hist_df = organise_dataframe(historical)

    # Looping through earlier historical data and adding to the historical dataframe
    for i in range(1,n30_offset+1):
        historical = fetch_price_releases(access_token=token,limit=30,offset=i*30)
        hist_df = pd.concat([hist_df,organise_dataframe(historical)])

    return hist_df