# Python API Example - Freight Route Data Import and Storage in Dataframe

This guide is designed to provide an example of how to access the Spark API:
- The path to your client credentials is the only input needed to run this script (just before Section 2)
- This script has been designed to display the raw outputs of requests from the API, and how to format those outputs to enable easy reading and analysis
- This script can be copy and pasted by customers for quick use of the API
- Once comfortable with the process, you can change the variables that are called to produce your own custom analysis products. (Section 2 onwards in this guide).

__N.B. This guide is just for Freight route data. If you're looking for other API data products (such as Price releases or Netbacks), please refer to their according code example files.__ 

### Have any questions?

If you have any questions regarding our API, or need help accessing specific datasets, please contact us at:

__data@sparkcommodities.com__

or refer to our API website for more information about this endpoint: https://www.sparkcommodities.com/api/request/routes.html

## 1. Importing Data

Here we define the functions that allow us to retrieve the valid credentials to access the Spark API.

This section can remain unchanged for most Spark API users.

In [40]:
## Importing libraries for calling the API
import json
import os
import sys
from base64 import b64encode
from urllib.parse import urljoin
import pandas as pd


try:
    from urllib import request, parse
    from urllib.error import HTTPError
except ImportError:
    raise RuntimeError("Python 3 required")

In [41]:
## Defining functions for API request

API_BASE_URL = "https://api.sparkcommodities.com"


def retrieve_credentials(file_path=None):
    """
    Find credentials either by reading the client_credentials file or reading
    environment variables
    """
    if file_path is None:
        client_id = os.getenv("SPARK_CLIENT_ID")
        client_secret = os.getenv("SPARK_CLIENT_SECRET")
        if not client_id or not client_secret:
            raise RuntimeError(
                "SPARK_CLIENT_ID and SPARK_CLIENT_SECRET environment vars required"
            )
    else:
        # Parse the file
        if not os.path.isfile(file_path):
            raise RuntimeError("The file {} doesn't exist".format(file_path))

        with open(file_path) as fp:
            lines = [l.replace("\n", "") for l in fp.readlines()]

        if lines[0] in ("clientId,clientSecret", "client_id,client_secret"):
            client_id, client_secret = lines[1].split(",")
        else:
            print("First line read: '{}'".format(lines[0]))
            raise RuntimeError(
                "The specified file {} doesn't look like to be a Spark API client "
                "credentials file".format(file_path)
            )

    print(">>>> Found credentials!")
    print(
        ">>>> Client_id={}, client_secret={}****".format(client_id, client_secret[:5])
    )

    return client_id, client_secret


def do_api_post_query(uri, body, headers):
    """
    OAuth2 authentication requires a POST request with client credentials before accessing the API. 
    This POST request will return an Access Token which will be used for the API GET request.
    """
    url = urljoin(API_BASE_URL, uri)

    data = json.dumps(body).encode("utf-8")

    # HTTP POST request
    req = request.Request(url, data=data, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 201, resp_content

    # The server returned a JSON response
    content = json.loads(resp_content)

    return content


def do_api_get_query(uri, access_token):
    """
    After receiving an Access Token, we can request information from the API.
    """
    url = urljoin(API_BASE_URL, uri)

    headers = {
        "Authorization": "Bearer {}".format(access_token),
        "Accept": "application/json",
    }

    # HTTP POST request
    req = request.Request(url, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 200, resp_content

    # The server returned a JSON response
    content = json.loads(resp_content)

    return content


def get_access_token(client_id, client_secret):
    """
    Get a new access_token. Access tokens are the thing that applications use to make
    API requests. Access tokens must be kept confidential in storage.

    # Procedure:

    Do a POST query with `grantType` and `scopes` in the body. A basic authorization
    HTTP header is required. The "Basic" HTTP authentication scheme is defined in
    RFC 7617, which transmits credentials as `clientId:clientSecret` pairs, encoded
    using base64.
    """

    # Note: for the sake of this example, we choose to use the Python urllib from the
    # standard lib. One should consider using https://requests.readthedocs.io/

    payload = "{}:{}".format(client_id, client_secret).encode()
    headers = {
        "Authorization": b64encode(payload).decode(),
        "Accept": "application/json",
        "Content-Type": "application/json",
    }
    body = {
        "grantType": "clientCredentials",
        "scopes": "read:prices,read:routes",
    }

    content = do_api_post_query(uri="/oauth/token/", body=body, headers=headers)

    print(
        ">>>> Successfully fetched an access token {}****, valid {} seconds.".format(
            content["accessToken"][:5], content["expiresIn"]
        )
    )

    return content["accessToken"]

## Defining Fetch Request

Here is where we define what type of data we want to fetch from the API.

In my fetch request, I use the URL:

__uri="/v1.0/routes/"__

This is to query shipping route data specifically. Other data products (such as price releases) require different URL's in the fetch request (refer to other Python API examples).

In [42]:
## Define function for listing routes from API
def list_routes(access_token):
    """
    Fetch available routes. Return route ids and Spark price release dates.

    The URI used returns a list of all available Spark routes. With these routes, we can find the price breakdown of a specific route.

    # Procedure:

    Do a GET query to /v1.0/routes/ with a Bearer token authorization HTTP header.
    """
    content = do_api_get_query(uri="/v1.0/routes/", access_token=access_token)

    print(">>>> All the routes you can fetch")
    tickers = []
    for contract in content["data"]["routes"]:
        tickers.append(contract["uuid"])

    reldates = content["data"]["sparkReleaseDates"]

    dicto1 = content["data"]

    return tickers, reldates, dicto1

## N.B. Credentials

Here we call the above functions, and input the file path to our credentials.

N.B. You must have downloaded your client credentials CSV file before proceeding. Please refer to the API documentation if you have not dowloaded them already. Instructions for downloading your credentials can be found here:

https://api.sparkcommodities.com/redoc#section/Authentication/Create-an-Oauth2-Client

The code then prints:
- the number of callable routes available
- the number of Spark freight price dates that are callable

In [43]:
## Input your file location here
client_id, client_secret = retrieve_credentials(file_path="/tmp/client_credentials.csv")


# Authenticate:
access_token = get_access_token(client_id, client_secret)

# Fetch all contracts:
routes, reldates, dicto1 = list_routes(access_token)

>>>> Found credentials!
>>>> Client_id=875f483b-19de-421a-8e9b-dceff6703e83, client_secret=6cdf8****
>>>> Successfully fetched an access token eyJhb****, valid 604799 seconds.
>>>> All the routes you can fetch


# 2. Describing available routes

We have now saved all available routes as a dictionary. We can check how this looks, and then filter the routes by several characteristics.

In [44]:
## Raw dictionary

print(dicto1)

{'routes': [{'uuid': '003a3297-b4ed-4c49-9ab2-65122c6f6de8', 'loadPort': {'uuid': '003dec0a-ce8f-41db-8c24-4d7ef6addf70', 'type': 'export', 'region': 'atlantic', 'name': 'Sabine Pass'}, 'dischargePort': {'uuid': '003b319e-b29e-4853-b4ee-85794d5bacba', 'type': 'import', 'region': 'atlantic', 'name': 'Stade'}, 'via': None}, {'uuid': '003511be-a06d-407b-8d13-22a6ac99f59d', 'loadPort': {'uuid': '00381c87-4180-4430-80f1-bf828099124f', 'type': 'export', 'region': 'pacific', 'name': 'NWS'}, 'dischargePort': {'uuid': '0030d930-6574-4049-a739-327a16620429', 'type': 'import', 'region': 'atlantic', 'name': 'Ravenna'}, 'via': 'suez'}, {'uuid': '00376e89-c9a4-4d49-8100-43ec6ad89793', 'loadPort': {'uuid': '003f9d1b-b4ad-4de9-8c8d-bd7fbcacd3df', 'type': 'export', 'region': 'pacific', 'name': 'Ras Laffan'}, 'dischargePort': {'uuid': '00348162-8284-447d-b641-1f06b9078fdd', 'type': 'combined', 'region': 'pacific', 'name': 'Gate'}, 'via': 'cogh'}, {'uuid': '003fb354-6fc4-406c-86d5-89c015d227a7', 'loadPor

In [45]:
## Store route characteristics as a DataFrame

def check_and_store_characteristics(dict1):
    """
    # Store some of the route characteristics in lists, and check these lists are the same length
    # N.B. these are not all the characteristics available!
    # Check the Output of the raw dictionary (above) to see all available characteristics.
    """

    routes_info = {
        "UUID": [],
        "Load Location": [],
        "Discharge Location": [],
        "Via": [],
        "Load Region": [],
        "Discharge Region": [],
        "Load UUID": [],
        "Discharge UUID": []
    }
    for route in dict1["routes"]:
        
        routes_info['UUID'].append(route["uuid"])
        routes_info['Via'].append(route["via"])

        routes_info['Load UUID'].append(route["loadPort"]["uuid"])
        routes_info['Load Location'].append(route["loadPort"]["name"])
        routes_info['Load Region'].append(route["loadPort"]["region"])

        routes_info['Discharge UUID'].append(route["dischargePort"]["uuid"])
        routes_info['Discharge Location'].append(route["dischargePort"]["name"])
        routes_info['Discharge Region'].append(route["dischargePort"]["region"])
        
    
    route_df = pd.DataFrame(routes_info)

    return route_df


### Exploring the data


In [46]:
## We use the stored route characteristics to create the dataframe
route_df = check_and_store_characteristics(dicto1)

route_df.head()

Unnamed: 0,UUID,Load Location,Discharge Location,Via,Load Region,Discharge Region,Load UUID,Discharge UUID
0,003a3297-b4ed-4c49-9ab2-65122c6f6de8,Sabine Pass,Stade,,atlantic,atlantic,003dec0a-ce8f-41db-8c24-4d7ef6addf70,003b319e-b29e-4853-b4ee-85794d5bacba
1,003511be-a06d-407b-8d13-22a6ac99f59d,NWS,Ravenna,suez,pacific,atlantic,00381c87-4180-4430-80f1-bf828099124f,0030d930-6574-4049-a739-327a16620429
2,00376e89-c9a4-4d49-8100-43ec6ad89793,Ras Laffan,Gate,cogh,pacific,pacific,003f9d1b-b4ad-4de9-8c8d-bd7fbcacd3df,00348162-8284-447d-b641-1f06b9078fdd
3,003fb354-6fc4-406c-86d5-89c015d227a7,Hammerfest,Gate,,atlantic,pacific,003f92ce-86d5-4d03-9761-311036c47812,00348162-8284-447d-b641-1f06b9078fdd
4,0034630c-1c15-42a0-8236-39a85ad929da,Hammerfest,Futtsu,panama,atlantic,pacific,003f92ce-86d5-4d03-9761-311036c47812,003c2da6-6a74-4e29-aef6-a789a747ac65


# 3. Analysing a Specific Route


Here we define the function that allows us to pull data for a specific route and release date.

We then define a given route ID ('my_route') and release date ('my_release') below the function, and these values are printed out for the user to check the parameters.


__NOTE:__ The 'congestion_laden' and 'congestion_ballast' options are limited to the following combinations: (0,0), (1,1), (4,4), (7,7), (10,10), (15,15), (20,20). Both parameters must have the same value.

In [47]:
## Defining the function

def fetch_route_data(access_token, ticker, release, congestion_laden= None, congestion_ballast= None):
    """
    For a route, fetch then display the route details

    # Procedure:

    Do GET queries to https://api.sparkcommodities.com/v1.0/routes/{route_uuid}/
    with a Bearer token authorization HTTP header.
    """

    query_params = "?release-date={}".format(release)
    if congestion_laden is not None:
        query_params += "&congestion-laden-days={}".format(congestion_laden)
    if congestion_ballast is not None:
        query_params += "&congestion-ballast-days={}".format(congestion_ballast)

    uri = "/v1.0/routes/{}/{}".format(ticker, query_params)
    print(uri)

    content = do_api_get_query(
        uri="/v1.0/routes/{}/{}".format(ticker, query_params),
        access_token=access_token,
    )

    my_dict = content["data"]

    print(">>>> Get route information for {}".format(ticker))

    return my_dict

### N.B. Plan Limits

__Premium__ Users can choose any release date, as they have full access to the dataset.

__Trial__ Users must choose a release date within the last 2 weeks.

In [48]:
## Calling that function and storing the output

# Here we store the entire dataset called from the API

load = 'Sabine Pass'
discharge = 'Futtsu'
via = 'cogh'

my_route = route_df[(route_df["Load Location"] == load) & \
                    (route_df["Discharge Location"] == discharge) & \
                    (route_df['Via'] == via)]['UUID'].values[0]

my_release = '2024-09-25'

my_dict = fetch_route_data(access_token, my_route, release=my_release)

/v1.0/routes/003b2bb4-4c8a-41ee-93a1-803f0accf226/?release-date=2024-09-25
>>>> Get route information for 003b2bb4-4c8a-41ee-93a1-803f0accf226


In [26]:
## Calling that dictionary to see how it is structured

my_dict

{'uuid': '003b2bb4-4c8a-41ee-93a1-803f0accf226',
 'name': 'Sabine Pass to Futtsu (via COGH)',
 'loadPortUuid': '003dec0a-ce8f-41db-8c24-4d7ef6addf70',
 'dischargePortUuid': '003c2da6-6a74-4e29-aef6-a789a747ac65',
 'via': 'cogh',
 'congestionDays': 0,
 'congestionLadenDays': 0,
 'congestionBallastDays': 0,
 'assumptions': [{'type': 'load-port', 'value': 'Sabine Pass', 'unit': None},
  {'type': 'discharge-port', 'value': 'Futtsu', 'unit': None},
  {'type': 'distance', 'value': '15,855', 'unit': 'NM'},
  {'type': 'round-trip-duration', 'value': '83', 'unit': 'days'},
  {'type': 'flex-days', 'value': '3', 'unit': 'days'},
  {'type': 'port-days', 'value': '2', 'unit': 'days'},
  {'type': 'congestion-days', 'value': '0', 'unit': 'days'},
  {'type': 'canal-days', 'value': '0', 'unit': 'days'},
  {'type': 'discharge-volume', 'value': '3,300,941', 'unit': 'MMBtu'},
  {'type': 'discharge-volume-174', 'value': '3,640,864', 'unit': 'MMBtu'},
  {'type': 'lng-freight-rate-source', 'value': 'Spark25F

In [27]:
## Define a variable storing the route start-end
route_name = my_dict["name"]

### Storing Data as a DataFrame

We extract some relevant data for the chosen route, including the spot price and forward prices. These are stored in a Pandas Dataframe for readability and ease of use.

In [49]:
## Defining the function to store as dataframe
def organise_dataframe(my_dict):
    my_route  = {
            "Period": [],
            "Start Date": [],
            "End Date": [],
            "Cost in USD": [],
            "Cost in USDperMMBtu": [],
            "Hire Cost in USD": [],
        }

    for data in my_dict["dataPoints"]:
        my_route['Start Date'].append(data["deliveryPeriod"]["startAt"])
        my_route['End Date'].append(data["deliveryPeriod"]["endAt"])
        my_route['Period'].append(data["deliveryPeriod"]["name"])

        my_route['Cost in USD'].append(data["costsInUsd"]["total"])
        my_route['Cost in USDperMMBtu'].append(data["costsInUsdPerMmbtu"]["total"])

        my_route['Hire Cost in USD'].append(data["costsInUsd"]["hire"])


    my_route_df = pd.DataFrame(my_route)


    ## Changing the data type of these columns from 'string' to numbers.
    ## This allows us to easily plot a forward curve, as well as perform statistical analysis on the prices.
    my_route_df["Cost in USD"] = pd.to_numeric(my_route_df["Cost in USD"])
    my_route_df["Hire Cost in USD"] = pd.to_numeric(my_route_df["Hire Cost in USD"])
    my_route_df["Cost in USDperMMBtu"] = pd.to_numeric(my_route_df["Cost in USDperMMBtu"])
    
    return my_route_df

In [50]:
my_route_df = organise_dataframe(my_dict)
my_route_df.head()

Unnamed: 0,Period,Start Date,End Date,Cost in USD,Cost in USDperMMBtu,Hire Cost in USD
0,Spot (Physical),2024-10-10,2024-11-09,9745145,2.677,6204250
1,M+1,2024-10-01,2024-10-31,10830479,2.975,7241750
2,M+2,2024-11-01,2024-11-30,11035800,3.031,7387000
3,M+3,2024-12-01,2024-12-31,9809541,2.694,6121250
4,M+4,2025-01-01,2025-01-31,8330837,2.288,4627250


# Panama Canal Congestion

The Spark API allows you to account for congestion delays for any route passing through the Panama canal. This is done via an optional query parameter in the __'fetch_route_data'__ function - 'congestion'.

- Set the congestion parameter to the amount of delay days needed
    - This should be given as an integer: e.g. congestion = 5
- If the congestion parameter is not specified, like in the examples above, the congestion value is set to the default value of 0.
- If the congestion parameter is called for a route that does not go through the Panama canal, then a 404 error will be triggered

Below is an example of using this congestion parameter.

In [51]:
## First, check which routes go via the Panama canal

route_df[route_df["Via"] == "panama"].head()

Unnamed: 0,UUID,Load Location,Discharge Location,Via,Load Region,Discharge Region,Load UUID,Discharge UUID
4,0034630c-1c15-42a0-8236-39a85ad929da,Hammerfest,Futtsu,panama,atlantic,pacific,003f92ce-86d5-4d03-9761-311036c47812,003c2da6-6a74-4e29-aef6-a789a747ac65
25,003e5067-132d-4660-824c-ef17386f73fb,Sabine Pass,Tianjin,panama,atlantic,pacific,003dec0a-ce8f-41db-8c24-4d7ef6addf70,003a90f9-9cde-436f-9bec-4f9e786c2ea7
31,003ba153-1f89-477a-b5a1-9fcf90e13c74,Corpus Christi,Futtsu,panama,atlantic,pacific,0030c461-9a63-403d-8f53-9327ea773517,003c2da6-6a74-4e29-aef6-a789a747ac65
33,003b2421-367d-4a92-b2db-9e0096f5d69f,Corpus Christi,Tianjin,panama,atlantic,pacific,0030c461-9a63-403d-8f53-9327ea773517,003a90f9-9cde-436f-9bec-4f9e786c2ea7
40,0030cd3b-af5b-4dcd-b4ad-38be7a030ff6,Cove Point,Futtsu,panama,atlantic,pacific,003e8539-3a98-48fa-b35d-0ba061beea4e,003c2da6-6a74-4e29-aef6-a789a747ac65


### N.B. Congestion Days Options

The 'congestion_laden' and 'congestion_ballast' option are limited to the following combinations: (0,0), (1,1), (4,4), (7,7), (10,10), (15,15), (20,20). Both parameters must have the same value.

### N.B. Plan Limits

__Premium__ Users can choose any release date, as they have full access to the dataset.

__Trial__ Users must choose a release date within the last 2 weeks, ie. 'reldates[13]' is the earliest date possible.

In [53]:
# Specify which route we want to use ('cong_route'): we can find specific routes by filtering the dataframe.
# as well as release date ('cong_release')
# and amount of congestion days (cong_days)

cong_route = route_df[(route_df['Via']=='panama') & \
                      (route_df['Load Location'] == 'Sabine Pass') & \
                      (route_df['Discharge Location'] == 'Futtsu')]['UUID'].tolist()[0]
cong_release = reldates[8]
cong_days_laden = 4
cong_days_ballast = 4

# Fetch the route data with these specifications
cong_dict = fetch_route_data(
    access_token, cong_route, release=cong_release, congestion_laden=cong_days_laden, congestion_ballast=cong_days_ballast
)

/v1.0/routes/0036c5e9-f3c8-4a67-95c8-3afaefa39ff3/?release-date=2025-01-02&congestion-laden-days=4&congestion-ballast-days=4
>>>> Get route information for 0036c5e9-f3c8-4a67-95c8-3afaefa39ff3


In [54]:
# Fetching data for the same route but without congestion delays ('nocong_dict'), for comparison
nocong_dict = fetch_route_data(access_token, cong_route, release=cong_release)

# Save the name of the route
congroute_name = nocong_dict["name"]

/v1.0/routes/0036c5e9-f3c8-4a67-95c8-3afaefa39ff3/?release-date=2025-01-02
>>>> Get route information for 0036c5e9-f3c8-4a67-95c8-3afaefa39ff3


In [55]:
# Call the 'organise_dataframe' function to organise the dictionary into a readable dataframe.
# Applying function to dictionaries
cong_df = organise_dataframe(cong_dict)
nocong_df = organise_dataframe(nocong_dict)

In [56]:
cong_df.head()

Unnamed: 0,Period,Start Date,End Date,Cost in USD,Cost in USDperMMBtu,Hire Cost in USD
0,Spot (Physical),2025-01-17,2025-02-16,6201723,1.681,1228500
1,M+1,2025-02-01,2025-02-28,6091642,1.652,1120500
2,M+2,2025-03-01,2025-03-31,5970677,1.619,1026000
3,M+3,2025-04-01,2025-04-30,5961121,1.616,1026000
4,M+4,2025-05-01,2025-05-31,6067751,1.645,1120500


In [57]:
nocong_df.head()

Unnamed: 0,Period,Start Date,End Date,Cost in USD,Cost in USDperMMBtu,Hire Cost in USD
0,Spot (Physical),2025-01-17,2025-02-16,5622011,1.504,1228500
1,M+1,2025-02-01,2025-02-28,5526134,1.478,1120500
2,M+2,2025-03-01,2025-03-31,5420778,1.45,1026000
3,M+3,2025-04-01,2025-04-30,5412455,1.447,1026000
4,M+4,2025-05-01,2025-05-31,5505326,1.473,1120500


# Analytics Gallery

Want to gain market insights using our data?

Take a look at our [Analytics Gallery](https://www.sparkcommodities.com/api/code-examples/analytics-examples.html) on the Spark API website, which includes:

- __Routes Contract Month Evolution & Seasonality__ - For a month of interest, track how that month has priced in historically (e.g. Dec22 vs Dec23 vs Dec24), providing insight into how the current year's contract (e.g. Dec25) might price in over the coming months for a Route of your choice.

Want to create meaningful charts using our data?

View our Route Seasonality Chart [here](https://www.sparkcommodities.com/api/code-examples/analytics-examples.html). 

