# Python API Example - Arb Breakevens Data Import and Storage in Dataframe

This guide is designed to provide an example of how to call the Arb Breakevens API endpoint, and store the data accordingly.

__N.B. This guide is just for Arb Breakevens data. If you're looking for other API data products (such as Netbacks, Price releases or Freight routes), please refer to their according code example files.__ 

### Have any questions?

If you have any questions regarding our API, or need help accessing specific datasets, please contact us at:

__data@sparkcommodities.com__

## 1. Importing Data

Here we define the functions that allow us to retrieve the valid credentials to access the Spark API.

__This section can remain unchanged for most Spark API users.__

In [20]:
# Importing libraries for calling the API
import json
import os
import sys
import pandas as pd
from base64 import b64encode
from urllib.parse import urljoin


try:
    from urllib import request, parse
    from urllib.error import HTTPError
except ImportError:
    raise RuntimeError("Python 3 required")

In [21]:
# Defining functions for API request

API_BASE_URL = "https://api.sparkcommodities.com"


def retrieve_credentials(file_path=None):
    """
    Find credentials either by reading the client_credentials file or reading
    environment variables
    """
    if file_path is None:
        client_id = os.getenv("SPARK_CLIENT_ID")
        client_secret = os.getenv("SPARK_CLIENT_SECRET")
        if not client_id or not client_secret:
            raise RuntimeError(
                "SPARK_CLIENT_ID and SPARK_CLIENT_SECRET environment vars required"
            )
    else:
        # Parse the file
        if not os.path.isfile(file_path):
            raise RuntimeError("The file {} doesn't exist".format(file_path))

        with open(file_path) as fp:
            lines = [l.replace("\n", "") for l in fp.readlines()]

        if lines[0] in ("clientId,clientSecret", "client_id,client_secret"):
            client_id, client_secret = lines[1].split(",")
        else:
            print("First line read: '{}'".format(lines[0]))
            raise RuntimeError(
                "The specified file {} doesn't look like to be a Spark API client "
                "credentials file".format(file_path)
            )

    print(">>>> Found credentials!")
    print(
        ">>>> Client_id={}, client_secret={}****".format(client_id, client_secret[:5])
    )

    return client_id, client_secret


def do_api_post_query(uri, body, headers):
    """
    OAuth2 authentication requires a POST request with client credentials before accessing the API. 
    This POST request will return an Access Token which will be used for the API GET request.
    """
    url = urljoin(API_BASE_URL, uri)

    data = json.dumps(body).encode("utf-8")

    # HTTP POST request
    req = request.Request(url, data=data, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 201, resp_content

    # The server returned a JSON response
    content = json.loads(resp_content)

    return content


def do_api_get_query(uri, access_token, format='json'):
    """
    After receiving an Access Token, we can request information from the API.
    """
    url = urljoin(API_BASE_URL, uri)

    if format == 'json':
        headers = {
            "Authorization": "Bearer {}".format(access_token),
            "Accept": "application/json",
        }
    elif format == 'csv':
        headers = {
            "Authorization": "Bearer {}".format(access_token),
            "Accept": "text/csv"
        }

    #headers = {
    #    "Authorization": "Bearer {}".format(access_token),
    #    "Accept": "application/json",
    #}

    # HTTP POST request
    req = request.Request(url, headers=headers)
    try:
        response = request.urlopen(req)
    except HTTPError as e:
        print("HTTP Error: ", e.code)
        print(e.read())
        sys.exit(1)

    resp_content = response.read()

    # The server must return HTTP 201. Raise an error if this is not the case
    assert response.status == 200, resp_content

    # Storing response based on requested format
    if format == 'json':
        content = json.loads(resp_content)
    elif format == 'csv':
        content = resp_content

    return content


def get_access_token(client_id, client_secret):
    """
    Get a new access_token. Access tokens are the thing that applications use to make
    API requests. Access tokens must be kept confidential in storage.

    # Procedure:

    Do a POST query with `grantType` and `scopes` in the body. A basic authorization
    HTTP header is required. The "Basic" HTTP authentication scheme is defined in
    RFC 7617, which transmits credentials as `clientId:clientSecret` pairs, encoded
    using base64.
    """

    # Note: for the sake of this example, we choose to use the Python urllib from the
    # standard lib. One should consider using https://requests.readthedocs.io/

    payload = "{}:{}".format(client_id, client_secret).encode()
    headers = {
        "Authorization": b64encode(payload).decode(),
        "Accept": "application/json",
        "Content-Type": "application/json",
    }
    body = {
        "grantType": "clientCredentials",
        "scopes": "read:netbacks,read:access,read:prices,read:routes",
    }

    content = do_api_post_query(uri="/oauth/token/", body=body, headers=headers)

    print(
        ">>>> Successfully fetched an access token {}****, valid {} seconds.".format(
            content["accessToken"][:5], content["expiresIn"]
        )
    )

    return content["accessToken"]

## Reference Data fetching

In the fetch request, we use the URL:

__uri="/v1.0/netbacks/reference-data/"__

This query shows an overview on all available netbacks and according arb breakevens, showing all available ports and possible routes to/from these destinations (i.e. via Suez, Panama etc.).

In [22]:
# Define the function for listing all netbacks
def list_netbacks(access_token):
    """
    Fetch available routes. Return contract ticker symbols

    # Procedure:

    Do a GET query to /v1.0/routes/ with a Bearer token authorization HTTP header.
    """
    content = do_api_get_query(
        uri="/v1.0/netbacks/reference-data/", access_token=access_token
    )

    print(">>>> All the routes you can fetch")
    tickers = []
    fobPort_names = []

    availablevia = []

    for contract in content["data"]["staticData"]["fobPorts"]:
        tickers.append(contract["uuid"])
        fobPort_names.append(contract["name"])

        availablevia.append(contract["availableViaPoints"])

    reldates = content["data"]["staticData"]["sparkReleases"]

    dicto1 = content["data"]

    return tickers, fobPort_names, availablevia, reldates, dicto1

## N.B. Credentials

Here we call the above functions, and input the file path to our credentials.

N.B. You must have downloaded your client credentials CSV file before proceeding. Please refer to the API documentation if you have not dowloaded them already.  Instructions for downloading your credentials can be found here:

https://api.sparkcommodities.com/redoc#section/Authentication/Create-an-Oauth2-Client


In [None]:
# Input the path to your client credentials here
client_id, client_secret = retrieve_credentials(file_path="/tmp/client_credentials.csv")

# Authenticate:
access_token = get_access_token(client_id, client_secret)
print(access_token)

# Fetch all contracts:
tickers, fobPort_names, availablevia, reldates, dicto1 = list_netbacks(access_token)

>>>> Found credentials!
>>>> Client_id=01c23590-ef6c-4a36-8237-c89c3f1a3b2a, client_secret=80763****


>>>> Successfully fetched an access token eyJhb****, valid 604799 seconds.
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJ0eXBlIjoiYWNjZXNzVG9rZW4iLCJzdWIiOiIwMWMyMzU5MC1lZjZjLTRhMzYtODIzNy1jODljM2YxYTNiMmEiLCJzdWJUeXBlIjoib2F1dGgtY2xpZW50IiwiZXhwIjoxNzU1ODU3MDIzLCJoYXNoZWRTZWNyZXQiOiJwYmtkZjJfc2hhMjU2JDYwMDAwMCRoTXRMNDlrMUZUaVVzTE42Njlqc2pPJHVCSXNxcml5b1NHVzJTS1AvaHVLNHh3eTZ4d3VDN001aUdGRm43N2l4S1U9Iiwib3JnVXVpZCI6IjQ5MzhiMGJiLTVmMjctNDE2NC04OTM4LTUyNTdmYmQzNTNmZiIsImNsaWVudFR5cGUiOiJvYXV0aC1jbGllbnQifQ.f65SrH9FbI8DqkAfclLHftUXS6U7fqbmz77996ZZVQ0
>>>> All the routes you can fetch


In [24]:
# Prints the callable route options, corresponding to each Route ID number shown above
# I.e. availablevia[2] shows the available route options for tickers[2]

print(availablevia)

[['suez', None], ['panama', None], ['cogh', 'panama', 'suez', None], ['suez', None], ['cogh', 'panama', 'suez', None], ['cogh', 'panama', 'suez', None], ['cogh', 'suez', None], ['cogh', 'suez', None], ['cogh', None], ['cogh', 'panama', None], ['cogh', 'panama', 'suez', None], ['cogh', 'panama', 'suez', None], ['panama', None], ['cogh', 'suez', None], ['suez', None], ['cogh', 'panama', 'suez', None], ['magellan-straits', 'panama', None], ['cogh', 'panama', 'suez', None], ['cogh', 'suez', None], ['cogh', 'panama', 'suez', None], ['magellan-straits', 'panama', None], ['cogh', 'panama', 'suez', None], ['cogh', 'panama', 'suez', None], ['panama', None], ['cogh', 'suez', None], ['cogh', 'panama', 'suez', None], [None], ['cogh', None], ['cogh', 'panama', 'suez', None], ['cogh', 'panama', 'suez', None], ['panama', None], ['suez', None], ['cogh', 'magellan-straits', 'panama', None], [None], ['cogh', 'suez', None], ['cogh', 'panama', 'suez', None], ['cogh', 'suez', None]]


In [25]:
# Print the names of each of the ports, corresponding to Route ID and availablevia details shown above
# Some of these options are currently unavailable. 
# Please refer to the Netbacks tool on the Spark Platform to check which Netbacks are currently available

print(fobPort_names)

['Wheatstone', 'Woodfibre LNG', 'Corpus Christi', 'Bintulu', 'Lake Charles', 'Elba Island', 'Das Island', 'Gorgon', 'Congo LNG', 'Atlantic LNG', 'Bethioua', 'Cove Point', 'Peru LNG', 'Ras Laffan', 'Murmansk', 'Hammerfest', 'LNG Canada', 'Rio Grande LNG', 'Yamal', 'Plaquemines', 'Costa Azul', 'Altamira', 'Sabine Pass', 'Puerto Libertad', 'NWS', 'Delfin FLNG', 'Bioko', 'Bonny LNG', 'Freeport', 'Cameron (Liqu.)', 'Kamchatka', 'GLNG', 'Argentina LNG', 'Soyo', 'Qalhat', 'Calcasieu Pass', 'Tangguh']


In [26]:
# Shows the structure of the raw dictionary called
dicto1

{'staticData': {'viaPoints': [{'code': 'panama', 'name': 'Panama'},
   {'code': 'suez', 'name': 'Suez'},
   {'code': 'cogh', 'name': 'COGH'},
   {'code': 'magellan-straits', 'name': 'Strait of Magellan'}],
  'fobPorts': [{'uuid': '00398967-3ee1-4b26-bcdb-805ad19dbcce',
    'name': 'Wheatstone',
    'availableViaPoints': ['suez', None]},
   {'uuid': '00314d16-eada-4f37-bff3-d844708aeb45',
    'name': 'Woodfibre LNG',
    'availableViaPoints': ['panama', None]},
   {'uuid': '0030c461-9a63-403d-8f53-9327ea773517',
    'name': 'Corpus Christi',
    'availableViaPoints': ['cogh', 'panama', 'suez', None]},
   {'uuid': '003342b7-ba5b-4f0e-b6df-4d95837a5691',
    'name': 'Bintulu',
    'availableViaPoints': ['suez', None]},
   {'uuid': '003ff22f-77d8-413f-9997-2c6280e7c28c',
    'name': 'Lake Charles',
    'availableViaPoints': ['cogh', 'panama', 'suez', None]},
   {'uuid': '00352a22-e959-4233-b93d-d23a0da3dfed',
    'name': 'Elba Island',
    'availableViaPoints': ['cogh', 'panama', 'suez', N

### Reformatting

For a more accessible data format, we filter the data to only retrieve ports that have available Netbacks data. We then reformat this into a DataFrame.

In [27]:
# Define formatting data function
def format_store(available_via, fob_names, tickrs):
    dict_store = {
        "Index": [],
        "Ports": [],
        "Ticker": [],
        "Available Via": []
    }
    
    c = 0
    for a in available_via:
        ## Check which routes have non-empty Netbacks data and save indices
        if len(a) != 0:
            dict_store['Index'].append(c)

            # Use these indices to retrive the corresponding Netbacks info
            dict_store['Ports'].append(fob_names[c])
            dict_store['Ticker'].append(tickrs[c])
            dict_store['Available Via'].append(available_via[c])
        c += 1
    # Show available Netbacks ports in a DataFrame (with corresponding indices)
    dict_df = pd.DataFrame(dict_store)
    return dict_df


# Run formatting data function
available_df = format_store(availablevia,fobPort_names,tickers)

# View some of the dataframe
available_df.head()

Unnamed: 0,Index,Ports,Ticker,Available Via
0,0,Wheatstone,00398967-3ee1-4b26-bcdb-805ad19dbcce,"[suez, None]"
1,1,Woodfibre LNG,00314d16-eada-4f37-bff3-d844708aeb45,"[panama, None]"
2,2,Corpus Christi,0030c461-9a63-403d-8f53-9327ea773517,"[cogh, panama, suez, None]"
3,3,Bintulu,003342b7-ba5b-4f0e-b6df-4d95837a5691,"[suez, None]"
4,4,Lake Charles,003ff22f-77d8-413f-9997-2c6280e7c28c,"[cogh, panama, suez, None]"


## Fetching Arb Breakevens Data specific to one port

Now that we can see all the available Netbacks data available to us, we can start to define what ports we want to call Arb Breakevens data for (by referring to 'available_df' above).

The first step is to choose which port ID ('my_ticker') we want. We check what possible routes are available for this port ('possible_via') and then choose one ('my_via').

__This is where you should input the specific Netbacks parameters you want to see__

In [28]:
# Choose route ID and price release date

# Here we define which port we want
port = "Sabine Pass"
ti = int(available_df[available_df["Ports"] == port]["Index"])
my_ticker = tickers[ti]

print(my_ticker)

003dec0a-ce8f-41db-8c24-4d7ef6addf70


  ti = int(available_df[available_df["Ports"] == port]["Index"])


In [29]:
# See possible route passage options
possible_via = availablevia[tickers.index(my_ticker)]
print(possible_via)

['cogh', 'panama', 'suez', None]


In [30]:
# Choose route passage
my_via = possible_via[0]
print(my_via)

cogh


## Data Import Function

Defining functio to fetch Arb Breakevens data, as well as the data format of choice. In the fetch request, we use the URL:

__uri="/v1.0/netbacks/arb-breakevens/"__

We then print the output. The data function takes 6 inputs:
- __breakeven -__  which breakeven you're looking to pull, 'jkm-ttf' or 'freight' 
- __ticker -__  which FoB port ticker to use
- __via -__  which via point to use ('cogh', 'panama' etc.)
- __start -__  what date you want the historical data to start from. Format in yyyy-mm-dd
- __end -__  what date you want the historical data to end at (inclusive). Format in yyyy-mm-dd
- __format -__ which format you'd like to output the data, 'json' or 'csv'. Metadata is only available via JSON

__This function does not need to be altered by the user.__

In [31]:
## Defining the function
from io import StringIO

def fetch_breakevens(access_token, ticker, via=None, breakeven='jkm-ttf', start=None, end=None, format='json'):
    
    #For a route, fetch then display the route details
    #https://api.sparkcommodities.com/v1.0/netbacks/arb-breakevens/
    


    query_params = breakeven + '/' + "?fob-port={}".format(ticker)

    if via is not None:
        query_params += "&via-point={}".format(via)
    if start is not None:
        query_params += "&start={}".format(start)
    if end is not None:
        query_params += "&end={}".format(end)

    uri = "/v1.0/netbacks/arb-breakevens/{}".format(query_params)
    print(uri)
    content = do_api_get_query(
        uri="/v1.0/netbacks/arb-breakevens/{}".format(query_params),
        access_token=access_token, format=format,
    )
    
    if format == 'json':
        my_dict = content['data']
    else:
        my_dict = content.decode('utf-8')
        my_dict = pd.read_csv(StringIO(my_dict))

    return my_dict

## Calling that function and storing the output - JSON version
my_dict = fetch_breakevens(access_token, my_ticker, via='cogh', breakeven='jkm-ttf', start='2025-07-20', end='2025-07-30')


/v1.0/netbacks/arb-breakevens/jkm-ttf/?fob-port=003dec0a-ce8f-41db-8c24-4d7ef6addf70&via-point=cogh&start=2025-07-20&end=2025-07-30


In [32]:
# JSON data sample
my_dict

[{'releaseDate': '2025-07-30',
  'prices': [{'deliveryMonthIndex': 'M+2',
    'deliveryMonthStart': '2025-09',
    'deliveryMonthArb': None,
    'jkmTtfSpreadBreakeven': None,
    'deliveryDate': '2025-09-15',
    'nweCargoLoadDate': '2025-08-29',
    'neaCargoLoadDate': '2025-08-02'},
   {'deliveryMonthIndex': 'M+3',
    'deliveryMonthStart': '2025-10',
    'deliveryMonthArb': '-0.159',
    'jkmTtfSpreadBreakeven': '0.389',
    'deliveryDate': '2025-10-15',
    'nweCargoLoadDate': '2025-09-28',
    'neaCargoLoadDate': '2025-09-01'},
   {'deliveryMonthIndex': 'M+4',
    'deliveryMonthStart': '2025-11',
    'deliveryMonthArb': '-0.163',
    'jkmTtfSpreadBreakeven': '0.288',
    'deliveryDate': '2025-11-15',
    'nweCargoLoadDate': '2025-10-29',
    'neaCargoLoadDate': '2025-10-02'},
   {'deliveryMonthIndex': 'M+5',
    'deliveryMonthStart': '2025-12',
    'deliveryMonthArb': '-0.138',
    'jkmTtfSpreadBreakeven': '0.458',
    'deliveryDate': '2025-12-15',
    'nweCargoLoadDate': '2025-1

## CSV example

In [33]:
# calling the same data in CSV format. This option automatically converts the data to a pandas dataframe.
df = fetch_breakevens(access_token, my_ticker, via='cogh', breakeven='jkm-ttf', start='2025-07-20', end='2025-07-30', format='csv')

/v1.0/netbacks/arb-breakevens/jkm-ttf/?fob-port=003dec0a-ce8f-41db-8c24-4d7ef6addf70&via-point=cogh&start=2025-07-20&end=2025-07-30


In [34]:
# CSV data sample as a Pandas DataFrame
df

Unnamed: 0,ReleaseDate,DeliveryMonthIndex,DeliveryMonthStart,DeliveryDate,NWECargoLoadDate,NEACargoLoadDate,DeliveryMonthArb,JKMTTFSpreadBreakeven,FobPortUUID,FobPortName,ViaPoint
0,2025-07-30,M+2,2025-09,2025-09-15,2025-08-29,2025-08-02,,,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
1,2025-07-30,M+3,2025-10,2025-10-15,2025-09-28,2025-09-01,-0.159,0.389,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
2,2025-07-30,M+4,2025-11,2025-11-15,2025-10-29,2025-10-02,-0.163,0.288,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
3,2025-07-30,M+5,2025-12,2025-12-15,2025-11-28,2025-11-01,-0.138,0.458,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
4,2025-07-30,M+6,2026-01,2026-01-15,2025-12-29,2025-12-02,-0.173,0.598,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
...,...,...,...,...,...,...,...,...,...,...,...
75,2025-07-21,M+7,2026-02,2026-02-15,2026-01-29,2026-01-02,-0.395,0.818,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
76,2025-07-21,M+8,2026-03,2026-03-15,2026-02-26,2026-01-30,-0.548,0.723,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
77,2025-07-21,M+9,2026-04,2026-04-15,2026-03-29,2026-03-02,-0.480,0.643,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
78,2025-07-21,M+10,2026-05,2026-05-15,2026-04-28,2026-04-01,-0.221,0.498,003dec0a-ce8f-41db-8c24-4d7ef6addf70,sabine-pass,cogh
