# onc library tutorial


The _onc_ library is developed based on the Python _requests_ library, which is a popular library for making HTTP requests. In fact, you can use _requests_ directly to interact with the Oceans 3.0 API. But there are cases when you will find the _onc_ library useful (boolean parse, one-click data product download, ability to automatically download all pages, etc.). This tutorial will demonstrate both versions.

:::{Tip}
This is a Jupyter notebook. You can download the file [here](https://github.com/OceanNetworksCanada/api-python-client/blob/main/doc/source/Tutorial/onc_Library_Tutorial.ipynb).
:::


In [None]:
# Install some libraries

# 1. onc: this is an onc library tutorial, right?
# 2. request: an alternative (vanilla) way to make HTTP requests to Oceans 3.0 API.
# 3. pandas: because it's useful and fun!
# 4. python-dotenv: a handy library to hide the token outside the notebook.

import sys

!{sys.executable} -m pip install --upgrade requests pandas python-dotenv onc

In [31]:
# Get the token from your Oceans 3.0 profile page
# Put "TOKEN=[YOUR_TOKEN]" in a .env file.

from dotenv import load_dotenv
import os

load_dotenv(override=True)
token = os.getenv(
    "TOKEN",
    "",  # Put your token here if you don't want to use .env file to store the token.
)

In [None]:
import requests
import pandas as pd
from onc import ONC

onc = ONC(token)

# For not overflowing the max-width of sphinx-rtd-theme
pd.set_option("display.max_colwidth", 30)
pd.set_option("display.max_columns", 5)
pd.set_option("display.max_rows", 5)

## 1. Searching with discovery services

To download data from Ocean Networks Canada's database (Oceans 3.0) , you need to specify the type of data you are interested in and where in particular (i.e. location, from specific instrument (device)) it originates from.

In the Oceans 3.0 API, there are a unique codes that identify every location, device, property, data product type, etc. You include these codes in a group of filters to retrieve the information you're interested in.

The Oceans 3.0 API **Discovery services** allow you to explore the hierarchy of ONC's database to obtain the codes required for your filters to obtain the information/data you are interested in (they work like a "search" function).

The example below uses the _getLocations_ method, which is querying the locations database tables that include _"Bullseye"_ in their name (i.e. _"Clayoquot Slope Bullseye Vent"_).
It returns a list with all locations that match the search filters provided.


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter
params = {
    "locationName": "Bullseye",
}

# 2. Call methods in the onc library
onc.getLocations(params)

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter
params_requests = {
    "locationName": "Bullseye",
    "token": token,
}

# 2. Define your base url for this query
url = "http://data.oceannetworks.ca/api/locations"

# 3. Run your request
r = requests.get(url, params=params_requests)

# 4. Parse the json file
r.json()

Each entry of this list contains more meta data information for that location, e.g. the locationName, the geographical coordinates and depth, a description field and the URL for **Oceans 3.0 Data Search tool**.
The parameter **locationCode** contains the string "NC89", which is needed for the next steps.


### What device categories are available here at the location NC89?


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter
params = {
    "locationCode": "NC89",
}

# 2. Call methods in the onc library
result = onc.getDeviceCategories(params)

# 3. Read it into a DataFrame
pd.DataFrame(result)

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter
params_requests = {
    "locationCode": "NC89",
    "token": token,
}

# 2. Define your base url for this query
url = "http://data.oceannetworks.ca/api/deviceCategories"

# 3. Run your request
r = requests.get(url, params=params_requests)

# 4. Read it into a DataFrame
pd.DataFrame(r.json())

### What properties are available for the device category CTD at this location NC89?


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter
params = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
}

# 2. Call methods in the onc library
r = onc.getProperties(params)

# 3. Read it into a DataFrame
pd.DataFrame(r)

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter
params_requests = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "token": token,
}

# 2. Define your base url for this query
url = "http://data.oceannetworks.ca/api/properties"

# 3. Run your request
r = requests.get(url, params=params_requests)

# 4. Read it into a DataFrame
pd.DataFrame(r.json())

### What data product types are available for the device category CTD at this location NC89?


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter
params = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
}

# 2. Call methods in the onc library
r = onc.getDataProducts(params)

# 3. Read it into a DataFrame
pd.DataFrame(r)

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter
params_requests = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "token": token,
}

# 2. Define your base url for this query
url = "http://data.oceannetworks.ca/api/dataProducts"

# 3. Run your request
r = requests.get(url, params=params_requests)

# 4. Read it into a DataFrame
pd.DataFrame(r.json())

## 2. Downloading data

Once you determine the exact filters that identify the information you are interested in, there are different methods available to download data.

1. Near real-time scalar data sensor readings for a given timeframe
2. Near real-time raw data for a given timeframe
3. Download archived files containing raw data or processed data
4. Download data products that are also available via Oceans 3.0 Data Search Tool


### 2.1 Near real-time scalar data download

In this example we want to download one minute of **Pressure** sensor data from a **CTD** at location "Bullseye"" (locationCode: **NC89**)


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter
params = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dateFrom": "2020-06-20T00:00:00.000Z",
    "propertyCode": "pressure",
    "dateTo": "2020-06-20T00:01:00.000Z",
}

# 2. Call methods in the onc library
r = onc.getDirectByLocation(params)

# 3. Return the dictionary keys (fields) of the query to get a sense what is contained in your returned message
print(r.keys())

# 4. Read the data from parameter field "sensorData" it into a DataFrame - this is the data from your requested "Pressure" sensor
pressure = pd.DataFrame(r["sensorData"][0]["data"])
pressure

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter to obtain scalar data for 10 seconds
params_requests = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dateFrom": "2020-06-20T00:00:00.000Z",
    "propertyCode": "pressure",
    "dateTo": "2020-06-20T00:00:10.000Z",
    "token": token,
}

# 2. Define your base url for this query
url = "https://data.oceannetworks.ca/api/scalardata/location"

# 3. Run your request
r = requests.get(url, params_requests)

# 4. Return the dictionary keys (fields) of the query to get a sense what is contained in your returned message. Note this is a JSON object for this method.
print(r.json().keys())

# 5. Read the data from parameter field "sensorData" it into a DataFrame - this is the data from your requested "Pressure" sensor
pressure = pd.DataFrame(r.json()["sensorData"][0]["data"])
pressure

### 2.2 Near real-time raw data readings

In this example we want to download one minute of raw data from a **CTD** at location "Bullseye"" (locationCode: **NC89**)


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter
params = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dateFrom": "2020-06-20T00:00:00.000Z",
    "dateTo": "2020-06-20T00:01:00.000Z",
}

# 2. Call methods in the onc library
r = onc.getDirectRawByLocation(params)

# 3. Return the dictionary keys (fields) of the query to get a sense what is contained in your returned message
print(r.keys())

# 4. Read the content of parameter field data it into a DataFrame. The column "readings" contains the raw data and column "times" the Oceans 3.0 timestamps associated with each data reading.
pd.DataFrame(r["data"])

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter
params_requests = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dateFrom": "2020-06-20T00:00:00.000Z",
    "dateTo": "2020-06-20T00:00:10.000Z",
    "token": token,
}

# 2. Define your base url for this query
url = "http://data.oceannetworks.ca/api/rawdata/location"

# 3. Run your request
r = requests.get(url, params_requests)

# 4. Read it into a DataFrame
pd.DataFrame(r.json()["data"])

#### 2.2.1. Downloading more data

:::{Admonition} Pagination of response due to too many data rows
:class: note

If the row of the data is above 100,000, not all the data will be returned. The rest of the data can be queried based on the _next_ key in the response.

1. If you use **onc**.

`getDirectRawByLocation` supports a boolean `allPages` parameter. When set to `True`, it will try to retrieve all the pages.

2. If you use **requests**.

You have to manually query the next pages until the `next` key in the response json is `None`, and concatenate all the data together.
:::


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter with a longer date range (2 days of data)
params_longer_range = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dateFrom": "2020-06-20T00:00:00.000Z",
    "dateTo": "2020-06-22T00:00:00.000Z",
}

# 2. Call methods in the onc library
r = onc.getDirectRawByLocation(params_longer_range, allPages=True)

# 3. Read it into a DataFrame
pd.DataFrame(r["data"])

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter with a longer date range
params_requests_longer_range = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dateFrom": "2020-06-20T00:00:00.000Z",
    "dateTo": "2020-06-21T20:00:00.000Z",
    "token": token,
}

# 2. Define your base url for this query
url = "http://data.oceannetworks.ca/api/rawdata/location"

# 3. Run your request (the url is still the same)
r = requests.get(url, params_requests_longer_range)

# 4. Read it into a DataFrame
pd.DataFrame(r.json()["data"])

#### Now check the parameter field "next"


In [None]:
r.json()["next"]

##### Update the **dateFrom** parameter to get the next page


In [None]:
params_requests_longer_range["dateFrom"] = r.json()["next"]["parameters"]["dateFrom"]
r = requests.get(url, params_requests_longer_range)
pd.DataFrame(r.json()["data"])

#### Now check the parameter field "next"


In [None]:
print(r.json()["next"])

### 2.3. Downloading archived files

A faster way to download data products and processed data files that are available in Oceans 3.0 (if it suits your needs) is to leverage how ONC scripts auto-generate and archive data products of different types at set time intervals. You can directly download these data product files from our files archive, as long as you know their unique filename.

In the following example, we get the list of archived files available for a camera (deviceCategoryCode: **VIDEOCAM**) at Ridley Island (locationCode:**RISS**) for 5-minute timerange.


<font color=#2E8B57>Using ONC library</font>


In [None]:
# 1. Define your filter parameter
params = {
    "locationCode": "RISS",
    "deviceCategoryCode": "VIDEOCAM",
    "dateFrom": "2016-12-01T00:00:00.000Z",
    "dateTo": "2016-12-01T00:05:00.000Z",
}

# 2. Call methods in the onc library
r = onc.getListByLocation(params)

# 3. This is the list of archived files:
r["files"]

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter
params_requests = {
    "locationCode": "RISS",
    "deviceCategoryCode": "VIDEOCAM",
    "dateFrom": "2016-12-01T00:00:00.000Z",
    "dateTo": "2016-12-01T00:05:00.000Z",
    "token": token,
}

# 2. Define your base url for this query
url_location = "https://data.oceannetworks.ca/api/archivefile/location"

# 3. Run your request
r = requests.get(url_location, params_requests)

# 4. This is the list of archived files:
r.json()["files"]

Once we have the file names, you can use the method **"getFile()"** to download individual files:


<font color=#2E8B57>Using ONC library</font>


In [32]:
# 1. Call methods in the onc library with the filename. The file is downloaded in the output folder.
onc.getFile("AXISQ6044PTZACCC8E334C53_20161201T000001.000Z.jpg", overwrite=True)

{'url': 'https://data.oceannetworks.ca/api/archivefiles?token=43edcb6d-a3b0-44f0-9a57-ecac79ab3cc3&method=getFile&filename=AXISQ6044PTZACCC8E334C53_20161201T000001.000Z.jpg',
 'status': 'completed',
 'size': 113511,
 'downloadTime': 0.235,
 'file': 'AXISQ6044PTZACCC8E334C53_20161201T000001.000Z.jpg'}

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Define your filter parameter with the filename
params = {
    "filename": "AXISQ6044PTZACCC8E334C53_20161201T000001.000Z.jpg",
    "token": token,
}

# 2. Define your base url for this query
url_location = "https://data.oceannetworks.ca/api/archivefile/download"

# 3. Run your request (the url is still the same)
r = requests.get(url_location, params)

# 4. Save the file
# with open(params["filename"], "wb") as f:
#     f.write(r.content)

### 2.4 Downloading data products

Other than using Oceans 3.0 [Data Search](https://data.oceannetworks.ca/DataSearch), we can request the ONC server to generate a **data product**. This is done through the data product delivery services methods.

:::{Hint}
This service should **ONLY** be used when the requested files are not already provided using the **ArchiveFiles** services (see **2.3** above). The data product delivery services will re-generate files using ONC's web machines and this process can often take very long time to generate these results. If you request data files for very long-time ranges and large file sizes, ONCs system will sometimes slow down and stall and requires some manual actions.

We therefore encourage you to check other services before requesting data through this service. If you are unsure what to use feel free to contact u.
:::

**This process will require three steps before you will be able to see the downloaded data product on your computer:**

1. **Request** the data.
2. **Run** the Request.
3. **Download** the data.

The following example downloads two **PNG** files with plots for 30 minutes of data from a CTD (find them in the **"output"** folder beside this jupyter notebook). The filter includes codes for **location**, **deviceCategory**, and **dataProduct**, as well as the file **extension** and a time interval. They also include a couple of filters to configure this specific data product type (starting with the **"dpo\_"** prefix) which can be obtained from the [Data Product Options documentation](https://wiki.oceannetworks.ca/display/O2A/Available+Data+Products). You can download more than 120 different types of data products including audio & video.


<font color=#2E8B57>Using ONC library</font>

ONCs library contains all three steps (methods) in one call. So this is the preferred library to use over the requests library.


In [None]:
# 1. Define your filter parameter
params = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dataProductCode": "TSSP",
    "extension": "png",
    "dateFrom": "2017-01-19T00:00:00.000Z",
    "dateTo": "2017-01-19T00:30:00.000Z",
    "dpo_qualityControl": "1",
    "dpo_resample": "none",
}

# 2. Call methods in the onc library
onc.orderDataProduct(params)

<font color=#2E8B57>Using requests library</font>


In [None]:
# 1. Request the data

# Define your base url for this query
url_request = "https://data.oceannetworks.ca/api/dataProductDelivery/request"

# Define your filter parameter
params_requests = {
    "locationCode": "NC89",
    "deviceCategoryCode": "CTD",
    "dataProductCode": "TSSP",
    "extension": "png",
    "dateFrom": "2017-01-19T00:00:00.000Z",
    "dateTo": "2017-01-19T00:30:00.000Z",
    "dpo_qualityControl": "1",
    "dpo_resample": "none",
    "token": token,
}

request = requests.get(url_request, params=params_requests)
request.json()

In [None]:
##### requests continued #####

# 2. Run the request

# Note: you have to execute this cell multiple times until the return shows the "status": "complete"
# Note: Depending on your request, you can have more than one file ('fileCount').
#       You will need to individually download these files by using the index parameter.

url_run = "https://data.oceannetworks.ca/api/dataProductDelivery/run"

requestID = request.json()["dpRequestId"]

params_requests = {
    "dpRequestId": requestID,
    "token": token,
}

r = requests.get(url_run, params_requests)
r.json()

In [None]:
##### requests continued #####

# Find the RunID for the next step
RunId = r.json()[0]["dpRunId"]
RunId

In [None]:
##### requests continued #####

# 3. Download the data

url_download = "https://data.oceannetworks.ca/api/dataProductDelivery/download"

params_requests = {
    "dpRunId": RunId,
    "token": token,
    "index": "1",
}

r = requests.get(url_download, params_requests)
r  # Rerun this cell until the response code is 200.

# r.headers["Content-Disposition"] has the format "attachement; filename=XXX.png"
# with open(r.headers["Content-Disposition"][22:], 'wb') as f:
#     f.write(r.content)

:::{admonition} Another option to get the data
:class: tip

Obtain your downloads from your user FTP directory (More -> User Directory) in Oceans 3.0.
Navigate to the folder that contains the runId: You will see the files in this folder.

![UserDirectory.png](../_static/Tutorial/onc_Library_Tutorial/UserDirectory.png)

:::
