-
Notifications
You must be signed in to change notification settings - Fork 0
Using REST API to access NES LTER data
A REST API provides a set of consistent URLs that, when fetched, return data in machine-readable form that may be used by application code.
Note: The NES-LTER REST API is currently experimental and subject to change. Not all data are currently available. The number of data types will increase as we continue to develop the backend code and flesh out the API. During the development phase, the server will sometimes be taken down for software updates and so will be briefly unavailable. In addition, there may be bugs and errors encountered even when making correct requests.
The REST API provides a set of URLs from which your code can directly access conveniently-formatted NES-LTER datasets. To do this, construct a URL according to the instructions below, and use one of the provided code snippets to fetch the data.
Several data types are currently provided per cruise:
List of cruises available (csv) (json)
- CTD casts and metadata related to them
- Underway data
- Station metadata
- Event logs
- Nutrients
- Chlorophyll
- HPLC
In addition to fetching the data directly into your code, you can access the URL with your browser and download the data. The downloaded file will have a filename that distinguishes it from other similar files.
When you request data from a URL, there are two reasons why you might get a 404 "not found" response. One is if your URL is not correctly formatted according to the REST URL syntax; the other is that the requested data are not available.
You can view the README for each data type to learn more about what data are available.
URLs begin with a standard URL prefix followed by a data type indicator followed by the cruise ID followed by parameters including a final filename. For example, to fetch CTD cast 8 from en608, use the following URL:
Example URLs for all data types are given below.
CTD metadata provides a list of casts for a given cruise along with metadata about each cast (e.g., time, lat/lon, nearest station). For example:
CTD bottle data provides a list of niskins along with data from the CTD as it has been provided by the SeaBird CTD processing software. For example:
CTD bottle summary data provides a concise summary of when and where each bottle was fired. For example:
https://nes-lter-data.whoi.edu/api/ctd/en627/bottle_summary.csv
CTD profiles for each cast are available at URLs like:
Please note that column headers for CTD bottle data and profiles are described in the SeaBird Data Processing Software Manual, Appendix VI: Output Variable Names, e.g., pp. 161 - 174 at this link.
Underway 1-minute data is provided per-cruise. Use a URL based on the following pattern:
Station metadata briefly describes each station along with pertinent information like lat/lon. Use the following URL pattern:
Station metadata is provided per-cruise because the stations may change from cruise to cruise.
Event logs are available per-cruise. The event logs are corrected to align with the other products, including correction of timestamps and positions.
For example:
Use the pattern as above for underway/stations/events, but replace with nut, chl, or hplc.
For example:
In general, to view the README, simply type /README at the end of your URL pattern (link to code with the complete set of README URL patterns).
For example:
The following code snippets show how to read NES-LTER data into your preferred language using the REST API. To use each example, change the URL to the URL for whichever dataset you want.
If the URL pattern is incorrect or no data is available, the server will respond with HTTP 404, "not found".
To read data from one of these URLs from MATLAB, use the following two lines of code.
myreadtable = @(filename)readtable(filename,'Delimiter',',');
options = weboptions('ContentType', 'table', 'Timeout', 30,'ContentReader',myreadtable);
mytable = webread('https://nes-lter-data.whoi.edu/api/ctd/en608/cast_7.csv', options);
The result is a MATLAB table.
Note that MATLAB's default timeout is 5s and sometimes the API does not return quickly enough, hence the
Timeout
setting inweboptions
.
import pandas as pd
pd.read_csv('https://nes-lter-data.whoi.edu/api/ctd/en608/cast_7.csv')
In R, use read.csv
:
read.csv('https://nes-lter-data.whoi.edu/api/ctd/en608/cast_7.csv')
In Julia use the HTTP, CSV, and DataFrames packages:
using HTTP
using CSV
using DataFrames
DataFrame!(CSV.File(HTTP.get('https://nes-lter-data.whoi.edu/api/ctd/en608/cast_7.csv').body))
When possible, dates and times are provided in ISO 8601 format in the UTC timezone. Using those fields typically requires additional parsing, and default rules for parsing dates and times in tools such as Excel and MATLAB will often fail or produce unexpected results because of limitations of those tools. For some datatypes, date and/or time are provided in separate columns in various other formats (e.g., seconds since the beginning of a CTD cast).
Here are some code snippets for parsing ISO 8601 formatted timestamps:
iso8601format = 'yyyy-mm-dd hh:MM:ss'
mydatenum = datenum('2019-02-22 12:34:56+00:00', iso8601format)
Use Pandas to_datetime
:
import pandas as pd
mydatetime = pd.to_datetime('2019-02-22 12:34:56+00:00', utc=True)
mydatetime <- as.POSIXlt("2019-02-22 12:34:56+00:00", tz="UTC")
or
iso8601format <- "%Y-%m-%d %H:%M:%S"
mydatetime <- strptime("2019-02-22 12:34:56+00:00", iso8601format, tz="UTC")
The following formula will convert an ISO 8601 timestamp to an Excel timestamp value, see this answer for details.
=DATEVALUE(MID(A2,1,10))+TIMEVALUE(MID(A2,12,8))
REST endpoints respond with JSON if no extension is specified or if the extension provided is "json". The JSON format is the one provided by Pandas to_json.
Some JSON-only endpoints are/will be included. For example this endpoint returns a list of CTD cast numbers for en608:
And this endpoint returns a list of cruises: