### Authenticating with the M2M API

Login in at https://ooinet.oceanobservatories.org/ and obtain your <b>API username and API token</b> under your profile (top right corner).

In [14]:
username = ''
token = ''

Optionally, you can handle authentication outside the notebook by setting up a .netrc file in your home directory and loading it with your bash profile. Open your terminal
```
$ touch .netrc
$ chmod 700 .netrc
$ vim .netrc

```
Add the following your your .netrc file:

```
machine ooinet.oceanobservatories.org
login OOIAPI-TEMPD1SPK4K0X
password TEMPCXL48ET2XT
```

Use your username and token. Save the file and uncomment the following cell.

In [15]:
# import netrc
# netrc = netrc.netrc()
# remoteHostName = "ooinet.oceanobservatories.org"
# info = netrc.authenticators(remoteHostName)
# username = info[0]
# token = info[2]

### Setting up the data request url

The ingredients entered below and used to build the m2m data request url can be found at
http://ooi.visualocean.net/instruments/view/RS03ECAL-MJ03E-06-BOTPTA302. You need the reference designator, delivery method and stream name. To request all available data, no start and end time needs to be specified.

In [16]:
reference_designator = 'RS03ECAL-MJ03E-06-BOTPTA302'
method = 'streamed'
stream = 'botpt_lily_sample'
# beginDT = '2014-09-27T01:01:01.000Z' #example format to specify a specific time range
beginDT = None
endDT = None

In the next step we will build the data request url and specify the parameters.

In [17]:
base_url = 'https://ooinet.oceanobservatories.org/api/m2m/12576/sensor/inv'

subsite = reference_designator[:8]
node = reference_designator[9:14]
sensor = reference_designator[15:27]

data_request_url ='/'.join((base_url,subsite,node,sensor,method,stream))

params = {
    'beginDT':beginDT,
    'endDT':endDT,   
}

### Requesting the data

Next we will send off the request. When you send in a request to the api, you receive a response that lets you know if the response was successful. A 200 level code means OK, while a 400 or 500 level code means somethign went wrong.

In [18]:
import requests
import time

In [19]:
r = requests.get(data_request_url, params=params, auth=(username, token))
r

<Response [200]>

Next we can examine the content of the response, which tells us the url locations where the data is being delivered to. You could also specify it as r.text(), instead of r.json(), but the json format is generally preferred, as it is easier to parse.

In [20]:
urls = r.json()
urls

{u'allURLs': [u'https://opendap.oceanobservatories.org/thredds/catalog/ooi/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample/catalog.html',
  u'https://opendap.oceanobservatories.org/async_results/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample'],
 u'numberOfSubJobs': 8795,
 u'outputURL': u'https://opendap.oceanobservatories.org/thredds/catalog/ooi/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample/catalog.html',
 u'requestUUID': u'670da58f-d2c6-4f89-910d-eb195b2eb2b6',
 u'sizeCalculation': 4476292936,
 u'timeCalculation': 18353}

The first url in the response is the location on THREDDS where the data is being served.

In [21]:
print(urls['allURLs'][0])

https://opendap.oceanobservatories.org/thredds/catalog/ooi/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample/catalog.html


The second url in the response is the regular APACHE server location for the data. 

In [33]:
print(urls['allURLs'][1])

https://opendap.oceanobservatories.org/async_results/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample


We will use this second location to programatically check for a status.txt file to be written, containing the text 'request completed'. This indicates that the request is completed and the system has finished writing out the data to this location. This step may take a few minutes. 

In [23]:
%%time
check_complete = urls['allURLs'][1] + '/status.txt'
for i in range(1800): 
    r = requests.get(check_complete)
    if r.status_code == requests.codes.ok:
        print('request completed')
        break
    else:
        time.sleep(1)

request completed
CPU times: user 24.2 s, sys: 1.71 s, total: 25.9 s
Wall time: 24min 53s


### Reading the data into your notebook without downloading the data

Instead of downloading the data and then reading it into your notebook, you can also just read the data directly from the THREDDS location `urls['allURLs'][0]`.

In [25]:
url = urls['allURLs'][0]
tds_url = 'https://opendap.oceanobservatories.org/thredds/dodsC'
datasets = requests.get(url).text
thredds_urls = re.findall(r'href=[\'"]?([^\'" >]+)', datasets)
data_urls = []
for i in thredds_urls:
    if i[-3:] == '.nc':
        i = i[21:]
        data_urls.append(i)
datasets = [os.path.join(tds_url, i) for i in data_urls]
datasets

[u'https://opendap.oceanobservatories.org/thredds/dodsC/ooi/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample/deployment0001_RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample_20180113T000000-20180226T202752.nc',
 u'https://opendap.oceanobservatories.org/thredds/dodsC/ooi/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample/deployment0001_RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample_20170830T000000-20180112T235959.nc',
 u'https://opendap.oceanobservatories.org/thredds/dodsC/ooi/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample/deployment0001_RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample_20170420T000000-20170829T235959.nc',
 u'https://opendap.oceanobservatories.org/thredds/dodsC/ooi/ooidatateam@gmail.com/20180226T202756-RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_sample/deployment0001_RS03ECAL-MJ03E-06-BOTPTA302-streamed-botpt_lily_samp

Now we have a list of files on THREDDS, that xarray can read directly into the notebook. We will look at the first two files for the purpose of check x and y tilt labelling. 

In [26]:
import xarray as xr
import pandas as pd

In [29]:
%%time
from dask.diagnostics import ProgressBar
with ProgressBar():
    ds = xr.open_mfdataset(datasets[0:2])
    ds = ds.swap_dims({'obs': 'time'})
    ds = ds.chunk({'time': 100})

[########################################] | 100% Completed |  3.2s
CPU times: user 1min 38s, sys: 4.33 s, total: 1min 42s
Wall time: 1min 42s


In [30]:
ds

<xarray.Dataset>
Dimensions:                      (time: 15192311)
Coordinates:
    obs                          (time) int64 dask.array<shape=(15192311,), chunksize=(100,)>
  * time                         (time) datetime64[ns] 2018-01-13 ...
    lat                          (time) float64 dask.array<shape=(15192311,), chunksize=(100,)>
    lon                          (time) float64 dask.array<shape=(15192311,), chunksize=(100,)>
Data variables:
    deployment                   (time) int32 dask.array<shape=(15192311,), chunksize=(100,)>
    id                           (time) |S64 dask.array<shape=(15192311,), chunksize=(100,)>
    compass_direction            (time) float64 dask.array<shape=(15192311,), chunksize=(100,)>
    date_time_string             (time) object dask.array<shape=(15192311,), chunksize=(100,)>
    driver_timestamp             (time) datetime64[ns] dask.array<shape=(15192311,), chunksize=(100,)>
    ingestion_timestamp          (time) datetime64[ns] dask.array<s

In [31]:
ds['lily_x_tilt']

<xarray.DataArray 'lily_x_tilt' (time: 15192311)>
dask.array<shape=(15192311,), dtype=float64, chunksize=(100,)>
Coordinates:
    obs      (time) int64 dask.array<shape=(15192311,), chunksize=(100,)>
  * time     (time) datetime64[ns] 2018-01-13 2018-01-13T00:00:01 ...
    lat      (time) float64 dask.array<shape=(15192311,), chunksize=(100,)>
    lon      (time) float64 dask.array<shape=(15192311,), chunksize=(100,)>
Attributes:
    comment:                  Seafloor High-Resolution Tilt measurements are ...
    long_name:                High-Resolution X-Tilt
    data_product_identifier:  BOTTILT-XTLT_L0
    units:                    urad
    _ChunkSizes:              10000

In [32]:
ds['lily_y_tilt']

<xarray.DataArray 'lily_y_tilt' (time: 15192311)>
dask.array<shape=(15192311,), dtype=float64, chunksize=(100,)>
Coordinates:
    obs      (time) int64 dask.array<shape=(15192311,), chunksize=(100,)>
  * time     (time) datetime64[ns] 2018-01-13 2018-01-13T00:00:01 ...
    lat      (time) float64 dask.array<shape=(15192311,), chunksize=(100,)>
    lon      (time) float64 dask.array<shape=(15192311,), chunksize=(100,)>
Attributes:
    comment:                  Seafloor High-Resolution Tilt measurements are ...
    long_name:                High-Resolution Y-Tilt
    data_product_identifier:  BOTTILT-YTLT_L0
    units:                    urad
    _ChunkSizes:              10000