![header](http://eurogoos.eu/wp-content/uploads/SOCIB-logo.jpg)

# SOCIB API TRAINING
<div style="text-align: right"><i> 01-Part-one-out-of-04 </i></div>

---

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item">
    <li><span><a href="#Introduction" data-toc-modified-id="Introduction-1">
        <span class="toc-item-num">1&nbsp;&nbsp;</span>Introduction</a></span>
    </li>
    <li><span><a href="#Setup" data-toc-modified-id="Setup-2">
        <span class="toc-item-num">2&nbsp;&nbsp;</span>Setup</a></span>
        <ul class="toc-item">
            <li>
                <span><a href="#Importing-modules" data-toc-modified-id="Importing-modules-2.1">
                    <span class="toc-item-num">2.1&nbsp;&nbsp;</span>Importing modules</a></span>
            </li>
            <li>
                <span><a href="#API-token" data-toc-modified-id="API-token-2.2">
                    <span class="toc-item-num">2.2&nbsp;&nbsp;</span>API token</a></span>
            </li>
        </ul>
    <li>
        <span><a href="#Query-SOCIB-API-for-entries/files:" data-toc-modified-id="Query-SOCIB-API-for-entries/files:"><span class="toc-item-num">3&nbsp;&nbsp;</span>Querying SOCIB API for entries/files:</a></span>
        <ul>
            <li><span><a href="#Generic-query" data-toc-modified-id="Generic-query"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Generic query</a></span></li>
            <li><span><a href="#Custom-query" data-toc-modified-id="Custom-query"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Custom query</a></span></li>
                <ul>
                    <li><span><a href="#Time-range" data-toc-modified-id="Time-range"><span class="toc-item-num">3.2.1&nbsp;&nbsp;</span>Time range</a></span></li>
                    <li><span><a href="#Update-time" data-toc-modified-id="Update-time"><span class="toc-item-num">3.2.2&nbsp;&nbsp;</span>Update time</a></span></li>
                    <li><span><a href="#Version-datetime" data-toc-modified-id="Version-datetime"><span class="toc-item-num">3.2.3&nbsp;&nbsp;</span>Version datetime</a></span></li>
                    <li><span><a href="#Processing-level" data-toc-modified-id="Processing-level"><span class="toc-item-num">3.2.4&nbsp;&nbsp;</span>Processing-level</a></span></li>
                    <li><span><a href="#Data-mode" data-toc-modified-id="Data-mode"><span class="toc-item-num">3.2.5&nbsp;&nbsp;</span>Data-mode</a></span></li>
                    <li><span><a href="#Feature-type" data-toc-modified-id="Feature-type"><span class="toc-item-num">3.2.6&nbsp;&nbsp;</span>Feature-type</a></span></li>
                    <li><span><a href="#Instrument-type" data-toc-modified-id="Instrument-type"><span class="toc-item-num">3.2.7&nbsp;&nbsp;</span>Instrument-type</a></span></li>
                    <li><span><a href="#Platform-type" data-toc-modified-id="Platform-type"><span class="toc-item-num">3.2.8&nbsp;&nbsp;</span>Platform-type</a></span></li>
                    <li><span><a href="#Data-entity" data-toc-modified-id="Data-entity"><span class="toc-item-num">3.2.9&nbsp;&nbsp;</span>Data entity</a></span></li></ul>
        </ul>
     </li>
    <li><span><a href="#Next-tutorial" data-toc-modified-id="Next-tutorial">
        <span class="toc-item-num">4&nbsp;&nbsp;</span>Next tutorial</a></span></li>
    </ul>
</div>

---

# ENTRIES

## Introduction 

SOCIB API is a door users can knock-on in order to get information about the Balearic Islands Coastal Ocean Observing and Forecast System (SOCIB). SOCIB API is represented by an generic url (SOCIB API url). The elements that trigger a response when added to the generic API url are called `ENDPOINTS`.
Among the present posibilities:
<ul>
    <li>measured variables (<span class="alert-info">/standard-variables/</span>)</li>
    <li>stock of instruments (<span class="alert-info">/instrument-types/</span>) and platforms(<span class="alert-info">/platform-types/</span>), 
</li>
    <li>data maturity (<span class="alert-info">/processing-levels/</span> and <span class="alert-info">/data-modes/</span>)</li>
    <li>kind of data (<span class="alert-info">/feature-types/</span>)</li>
    <li>data entities (<span class="alert-success"><b>/entries/<b></span>, <span class="alert-info">/data-sources/</span>, <span class="alert-info">/instruments/</span>,<span class="alert-info">/platforms/</span>, <span class="alert-info">/data-products/</span>)</li>

</ul>

<br>This notebook will focus then on the <span class="alert-success" style=""><b>/entries/</b></span> endpoint.

---

## Setup

### Importing modules

We will relly on a set of python modules to deal with SOCIB API next.<br> `Please run the next cell` so that they can be used by the present Jupyter Notebook:

In [None]:
import warnings
warnings.filterwarnings("ignore")

import datetime
import json
import pandas
import requests
import IPython
from json2html import *

<div class="alert alert-block alert-warning" style="margin-left: 2em">
<b>Tip!</b>
    
***  
If any of them raises any error please run prior to load it and in a dedicated cell, either:<ul><li> <i>`!conda install packageName --yes`</i></li> or <li><i>`!pip install packageName --yes`</i></li></ul>

### API token

To be able to query SOCIB API you will need first a <i>token</i> (api key).<br>To get one please visit the [API home page](http://api.socib.es/home/) and fill-in the form at bottom. An email will be sent to you with such <i>token</i>.

`Please run the next cell if you wanna have a glimpse to API home page`:

In [None]:
IPython.display.HTML('<iframe src="http://api.socib.es/home/"" width=100% height=500></iframe>')

`Please set in the next cell your api_key and run the cell below to load it in memory for later use`:

In [None]:
api_key = '' #write here the token emailed to you

In [None]:
api_url = 'http://api.socib.es'
headers = {
    'accept': 'application/vnd.socib+json',
    'apikey': api_key,
}

<div class="alert alert-block alert-warning" style="margin-left: 2em">
<b>Tip!</b>
    
***  
If you do not remember your token or wanna ask for a new one please reach <i>data.center@socib.es</i> with the following `subject`: 'SOCIB API TOKEN: UPDATE/REMIND REQUEST'

---

## Querying SOCIB API for entries/files

### Generic query

On [Getting Started](../01-GettingStarted) we already saw how to query SOCIB API for netCDF files (basic metocean observation storage entity). Let's `run the next cell` to recall the exact url to aim : 

In [None]:
end_point = '/entries/'
url = '%s%s' % (api_url, end_point)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)
print('API generic query url for entries: '+ url)

The reponse layout for any data entity (entries, data-sources, platforms and data-products) is more complex than the plain response of the other API endpoints: the response of the data entities queries must be paginated because of the large amount of elements in the response. This way, the response of such ENDPOINTS is actually an object with 4 main fields:

In [None]:
pandas.DataFrame.from_dict([entries_response])

- **count** - <i>containing the total number of files/netCDFs</i>
- **results** - <i>containing a list/array of 8 files/netCDFs </i>
- **next** - <i>containing the api url to get the next 8 files/netCDFs from the total amount. None in the case of the last page.</i>
- **previous** - <i>containing the api url to get the previous 8 files/netCDFs from the total amount. None in case of the first page.</i>

To work with these endpoints users will need to loop over all pages (by subsequently querying the urls returned in **next**) to get the complete list of entries/files and not just a sample (8 per page). An example of a loop is given next:

<div class="alert alert-block alert-warning" style="margin-left: 2em">
<b>Warning!</b>
    
***  
<b>Please do not run the loop below as it might take for ever</b>. To reduce the number of pages to loop over for getting the full list of entries/files `query parameters` must be provided. If you do not know what are `query parameter` check the next section (**custom query**)

In [None]:
end_point = '/entries/'
url = '%s%s' % (api_url, end_point)
entries_results_buffer = []
while url != None:
    print(url)
    entries_request = requests.get(url, headers=headers)
    entries_response = json.loads(entries_request.text)
    entries_results_buffer = entries_results_buffer + entries_response['results']
    url = entries_response['next']

### Custom query

So far, the API generic query for entries/netCDFs has been presented. Nevertheless, to enable searching for certain entries/netCDFS the generic query might be customized by using the so called `query parameters`. <br>The `query parameters` are key&value pairs that, when added to a given generic url, filter-out the elements of the generic response returning only the matching elements. Here after we will see the available ones:

<div class="alert alert-block alert-warning" style="margin-left: 2em">
<b>Warning!</b>
    
***  
Next we will explain each of the available `query parameters` separately but, remember that you can mix all them in the same query.

#### Time range

Let's imagine that our target are those entries/files covering a certain time range. In this case, the `initial_datetime` and `end_datetime` query parameters are to be used. <br>Here after an example about how to use them:

In [None]:
end_point = '/entries/'
query_parameters = 'initial_datetime=2018-01-01T01:01:01&end_datetime=2018-12-31T23:59:59'
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files with data between the specified dates...'%(entries_response['count']))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Update time

Let's imagine that our target are those entries/files updated after a specific time. In this case, the `last_update_datetime` query parameter is to be used.
<br>Here after an example about how-to-get all the entries updated since yesterday:

In [None]:
yesterday = (datetime.datetime.today() - datetime.timedelta(days=1)).strftime('%Y-%m-%dT%H:%M:%S')
yesterday

In [None]:
end_point = '/entries/'
query_parameters = 'last_update_datetime=%s'%(yesterday)
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files with data updated after %s'%(entries_response['count'], yesterday))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Version datetime

NetCDFs/files might be reprocessed at some point in order to correct errors produced in the real-time processing data-flow. When reprocessing is undertaken, a version is set for the affected netCDF/files. By default, the API is retourning the latest version of the files but, if an user would like to retrieve an older version, the `version_datetime` query parameter is to be used.
<br>Here after an example about how- to-retrieve for a given time the valid version of the netCDFs/files back then:

In [None]:
oneyearago = (datetime.datetime.today() - datetime.timedelta(days=365)).strftime('%Y-%m-%dT%H:%M:%S')
oneyearago

In [None]:
end_point = '/entries/'
query_parameters = 'version_datetime=%s'%(oneyearago)
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files with data updated after %s'%(entries_response['count'], oneyearago))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Processing level

A explained in [Getting Started](../01-GettingStarted), SOCIB produces files in several processing levels. Let's imagine that our target is to solely retrieve entries/files with a certain level of processing. In this case, the `processing_level` query parameter is to be used.
<br>Here after an example about how-to-retrieve only those entries matching a given processing level: <br>

In [None]:
end_point = '/entries/'
query_parameters = 'processing_level=L1'
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files matching such processing level'%(entries_response['count']))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Data mode

As explained in [Getting Started](../01-GettingStarted), SOCIB produces files in several data-modes. Let's imagine that our target is to solely retrieve entries/files with a certain data-mode. In this case, the `data_mode` query parameter is to be used.
<br>Here after an example about how-to-retrieve only those entries matching a given data mode: <br>

In [None]:
end_point = '/entries/'
query_parameters = 'data_mode=dt'
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files matching such data mode'%(entries_response['count']))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Feature type

As explained in [Getting Started](../01-GettingStarted), distinguishes different feature types or geometrical reprsentation of the data collected. Let's imagine that our target is to solely retrieve entries/files of a certain feature type. In this case, the `feature_type` query parameter is to be used.
<br>Here after an example about how-to-retrieve only those entries matching a given feature_type: <br>

In [None]:
end_point = '/entries/'
query_parameters = 'feature_type=grid'
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files matching such instrument type'%(entries_response['count']))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Instrument type

As explained in [Getting Started](../01-GettingStarted), the data in the netCDFS has been reported by a certain instrument and platform emssemble. Let's imagine that we are interested in retrieveing only those entries/files that contains data involving a certain type of instrument. In this case, the `instrument_type` query parameter is to be used.
<br>Here after an example about how-to-retrieve only those entries with data involving a certain instrument type: <br>

In [None]:
end_point = '/entries/'
query_parameters = 'instrument_type=Currentmeter'
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files matching such instrument type'%(entries_response['count']))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Platform type

As explained in [Getting Started](../01-GettingStarted), the data in the netCDFS has been reported by a certain instrument and platform emssemble. Let's imagine that we are interested in retrieveing only those entries/files that contains data involving a certain type of platform. In this case, the `platform_type` query parameter is to be used.
<br>Here after an example about how-to-retrieve only those entries with data involving a certain platform type: <br>

In [None]:
end_point = '/entries/'
query_parameters = 'platform_type=Weather Station'
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files matching such platform type'%(entries_response['count']))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

#### Data entity

As explained in [Getting Started](../01-GettingStarted), the netCDFS can be grouped in collections for a better management  as data-products (sharing the same research audience), data-sources (sharing the same instrument&platform enssemble) or platforms (sharing the same platform). In this case, the `data_product`, `data_source` and `platform` query parameters are to be used, passing the ID of the entity as value the ID.
<br>Here after an example about how-to-retrieve only those entries with data involving a certain platform: <br>

In [None]:
end_point = '/entries/'
query_parameters = 'platform=Buoy_CanalDeIbiza'
url = '%s%s?%s' % (api_url, end_point, query_parameters)
entries_request = requests.get(url, headers=headers)
entries_response = json.loads(entries_request.text)

print('API custom query : '+ url)
print('There are %s netCDFs/files matching such platform type'%(entries_response['count']))
print('Find next a preview of the first %s ones'%(len(entries_response['results'])))
pandas.DataFrame.from_dict(entries_response['results'])

---

## Next tutorial

<div class="alert alert-block alert-success" style="margin-left: 2em">
<b>More!</b>
    
***  
To see way more about SOCIB entries next dedicated notebooks:
<ul>
    <li><span><a href="02-entry-viewers.ipynb">02-entry-viewers</a></span></li>
    <li><span><a href="03-entry-services.ipynb">03-entry-services</a></span></li>
    <li><span><a href="04-entry-data.ipynb">04-entry-data</a></span></li>
</ul>