# CKAN API

http://docs.ckan.org/en/ckan-2.6.5/api/index.html
For the CKAN DataStore
https://docs.ckan.org/en/2.8/maintaining/datastore.html#the-datastore-api

EMN data hubs use **<font color='steelblue'>CKAN version 2.6</font>**.
Note: We are in the process of an upgrade to 2.8 at this time (2019-04-08)

### API Tools

##### Postman

A sandbox REST APIs. Also generates code snippets in several languages. Good for testing!

* Download: https://www.getpostman.com/
* CKAN API Postman collection: https://github.com/EMN-Data/ckan-api-postman

##### Python

**ckanapi** CLI

https://github.com/ckan/ckanapi

**ckanapi** python package

https://github.com/ckan/ckanapi#ckanapi-python-module

manually with requests

http://docs.python-requests.org/en/master/

`pip install requests`

manually with urllib (if not installed with python)

https://docs.python.org/3/library/urllib.html

`pip install urllib`

In [None]:
import json
import pprint as pp
import urllib.request as ur
import urllib.parse as up

# CKAN Structure Overveiw

The EMN data hubs are built on a [CKAN framework](http://docs.ckan.org/en/ckan-2.7.3/user-guide.html).

The CKAN **web application** has a hierarchical layout. From top to bottom:

* ##### Projects and Sub-Projects
* ##### Datasets
* ##### Resources

The CKAN **API** has the same structure, but **datasets are _packages_**. And **project** is often synonymous with **_group_ or _organization_**.

The CKAN API documentation does not explicitly include Projects, but the documentation for _groups_ and _organizations_ apply to projects. 

# API Use Cases

* Get existing projects
* Get user's specific projects
* Get a project's details
* Query specific data records from a CSV resource
* More to come later...

The API can also be used to modify or even upload new datasets. We are currently building an application to provide the upload capability for researchers to be done programmtically; particularlay with respect to handling multiple files into exisitng or new datasets.

# Building Requests

Each request follows the same format

`<datahub>/api/3/action/<action>`

> Examples
>
> `https://datahub.duramat.org/api/3/action/project_list`
> `https://datahub.duramat.org/api/3/action/resource_show`

### Actions

Actions also follow a similar format. The start the the entity and end with a verb. Entity are things like, **project**, **package**, **resource**, **datastore**, and **revision** (plus many others). Verbs are **list**, **show**, **create**, **update**, **patch**, **search**, and **delete**. Together you may get a **`project_list`** or a **`resource_show`**, or maybe you need to delete a dataset with **`package_delete`**.

The helper below will generate a URI for an **<action\>** that we will use in each request.

In [None]:
# build url pieces
emn_datahub = 'https://datahub.duramat.org'

# Helper to build a URI for a given API action
action = lambda a: '{}/api/3/action/{}'.format(emn_datahub, a)

action('project_list')

### Get your API token

Most API calls require your API token. In general, any time you need to create or modify content, an API token is required. To read or download **public** datasets, no API token is required.

1. Login to the data hub
2. Click on your user name in the top ribbon
3. Your API token is at the bottom of the left column

In [None]:
# Set your API token
api_token = 'c649045a-9d7a-4a5e-a200-bd667ba0b67a'

# CKAN API responses

Every CKAN response will be a JSON object.

Example **success** response:

```json
{
    "success": true,
    "result": <array|object>,
    "help": ""
}
```

Example **error** response:

```json
{
    "success": false,
    "error": "",
    "help": ""
}
```

# Walkthrough

## Projects

Note that projects are called _organizations_ or _groups_ in the CKAN API documentation.

### project_list

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.get.organization_list

In [None]:
request = ur.Request(action('project_list')) 

The CKAN API is RESTful. For the most part, it uses GET and POST. An authorization header is requred for both types of requests. POST requests require an additional header to tell the API to expect JSON in the body of the request.

In [None]:
# Using only authorization for the GET header
request.add_header('Authorization', api_token)

Complete the GET request by initiating the contact and then parsing out the results; which will be a json file.

In [None]:
response = ur.urlopen(request)
data = json.loads(response.read().decode('utf-8'))   
pp.pprint (data['result'], indent=4)

### project_list_for_user

Alternatively use `project_list_for_user` to get a list of projects with premission to preform a given action. For this one we will be needing to build a parameter list to pass as part of the request to the CKAN REST API. It will also be necessary to encode the parameter dictionary into the format for the request.

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.get.organization_list_for_user

In [None]:
# Make dictionary of parameters and encode them into uri string
params = {'permission': 'create_dataset'}
param_string = up.urlencode( params )
#Build composite url for request
url = action('project_list_for_user') + '?' + param_string
# https://datahub.duramat.org/api/3/action/project_list_for_user
request = ur.Request(url)
# Add authorization
request.add_header('Authorization', api_token)
response = ur.urlopen(request)
data = json.loads(response.read().decode('utf-8'))   
pp.pprint (data['result'], indent=4)

### project_show

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.get.organization_show

The next section will require a project id to perform. I have chosen a public data set from the Data Hub; The Enphase Micro-Inverter sub-project, by Bill Marion, under PV Field Data.

To find the project ID, click on the project name from the "project tree" on the left of the page. The ID is foundon the upper right, above the project description

In [None]:
project_id = '6448b09a-f253-4e5c-9a97-e9e88b628df5'
params = {
    'id': project_id,
    'include_datasets': True,
    'include_users': False
}
param_string = up.urlencode( params )
#Build composite url for request
url = action('project_show') + '?' + param_string
# https://datahub.duramat.org/api/3/action/project_show
request = ur.Request(url)
# Add authorization
request.add_header('Authorization', api_token)
response = ur.urlopen(request)
data = json.loads(response.read().decode('utf-8'))   
pp.pprint (data['result'], indent=4)

### Datasets and Data Files/Resources

Remember, datasets are called packages in the CKAN API documentation.To search a data file for particular values requires using the DataStore API, a subset of the CKAN API documentation

https://docs.ckan.org/en/2.8/maintaining/datastore.html#the-datastore-api

The data resource can be queried for particular values if it has been pushed into the data store. This is part of the advnatage to storing data files as CSV whenever possible. The process for accessing the data is similar to what has been done above. 

In this case we are using the "datastore_search" action. I will target a public data set in the same enphase Micro-Inverter sub-project. I will search the Denver Info file (a CSV) for all Canadian Solar records. Note; there are far more parameters that can be used in the query and those can be found in the documentaion under: ckanext.datastore.logic.action.datastore_search 

We will need the resource ID for the file. That can be found by either searching the project data sets (using the results from above) or by clicking on the resource and then clicking on the green DATA API button on top right. Inside the Data API button examples is the UUID for the resource.

In [None]:
resource_id='31c4a04d-5bbc-4776-963b-a2d8fab994c2'
params = {
    'resource_id': resource_id,
    'q': 'Canadian Solar'
}
param_string = up.urlencode( params )
#Build composite url for request
url = action('datastore_search') + '?' + param_string
# https://datahub.duramat.org/api/3/action/project_show
request = ur.Request(url)
# Add authorization
request.add_header('Authorization', api_token)
response = ur.urlopen(request)
data = json.loads(response.read().decode('utf-8'))   
pp.pprint (data['result'], indent=4)