# CKAN API

https://docs.ckan.org/en/2.8/api/index.html

EMN data hubs use **<font color='steelblue'>CKAN version 2.6</font>**.

### API Tools

##### Postman

A sandbox REST APIs. Also generates code snippets in several languages.

* Download: https://www.getpostman.com/
* CKAN API Postman collection: https://github.com/EMN-Data/ckan-api-postman

##### Python

**ckanapi** CLI

https://github.com/ckan/ckanapi

**ckanapi** python package

https://github.com/ckan/ckanapi#ckanapi-python-module

manually with requests

http://docs.python-requests.org/en/master/

`pip install requests==2.18.4`

In [None]:
import requests
import json

# CKAN Structure Overveiw

The EMN data hubs are built on a [CKAN framework](http://docs.ckan.org/en/ckan-2.7.3/user-guide.html).

The CKAN **web application** has a hierarchical layout. From top to bottom:

* ##### Projects
* ##### Datasets
* ##### Resources

The CKAN **API** has the same structure, but **datasets are _packages_**. And **project** is often synonymous with **_group_ or _organization_**.

The CKAN API documentation does not explicitly include Projects, but the documentation for _groups_ and _organizations_ apply to projects. 

# API Use Cases

* Get existing projects, datasets (_packages_), and resources
* Edit dataset or resource level metadata
* Add new resoureces to datasets
* Add new datasets to projects

# Building Requests

Each request follows the same format

`<datahub>/api/3/action/<action>`

> Examples
>
> `https://datahub.h2awsm.org/api/3/action/project_list`
> `https://datahub.h2awsm.org/api/3/action/resource_show`

### Actions

Actions also follow a similar format. The start the the entity and end with a verb. Entity are things like, **project**, **package**, **resource**, and **revision** (plus many others). Verbs are **list**, **show**, **create**, **update**, **patch**, and **delete**. Together you may get a **`project_list`** or a **`resource_show`**, or maybe you need to delete a dataset with **`package_delete`**.

The helper below will generate a URI for an **<action\>** that we will use in each request.

In [None]:
emn_datahub = 'https://datahub.h2awsm.org'

# Helper to build a URI for a given API action
action = lambda a: '{}/api/3/action/{}'.format(emn_datahub, a)

action('project_list')

### Get your API token

Most API calls require your API token. In general, any time you need to create or modify content, an API token is required. To read or download **public** datasets, no API token is required.

1. Login to the data hub
2. Click on your user name in the top ribbon
3. Your API token is at the bottom of the left column

In [None]:
# Set your API token
api_token = 'fadc3218-8a6c-46a5-be7b-b93983620f2f'

The CKAN API is RESTful. For the most part, it uses GET and POST. An authorization header is requred for both types of requests. POST requests require an additional header to tell the API to expect JSON in the body of the request.

In [None]:
# Include these headers in GET requests
get_headers = {
    'authorization': api_token
}

# Include these headers in POST requests
post_headers = {
    'authorization': api_token,
    'content-type': 'application/json;charset=utf-8'
}

# CKAN API responses

Every CKAN response will be a JSON object.

Example **success** response:

```json
{
    "success": true,
    "result": <array|object>,
    "help": ""
}
```

Example **error** response:

```json
{
    "success": false,
    "error": "",
    "help": ""
}
```

# Walkthrough

## Projects

Remember, projects are called _organizations_ or _groups_ in the CKAN API documentation.

### project_list

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.get.organization_list

In [None]:
# https://datahub.h2awsm.org/api/3/action/project_list
response = requests.get(action('project_list'), 
                        headers=get_headers)

projects = response.json()['result']

print(json.dumps(projects, indent=2))

### project_list_for_user

Alternatively use `project_list_for_user` to get a list of projects with premission to preform a given action.

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.get.organization_list_for_user

In [None]:
params = {'permission': 'create_dataset'}

# https://datahub.h2awsm.org/api/3/action/project_list_for_user
response = requests.get(action('project_list_for_user'), 
                        headers=get_headers, 
                        params=params)

my_projects = response.json()['result']

print(json.dumps(my_projects, indent=2))

In [None]:
project_id = [project['id'] for project in my_projects if project['title'] == 'API Demo'][0]

print(project_id)

### project_show

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.get.organization_show

In [None]:
params = {
    'id': project_id,
    'include_datasets': True,
    'include_users': False
}

# https://datahub.h2awsm.org/api/3/action/project_show
response = requests.get(action('project_show'), 
                        headers=get_headers, 
                        params=params)

project = response.json()['result']

print(json.dumps(project, indent=2))

### Datasets

Remember, datasets are called packages in the CKAN API documentation.

### package_create

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.create.package_create

In [None]:
# owner_org = project id
dataset_metadata = {
    'name': 'api-walkthrough',
    'owner_org': project_id,
    'maintainer_email': 'nick.wunder@nrel.gov',
    'institution': 'National Renewable Energy Laboratory'
}

# https://datahub.h2awsm.org/api/3/action/package_create
response = requests.post(action('package_create'), 
                         headers=post_headers, 
                         data=json.dumps(dataset_metadata))

new_dataset = response.json()['result']

print(json.dumps(new_dataset, indent=2))

### Resources

### resource_create

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.create.resource_create

#### Single file upload

In [None]:
file_name = '/Users/nwunder2/Projects/emn/ckan-api-demo/data/charlotte_perkins_gilman_the_yellow_walpaper.txt'

resource_metadata = {
    'package_id': new_dataset['id'],
    'name': 'charlotte_perkins_gilman_the_yellow_walpaper.txt'
}

request = requests.post(action('resource_create'),
                        data=resource_metadata,
                        headers=get_headers,
                        files=[('upload', open(file_name, 'rb'))])

new_resource = request.json()['result']

print(json.dumps(new_resource, indent=2))

#### Multiple file upload

In [None]:
def upload(file_name, resource_metadata):
    print('Uploading \n{}'.format(json.dumps(resource_metadata, indent=2)))
    r = requests.post(action('resource_create'),
                      data=resource_metadata,
                      headers=get_headers,
                      files=[('upload', open(file_name, 'rb'))])
    print('Status: {}\n'.format(r.status_code))
    return r.json()

In [None]:
path = '/Users/nwunder2/Projects/emn/ckan-api-demo/data/books'

from os import listdir
file_names = [f for f in listdir(path) if not f.startswith('.')]
print(file_names)

In [None]:
resources_metadata = []

for file_name in file_names:
    resource_metadata = {
        'package_id': new_dataset['id'],
        'name': file_name,
        'data_tool': 'multi-spectra'
    }
    resources_metadata.append(resource_metadata)

new_resources = []

for resource_metadata in resources_metadata:
    new_resource_obj = upload('{}/{}'.format(path, resource_metadata['name']), 
                          resource_metadata=resource_metadata) 
    new_resources.append(new_resource_obj)

### resource_patch

http://docs.ckan.org/en/ckan-2.6.5/api/index.html#ckan.logic.action.patch.resource_patch

# <font color="red">¡CAUTION!</font> patch versus update

There are two ways to update existing content, **update** and **patch**. 

**The patch enpoint is almost always favored over update** as it allows you to change only what you need. The update API call will reset any properties or metadata not defined in the API call; patch will maintain existing values if not explicitly provided in the request.

In [None]:
resource_metadata = {
    'id': new_resource['id'],
    'description': '## The Yellow Walpaper\n### by Charlotte Perkins Gilman\n_A splendid short story._'
}

request = requests.post(action('resource_patch'),
                        data=json.dumps(resource_metadata),
                        headers=post_headers)

modified_resource = request.json()['result']

print(json.dumps(modified_resource, indent=2))