# DEMO ON DATA.GOV.IE API USAGE DEMO


The objective of this notebook is to show developers how DATA.GOV.IE API can be used to fetch organization, data sets and its details for the Analysis or Dashboard purposes. Different functionality of API and its parameters is given in detail in the link: https://docs.ckan.org/en/2.8/api/ .

This notebook is split into two sections:

<ul>
<li>Demo on how to use different API calls.</li>
    <ul>
    <li> organization_list </li>
    <li> organization_show </li>
    <li> package_list </li>
    <li> package_show </li>
    </ul>
</ul>


<b>Please Note: All available paramters options for different API call is given in link: https://docs.ckan.org/en/2.8/api/ .</b>

**Install requirements**

In [1]:
!pip install fuzzywuzzy PyHamcrest python-Levenshtein pyjstat plotly seaborn matplotlib

[33mYou are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


#### Import all necessary libraries

In [2]:
import requests
from fuzzywuzzy import fuzz
from fuzzywuzzy import process
import sys
import pandas as pd
from pyjstat import pyjstat
import numpy as np
from pyjstat import pyjstat
import plotly.plotly as py
import plotly.graph_objs as go
import matplotlib.pyplot as plt
import seaborn as sns
import collections
from IPython.display import display, HTML

## Demo on using API action organization_list:

This action gives the list of organization and its details in breief. Organization_list can also be used to find the organization ID given a name. In addition, different parameterts can also be passed to this action. Demo on how to do this is given below:

Note: Details on what parameters can be used is given in the link: https://docs.ckan.org/en/2.8/api/ .

#### Demo on fetching organization id given name:

This uses a fuzzy matching and returns the ID of the given organisation. To get the list of organization we use action/organization_list.

Case 1: Calling API with default parameters

Case 2: Calling API with paramters

Given Organization Name to search is: <b>"Ordance Survey Ireland"</b>


##### Case 1:

In [3]:
def find_organization(url, org_name):
    
    try:
        
        response = requests.get(url+'/api/3/action/organization_list')
        results = response.json()
        
        if not results['success']:
            raise SystemError
            
        if len(results['result']) == 0:
            
            return ("No matching Organization found")
        
        else:
            
            ratio = {}
            
            for x in results['result']:
                ratio[x] = fuzz.ratio(org_name, x)
                
            mtch_rtr = sorted(ratio.values())[-1]
            
            if mtch_rtr >= 75:
                
                lKey = [k for k,v in ratio.items() if v == mtch_rtr]
                
                return {'Id':lKey[0]}
                
            else:
                return ("No matching Organization found")
            
    except SystemError:
        print("Failure in request - bad action/url")
        sys.exit(1)
        
    except Exception as e:
        print(e)
        sys.exit(1)
        
find_organization('https://data.gov.ie', 'Ordance Survey Ireland')

{'Id': 'ordnance-survey-ireland'}

#### Case 2: With paramters

In [4]:
def organization_details_title_status(url, org_name, param): # Action: organization_list
    
    try:

        response = requests.get(url, param)
        results = response.json()

        if not results['success']:
            raise SystemError

        if len(results['result']) == 0:

            return ("No matching Organization found")

        else:

            ratio = {}

            for index, x in enumerate(results['result']):
                ratio[index] = fuzz.ratio(org_name, x['display_name'])

            mtch_rtr = sorted(ratio.values())[-1]

            if mtch_rtr >= 75:

                lKey = [k for k,v in ratio.items() if v == mtch_rtr]

                return results['result'][int(lKey[0])]

            else:
                return ("No matching Organization found")

    except SystemError:
        print("Failure in request - bad action/url")
        sys.exit(1)
        
    except Exception as e:
        print(e)
        sys.exit(1)
        

param = {'all_fields': True}
res = organization_details_title_status('https://data.gov.ie/api/3/action/organization_list', 'ordnance survey ireland', param)
print(res)
print("\n")
print(res.keys())

{'display_name': 'Ordnance Survey Ireland', 'description': '', 'image_display_url': '', 'package_count': 155, 'created': '2018-03-05T11:22:58.866536', 'name': 'ordnance-survey-ireland', 'is_organization': True, 'state': 'active', 'image_url': '', 'type': 'organization', 'title': 'Ordnance Survey Ireland', 'revision_id': '3922dff9-b761-45ac-a46c-20eb581ea7ec', 'num_followers': 0, 'id': 'c2f170ca-63d0-4498-9e81-759827708e97', 'approval_status': 'approved'}


dict_keys(['display_name', 'description', 'image_display_url', 'package_count', 'created', 'name', 'is_organization', 'state', 'image_url', 'type', 'title', 'revision_id', 'num_followers', 'id', 'approval_status'])


## Demo on using API action organization_show:

This API call gives a details of the organization, its status, number of resources, packages/datasets, type of datasets and more. Demo on this is given below:

In [5]:
def organisation_details(url, param): # Action: organization_show
    
    try:
        
        response = requests.get(url+'organization_show', params)
        results = response.json()
    
        if not results['success']:
            raise SystemError

        if len(results['result']) == 0:
            raise Exception

        else:
            
            res = {'no_of_users': len(results['result']['users']), 
                   'packages_count': results['result']['package_count'], 
                  'num_followers': results['result']['num_followers'], 
                   'display_name':results['result']['display_name'],
                  'approval_status': results['result']['approval_status'], 
                   'display_name': results['result']['display_name'], 'id': results['result']['id']}

            return results['result'], res

    except SystemError:
        print("Failure in request - bad organization name"\
              "Hint: Get organization ID from find organization function above")
        sys.exit(1)
        
    except Exception as e:
        print(e)
        sys.exit(1)
        

params = {'id': 'ordnance-survey-ireland', 'include_users': True, 'include_dataset_count': True, 
          'include_users': True, 'include_groups': True, 'include_tags': True, 'include_datasets': True}        
com_res, res = organisation_details('https://data.gov.ie/api/3/action/', params)
print("Extracted Organization Details:\n\n", res)

print("\n\n****************\n\n")
print("Available datasets/packages - \n")
print(com_res['packages'])

print("\n\n\n****************\n\n\n")
print("Complete Details of the API response - \n")
print(com_res)

Extracted Organization Details:

 {'no_of_users': 1, 'packages_count': 155, 'num_followers': 0, 'display_name': 'Ordnance Survey Ireland', 'approval_status': 'approved', 'id': 'c2f170ca-63d0-4498-9e81-759827708e97'}


****************


Available datasets/packages - 

[{'owner_org': 'c2f170ca-63d0-4498-9e81-759827708e97', 'maintainer': None, 'issued': '2016-10-11', 'private': False, 'maintainer_email': None, 'num_tags': 5, 'contact_name': 'OSI Data Contact', 'id': '3054fb53-b1b4-4148-b2ac-11c99524327d', 'metadata_created': '2015-10-20T10:33:12.746697', 'metadata_modified': '2018-10-19T04:49:01.444400', 'author': None, 'author_email': None, 'theme': 'Environment', 'state': 'active', 'relationships_as_object': [], 'license_id': 'cc-by', 'contact_phone': '-', 'updated': '2016-09-01', 'num_resources': 6, 'title': 'Baronies - OSi National Placenames Gazetteer', 'contact_email': 'custserv@osi.ie', 'groups': [], 'creator_user_id': 'ef08264c-676d-48c5-aaad-6ce8aa18d521', 'relationships_as_subj

## Demo on extracting the dataset/package details using API (package_show):

Let us consider the package with Id: 'c2f170ca-63d0-4498-9e81-759827708e97' belongs to Central Statistics Office (obtained from organizarion show) ane explore the details of the package.

This gives details of resources and its formats for a given package id -

In [6]:
def dataset_details(url, param):
    
    try:
        
        response = requests.get(url+'api/3/action/package_show', param)
        results = response.json()
        
        if not results['success']:
            raise SystemError

        if len(results['result']) == 0:
            raise Exception

        else:
            
            res_pkg = {
                   'maintainer': results['result']['maintainer'],
                   'num_resources': results['result']['num_resources'], 
                   'resource_formats': [x['format'] for x in results['result']['resources']],
                   'resource_star_rating': results['result']['qa']['openness_score'],
                   'owner_org': results['result']['owner_org']
                  }
            
            return res_pkg, results['result']
            
    except SystemError:
        print("Failure in request - bad organization name"\
              "Hint: Get organization ID from find organization function above")
        sys.exit(1)
        
    except Exception as e:
        print(e)
        sys.exit(1)
    
    
param = {'id': 'railway-stations-osi-national-250k-map-of-ireland'}
res_pkg, comp_res_pkg = dataset_details('https://data.gov.ie/', param)

print("Extracted Details:\n\n", res_pkg)
#print("\n\n*************\n\n")
#print("Complete dataset or package details - \n")
#print(comp_res_pkg)

Extracted Details:

 {'maintainer': None, 'num_resources': 6, 'resource_formats': ['HTML', 'Esri REST', 'GeoJSON', 'CSV', 'KML', 'ZIP'], 'resource_star_rating': 3, 'owner_org': 'c2f170ca-63d0-4498-9e81-759827708e97'}


## Demo on extracting the one of the resource (CSV) through API (package_show):

The below function shows how to extract one of the resource (eg: CSV format) from the any selected package id. In this case selected package id is: 'c2f170ca-63d0-4498-9e81-759827708e97' or 'railway-stations-osi-national-250k-map-of-ireland''

In [7]:
def extract_pkg_data(url, pkg_id):
    
    try:
        
        param = {'id': pkg_id}
        response = requests.get(url, param)
        results = response.json()
        
        if not results['success']:
            raise SystemError

        if len(results['result']) == 0:
            return ("No package found")
        
        dataset = pd.read_csv(results['result']['resources'][3]['url'])
        #df = dataset.write('dataframe')
        
        return dataset
    
    except SystemError:
        
        print("Request Failure, please check the URL or the parameters")
        sys.exit(1)
        
    except Exception as e:
        
        print(e)
        sys.exit(1)
        
dt_rwly_stn = extract_pkg_data('https://data.gov.ie/api/3/action/package_show', 'railway-stations-osi-national-250k-map-of-ireland')
dt_rwly_stn.head()

Unnamed: 0,X,Y,OBJECTID,FCsubtype,NAMN1
0,-8.487385,54.271232,1,1,Sligo
1,-8.494182,54.186333,2,1,Collooney
2,-9.159261,54.113052,3,1,Ballina
3,-8.520787,54.08797,4,1,Ballymote
4,-6.411718,54.000091,5,1,Dundalk


## Demo on all available packages for the site DATA.GOV.IE - package_list:
    
It is recommended not to use this because it brings entire packages list from the site - 

In [8]:
import requests
params = {'id': 'central-statistics-office'}
response = requests.get('https://data.gov.ie/api/3/action/package_list')
results = response.json()
print(results['result'])
assert results['success'] is True
print('API Status: ', results['success'])

API Status:  True
