# Experimenting with the ONS API

Starting here: https://developer.ons.gov.uk/tour/getting-started/#results-section

In [2]:
import pandas as pd

## Query the datasets endpoint

The 'getting started' page suggests getting a list of datasets "using the [list datasets](https://api.beta.ons.gov.uk/v1/datasets) endpoint."


In [None]:
#store the url
datasetsurl = "https://api.beta.ons.gov.uk/v1/datasets"
#fetch the json from that url
datasets = pd.read_json(datasetsurl)
#show
datasets

Unnamed: 0,@context,count,items,limit,offset,total_count
0,https://cdn.ons.gov.uk/assets/json-ld/context....,20,"{'contacts': [{'email': '+44 (0)1329 444661', ...",20,0,43
1,https://cdn.ons.gov.uk/assets/json-ld/context....,20,"{'contacts': [{'email': 'pop.info@ons.gov.uk',...",20,0,43
2,https://cdn.ons.gov.uk/assets/json-ld/context....,20,"{'contacts': [{'email': 'pop.info@ons.gov.uk',...",20,0,43
3,https://cdn.ons.gov.uk/assets/json-ld/context....,20,"{'contacts': [{'email': 'pop.info@ons.gov.uk',...",20,0,43
4,https://cdn.ons.gov.uk/assets/json-ld/context....,20,"{'contacts': [{'email': 'pop.info@ons.gov.uk',...",20,0,43
5,https://cdn.ons.gov.uk/assets/json-ld/context....,20,"{'contacts': [{'email': 'pop.info@ons.gov.uk',...",20,0,43
6,https://cdn.ons.gov.uk/assets/json-ld/context....,20,"{'contacts': [{'email': 'pop.info@ons.gov.uk',...",20,0,43
7,https://cdn.ons.gov.uk/assets/json-ld/context....,20,{'contacts': [{'email': 'health.data@ons.gov.u...,20,0,43
8,https://cdn.ons.gov.uk/assets/json-ld/context....,20,{'contacts': [{'email': 'health.data@ons.gov.u...,20,0,43
9,https://cdn.ons.gov.uk/assets/json-ld/context....,20,{'contacts': [{'email': 'vacancy.survey@ons.go...,20,0,43


## Drill down into 'items'

That's not very easy to understand because the structure of the json isn't simple. So we need to drill down to one of the branches.

In [None]:
#drill down into 'items'
datasets['items']

0     {'contacts': [{'email': '+44 (0)1329 444661', ...
1     {'contacts': [{'email': 'pop.info@ons.gov.uk',...
2     {'contacts': [{'email': 'pop.info@ons.gov.uk',...
3     {'contacts': [{'email': 'pop.info@ons.gov.uk',...
4     {'contacts': [{'email': 'pop.info@ons.gov.uk',...
5     {'contacts': [{'email': 'pop.info@ons.gov.uk',...
6     {'contacts': [{'email': 'pop.info@ons.gov.uk',...
7     {'contacts': [{'email': 'health.data@ons.gov.u...
8     {'contacts': [{'email': 'health.data@ons.gov.u...
9     {'contacts': [{'email': 'vacancy.survey@ons.go...
10    {'contacts': [{'email': 'faster.indicators@ons...
11    {'contacts': [{'email': 'cpi@ons.gov.uk', 'nam...
12    {'contacts': [{'email': 'pop.info@ons.gov.uk',...
13    {'contacts': [{'email': 'health.data@ons.gov.u...
14    {'contacts': [{'email': 'health.data@ons.gov.u...
15    {'contacts': [{'email': 'regionalgdp@ons.gov.u...
16    {'contacts': [{'email': 'QualityOfLife@ons.gov...
17    {'contacts': [{'email': 'hpi@ons.gov.uk', 

## You need to iterate

If we try to drill down further we get an error.

In [None]:
#drill down into 'items' > 'title'
datasets['items']['title']

KeyError: ignored

This is because 'items' is a list, which we need to iterate through.

In [None]:
#drill down into 'items' > 'title'
for i in datasets['items']:
  print(i['title'])

Local authority ageing statistics, based on annual mid-year population estimates
Local authority ageing statistics, population projections for older people
Local authority ageing statistics, older people economic activity
Local authority ageing statistics, net internal migration people aged 65 and over and 85 and over
Local authority ageing statistics, sex ratios for people aged 65 and over and 85 and over
Local authority ageing statistics, household projections for older people
Local authority ageing statistics, projected sex ratios for older people
Deaths registered weekly in England and Wales by age and sex
Deaths registered weekly in England and Wales by region
Faster Indicators - Online Job Advert Estimates
Coronavirus and the latest indicators for the UK economy and society: Shipping indicators
Consumer Prices Index including owner occupiers' housing costs (CPIH)
Population Estimates for UK, England and Wales, Scotland and Northern Ireland
Death registrations and occurrences by h

...or index:

In [None]:
#drill down into the first item in 'items' and then its 'title'
datasets['items'][0]['title']

'Local authority ageing statistics, based on annual mid-year population estimates'

Some items have further branches 

In [None]:
#drill down into the first item in 'items' and then 'links'
datasets['items'][0]['links']

{'editions': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/ageing-population-estimates/editions'},
 'latest_version': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/ageing-population-estimates/editions/time-series/versions/1',
  'id': '1'},
 'self': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/ageing-population-estimates'},
 'taxonomy': {'href': 'https://api.beta.ons.gov.uk/v1/peoplepopulationandcommunity/birthsdeathsandmarriages/ageing'}}

In [None]:
#drill down into the first item in 'items' and then 'links'
datasets['items'][0]['links']['editions']['href']

'https://api.beta.ons.gov.uk/v1/datasets/ageing-population-estimates/editions'

## Get the latest version

The latest version can be found at the path given below:

In [None]:
#drill down into the first item in 'items' and then 'links'
datasets['items'][0]['links']['latest_version']['href']

'https://api.beta.ons.gov.uk/v1/datasets/ageing-population-estimates/editions/time-series/versions/1'

## Get a specific dataset

The [next page in the tour](https://developer.ons.gov.uk/tour/latest-release/) mentions the get dataset endpoint.

This isn't [documented](https://developer.ons.gov.uk/dataset/datasets-id/) massively clearly - when it says `/datasets/{id}` what it *means* is that you need to add `/datasets/` to the end of the API's base URL, followed by the ID code for the dataset you want.

What is the base URL? Well the example [given in the tour](https://developer.ons.gov.uk/tour/latest-release/) is `https://api.beta.ons.gov.uk/v1/datasets/cpih01` so we can break this into 3 parts and guess that: 

1. `https://api.beta.ons.gov.uk/v1` is the base part of the API
2. `/datasets/` is the *method or function*. 
3. `cpih01`, then, must be the ID of the example dataset.

This is all just guesswork that we need to test. First let's try to fetch that example URL.

In [3]:
#store the url
testurl = "https://api.beta.ons.gov.uk/v1/datasets/cpih01"
#read it
testdata = pd.read_json(testurl)
testdata

ValueError: ignored

Not a good start. 

## Reading from the JSON file locally

After a lot of googling around and trial and error, I came across [this stackoverflow thread](https://stackoverflow.com/questions/16573332/jsondecodeerror-expecting-value-line-1-column-1-char-0) which contained some code that worked - if the file is local. 

We start off by importing the `json` library so we can use the `json.loads()` function.

In [7]:
#import json library
import json

In [18]:
#store the path
json_file_path = "cpih01.json"

#load the json at that path
with open(json_file_path, 'r') as j:
     contents = json.loads(j.read())
#show the contents
contents

{'@context': 'https://cdn.ons.gov.uk/assets/json-ld/context.json',
 'contacts': [{'email': 'cpi@ons.gov.uk',
   'name': 'Chris Payne and Philip Gooding',
   'telephone': '+44 (0)1633 456900'}],
 'description': "CPIH is the most comprehensive measure of inflation. It extends CPI to include a measure of the costs associated with owning, maintaining and living in one's own home, known as owner occupiers' housing costs (OOH), along with council tax. This dataset provides CPIH time series (2005 to latest published month), allowing users to customise their own selection, view or download.",
 'id': 'cpih01',
 'keywords': ['Inflation'],
 'license': 'Open Government Licence v3.0',
 'links': {'editions': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/cpih01/editions'},
  'latest_version': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/cpih01/editions/time-series/versions/15',
   'id': '15'},
  'self': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/cpih01'},
  'taxonomy': {'href': 'https

That at least tells us we can work with the JSON locally. 

This means, as a workaround, we can [add an extra step to do that](https://stackoverflow.com/questions/22676/how-to-download-a-file-over-http).

In [21]:
import urllib.request
urllib.request.urlretrieve("https://api.beta.ons.gov.uk/v1/datasets/cpih01", "thejson.json")

('thejson.json', <http.client.HTTPMessage at 0x7f95d8ed4d50>)

In [24]:
#store the path
json_file_path = "thejson.json"

#load the json at that path
with open(json_file_path, 'r') as j:
     contents = json.loads(j.read())
#show the contents
contents['links']['latest_version']['href']

'https://api.beta.ons.gov.uk/v1/datasets/cpih01/editions/time-series/versions/15'

## From the URL

Meanwhile, back to the URL approach, where [this post](https://python.plainenglish.io/from-api-to-pandas-getting-json-data-with-python-df127f699b6b) turns out to have the answer

In [36]:
# import some libraries to fetch URLs and deal with certificates
import urllib3
from urllib3 import request
import certifi

#create an object
http = urllib3.PoolManager(
       cert_reqs='CERT_REQUIRED',
       ca_certs=certifi.where())

In [37]:
url = "https://api.beta.ons.gov.uk/v1/datasets/cpih01"
r = http.request('GET', url)
r.status

# decode json data into a dict object
data = json.loads(r.data.decode('utf-8'))
data

{'@context': 'https://cdn.ons.gov.uk/assets/json-ld/context.json',
 'contacts': [{'email': 'cpi@ons.gov.uk',
   'name': 'Chris Payne and Philip Gooding',
   'telephone': '+44 (0)1633 456900'}],
 'description': "CPIH is the most comprehensive measure of inflation. It extends CPI to include a measure of the costs associated with owning, maintaining and living in one's own home, known as owner occupiers' housing costs (OOH), along with council tax. This dataset provides CPIH time series (2005 to latest published month), allowing users to customise their own selection, view or download.",
 'id': 'cpih01',
 'keywords': ['Inflation'],
 'license': 'Open Government Licence v3.0',
 'links': {'editions': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/cpih01/editions'},
  'latest_version': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/cpih01/editions/time-series/versions/15',
   'id': '15'},
  'self': {'href': 'https://api.beta.ons.gov.uk/v1/datasets/cpih01'},
  'taxonomy': {'href': 'https