# Open Energy Shared Dataset Retrieval example

This Jupyter Notebook uses Open Energy Search to identify a Shared dataset that requires authorisation using [Open Energy access control](https://docs.openenergy.org.uk/main/access_control_specification.html) in order to download it.

Shared data is one of the [three classes of data](https://icebreakerone.org/open-shared-closed/) (Open, Shared, Closed) in the Open Energy ecosystem, and denotes data that may be used by members of the Open Energy ecosystem subject to conditions set by the data owner.

You or your organisation need to have registered with an Open Energy account and created your access credentials before using this notebook. This process is detailed in another Jupyter notebook: [Setting up a shared data connection](https://colab.research.google.com/github/icebreakerone/open-energy-python-infrastructure/blob/main/examples/jupyter/setting_up_a_shared_data_connection.ipynb).

For an example of using Open Energy Search to find and access Open datasets, see [Open Energy Open Dataset Retrieval Example](https://colab.research.google.com/github/icebreakerone/open-energy-python-infrastructure/blob/main/examples/jupyter/open_dataset_retrieval.ipynb)

## Install dependencies

In [None]:
!pip install ib1.openenergy.support pandas geopandas matplotlib rtree
!oe_install_cacerts

## Set up connection to Open Energy CKAN Server

In [None]:
from ckanapi import RemoteCKAN
ua = 'openenergyexample/1.0'
oeserver = RemoteCKAN('https://search.openenergy.org.uk', user_agent=ua)

## Search for term

In [None]:
search_term = 'bis headquarters'
search_results = oeserver.action.package_search(q=search_term)

## Show table of results

In [None]:
import pandas as pd
search_results_df = pd.json_normalize(search_results['results'], max_level=1).filter(items=('organization.title', 'title','license_title','num_resources', 'id'))
display(search_results_df)

## Select a package

In [None]:
from ipywidgets import widgets as wgt
from IPython.display import HTML, Markdown

# id from search results
package_id = '047ce029-400f-4772-a812-5477c38e58aa' 
package = oeserver.action.package_show(id=package_id)
resources_df = pd.DataFrame(package['resources'], columns=['name', 'format', 'size', 'url'])

#Custom styles: bold labels and tighter line spacing
display(HTML("<style>.ib-label { font-weight:bold; } .widget-label { margin-bottom: 10px; } .widget-html > .widget-html-content { line-height:1.5; margin-bottom: 10px;}</style>"))
items = [
    wgt.HTML('Organization'), wgt.HTML(package['organization']['name']),
    wgt.HTML('Title'), wgt.HTML(package['title']),
    wgt.HTML('Name'), wgt.HTML(package['name']),
    wgt.HTML('Is Open'), wgt.HTML('Open' if package['isopen'] else 'Not open'),
    wgt.HTML('Notes'), wgt.HTML(package['notes'])
]
for i in items[::2]:
    i.add_class("ib-label")

gb = wgt.GridBox(items, layout=wgt.Layout(grid_template_columns="100px fit-content(60%)"))
display(gb)

display(Markdown('### Resources'))
display(resources_df)

## Get the Open API spec for OE server

In [None]:
import requests
import json

url = ''
for res in package['resources']:
    if res['type'] == 'api':
        url = res['url']

result = requests.get(url)
openapi_spec = json.loads(result.text)
print(json.dumps(openapi_spec, indent=2))

### Mount Drive to access stored key, certificate and client ID

These files were created in [Setting up a shared data connection.ipynb](https://colab.research.google.com/drive/18xWWO_CxWZIylP04EjSBiJDWKAWa_EVg#scrollTo=m6OrQrJ6N9Sf)

Mount your Jupyter environment to your Drive. This will pop up a warning and then take you through a standard permissioning flow to allow access to your Drive.

If you have your Open Energy key and certificate stored with a different mechanism, replace this step.

In [None]:
from google.colab import drive
drive.mount('/content/drive')
!ls /content/drive/MyDrive/oe-certs

In [None]:
oe_key = '/content/drive/MyDrive/oe-certs/oe.key'
oe_cert = '/content/drive/MyDrive/oe-certs/oe.pem'
client_id = ''

with open('/content/drive/MyDrive/oe-certs/client_id.txt') as f:
  client_id = f.readline()

print('Client ID: {0}'.format(client_id))

### Create FAPI session

In [None]:
from ib1.openenergy.support import FAPISession

client = FAPISession(client_id=client_id,
                     issuer_url='https://matls-auth.directory.energydata.org.uk',
                     requested_scopes='directory:software',
                     private_key=oe_key,
                     certificate=oe_cert)

## Fetch the dataset

In [None]:
import io

url = openapi_spec['servers'][0]['url']+list(openapi_spec['paths'].keys())[0]

csv_data = client.session.get(url=url).text

df = pd.read_csv(io.StringIO(csv_data)) 
display(df)

### Plot it out

In [None]:
import matplotlib.pyplot as plt
from datetime import datetime

def datetime_from_date_time(d, t):
    return datetime.strptime(d+' '+t, '%d/%m/%y %H:%M')

df['date_obj'] = list(map(datetime_from_date_time, df['date'], df['time']))
df.sort_values(by='date_obj', inplace=True)

df2 = df[df.date_obj > datetime.strptime('01/04/19', '%d/%m/%y')].sort_values(by='date_obj')
                                                                        
x = df2['date_obj']
y = df2['electricity_kwh']

plt.figure(figsize=(15,5))

# beautify the x-labels
plt.gcf().autofmt_xdate()
# plot
plt.plot(x,y)
plt.show()