# Open Energy Open Dataset Retrieval example

This Jupyter Notebook uses Open Energy Search to find an Open dataset that is available publicly without access controls. The metadata is used to check the format, then the dataset is downloaded and displayed in a visualisation.

Open data is one of the [three classes of data](https://icebreakerone.org/open-shared-closed/) (Open, Shared, Closed) in the Open Energy ecosystem. Open data is freely available without restrictions on how it is used.

Open Energy Search also indexes Shared data - data that may be used by Open Energy members subject to conditions set by the data owner. For an example of using Open Energy Search and [Open Energy access control](https://docs.openenergy.org.uk/main/access_control_specification.html) to retrieve Shared datasets, see [Open Energy Shared Dataset Retrieval Example](https://colab.research.google.com/github/icebreakerone/open-energy-python-infrastructure/blob/main/examples/jupyter/shared_dataset_retrieval.ipynb)

## Install dependencies

In [None]:
!pip install ib1.openenergy.support pandas geopandas matplotlib rtree

## Set up connection to Open Energy CKAN Server

In [None]:
from ckanapi import RemoteCKAN
ua = 'openenergyexample/1.0'
oeserver = RemoteCKAN('https://search.openenergy.org.uk', user_agent=ua)

## Search for term

In [None]:
search_term = 'ev'
search_results = oeserver.action.package_search(q=search_term)

## Show table of results

In [None]:
import pandas as pd
search_results_df = pd.json_normalize(search_results['results'], max_level=1).filter(items=('organization.title', 'title','license_title','num_resources', 'id'))
display(search_results_df)

## Select a package

In [None]:
from IPython.display import Markdown
# id from search results
package_id = 'a49e073c-018e-4e3e-965b-501396dc4e31'
package = oeserver.action.package_show(id=package_id)
#print(json.dumps(package, indent=2))
pd.set_option('display.max_colwidth', None)
package_df = pd.json_normalize(package, max_level=1).filter(items=('organization.title', 'title','notes','num_resources'))
resources_df = pd.DataFrame(package['resources'], columns=['name', 'format', 'size', 'url'])
display(Markdown('### Package details'))
display(package_df)
display(Markdown('### Resources'))
display(resources_df)

## Alternative display style

In [None]:
from ipywidgets import widgets as wgt
from IPython.display import HTML

#Custom styles: bold labels and tighter line spacing
display(HTML("<style>.ib-label { font-weight:bold; } .widget-label { margin-bottom: 10px; } .widget-html > .widget-html-content { line-height:1.5; margin-bottom: 10px;}</style>"))
items = [
    wgt.HTML('Organization'), wgt.HTML(package['organization']['name']),
    wgt.HTML('Title'), wgt.HTML(package['title']),
    wgt.HTML('Name'), wgt.HTML(package['name']),
    wgt.HTML('Is Open'), wgt.HTML('Open' if package['isopen'] else 'Not open'),
    wgt.HTML('Notes'), wgt.HTML(package['notes'])
]
for i in items[::2]:
    i.add_class("ib-label")

gb = wgt.GridBox(items, layout=wgt.Layout(grid_template_columns="100px fit-content(60%)"))
display(gb)

display(Markdown('### Resources'))
display(resources_df)

## Choose resource number

In [None]:
selected_res_index = 0

## If it's a CSV, fetch it

In [None]:
csv_df = None

res = package['resources'][selected_res_index]
if res['format'] == 'CSV':
    csv_df=pd.read_csv(res['url'])
    title = wgt.HTML(res['name'])
    title.add_class('ib-label')
    display(title)
    display(csv_df)
else:
    # Ignore other types for now
    print('Sorry, can\'t fetch non-CSV data yet!')

## Only for Electric Vehicle Capacity Map - plot on chart

In [None]:
#@title
import matplotlib.pyplot as plt
import geopandas
from shapely.geometry import Polygon

if res['name'] == 'Electric Vehicle  Capacity Map':
    fig, ax = plt.subplots(figsize=(10,10))
    countries = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))

    xmin = csv_df['Longitude'].min()
    xmax = csv_df['Longitude'].max()
    ymin = csv_df['Latitude'].min()
    ymax = csv_df['Latitude'].max()

    box = Polygon([(xmin, ymin), (xmax, ymin), (xmax, ymax), (xmin, ymax), (xmin, ymin)])

    zoom_area = countries.clip(box)

    zoom_area.plot(color="lightgrey", ax=ax)

    csv_df.plot(x="Longitude", y="Latitude", kind="scatter",
            title=f"Substations", 
            ax=ax)
    plt.show()
else:
    print('You\'ll need to make your own visualisation of this one')