TODO:

1. Folium => Can we do categorical plots?
2. Titles + menu
3. Storyline
4. Can we remove warning?

------------ Maybe-------------

4. Readme in notebook folder
5. Join query with swisstopo

<div class="notebook-buttons" style="display:flex; padding-top: 5rem;padding-bottom: 2.5rem;line-height: 2.15;">
    <a href="https://colab.research.google.com/github/zazuko/notebooks/blob/master/notebooks/electricity-prices/electricity-prices.ipynb">
        <div id="colab-link" style="display: flex;padding-right: 3.5rem;padding-bottom: 0.625rem;border-bottom: 1px solid #ececed; align-items: center;">
            <img class="call-to-action-img" src="../img/colab.svg" width="30" height="30" style="margin-right: 10px;margin-top: auto;margin-bottom: auto;">
            <div class="call-to-action-txt">Run in Google Colab</div>
        </div>
    </a>
    <a href="https://raw.githubusercontent.com/zazuko/notebooks/master/notebooks/electricity-prices/electricity-prices.ipynb" download>
        <div id="download-link" style="display: flex;padding-right: 3.5rem;padding-bottom: 0.625rem;border-bottom: 1px solid #ececed; height: auto;align-items: center;">
            <img class="call-to-action-img" src="../img/download.svg" width="22" height="30" style="margin-right: 10px;margin-top: auto;margin-bottom: auto;">
            <div class="call-to-action-txt">Download Notebook</div>
        </div>
    </a>
    <a href="https://github.com/zazuko/notebooks/blob/master/notebooks/electricity-prices/electricity-prices.ipynb">
        <div id="github-link" style="display: flex;padding-right: 3.5rem;padding-bottom: 0.625rem;border-bottom: 1px solid #ececed; height: auto;align-items: center;">
            <img class="call-to-action-img" src="../img/github.svg" width="25" height="30" style="margin-right: 10px;margin-top: auto;margin-bottom: auto;">
            <div class="call-to-action-txt">View on GitHub</div>
        </div>
    </a>
</div>

# Electricity prices in Switzerland

ElCom, the Federal Electricity Commission, collects data on the electricity tariffs for household and companies. This data is published as [Linked Data](https://en.wikipedia.org/wiki/Linked_data). 

In this tutorial, we will show **how to work with Linked Data.** Mainly, we will see how to work with data on electricity tariffs.   
We will look into how to query, process, and visualize it.   


## Here goes menu
[1. Restaurants over time](#Restaurants-over-time)  
[2. Restaurants in city quartiers](#Restaurants-in-city-quartiers)  
[3. After-school care: gender-representation](#After-school-care:-gender-representation)

### SPARQL endpoints

#### For electricity traiffs
Data on all electricity tariffs is published as Linked Data. It can be accessed with [SPARQL queries](https://www.w3.org/TR/rdf-sparql-query/).   
You can send queries using HTTP requests. The API endpoint is **[https://lindas.admin.ch/query/](https://int.lindas.admin.ch/query).**  

#### For geodata
Different municipalities may have different tariffs. To understand their location, we will work with 
swiss geodata. It is published as Linked Data. It can be accessed using API endpoint under **[https://ld.geo.admin.ch/query](https://ld.geo.admin.ch/query).**  

Let's use `SparqlClient` from [graphly](https://github.com/zazuko/graphly) to communicate with both databases. 
Graphly will allow us to:
* send SPARQL queries
* automatically add prefixes to all queries
* format response to `pandas` or `geopandas`

In [None]:
import string
import re
import json

import pandas as pd
import plotly.express as px
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import matplotlib.pyplot as plt
import mapclassify

import matplotlib as mpl
import matplotlib.cm as cm
import folium

from graphly.api_client import SparqlClient

# TODO: 
# 1. Ask Geoimpact for list of "Firmen mit eigene Transformatorenstation"
# 2. Join query with swisstopo
# 3. New notebook: Electricity prices over time. Were market prices reflected on end consumers ? 

In [None]:
# Uncomment to install dependencies in Colab environment
#!pip install mapclassify
#!pip install git+https://github.com/zazuko/graphly.git

In [None]:
sparql = SparqlClient("https://int.lindas.admin.ch/query")
geosparql = SparqlClient("https://ld.geo.admin.ch/query")

sparql.add_prefixes({
    "schema": "<http://schema.org/>",
    "cube": "<https://cube.link/>",
    "elcom": "<https://energy.ld.admin.ch/elcom/electricityprice/dimension/>"
})

geosparql.add_prefixes({
    "dct": "<http://purl.org/dc/terms/>",
    "geonames": "<http://www.geonames.org/ontology#>",
    "schema": "<http://schema.org/>",
    "geosparql": "<http://www.opengis.net/ont/geosparql#>",
})

SPARQL queries can become very long. To improve the readibility, we will work wih [prefixes](https://en.wikibooks.org/wiki/SPARQL/Prefixes).
 
Using `add_prefixes` method, we can define persistent prefixes. 
Every time you send a query, `graphly` will now automatically add the prefixes for you.

### Electricity tariffs

In [None]:
query = """
SELECT ?municipality_id ?category ?energy ?grid ?aidfee (?community_fees + ?aidfee as ?taxes) ?fixcosts ?variablecosts 
FROM <https://lindas.admin.ch/elcom/electricityprice>
WHERE {
    <https://energy.ld.admin.ch/elcom/electricityprice/observation/> cube:observation ?observation.
    
    ?observation
      elcom:category/schema:name ?category;
      elcom:municipality ?municipality_id;
      elcom:period "2020"^^<http://www.w3.org/2001/XMLSchema#gYear>;
      elcom:product <https://energy.ld.admin.ch/elcom/electricityprice/product/standard>;
      elcom:fixcosts ?fixcosts;
      elcom:total ?variablecosts;
      elcom:gridusage ?grid;
      elcom:energy ?energy;
      elcom:charge ?community_fees;
      elcom:aidfee ?aidfee.
      
}
ORDER BY ?muncipality ?category ?variablecosts
"""

prices = sparql.send_query(query)
prices = prices.groupby(["municipality_id", "category"]).first().reset_index()
prices.head()

In [None]:
query = """
SELECT DISTINCT ?category ?description
WHERE {
  GRAPH <https://lindas.admin.ch/elcom/electricityprice> {
    
    ?s <https://energy.ld.admin.ch/elcom/electricityprice/dimension/category> ?category_uri.
    ?category_uri schema:name ?category .
    ?category_uri schema:description ?description .
  }
}
ORDER BY ?category
"""
df = sparql.send_query(query)
df.head()

In [None]:
def extract_consumption(description: str) -> int:
    """
    Extract average electricity consumption from a description.
    Args:
        description:  Category description for electricity prices
        
    Returns: 
        int:          Electricity consumption in kWh/year
    
    """
    
    number_as_string = description.split(" kWh/Jahr")[0]
    return int(number_as_string.translate(str.maketrans('', '', string.punctuation)))

In [None]:
cat2description = dict(zip(df.category, df.description))
cat2consumption = dict(zip(df.category, [extract_consumption(d) for d in df.description]))

prices["consumption"] = prices[["category"]].replace({"category": cat2consumption})
prices["monthly_bill"] = ((prices.consumption*prices.variablecosts/12 + prices.fixcosts)/100).round(2)
prices.head()

In [None]:
query = """    
SELECT ?municipality_id ?municipality ?population ?boundary 

WHERE {
  ?muni_iri dct:hasVersion ?version ;
            geonames:featureCode geonames:A.ADM3 .
  
  ?version schema:validUntil "2020-12-31"^^<http://www.w3.org/2001/XMLSchema#date>;
           geonames:population ?population;
           schema:name ?municipality;
           geosparql:hasGeometry/geosparql:asWKT ?boundary.
  
  BIND(IRI(REPLACE(STR(?muni_iri), "https://ld.geo.admin.ch/boundaries/", "https://ld.admin.ch/")) AS ?municipality_id)
}

"""
communes = geosparql.send_query(query)
communes = communes.set_crs(epsg=4326)
communes.head()

In [None]:
communes.plot()

In [None]:
join = pd.merge(communes[["municipality_id", "municipality", "population", "boundary"]], prices, how="inner", on="municipality_id")
join.drop(columns=["variablecosts", "consumption"], inplace=True)
join.head()

## Prices across regions

In [None]:
style_function = lambda x: {'fillColor': '#ffffff', 
                            'color':'#000000', 
                            'fillOpacity': 0.1, 
                            'weight': 0.1}
highlight_function = lambda x: {'fillColor': '#000000', 
                                'color':'#000000', 
                                'fillOpacity': 0.50, 
                                'weight': 0.1}

def plot_tariffs(df, variable, variable_description, title):
    
    classifier = mapclassify.NaturalBreaks(y=df[variable], k=5)
    bins = [df[variable].min()] + list(classifier.bins)

    m = folium.Map(location=[46.83, 8.13], zoom_start=8, tiles="cartodbpositron")

    folium.Choropleth(
        geo_data=json.loads(df.to_json()),
        data=df,
        columns=["municipality", variable],
        key_on="feature.properties.municipality",
        fill_color="YlGn",
        line_opacity=0,
        fill_opacity=1,
        bins=bins,
        reset=True,
        legend_name=variable_description,
    ).add_to(m)

    hover = folium.features.GeoJson(
        dff,
        style_function=style_function, 
        control=False,
        highlight_function=highlight_function, 
        tooltip=folium.features.GeoJsonTooltip(
            fields=['municipality', variable],
            aliases=['Municipality: ', variable_description + ": "],
            style=("background-color: white; color: #333333; font-family: arial; font-size: 12px; padding: 10px;") 
        )
    )

    folium.LayerControl().add_to(m)
    m.add_child(hover)
    m.keep_in_front(hover)
    
    title_html = '''<h3 align="center" style="font-size:16px"><b>{}</b></h3>'''.format(title)   
    m.get_root().html.add_child(folium.Element(title_html))

    return m

In [None]:
category = "H2"
plotting_df = join[join.category == category]
plot_tariffs(plotting_df, "monthly_bill", "Average monthly bill [CHF]", cat2description[category])

In [None]:
category = "C2"
plotting_df = join[join.category == "C2"]
plot_tariffs(plotting_df, "monthly_bill", "Average monthly bill [CHF]", cat2description[category])

## Prices across regions and tariffs 

In [None]:
cols = 2

titles = {"C1": "8 000 kWh/year\nSmall company", "C7": "1 500 000 kWh/year\nBig company with own transformator"}
categories = list(titles.keys())
fig, axs = plt.subplots(1, cols, figsize=(20,8))

for col in range(cols):
    cat = categories[col]
    plotting_df = join[join.category == cat]
    plotting_df.plot(ax=axs[col], column="monthly_bill", cmap="GnBu", scheme="Natural_Breaks", k=5, legend=True, legend_kwds={"fmt": "{:.0f}"})
    axs[col].set_axis_off()
    axs[col].set_title(titles[cat])
        
fig.suptitle("Average monthly bill for enterprises", fontsize=24)
fig.show()

In [None]:
cols = 2

titles = {"H1": "1 600 kWh/year\n2-room appartment", "H5": "7 500 kW/year\n5-room house"}
categories = list(titles.keys())
fig, axs = plt.subplots(1, cols, figsize=(20,8))

for col in range(cols):
    cat = categories[col]
    plotting_df = join[join.category == cat]
    plotting_df.plot(ax=axs[col], column="monthly_bill", cmap="YlGn", scheme="Natural_Breaks", k=5, legend=True, legend_kwds={"fmt": "{:.0f}"})
    axs[col].set_axis_off()
    axs[col].set_title(titles[cat])
        
fig.suptitle("Average monthly bill for households", fontsize=24)
fig.show()

## Competition between energy providers

These municipalities have more than one electricity provider

In [None]:
query = """
PREFIX cube: <https://cube.link/>
PREFIX elcom: <https://energy.ld.admin.ch/elcom/electricityprice/dimension/>
PREFIX schema: <http://schema.org/>

SELECT ?municipality_id (MAX(?providers) AS ?providers)
FROM <https://lindas.admin.ch/elcom/electricityprice>
WHERE {
  SELECT ?municipality_id ?category (COUNT(*) AS ?providers)

  WHERE {
      <https://energy.ld.admin.ch/elcom/electricityprice/observation/> cube:observation ?observation.

      ?observation
        elcom:category/schema:name ?category;
        elcom:municipality ?municipality_id;
        elcom:period "2020"^^<http://www.w3.org/2001/XMLSchema#gYear>;
        elcom:product <https://energy.ld.admin.ch/elcom/electricityprice/product/standard>;
        elcom:energy ?energy.
  }
  GROUP BY ?municipality_id ?category
  HAVING (COUNT(*) > 1)
}
GROUP BY ?municipality_id
"""
df = sparql.send_query(query)
df.head()

Let's get boundaries for these municipalities

In [None]:
join2 = pd.merge(communes[["municipality_id", "municipality", "population", "boundary"]], df, how="outer", on="municipality_id").fillna(1)
join2.head()

In [None]:
join2.providers = join2.providers.astype(int).astype(str)
join2 = join2.set_index("municipality")
data = json.loads(join2.to_json())

In [None]:
def num2color(value, minimum, maximum):
    return mpl.colors.rgb2hex(cm.plasma((value-minimum)/maximum))
    
providers = join2.providers.unique()
colormap = {i: num2color(int(i), 1, len(providers)-1) for i in providers}

In [None]:
fig = px.choropleth(join2, geojson=data, locations=join2.index, 
                    color="providers",
                    color_discrete_map=colormap,
                    projection="transverse mercator",
                    hover_name=join2.index,
                    #hover_data={"buckets": False},
                    labels={"providers": "Electricity providers"})
fig.update_geos(fitbounds="locations", visible=False)
fig.update_layout(margin={"r":0,"l":0,"b":0})
fig.update_traces(marker_line_width=0)
fig.show()

### Grid costs and population density 
=> which region is the most expensive to maintain
Does that correlate with area?

In [None]:
join["hectares"] = join.to_crs(epsg=3035).area/10000 # In hectares
join["population_density"] = join["population"]/join["hectares"]
join.head(3)

In [None]:
var = "monthly_bill"
dff = join[join.category == "H1"][["population", var]]
dff[var] = dff[var].round(2)

dff = dff.groupby([var]).sum().reset_index()
dff["frequency"] = dff["population"]/dff["population"].sum()*100

dff.head()

In [None]:
# Relationship between grid costs, and population density
dff = join[join.category == "C1"]

fig = px.scatter(dff, y="grid", x="population_density", hover_data=["municipality"],
                labels={
                     "population_density": "Inbahitants per ha",
                     "grid": "Grid costs per kWh"})
fig.show()

## Free grid access

In [None]:
join = join.assign(pays_grid=lambda x: x.grid!=0, 
                   pays_aidfee=lambda x: x.aidfee!=0,
                   pays_energy=lambda x: x.energy!=0)

join.head(3)

In [None]:
df = join[join.category=="C1"]
df = df.set_index("municipality")
data = json.loads(df.to_json())

In [None]:
fig = px.choropleth(df, geojson=data, locations=df.index, 
                    color="pays_grid",
                    projection="transverse mercator",
                    hover_name=df.index,
                    #hover_data={"buckets": False},
                    labels={"pays_for_grid": "Paid grid usage"})
fig.update_geos(fitbounds="locations", visible=False)
fig.update_layout(margin={"r":0,"l":0,"b":0})
fig.update_traces(marker_line_width=0)
fig.show()

In [None]:
# These communes do not pay for grid
temp = join[["municipality_id", "municipality", "pays_grid", "population"]].groupby(["municipality_id", "municipality", "population"]).all().reset_index()
free_grid_pop = temp[(~temp.pays_grid)]["population"].sum()
temp[(~temp.pays_grid)]

## Free energy

In [None]:
# These communes do not pay for energy
join["has_free_energy"] = ~join.pays_energy
temp = join[["municipality_id", "municipality", "has_free_energy", "population"]].groupby(["municipality_id", "municipality", "population"]).all().reset_index()
temp[temp.has_free_energy]

Ok. There is no comune with free electricity

## Aidfee waiver

In [None]:
fig = px.choropleth(df, geojson=data, locations=df.index, 
                    color="pays_aidfee",
                    projection="transverse mercator",
                    hover_name=df.index,
                    labels={"pays_aidfee": "Pays aidfee"},
                    color_discrete_map={True: "#00CC96", False: "#AB63FA"}
                   )
fig.update_geos(fitbounds="locations", visible=False)
fig.update_layout(margin={"r":0,"l":0,"b":0})
fig.update_traces(marker_line_width=0)
fig.show()

How many people profit from it?

In [None]:
# These communes do not pay aidfee
temp = join[["municipality_id", "municipality", "pays_aidfee", "population"]].groupby(["municipality_id", "municipality", "population"]).all().reset_index()
free_aidfee_pop = temp[(~temp.pays_aidfee)]["population"].sum()

temp[(~temp.pays_aidfee)]

## How many people benefit from it?

In [None]:
fig = make_subplots(rows=1, cols=2, subplot_titles=["Grid access", "Aidfee"], specs=[[{"type": "pie"}, {"type": "pie"}]])

fig.add_trace(go.Pie(
     values=[free_grid_pop,total_pop-free_grid_pop],
     labels=["Not paying", "Paying"],
     name="Grid access"), 
     row=1, col=1)

fig.add_trace(go.Pie(
     values=[free_aidfee_pop,total_pop-free_aidfee_pop],
     labels=["Not paying", "Paying"],
     name="Aidfee"),
     row=1, col=2)

fig.update_annotations(yshift=-280)
fig.update_layout(height=400, title={"text": "Population benefiting from free tariffs", "x": 0.5})
fig.show()


## Companies benefiting from free grid 

Are there any companies that profit from it?

Zefix: find companies registered in this region (and hence entitiled to free elencticity)
Use only AGs

Switzerland has around 2500 enterprises with its won transformators => they could benefit from these locations

In [None]:
query = """
SELECT ?company ?type ?company_iri
WHERE {
    ?company_iri a <https://schema.ld.admin.ch/ZefixOrganisation>.
    ?company_iri <http://schema.org/legalName> ?company.
    ?company_iri <https://schema.ld.admin.ch/municipality> <https://ld.admin.ch/municipality/3695>.
    ?company_iri <http://schema.org/additionalType>/<http://schema.org/name> ?type .
  
  FILTER(LANG(?type) = "de")
}
ORDER BY ?type ?company
"""
df = sparql.send_query(query)
df.head(7)

In [None]:
municipalities_free_energy = join["municipality_id"][~(join.pays_energy)].unique()

query = """
SELECT ?municipality ?company ?company_iri 
WHERE {{
	?company_iri a <https://schema.ld.admin.ch/ZefixOrganisation>.
  	?company_iri <http://schema.org/legalName> ?company.
    ?company_iri <https://schema.ld.admin.ch/municipality> ?municipality_iri.
    ?company_iri <http://schema.org/additionalType> <https://ld.admin.ch/ech/97/legalforms/0106> .
  
  ?municipality_iri <http://schema.org/name> ?municipality.
  
  FILTER(?municipality_iri IN({}))
  
}}
ORDER BY ?municipality ?company
""".format("<" + ">,<".join(municipalities_free_energy) + ">")
df = sparql.send_query(query)
df

## Join DFs

In [None]:
query = """
PREFIX cube: <https://cube.link/>
PREFIX elcom: <https://energy.ld.admin.ch/elcom/electricityprice/dimension/>
PREFIX schema: <http://schema.org/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX geonames: <http://www.geonames.org/ontology#>
PREFIX geosparql: <http://www.opengis.net/ont/geosparql#>

SELECT ?municipality ?category ?energy ?grid (?community_fees + ?aidfee as ?taxes) ?fixcosts ?variablecosts 
FROM <https://lindas.admin.ch/elcom/electricityprice>
WHERE {
    <https://energy.ld.admin.ch/elcom/electricityprice/observation/> cube:observation ?observation.
    
    ?observation
      elcom:category <https://energy.ld.admin.ch/elcom/electricityprice/category/H1>;
      elcom:municipality ?muni_iri;
      elcom:period "2020"^^<http://www.w3.org/2001/XMLSchema#gYear>;
      elcom:product <https://energy.ld.admin.ch/elcom/electricityprice/product/standard>;
      elcom:fixcosts ?fixcosts;
      elcom:total ?variablecosts;
      elcom:gridusage ?grid;
      elcom:energy ?energy;
      elcom:charge ?community_fees;
      elcom:aidfee ?aidfee.
    
    BIND(IRI(REPLACE(STR(?muni_iri),"https://ld.admin.ch/", "https://ld.geo.admin.ch/boundaries/")) AS ?municipality) .
  
    {SERVICE <https://ld.geo.admin.ch/query> {
      SELECT ?municipality ?name ?population ?polygon WHERE {
        
        ?municipality dct:hasVersion ?version ;
                      geonames:featureCode geonames:A.ADM3 .
        
        ?version schema:validUntil "2020-12-31"^^<http://www.w3.org/2001/XMLSchema#date>;
             geonames:population ?population ;
             schema:name ?name .
        
        ?version geosparql:hasGeometry/geosparql:asWKT ?polygon
    }
    }}
}
LIMIT 500
"""

#df = sparql.send_query(query)
#df.head()