# Objective

In this project I was curious to know:
1. How the energy systems is changing in each country since 1965 to present? 
2. To what degree our countries are making progress on reducing electricity generation from high-carbon sources?
3. How are we compensating this reduction? **Renewables** or **Nuclear?**

<br>
The dataset were download from **Our World in Data** and **Word Resources Institute** organizations. Data is publicly available and can be downloaded in below:<br>
Sources: 

- <a href="https://ourworldindata.org/energy">Our World in Data</a>
- <a href="https://datasets.wri.org/dataset/globalpowerplantdatabase">Word Resources Institute</a>


### Contents:<a class="anchor" id="cero"></a>
1. [Primary energy consumption from Fossil Fuels, Nuclear, and Renewables](#uno)
<br> Here I illustrated the % share of energy consumption from each source for each country, and the global shift of energy since 1965 to 2020.<br>
    - [result_1](#uno-uno)<br>
    - [result_2](#uno-dos)<br><br>

2. [Compensated by Renewables or Nuclear?](#dos)
<br> Which direction our countries took by reducing the Fossil base energy sources?
    - [result_compensation](#dos-uno)<br><br>

3. [The geolocation of all power plants in Europe](#tres)
<br> Here you can fine the location of all power plants, and information on plant capacity, generation, ownership, and fuel type.
    - [result_geolocations](#tres-uno)<br><br>
    
4. [Share of energy consumption/production in each country](#cuatro)
<br> Finally a treemap on the percentage of source for the consumption or production in 2019.
    - [result_consumption](#cuatro-uno)<br>
    - [result_production](#cuatro-dos)

In [None]:
#Importing the libraries
import pandas as pd
import geopandas as gpd
from plotly.subplots import make_subplots
import plotly.express as px
import plotly.graph_objects as go

# for making gif
# from chart_studio.plotly import image as PlotlyImage
# from PIL import Image as PILImage
# import io
# import chart_studio
import glob
from PIL import Image

## Primary energy consumption from Fossil Fuels, Nuclear, and Renewables <a class="anchor" id="uno"></a>
jump back to [contents](#cero)

#### Import, Clearn, Transform

In [None]:
# import the data
df = pd.read_csv('../input/energy-consumption-and-generation-in-the-globe/Primary-energy-consumption-from-fossilfuels-nuclear-renewables.csv')
df.head()


Add the continent of each countries from geopandas:

In [None]:
cont = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
cont = cont[['name','continent']]

df = df.merge(cont, how='left', left_on='Entity', right_on='name').drop(columns=['name'])

Check if there are still countries which the continent is not assigned yet:

In [None]:
df.loc[pd.isna(df.continent),:].Entity.unique()

Since I will focus only in Europe, there is one country need to be update. Then I will only keep the european countries:

In [None]:
df.loc[df.Entity=='North Macedonia','continent'] = 'Europe'

# keep european countries
df = df[df['continent']=='Europe'].reset_index(drop=True)

# rename the columns
df.rename(columns={
    'Fossil fuels (% sub energy)': 'Fossil',
    'Nuclear (% sub energy)': 'Nuclear',
    'Renewables (% sub energy)': 'Renewables'},inplace=True)

df.head()

Apparently for some countries the data was not available between 1965 to 1989:

In [None]:
bins=max(df['Year'])-min(df['Year'])+1
df['Year'].hist(bins=bins, grid=False)

#### visualization

#### result_1 <a class="anchor" id="uno-uno"></a>

In [None]:
fig = px.choropleth(df,
    locations='Code',
    color='Nuclear',
    locationmode='ISO-3',
    animation_frame="Year",
)

fig.update_layout(
    title=dict(
        text='Primary energy consumption from Nuclear energy',
        x=.5,
        font_size=18,
        ),

    geo=dict(
        bgcolor='#8ad6ff',
        lakecolor='#8ad6ff',
        projection_type='miller',
        scope='europe'
        ),
    
    width = 700,
    height = 700,
    coloraxis=dict(colorscale='Reds',cmin=df['Nuclear'].min(),cmax=df['Nuclear'].max())
)

fig.layout.updatemenus[0].buttons[0].args[1]["frame"]["duration"] = 100
fig.show()

#### result_2 <a class="anchor" id="uno-dos"></a>

Now let's see how all the three energy sources look like by time:

In [None]:
steps = list()    
for i in range(df['Year'].min(),df['Year'].max()+1):
    step = dict(
        method='restyle',
        args=['visible', [False] * len(df['Year'].unique()) * 2],
        label=' {}'.format(i)
    )
    steps.append(step)

# create a dataframe for styling
map_ = pd.DataFrame({'Nuclear': [0.45,0.8,'YlOrRd'],
                    'Fossil': [.95,.8,'Greys'],
                    'Renewables': [0.45,0.3,'Greens'],
                     })

# create subplots
ite=0
for year in range(df['Year'].min(),df['Year'].max()+1):
    fig = make_subplots(rows=2, cols=2, 
                        specs=[[{"type": "choropleth"}, {"type": "choropleth"}], 
                               [{"type": "choropleth"}, {"type": "bar"}]], 
                        subplot_titles=('Nuclear', 'Fossil', 'Renewables','Global Energy Consumption in Europe'))

    layout = dict(
        autosize = False,
        width = 1000,
        height = 1000,
        plot_bgcolor='rgba(0,0,0,0.1)',
        xaxis= {'title': 'Energy source','domain':[0.6, 0.95]},
        yaxis= {'title': 'Share (%)', 'range': [0, 100],'domain':[0.15, 0.45]},

        sliders=[dict(
            active=ite,
            steps=steps,
            y=1.2,
            )],
        )

    # plot the three maps
    df2 = df[df['Year']==year].reset_index(drop=True)
    r=0
    z=0
    for index, col in enumerate(map_):

        geo_key = 'geo'+str(index+1)

        fig.add_trace(go.Choropleth(
                locations=df2['Code'],
                z=df2[col].astype(float),
                locationmode='ISO-3',
                colorscale=map_[col][2],
                autocolorscale=False,
                marker_line_color='white',
                geo=geo_key,
                zmin=df[col].min(),
                zmax=df[col].max(),

                colorbar=dict(
                    title = '% energy',
                    thickness=10, 
                    x=map_[col][0], 
                    y=map_[col][1],
                    len=0.35),
                    ), 

                row=1+r, col=z+1)

        z = z+1
        if z==2:
            r = r+1
            z = z-2

        layout[geo_key] = dict(
                scope = 'europe',
                projection_type='natural earth',
                domain = dict( x = [], y = [] ),
                lakecolor='#7cd6fc',
                )


    # position of maps
    z = 0
    COLS = 2
    ROWS = 2
    for y in reversed(range(ROWS)):
        for x in range(COLS):
            geo_key = 'geo'+str(z+1)
            layout[geo_key]['domain']['x'] = [float(x)/float(COLS), float(x+1)/float(COLS)-.03]
            layout[geo_key]['domain']['y'] = [float(y)/float(ROWS), float(y+1)/float(ROWS)]
            z=z+1
            if z > 2:
                break


    # bar plot
    x1 = ['Nuclear', 'Fossil', 'Renewables']
    y1 = [df2['Nuclear'].mean(), df2['Fossil'].mean(), df2['Renewables'].mean()]

    fig.add_trace(go.Bar(
            x=x1, 
            y=y1,
            text=y1,
            texttemplate = '%{text:.2s}%',
            textposition='auto',
            marker_color=['red','black','green'],
            width=.8,
            ), 
            row=2, col=2)


    ite=ite+1


    fig.add_annotation(
        text='Source:\
<a href="https://ourworldindata.org/energy">\
Our World in Data</a>',    
        xref="paper", 
        yref="paper",
        font_color='black',
        x=0, y=0, 
        showarrow=False)

    fig.layout.annotations[0].update(y=0.97)
    fig.layout.annotations[1].update(y=0.97)
    fig.layout.annotations[2].update(y=0.48)
    fig.layout.annotations[3].update(y=0.48)

    fig.update_layout(layout)
#     fig.write_image("images/"+str(year)+".jpg",scale=2)

fig.show()

After saving the images, let's make a git :)

In [None]:
# # filepaths
# fp_in = "images/*.jpg"
# fp_out = "results/primary_consumption.gif"

# img, *imgs = [Image.open(f) for f in sorted(glob.glob(fp_in))]
# img.save(fp=fp_out, format='GIF', append_images=imgs,
#          save_all=True, duration=200, loop=0)

## Compensated by Renewables or Nuclear? <a class="anchor" id="dos"></a>
jump back to [contents](#cero)

Here I was curious to know how our countries reacted to the need of reducing high-carbon energy sources like Oil and Gas; some countries find the nuclear energy as a solution, however some other countries chose to move towards green energy; let's have a look!<br><br>
The method I used to quantify the shift toward around source is:<br>

-**fossil** x (**renewable** - **nuclear**)

<br> 
where the bold states the slope of linear regression fit from 1965 to present.

In [None]:
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
import numpy as np

regr = LinearRegression(normalize=True)

list_=df['Entity'].unique()
dff = pd.DataFrame(index=range(len(list_)),columns=["Entity", "R/F",'continent'])

for l in range(len(list_)):
    df3=df[df['Entity']==list_[l]]

    Y=['Nuclear','Fossil','Renewables']
    X = df3[['Year']]

    z = np.zeros(3)
    for i in range(len(Y)):
        y=df3[[Y[i]]]
        regr.fit(X, y)
        z[i]=regr.coef_


    out2=-z[1]*(z[2]-z[0])
    
    dff.at[l,'Entity'] = list_[l]
    dff.at[l,'R/F'] = out2
    dff.at[l,'continent'] = 'Europe'
    
dff['R/F'] = pd.to_numeric(dff['R/F'])

Find the countries which had the highest relative shift from fossil energy to renewables and nuclear energy, respectively:

In [None]:
df3=df[df['Entity']==dff['Entity'].iloc[dff['R/F'].idxmax()]]
df3.plot.line(x='Year', y=['Nuclear','Fossil','Renewables'],
              title=dff['Entity'].iloc[dff['R/F'].idxmax()],
              color=['r','k','g'])
plt.show()

In [None]:
df3=df[df['Entity']==dff['Entity'].iloc[dff['R/F'].idxmin()]]
df3.plot.line(x='Year', y=['Nuclear','Fossil','Renewables'],
              title=dff['Entity'].iloc[dff['R/F'].idxmin()],
              color=['r','k','g'])
plt.show()

#### result_compensation <a class="anchor" id="dos-uno"></a>

In [None]:
ite=0
steps = list()    
for i in range(df['Year'].min(),df['Year'].max()+1):
    step = dict(
        method='restyle',
        args=['visible', [False] * len(df['Year'].unique()) * 2],
        label=' {}'.format(i)
    )
    steps.append(step)
    
    
for year in range(df['Year'].min(),df['Year'].max()+1):
    fig = make_subplots(rows=2, cols=2, 
                            specs=[[{"type": "bar"}, {"type": "bar"}], 
                                   [{"type": "treemap", "colspan": 2}, None]], 
                            subplot_titles=('Iceland', 'France', 
                                            '% Shift from fossil energy to low-carbon sources'
                                            ))

    layout = dict(
        autosize = False,
        width = 1000,
        height = 1000,
        plot_bgcolor='rgba(0,0,0,0.1)',
        showlegend=False,
        xaxis= {'title': 'Energy source','domain':[0, 0.45]},
        yaxis= {'title': 'Share (%)', 'range': [0, 100],'domain':[0.55, 1]},
        xaxis2= {'title': 'Energy source','domain':[0.55, 1]},
        yaxis2= {'title': 'Share (%)', 'range': [0, 100],'domain':[0.55, 1]},

        sliders=[dict(
            active=ite,
            steps=steps,
            y=1.2
            )]
        )

    # first bar "Iceland"
    df2 = df[df['Year']==year].reset_index(drop=True)
    df3=df2[df2['Entity']=='Iceland']
    x1 = ['Nuclear', 'Fossil', 'Renewables']
    y1 = [df3['Nuclear'].mean(), df3['Fossil'].mean(), df3['Renewables'].mean()]
    fig.add_trace(go.Bar(
                x=x1, y=y1,
                text=y1,
                texttemplate = '%{text:.2s}%',
                textposition='auto',
                marker_color=['red','black','green'],
                width=.8,
            ), row=1, col=1)


    # first bar "France"
    df3=df2[df2['Entity']=='France']
    x1 = ['Nuclear', 'Fossil', 'Renewables']
    y1 = [df3['Nuclear'].mean(), df3['Fossil'].mean(), df3['Renewables'].mean()]
    fig.add_trace(go.Bar(
                x=x1, y=y1,
                text=y1,
                texttemplate = '%{text:.2s}%',
                textposition='auto',
                marker_color=['red','black','green'],
                width=.8,
            ), row=1, col=2)

    fig.add_trace(go.Treemap(
        labels=dff['Entity'].to_list(),
        parents=dff['continent'].to_list(),
        values=dff['R/F'].abs().to_list(),
        branchvalues='total',
        marker=dict(
            colors=dff['R/F'].to_list(),
            colorscale=["darkred",'white', "darkgreen"],
            cmid=0),
        ), row=2, col=1)

    fig.update_layout(layout)
    ite=ite+1

    fig.add_annotation(
        text='Source:\
<a href="https://ourworldindata.org/energy">\
Our World in Data</a>',    
        xref="paper", 
        yref="paper",
        font_color='black',
        x=0, y=-0.05, 
        showarrow=False)
#     fig.write_image("images1/"+str(year)+".jpg")
fig.show()

After saving the images, let's make a git :)

In [None]:
# # filepaths
# fp_in = "images1/*.jpg"
# fp_out = "results/compensation.gif"

# img, *imgs = [Image.open(f) for f in sorted(glob.glob(fp_in))]
# img.save(fp=fp_out, format='GIF', append_images=imgs,
#          save_all=True, duration=200, loop=0)

## The geolocation of all power plants in Europe <a class="anchor" id="tres"></a>
jump back to [contents](#cero)

#### Import, Clearn, Transform

In [None]:
# import data
df = pd.read_csv('../input/energy-consumption-and-generation-in-the-globe/global_power_plant_database_last.csv')
df.head()

In [None]:
# keep only desired columns
df = df[['country_long','latitude','longitude','primary_fuel','estimated_generation_gwh_2017']]

Add the continent of each countries from geopandas:

In [None]:
cont = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
cont = cont[['name','continent']]

df = df.merge(cont, how='left', left_on='country_long', right_on='name').drop(columns=['name'])

Check if there are still countries which the continent is not assigned yet:

In [None]:
df.loc[pd.isna(df.continent),:].country_long.unique()

In [None]:
continent_map = {'Bosnia and Herzegovina':'Europe','Czech Republic':'Europe'}
for country_, continent_ in continent_map.items():
    df.loc[df.country_long==country_, 'continent'] = continent_

# keep european countries
df = df[df['continent']=='Europe'].reset_index(drop=True)

Check missing values:

In [None]:
df.isna().sum()

In [None]:
# remove them
df=df.dropna()
df.head()

#### result_geolocations <a class="anchor" id="tres-uno"></a>

In [None]:
# px.set_mapbox_access_token(open(".mapbox_token").read())

color_discrete_map ={
    'Hydro':'#425fff',
    'Gas':'#f569ff',
    'Wind':'#7cfcf4',
    'Solar':'#fff94d',
    'Oil':'#45010c',
    'Waste':'#77ff52',
    'Nuclear':'#ff3838',
    'Coal':'#c9c8c5',
    'Geothermal':'#ff8000',
    'Other':'#82ffaa'
    }

fig = px.scatter_mapbox(df, 
                        lat="latitude", 
                        lon="longitude", 
                        color="primary_fuel", 
                        size="estimated_generation_gwh_2017",
                        size_max=12, 
                        color_discrete_map = color_discrete_map)


fig.update_layout(
#     mapbox_style="dark",
    mapbox_style="carto-positron",
    showlegend=True,
    
    legend=dict(
            x=.04,
            y=.96,
            title='Type',
            title_font_color='white',
            bgcolor='rgba(0,0,0,0)',
            font_color='white',
            ),
    
    mapbox=dict(
        bearing=0,
        center=go.layout.mapbox.Center(lat=52, lon=10),
        zoom=3,
            ),
    
    title=dict(
    text='Location and types of power plant in Europe (2017)',
    x=.5,
    font_size=25,
        ),
    width=1000,
    height=700,
    )


fig.add_annotation(
    text='Source:\
<a href="https://datasets.wri.org/dataset/globalpowerplantdatabase">\
Word Resources Institute</a>',
    xref="paper", 
    yref="paper",
    font_color='white',
    x=0, y=0, 
    showarrow=False)

# fig.write_image('results/Location-types-of-power-plant-in-Europe-2017.jpg',scale=2)
fig.show()

## Share of energy consumption/production in each country <a class="anchor" id="cuatro"></a>
jump back to [contents](#cero)

#### Import, Clearn, Transform

In [None]:
df = pd.read_csv('../input/energy-consumption-and-generation-in-the-globe/share-energy-consum-by-source.csv')
df.head()

Add the continent of each countries from geopandas:

In [None]:
cont = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
cont = cont[['name','continent']]

df = df.merge(cont, how='left', left_on='Entity', right_on='name').drop(columns=['name'])

Check if there are still countries which the continent is not assigned yet:

In [None]:
df.loc[pd.isna(df.continent),:].Entity.unique()

Since I will focus only in Europe, there is one country need to be update. Then I will only keep the european countries:

In [None]:
df.loc[df.Entity=='North Macedonia','continent'] = 'Europe'

# keep european countries
df = df[df['continent']=='Europe'].reset_index(drop=True)

# rename the columns
df.rename(columns={
    'Oil (% sub energy)': 'Oil',
    'Coal (% sub energy)': 'Coal',
    'Solar (% sub energy)': 'Solar',
    'Nuclear (% sub energy)': 'Nuclear',
    'Hydro (% sub energy)': 'Hydro',
    'Wind (% sub energy)': 'Wind',
    'Gas (% sub energy)': 'Gas',
    'Other renewables (% sub energy)': 'Other renewables',
    }, inplace=True)

df.head()

Since the columns is the % share of energy consumption, the sum of sources should be exactly 100%.<br>
However, for some rows this is not the case (e.g. ~99.9%); below I will slightly modify the values for the *other renewables* columns to be the case.

In [None]:
fuel_type=['Oil', 'Coal', 'Solar', 'Nuclear', 'Hydro', 'Wind', 'Gas']

df['Other renewables'] = 100 - df[fuel_type].sum(axis=1)

# if the values of 'other renewables' is negative, correct them.
index_=df.index[df['Other renewables']<0]
for i in range(len(index_)):
    df.at[index_[i],'Oil'] = df['Oil'].iloc[index_[i]]+df['Other renewables'].iloc[index_[i]]
    df.at[index_[i],'Other renewables'] = 0

Now let's create a new DataFrame adapted for *plotly.treemap*

In [None]:
# list of desired columns
columns = ['Country','Year','fuel_type','fuel_type_val']

# list of unique countries
list_=df['Entity'].unique()

# list of energy sources
fuel_type = ['Oil', 'Coal', 'Solar', 'Nuclear', 'Hydro', 'Wind', 'Gas', 'Other renewables']

# list of year span
years=df['Year'].unique()

# prealocate the number of rows of new datafram
index=len(df)*len(fuel_type)

# create the dataframe
df_new=pd.DataFrame(index=range(index),columns=columns)

# insert the values from the old df
z=0
for i in range(len(df)):
    for f in range(len(fuel_type)):
        
        df_new.at[z,'Country'] = df['Entity'].iloc[i]
        df_new.at[z,'Year'] = df['Year'].iloc[i]
        df_new.at[z,'fuel_type'] = fuel_type[f]
        df_new.at[z,'fuel_type_val'] = df[fuel_type[f]].iloc[i]
        z=z+1

#### Visualization

I would like to the results for the last year which is **2019**.

In [None]:
# filter only data for 2019
df_new=df_new[df_new['Year']==2019]

# create a map of colorcodes for each energy source
color_discrete_map ={
    'Hydro':'#425fff',
    'Gas':'#f569ff',
    'Wind':'#7cfcf4',
    'Solar':'#fff94d',
    'Oil':'#45010c',
    'Nuclear':'#ff3838',
    'Coal':'#c9c8c5',
    'Other renewables':'#82ffaa',
    '(?)':'#b1bbc9',}

#### result_consumption <a class="anchor" id="cuatro-uno"></a>

In [None]:
fig = px.treemap(
    df_new, path=[px.Constant('Europe'),'Country', 'fuel_type'], 
    values='fuel_type_val',
    color='fuel_type',
    color_discrete_map = color_discrete_map,
    )

fig.update_layout(    
    title=dict(
    text='Share of energy consumption in Europe (2019)',
    x=.5,
    font_size=18,
        ),
    width=600,
    height=800,
    )

fig.add_annotation(
    text='Source:\
<a href="https://ourworldindata.org/energy">\
Our World in Data</a>',    
    xref="paper", 
    yref="paper",
    font_color='black',
    x=0, y=-0.05, 
    showarrow=False)

# fig.write_image('results/energy_consumption_2019.jpg',scale=2)
fig.show()

Now do exactly the same thing for the energy production data

#### Import, Clearn, Transform

In [None]:
# import the data
df = pd.read_csv('../input/energy-consumption-and-generation-in-the-globe/share-elec-produc-by-source.csv')

# assign the continent to each country
cont = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
cont = cont[['name','continent']]
df = df.merge(cont, how='left', left_on='Entity', right_on='name').drop(columns=['name'])
df.loc[df.Entity=='North Macedonia','continent'] = 'Europe'

# keep european countries
df = df[df['continent']=='Europe'].reset_index(drop=True)

# keep the same countries which the data was available in the previous section
df = df[df['Entity'].isin(list_)]

# rename the columns
df.rename(columns={'Oil (% electricity)': 'Oil',
                   'Coal (% electricity)': 'Coal',
                   'Solar (% electricity)': 'Solar',
                   'Nuclear (% electricity)': 'Nuclear',
                   'Hydro (% electricity)': 'Hydro',
                   'Wind (% electricity)': 'Wind',
                   'Gas (% electricity)': 'Gas',
                   'Other renewables (% electricity)': 'Other renewables',
                   },inplace=True)


# normalize the share of energy production to 100%
fuel_type=['Oil', 'Coal', 'Solar', 'Nuclear', 'Hydro', 'Wind', 'Gas']
df['Other renewables'] = 100 - df[fuel_type].sum(axis=1)


# create a new dataframe
columns = ['Country','Year','fuel_type','fuel_type_val']
fuel_type = ['Oil', 'Coal', 'Solar', 'Nuclear', 'Hydro', 'Wind', 'Gas', 'Other renewables']
years=df['Year'].unique()
index=len(df)*len(fuel_type)
df_new=pd.DataFrame(index=range(index),columns=columns)
z=0
for i in range(len(df)):
    for f in range(len(fuel_type)):
        
        df_new.at[z,'Country'] = df['Entity'].iloc[i]
        df_new.at[z,'Year'] = df['Year'].iloc[i]
        df_new.at[z,'fuel_type'] = fuel_type[f]
        df_new.at[z,'fuel_type_val'] = df[fuel_type[f]].iloc[i]
        z=z+1

#### visualization

In [None]:
# filter only data for 2019
df_new=df_new[df_new['Year']==2019]

# create a map of colorcodes for each energy source
color_discrete_map ={
    'Hydro':'#425fff',
    'Gas':'#f569ff',
    'Wind':'#7cfcf4',
    'Solar':'#fff94d',
    'Oil':'#45010c',
    'Nuclear':'#ff3838',
    'Coal':'#c9c8c5',
    'Other renewables':'#82ffaa',
    '(?)':'#b1bbc9',}

#### result_production <a class="anchor" id="cuatro-dos"></a>

In [None]:
fig = px.treemap(df_new, path=[px.Constant('Europe'),'Country', 'fuel_type'], 
                 values='fuel_type_val',
                 color='fuel_type',
                 color_discrete_map = color_discrete_map,
                 )

fig.update_layout(    
    title=dict(
    text='Share of energy production in Europe (2019)',
    x=.5,
    font_size=18,
        ),
    width=600,
    height=800,
    )

# fig.write_image('results/energy_production_2019.jpg',scale=2)
fig.show()