# Exploratory Data Analysis in Action - EDA: Renewable Energies

Reneweable Energies questions:
- Q1: How many powerplants are present worldwide that utilize Renewable Energies and how many that do not?
- Q2: How much energy is produced in a green way opposed to fossil and nuclear energy?
- Q3: Which European countries have the highest share of energy production from renewable sources? And which ones the lowest?
- Q4: Inspect a country of your choice. How much of energy is gained from individual fuel types? Is the result trustworthy?
- Q5: Which types of renewable energies produce the most energy in total? Is there a difference between continents?
- Q6:  _Come up with your own question_

In [2]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import geopandas as gpd
import sys
sys.path.append("../src/")

from helper import cuteplot
from helper import minmax_scaler
plt.rcParams["figure.figsize"] = [22,10]

In [3]:
import pickle
gdf_world = pickle.load(open( "../data/gdf_world.p", "rb" ))
gdf_europe = pickle.load(open( "../data/gdf_europe.p", "rb" ))
gdf_germany = pickle.load(open( "../data/gdf_germany.p", "rb" ))

> **Q1: How many powerplants are present worldwide that utilize Renewable Energies and how many that do not?**

In [9]:
gdf_world_clean = gdf_world.drop_duplicates(subset ='geometry')
len(gdf_world_clean.loc[gdf_world_clean['green']==True])

23458

In [12]:
len(gdf_world_clean.loc[gdf_world_clean['green']==False])

9215

In [11]:
gdf_world_clean.groupby('green')['geometry'].count()

green
False     9215
True     23458
Name: geometry, dtype: int64

> **Q2: How much energy is produced in a green way opposed to fossil and nuclear energy?**

In [54]:
green_energy = gdf_world_clean.loc[gdf_world_clean['green']==True]
unsustainable_energy = gdf_world_clean.loc[gdf_world_clean['green']==False]
c = gdf_world_clean['capacity in MW'].sum()
c

5167012.669828

In [52]:
a = green_energy['capacity in MW'].sum()
a

1487577.0025099998

In [53]:
b = unsustainable_energy['capacity in MW'].sum()
b

3679435.667318

In [28]:
print(a/c*100,'% green')
print(b/c*100,'% unsustainable')

28.7898849406057 % green
71.21011505939431 % unsustainable


In [55]:
#b.isnull().sum()
#unsustainable_energy
#green_energy

> **Q3: Which European countries have the highest share of energy production from renewable sources? And which ones the lowest?**

> _Hint: consider using `df.groupby()` with a list of columns to group on. You can use `df.unstack()` afterwards to get rid of the multi-dimensional Index._

In [78]:
share_green = green_energy.groupby(['country'],as_index=False)['capacity in MW'].sum()
share_unsustain = unsustainable_energy.groupby(['country'],as_index=False)['capacity in MW'].sum()

#share_green.isnull().sum()
#share_unsustain.isnull().sum()
share_green['result'] = share_green['capacity in MW']/(share_green['capacity in MW']+share_unsustain['capacity in MW'])

result = share_green[['country','result']]
result = result.sort_values(by='result', ascending=False)

result['country'][0]

'Afghanistan'

In [None]:
share_total = share_green/(share_green+share_unsustain)

share_total.sort_values(ascending=False)

In [36]:
green_energy.groupby('country')['capacity in MW'].sum().sort_values(ascending=False)

country
China                       6.951384e-02
United States of America    4.811607e-02
Brazil                      2.323240e-02
Canada                      1.853978e-02
India                       1.389281e-02
                                ...     
Cuba                        3.715880e-06
Kuwait                      1.935354e-06
Palestine                   1.470869e-06
Niger                       1.354748e-06
Suriname                    9.676771e-07
Name: capacity in MW, Length: 144, dtype: float64

> **Q4: Inspect a country of your choice. How much of energy is gained from individual fuel types? Is the result trustworthy?**

> _Hint: Double check your result with another source. Is the dataset trustworthy for your country?_

> _Note: Consider plotting plants on a map of the country. Consider using the `map_extent` argument in the `cuteplot()` function._

> **Q5: Which types of renewable energies produce the most energy in total? Is there a difference between continents?**

> **Q6: _Come up with your own question_**