# Analysis of the Datasets and Strategy for AI-Powered Renewable Energy Consumption & Forecasting Dashboard

In [29]:
# the goal is the analyze the data and find the best way to predict the Strategy for AI-Powered Renewable Energy Consumption & Forecasting Dashboard
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px



## Datasets Loading
### 1. Our World in Data (OWID) - Global Energy Dataset

In [30]:
# loading the datasets
# owid data
owid_data = pd.read_csv('data/OWID/owid-energy-data.csv')
owid_bookcoode = pd.read_csv('data/OWID/owid-energy-codebook.csv')

print(
    owid_data.shape,
    owid_bookcoode.shape,
)

(21812, 130) (130, 4)


In [31]:
owid_data.head()

Unnamed: 0,country,year,iso_code,population,gdp,biofuel_cons_change_pct,biofuel_cons_change_twh,biofuel_cons_per_capita,biofuel_consumption,biofuel_elec_per_capita,...,solar_share_elec,solar_share_energy,wind_cons_change_pct,wind_cons_change_twh,wind_consumption,wind_elec_per_capita,wind_electricity,wind_energy_per_capita,wind_share_elec,wind_share_energy
0,ASEAN (Ember),2000,,,,,,,,,...,0.0,,,,,,0.0,,0.0,
1,ASEAN (Ember),2001,,,,,,,,,...,0.0,,,,,,0.0,,0.0,
2,ASEAN (Ember),2002,,,,,,,,,...,0.0,,,,,,0.0,,0.0,
3,ASEAN (Ember),2003,,,,,,,,,...,0.0,,,,,,0.0,,0.0,
4,ASEAN (Ember),2004,,,,,,,,,...,0.0,,,,,,0.0,,0.0,


### 2. Global Energy Consumption & Renewable Generation || Kaggle

In [32]:
# global energy data
continent_consumption = pd.read_csv("data/global_energy/Continent_Consumption_TWH.csv")
country_consumption = pd.read_csv("data/global_energy/Country_Consumption_TWH.csv")
non_renewable_total_power_generation = pd.read_csv(
    "data/global_energy/nonRenewablesTotalPowerGeneration.csv"
)
renewable_power_generation_97_17 = pd.read_csv(
    "data/global_energy/renewablePowerGeneration97-17.csv"
)
renewable_total_power_generation = pd.read_csv(
    "data/global_energy/renewablesTotalPowerGeneration.csv"
)
top_20_countries_power_generatoin = pd.read_csv(
    "data/global_energy/top20CountriesPowerGeneration.csv"
)

# print the shapes of the datasets
print(
    continent_consumption.shape,
    country_consumption.shape,
    non_renewable_total_power_generation.shape,
    renewable_power_generation_97_17.shape,
    renewable_total_power_generation.shape,
    top_20_countries_power_generatoin.shape,
)

(31, 12) (31, 45) (8, 2) (28, 5) (9, 2) (20, 6)


In [33]:
continent_consumption.head()

Unnamed: 0,Year,World,OECD,BRICS,Europe,North America,Latin America,Asia,Pacific,Africa,Middle-East,CIS
0,1990,101855.54,52602.49,26621.07,20654.88,24667.23,5373.06,24574.19,1197.89,4407.77,2581.86,16049.4
1,1991,102483.56,53207.25,26434.99,20631.62,24841.68,5500.99,24783.53,1186.26,4535.7,2744.68,15898.21
2,1992,102588.23,53788.75,25993.05,20189.68,25341.77,5628.92,25690.67,1209.52,4582.22,3081.95,14339.79
3,1993,103646.56,54614.48,26283.8,20189.68,25830.23,5675.44,26876.93,1267.67,4721.78,3349.44,13246.57
4,1994,104449.03,55579.77,25993.05,20085.01,26365.21,5989.45,28098.08,1279.3,4803.19,3640.19,11606.74


In [34]:
country_consumption.head()

Unnamed: 0,Year,China,United States,Brazil,Belgium,Czechia,France,Germany,Italy,Netherlands,...,Australia,New Zealand,Algeria,Egypt,Nigeria,South Africa,Iran,Kuwait,Saudi Arabia,United Arab Emirates
0,1990,874,1910,141,48,50,225,351,147,67,...,86,14,22,33,66,90,69,9,58,20
1,1991,848,1925,143,50,45,237,344,150,69,...,85,14,23,33,70,92,77,3,68,23
2,1992,877,1964,145,51,44,234,338,149,69,...,87,14,24,34,72,88,81,9,77,22
3,1993,929,1998,148,49,43,238,335,149,70,...,91,15,24,35,74,94,87,12,80,23
4,1994,973,2036,156,52,41,231,333,147,70,...,91,15,23,34,72,98,97,14,84,26


In [35]:
country_consumption.head()

Unnamed: 0,Year,China,United States,Brazil,Belgium,Czechia,France,Germany,Italy,Netherlands,...,Australia,New Zealand,Algeria,Egypt,Nigeria,South Africa,Iran,Kuwait,Saudi Arabia,United Arab Emirates
0,1990,874,1910,141,48,50,225,351,147,67,...,86,14,22,33,66,90,69,9,58,20
1,1991,848,1925,143,50,45,237,344,150,69,...,85,14,23,33,70,92,77,3,68,23
2,1992,877,1964,145,51,44,234,338,149,69,...,87,14,24,34,72,88,81,9,77,22
3,1993,929,1998,148,49,43,238,335,149,70,...,91,15,24,35,74,94,87,12,80,23
4,1994,973,2036,156,52,41,231,333,147,70,...,91,15,23,34,72,98,97,14,84,26


In [36]:
non_renewable_total_power_generation.head()

Unnamed: 0,Mode of Generation,Contribution (TWh)
0,Coal,9863.33
1,Natural Gas,5882.82
2,Nuclear,2636.03
3,Oil,841.87
4,Waste,114.04


In [37]:
renewable_power_generation_97_17.head()

Unnamed: 0,Year,Hydro(TWh),Biofuel(TWh),Solar PV (TWh),Geothermal (TWh)
0,1990,2191.67,3.88,0.09,36.42
1,1991,2268.63,4.19,0.1,37.39
2,1992,2267.16,4.63,0.12,39.3
3,1993,2397.67,5.61,0.15,40.23
4,1994,2419.73,7.31,0.17,41.05


In [38]:
renewable_total_power_generation.head()

Unnamed: 0,Mode of Generation,Contribution (TWh)
0,Hydro,9863.33
1,Wind,5882.82
2,Biofuel,2636.03
3,Solar PV,841.87
4,Geothermal,114.04


In [39]:
top_20_countries_power_generatoin.head()

Unnamed: 0,Country,Hydro(TWh),Biofuel(TWh),Solar PV (TWh),Geothermal (TWh),Total (TWh)
0,China,1189.84,295.02,79.43,0.125,1819.94
1,USA,315.62,277.91,58.95,18.96,758.619
2,Brazil,370.9,42.37,52.25,0.0,466.35
3,Canada,383.48,29.65,7.12,0.0,424.09
4,India,141.8,51.06,43.76,0.0,262.65


### Energy Generation & Consumption (from multiple sources)

In [41]:
electricity_consumption_statistics = pd.read_csv('data/IRR_cleaned/ELECSTAT_CLEANED.csv')
heat_generations = pd.read_csv('data/IRR_cleaned/HEATGEN_CLEANED.csv')
share_of_renewables = pd.read_csv('data/IRR_cleaned/RESHARE_CLEANED.csv')
investment_in_energy_infrastructure = pd.read_csv('data/IRR_cleaned/PUBFIN_CLEANED.csv')

print(
    electricity_consumption_statistics.shape,
    heat_generations.shape,
    share_of_renewables.shape,
    investment_in_energy_infrastructure.shape,
)

(9648, 7) (9254, 5) (10123, 4) (197064, 4)


In [42]:
electricity_consumption_statistics.head()

Unnamed: 0,Region_Tech_Desc,Category,Data_Type,Year,Electricity_Output_GWh,Grid_Connection,Miscellaneous
0,World,Total renewable,Electricity Generation (GWh),2000.0,2846212.0,,
1,World,Total renewable,Electricity Installed Capacity (MW),2000.0,752238.54,,
2,World,Solar energy,Electricity Generation (GWh),2000.0,1312.13,,
3,World,Solar energy,Electricity Installed Capacity (MW),2000.0,1215.68,,
4,World,Wind energy,Electricity Generation (GWh),2000.0,30944.47,,


In [43]:
heat_generations.head()

Unnamed: 0,Country/Area,Technology,Grid Connection,Year,Heat Generation (TJ)
0,Albania,Coal and peat,Heat (Commercial),2000,84.0
1,Albania,Coal and peat,Heat (Commercial),2001,109.0
2,Albania,Coal and peat,Heat (Commercial),2002,77.0
3,Albania,Oil,Combined Heat and Power (CHP),2000,145.0
4,Albania,Oil,Combined Heat and Power (CHP),2001,138.0


In [44]:
share_of_renewables.head()

Unnamed: 0,Region/Country,Indicator,Year,Value
0,World,RE share of electricity generation (%),2000,18.31
1,World,,2001,17.82
2,World,,2002,17.76
3,World,,2003,17.29
4,World,,2004,17.73


In [45]:
investment_in_energy_infrastructure.head()

Unnamed: 0,Country,Technology,Year,Investment
0,Afghanistan,On-grid Solar photovoltaic,2022,0.0
1,Afghanistan,On-grid Solar photovoltaic,2021,0.0
2,Afghanistan,On-grid Solar photovoltaic,2020,0.0
3,Afghanistan,On-grid Solar photovoltaic,2019,4.38
4,Afghanistan,On-grid Solar photovoltaic,2018,48.17


# 1. Understanding the Datasets

After reviewing the three provided datasets, we can categorize them as follows:


## 1. Dataset Group 1: Our World in Data (OWID) - Global Energy Dataset

- **owid-energy-data.csv**: A dataset covering global energy production, electricity mix, and energy consumption trends from various sources (hydro, wind, solar, fossil fuels, etc.).

- **owid-energy-codebook.csv**: A codebook detailing column descriptions and data sources for the OWID dataset.

## 2. Dataset Group 2: Global Energy Consumption & Renewable Generation
- **renewablePowerGeneration97-17.csv:** Tracks renewable power generation trends from 1997 to 2017 across different energy types (hydro, wind, biofuel, solar, geothermal).

- **renewablesTotalPowerGeneration.csv:** Summarizes total renewable power generation in TWh globally.

- **nonRenewablesTotalPowerGeneration.csv:** Summarizes total non-renewable power generation in TWh globally.

- **top20CountriesPowerGeneration.csv:** Highlights the top 20 countries' renewable energy generation.

- **Country_Consumption_TWH.csv:** Records national energy consumption trends.

- **Continent_Consumption_TWH.csv:** Records energy consumption trends at a continental level.



## Dataset Group 3: Energy Generation & Consumption (from multiple sources)

- **HEATGEN_CLEANED.csv:** Contains cleaned data on heat generation. This dataset could be useful for understanding how different energy sources contribute to overall energy production.

- **ELECSTAT_CLEANED.csv:** Likely includes statistics on electricity consumption, possibly broken down by country and year.
- **RESHARE_CLEANED.csv:** Appears to contain information on the share of renewable energy sources in overall energy consumption.
- **PUBFIN_CLEANED.csv:** May provide financial data related to public investment in energy infrastructure.