<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Import-libraries-and-initialize-main-variables" data-toc-modified-id="Import-libraries-and-initialize-main-variables-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Import libraries and initialize main variables</a></span></li><li><span><a href="#Read-data-from-API-and-CSV" data-toc-modified-id="Read-data-from-API-and-CSV-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Read data from API and CSV</a></span><ul class="toc-item"><li><span><a href="#AEMET" data-toc-modified-id="AEMET-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>AEMET</a></span></li><li><span><a href="#REE" data-toc-modified-id="REE-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>REE</a></span></li><li><span><a href="#Other-data" data-toc-modified-id="Other-data-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Other data</a></span></li></ul></li><li><span><a href="#Saving-Data-to-CSV" data-toc-modified-id="Saving-Data-to-CSV-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Saving Data to CSV</a></span></li></ul></div>

# Import libraries and initialize main variables

Lectura_AEMET_REE library have the code to import, clean and save the data reading by API´s from AEMET and REE.

For reading AEMT info, it´s going to be used the aemet library(https://pypi.org/project/python-aemet/), that provides some methods to use AEMET OpenData API.

AEMET API´s use a private key that is required to read the data. To obtain your own key, follow the instructions at: https://opendata.aemet.es/centrodedescargas/altaUsuario.

The info from REE is readed using then library requests, to get the info via API REST. For more info: https://www.ree.es/es/apidatos

In [1]:
# Import the main classes to read AEMET and REE from a custom library
from Lectura_AEMET_REE import Ingestion_AEMET,Ingestion_REE

import pandas as pd
import numpy as np
import datetime as dt

pd.options.display.max_columns=None

# Create objets for read and save the info.
Ing_AEMET=Ingestion_AEMET()
Ing_REE=Ingestion_REE()

# Date interval to read REE and AEMET data
date_ini="2016-01-01T00:00:00UTC"
date_end="2021-06-30T00:00:00UTC"

# Read data from API and CSV

##  AEMET

For the reading of meteorological data, we are going to use the aemet library, from which we will use the methods of the Aemet and Estacion classes to obtain the daily data of each meteorological station for a range of dates (date_ini to date_end), of the provinces that interest us for the study.

In [2]:
# Read the data from AEMET, it takes about 15-20 min per year because the API have a limit of requests per minute
df_weather=Ing_AEMET.read_weather_dates(date_ini,date_end)

Reading list of id of weather stations...


  0%|          | 0/291 [00:00<?, ?it/s]

Reading AEMET data from 2016-01-01T00:00:00UTC to 2016-12-31T00:00:00UTC ...


  0%|          | 0/291 [00:00<?, ?it/s]

Reading AEMET data from 2017-01-01T00:00:00UTC to 2017-12-31T00:00:00UTC ...


  0%|          | 0/291 [00:00<?, ?it/s]

Reading AEMET data from 2018-01-01T00:00:00UTC to 2018-12-31T00:00:00UTC ...


  0%|          | 0/291 [00:00<?, ?it/s]

Reading AEMET data from 2019-01-01T00:00:00UTC to 2019-12-31T00:00:00UTC ...


  0%|          | 0/291 [00:00<?, ?it/s]

Reading AEMET data from 2020-01-01T00:00:00UTC to 2020-12-31T00:00:00UTC ...


  0%|          | 0/291 [00:00<?, ?it/s]

Reading AEMET data from 2021-01-01T00:00:00UTC to 2021-06-30T00:00:00UTC ...


  0%|          | 0/291 [00:00<?, ?it/s]

Finish reading AEMET date from 2016-01-01 to 2021-06-20


In [3]:
df_weather.head()

Unnamed: 0,fecha,indicativo,nombre,provincia,altitud,tmed,prec,tmin,horatmin,tmax,horatmax,dir,velmedia,racha,horaracha,sol,presMax,horaPresMax,presMin,horaPresMin
0,2016-01-01,0252D,ARENYS DE MAR,BARCELONA,74,112,2,78,06:20,147,12:40,24,17,72,14:00,,,,,
1,2016-01-02,0252D,ARENYS DE MAR,BARCELONA,74,118,0,75,23:40,161,13:10,24,11,97,11:40,,,,,
2,2016-01-03,0252D,ARENYS DE MAR,BARCELONA,74,100,0,58,05:50,141,13:20,24,19,92,16:00,,,,,
3,2016-01-04,0252D,ARENYS DE MAR,BARCELONA,74,116,9,80,22:50,153,10:10,32,8,89,02:40,,,,,
4,2016-01-05,0252D,ARENYS DE MAR,BARCELONA,74,99,1,56,23:59,142,12:30,32,25,86,17:30,,,,,


## REE

To read the REE data I will use the python request library through its API, to obtain the generation of electricity by technology and electrical system


In [4]:
df_ree=Ing_REE.read_ree_dates(date_ini,date_end)

  0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

  0%|          | 0/5 [00:00<?, ?it/s]

In [5]:
df_ree.head()

Unnamed: 0,value,percentage,datetime,title,type,system
0,29281.0,0.054518,2016-01-01T00:00:00.000+01:00,Hidráulica,Renovable,peninsular
1,35544.881,0.057414,2016-01-02T00:00:00.000+01:00,Hidráulica,Renovable,peninsular
2,35910.705,0.05415,2016-01-03T00:00:00.000+01:00,Hidráulica,Renovable,peninsular
3,65268.886,0.097287,2016-01-04T00:00:00.000+01:00,Hidráulica,Renovable,peninsular
4,79718.832,0.111964,2016-01-05T00:00:00.000+01:00,Hidráulica,Renovable,peninsular


## Other data

To improve the models and be more realistic, I add the information of the holidays of the Madrid City concil work calendar.
https://datos.gob.es/en/catalogo/l01280796-calendario-laboral

I also create a new variable with the day of the week, because the REE generation is closely related to that.

In [6]:
df_holidays=pd.read_csv(Ing_AEMET.path_Data+'calendario.csv',
                        sep=';',
                        encoding = 'latin-1',
                        usecols=['Dia','Tipo de Festivo'],
                        dtype=str)

# Only use national holidays
df_holidays['Holiday']=(df_holidays['Tipo de Festivo']=='Festivo nacional').astype(int)
df_holidays['Dia']=pd.to_datetime(df_holidays['Dia'])

df_holidays.drop(columns='Tipo de Festivo',inplace=True)
df_holidays.dropna(inplace=True)


In [7]:
df_weather['fecha']=pd.to_datetime(df_weather['fecha'])

# Adding the national holidays to weather dataframe.
df_weather=df_weather.merge(df_holidays,how='left',left_on='fecha',right_on='Dia')

# Create a new variable with the day of week
df_weather['weekday']=df_weather['Dia'].dt.dayofweek
df_weather.drop(columns='Dia',inplace=True)


In [8]:
df_weather.head()

Unnamed: 0,fecha,indicativo,nombre,provincia,altitud,tmed,prec,tmin,horatmin,tmax,horatmax,dir,velmedia,racha,horaracha,sol,presMax,horaPresMax,presMin,horaPresMin,Holiday,weekday
0,2016-01-01,0252D,ARENYS DE MAR,BARCELONA,74,112,2,78,06:20,147,12:40,24,17,72,14:00,,,,,,1,4
1,2016-01-02,0252D,ARENYS DE MAR,BARCELONA,74,118,0,75,23:40,161,13:10,24,11,97,11:40,,,,,,0,5
2,2016-01-03,0252D,ARENYS DE MAR,BARCELONA,74,100,0,58,05:50,141,13:20,24,19,92,16:00,,,,,,0,6
3,2016-01-04,0252D,ARENYS DE MAR,BARCELONA,74,116,9,80,22:50,153,10:10,32,8,89,02:40,,,,,,0,0
4,2016-01-05,0252D,ARENYS DE MAR,BARCELONA,74,99,1,56,23:59,142,12:30,32,25,86,17:30,,,,,,1,1


# Saving Data to CSV

To avoid have to read all the data in each execution, AEMET and REE dataframes are saved in csv format. By default are saved in '../Data/'

In [9]:
Ing_AEMET.save_to_csv(df_weather)
Ing_REE.save_to_csv(df_ree) 