# Session 05: Getting data from APIs

In this session, we will learn how to get data from APIs. 

## What is an API?

API stands for Application Programming Interface. It is a set of rules that allows one software application to interact with another. APIs are used to define the methods for communication between different software components.

There are several types of APIs, such as web APIs, library APIs, and operating system APIs. In this session, we will focus on web APIs. Web APIs are used to enable communication between different software applications over the internet. We can use web APIs to get data from external sources, such as weather data, stock prices, and social media posts.

In order to use a web API, we need to send a request to the API server and receive a response. The request is usually sent in the form of a URL, and the response is usually in the form of JSON or XML data.

Types of requests:
- `GET`: Retrieve data from the server.
- `POST`: Send data to the server.
- `PUT`: Update data on the server.
- `DELETE`: Delete data from the server.

Types of responses:
- JSON: JavaScript Object Notation.
- XML: Extensible Markup Language.

Status codes:
- `200`: OK.
- `400`: Bad request.
- `401`: Unauthorized.
- `404`: Not found.
- `500`: Internal server error.

We are going to focus on the `GET` request and JSON response in this session.

## Getting access to an API

We will use the ESIOS API to get power market data from the Spanish electricity market.

* [ESIOS API](https://api.esios.ree.es/)

You will need to request an API key to access the ESIOS API. You can request an API key sending an email to the following address: `consultasios@ree.es` with the subject `Personal token request`. Mention that you are a student and you are using the API for educational purposes.

## Making a request to the API

Once you've received your token, we can start making requests to the ESIOS API. We will use the `requests` library in Python to make HTTP requests to the API server.

But first we need to load the token from the `.env` file.

In [4]:
# load environment variables from the .env file
import os
from dotenv import load_dotenv

load_dotenv()

# get the API token from the environment variables
api_token = os.getenv('API_KEY_PYTHON_CLASS')
api_token[:5]

'3fdd9'

Now we need to install the `requests` library.

```bash
!pip install requests
```

Now we are ready to make a request to the ESIOS API. You can read the documentation of the API to understand how to make requests and what data you can get.

For now, we will focus on some examples that I will show you.

## Example 1: Get the list of available indicators

The first example is to get the list of available indicators from the ESIOS API.

These indicators represent different types of data that we can get from the API, such as electricity prices, demand, generation, etc.

Let's make a request to the API to get the list of available indicators as a JSON response, and then save the response to a file.

Documentation: [ESIOS API - Getting a list of indicators](https://api.esios.ree.es/doc/indicator/getting_a_list_of_indicators.html)


In [3]:
!pip install requests

Defaulting to user installation because normal site-packages is not writeable


In [7]:
import requests
import pandas as pd

# the API endpoint
url = 'https://api.esios.ree.es/indicators'

# define the headers
headers = {
    'Accept': 'application/json; application/vnd.esios-api-v1+json',
    'Content-Type': 'application/json',
    'x-api-key': f'{api_token}'
}

# Make the GET request
response = requests.get(url, headers=headers)
response_json = response.json()

# check if the response is successful
if response.status_code != 200:
    print(f'Error: {response_json["error"]}')
else:

    # make a directory to store the data if it doesn't exist
    if not os.path.exists('./data'):
        os.makedirs('./data')

    # save the response content to a CSV file
    indicators = response_json['indicators']
    df = pd.DataFrame(indicators)
    df.to_csv('./data/indicators.csv', index=False)

## Example 2: Understanding the indicators

The second example is to get the id of a specific indicator from the ESIOS API. We are going to read the list of available indicators from the file that we saved in the previous example, and then select an indicator id.

In [8]:
# reading the data from the CSV file

indicators = pd.read_csv('./data/indicators.csv')

indicators.head()

Unnamed: 0,name,description,short_name,id
0,Generación programada PBF Hidráulica UGH,"<p>Es el programa de energía diario, con desgl...",Hidráulica UGH,1
1,Generación programada PBF Hidráulica no UGH,"<p>Es el programa de energía diario, con desgl...",Hidráulica no UGH,2
2,Generación programada PBF Turbinación bombeo,"<p>Es el programa de energía diario, con desgl...",Turbinación bombeo,3
3,Generación programada PBF Nuclear,"<p>Es el programa de energía diario, con desgl...",Nuclear,4
4,Generación programada PBF Hulla antracita Anex...,"<p>Es el programa de energía diario, con desgl...",Hulla antracita RD 134/2010,5


Let's sweep throught he list of available indicators looking for all the indicators that contain the word `Previsión` (forecast in spanish) in their name. We will use the `str.contains` method from pandas to filter the indicators.

In [9]:
# looking for `Previsión` or `prevista` in the `name` column (it means forecast in Spanish)

prevision_indicators = indicators[indicators['name'].str.contains('Previsión', case=False)]
prevista_indicators = indicators[indicators['name'].str.contains('prevista', case=False)]

In [10]:
prevision_indicators

Unnamed: 0,name,description,short_name,id
459,Previsión diaria de la demanda eléctrica penin...,\r\n\r\n<p>Es la previsi&oacute;n de consumo q...,Previsión diaria,460
460,Previsión mensual de la demanda eléctrica peni...,<p>Previsi&oacute;n de la demanda del sistema ...,Previsión mensual,461
538,Previsión de la producción eólica peninsular,<p>Previsi&oacute;n horaria de energ&iacute;a ...,Previsión eólica,541
576,Previsión semanal de la demanda eléctrica peni...,<p>Previsión de la demanda del sistema peninsu...,Previsión semanal,603
577,Previsión máxima mensual de la demanda eléctri...,<p>Previsión de la demanda del sistema peninsu...,Previsión máxima mensual,604
592,Previsión mínima mensual de la demanda eléctri...,<p>Previsión de la demanda del sistema peninsu...,Previsión mínima mensual,619
593,Previsión máxima anual de la demanda eléctrica...,<p>Previsión de la demanda del sistema peninsu...,Previsión máxima anual,620
594,Previsión mínima anual de la demanda eléctrica...,<p>Previsión de la demanda del sistema peninsu...,Previsión mínima anual,621
1427,Previsión demanda anual,<p>La Circular 4/2019 de la CNMC por la que se...,Demanda anual,1774
1428,Previsión diaria D+1 demanda,<p>La Circular 4/2019 de la CNMC por la que se...,Previsión diaria D+1 demanda,1775


In [11]:
prevista_indicators

Unnamed: 0,name,description,short_name,id
497,Capacidad de intercambio prevista (NTC) con Fr...,<p>M&aacute;ximo valor admisible del programa ...,Francia horizonte semanal importación,498
498,Capacidad de intercambio prevista (NTC) con Po...,<p>M&aacute;ximo valor admisible del programa ...,Portugal horizonte semanal importación,499
499,Capacidad de intercambio prevista (NTC) con Ma...,<p>M&aacute;ximo valor admisible del programa ...,Marruecos horizonte semanal importación,500
500,Capacidad de intercambio prevista (NTC) con An...,<p>M&aacute;ximo valor admisible del programa ...,Andorra horizonte semanal importación,501
501,Capacidad de intercambio prevista (NTC) con Fr...,<p>M&aacute;ximo valor admisible del programa ...,Francia horizonte semanal exportación,502
502,Capacidad de intercambio prevista (NTC) con Po...,<p>M&aacute;ximo valor admisible del programa ...,Portugal horizonte semanal exportación,503
503,Capacidad de intercambio prevista (NTC) con Ma...,<p>M&aacute;ximo valor admisible del programa ...,Marruecos horizonte semanal exportación,504
504,Capacidad de intercambio prevista (NTC) con An...,<p>M&aacute;ximo valor admisible del programa ...,Andorra horizonte semanal exportación,505
507,Capacidad de intercambio prevista (NTC) con Fr...,<p>M&aacute;ximo valor admisible del programa ...,Francia horizonte mensual importación,510
508,Capacidad de intercambio prevista (NTC) con Po...,<p>M&aacute;ximo valor admisible del programa ...,Portugal horizonte mensual importación,511


There seems to be 2 indicators for forecasted demand, let's get the data for both and compare.

* Indicator 460
* Indicator 544

## Example 3: Get the data for an indicator

The third example is to get the data for a specific indicator from the ESIOS API. We are going to use the indicator id that we selected in the previous example to get the data for that indicator between two dates.

We will check the forecasted demand for Christmas' Eve.

Documentation: [ESIOS API - Getting the data for an indicator between two dates](https://api.esios.ree.es/doc/indicator/getting_a_specific_indicator_filtering_values_by_a_date_range.html)

In [12]:
indicator_id = 460

# when we want to get the data for a specific date range, we need to provide the start and end date
# in the ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ
# since the hour in Spain is UTC+1 in winter and UTC+2 in summer, we need to adjust the hour
start_date = '2025-01-23T23:00:00Z' 
end_date = '2025-12-24T22:59:59Z'

# the API endpoint
url = f'https://api.esios.ree.es/indicators/{indicator_id}'

# define the headers
headers = {
    'Accept': 'application/json; application/vnd.esios-api-v1+json',
    'Content-Type': 'application/json',
    'x-api-key': f'{api_token}'
}

# define the parameters, in our case the start and end date
params = {
    'start_date': start_date,
    'end_date': end_date
}

# Make the GET request
response = requests.get(url, headers=headers, params=params)
response_json = response.json()

In [13]:
response_json

{'indicator': {'name': 'Previsión diaria de la demanda eléctrica peninsular',
  'short_name': 'Previsión diaria',
  'id': 460,
  'composited': False,
  'step_type': 'step',
  'disaggregated': False,
  'magnitud': [{'name': 'Potencia', 'id': 20}],
  'tiempo': [{'name': 'Quince minutos', 'id': 218}],
  'geos': [{'geo_id': 8741, 'geo_name': 'Península'}],
  'values_updated_at': '2025-01-23T18:28:38.000+01:00',
  'values': [{'value': 26779.0,
    'datetime': '2025-01-24T00:00:00.000+01:00',
    'datetime_utc': '2025-01-23T23:00:00Z',
    'tz_time': '2025-01-23T23:00:00.000Z',
    'geo_id': 8741,
    'geo_name': 'Península'},
   {'value': 26341.0,
    'datetime': '2025-01-24T00:15:00.000+01:00',
    'datetime_utc': '2025-01-23T23:15:00Z',
    'tz_time': '2025-01-23T23:15:00.000Z',
    'geo_id': 8741,
    'geo_name': 'Península'},
   {'value': 25898.0,
    'datetime': '2025-01-24T00:30:00.000+01:00',
    'datetime_utc': '2025-01-23T23:30:00Z',
    'tz_time': '2025-01-23T23:30:00.000Z',
    '

It looks like the data is stored under the key `'values'` in the JSON response. Let's extract the data and save it to a file.

In [14]:
raw_data = response_json['indicator']['values']

data = pd.DataFrame(raw_data)

data.tail()

Unnamed: 0,value,datetime,datetime_utc,tz_time,geo_id,geo_name
667,31177.0,2025-01-30T22:45:00.000+01:00,2025-01-30T21:45:00Z,2025-01-30T21:45:00.000Z,8741,Península
668,30251.0,2025-01-30T23:00:00.000+01:00,2025-01-30T22:00:00Z,2025-01-30T22:00:00.000Z,8741,Península
669,29558.0,2025-01-30T23:15:00.000+01:00,2025-01-30T22:15:00Z,2025-01-30T22:15:00.000Z,8741,Península
670,28958.0,2025-01-30T23:30:00.000+01:00,2025-01-30T22:30:00Z,2025-01-30T22:30:00.000Z,8741,Península
671,28429.0,2025-01-30T23:45:00.000+01:00,2025-01-30T22:45:00Z,2025-01-30T22:45:00.000Z,8741,Península


Let's analyze the data and plot it.

In [15]:
# unique geo_name values

data['geo_name'].unique()

array(['Península'], dtype=object)

It looks like the data has a granularity of 15 minutes. Let's convert `datetime` column to a `datetime` object and plot the data.

In [16]:
data['datetime'] = pd.to_datetime(data['datetime'])

In [17]:
from plotly import express as px

fig = px.line(data, x='datetime', y='value', color='geo_name', title='Electricity demand forecast (15 min)')

fig.show()

Let's compare the forecasted demand for the previous date and the current date, and see if there is any difference. We have to be smart about the date selection, since usually API requests are not free, and we don't want to make too many requests. 

We need to adjust the date selection to get the data for the previous day and the Christmas' Eve day.

In [18]:
indicator_id = 460

# when we want to get the data for a specific date range, we need to provide the start and end date
# in the ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ
# since the hour in Spain is UTC+1 in winter and UTC+2 in summer, we need to adjust the hour
start_date = '2024-12-22T23:00:00Z' 
end_date = '2024-12-24T22:59:59Z'

# the API endpoint
url = f'https://api.esios.ree.es/indicators/{indicator_id}'

# define the headers
headers = {
    'Accept': 'application/json; application/vnd.esios-api-v1+json',
    'Content-Type': 'application/json',
    'x-api-key': f'{api_token}'
}

# define the parameters, in our case the start and end date
params = {
    'start_date': start_date,
    'end_date': end_date
}

# Make the GET request
response = requests.get(url, headers=headers, params=params)
response_json = response.json()

# convert to a DataFrame
raw_data = response_json['indicator']['values']

data = pd.DataFrame(raw_data)

data['datetime'] = pd.to_datetime(data['datetime'])
data['date'] = data['datetime'].dt.date

data.head()

Unnamed: 0,value,datetime,datetime_utc,tz_time,geo_id,geo_name,date
0,24441.0,2024-12-23 00:00:00+01:00,2024-12-22T23:00:00Z,2024-12-22T23:00:00.000Z,8741,Península,2024-12-23
1,23917.0,2024-12-23 00:15:00+01:00,2024-12-22T23:15:00Z,2024-12-22T23:15:00.000Z,8741,Península,2024-12-23
2,23399.0,2024-12-23 00:30:00+01:00,2024-12-22T23:30:00Z,2024-12-22T23:30:00.000Z,8741,Península,2024-12-23
3,22897.0,2024-12-23 00:45:00+01:00,2024-12-22T23:45:00Z,2024-12-22T23:45:00.000Z,8741,Península,2024-12-23
4,22401.0,2024-12-23 01:00:00+01:00,2024-12-23T00:00:00Z,2024-12-23T00:00:00.000Z,8741,Península,2024-12-23


Since we want to compare the 15-minute data, let's extract the hour and minute from the `datetime` column and plot the data for the two dates.

In [19]:
data['time'] = data['datetime'].dt.time

data.head()

Unnamed: 0,value,datetime,datetime_utc,tz_time,geo_id,geo_name,date,time
0,24441.0,2024-12-23 00:00:00+01:00,2024-12-22T23:00:00Z,2024-12-22T23:00:00.000Z,8741,Península,2024-12-23,00:00:00
1,23917.0,2024-12-23 00:15:00+01:00,2024-12-22T23:15:00Z,2024-12-22T23:15:00.000Z,8741,Península,2024-12-23,00:15:00
2,23399.0,2024-12-23 00:30:00+01:00,2024-12-22T23:30:00Z,2024-12-22T23:30:00.000Z,8741,Península,2024-12-23,00:30:00
3,22897.0,2024-12-23 00:45:00+01:00,2024-12-22T23:45:00Z,2024-12-22T23:45:00.000Z,8741,Península,2024-12-23,00:45:00
4,22401.0,2024-12-23 01:00:00+01:00,2024-12-23T00:00:00Z,2024-12-23T00:00:00.000Z,8741,Península,2024-12-23,01:00:00


Let's create two variables, each one containing the data for the two dates we are interested in.

In [20]:
from datetime import datetime

december_23 = data[data['date'] == datetime(2024, 12, 23).date()]
december_24 = data[data['date'] == datetime(2024, 12, 24).date()]

december_23


Unnamed: 0,value,datetime,datetime_utc,tz_time,geo_id,geo_name,date,time
0,24441.0,2024-12-23 00:00:00+01:00,2024-12-22T23:00:00Z,2024-12-22T23:00:00.000Z,8741,Península,2024-12-23,00:00:00
1,23917.0,2024-12-23 00:15:00+01:00,2024-12-22T23:15:00Z,2024-12-22T23:15:00.000Z,8741,Península,2024-12-23,00:15:00
2,23399.0,2024-12-23 00:30:00+01:00,2024-12-22T23:30:00Z,2024-12-22T23:30:00.000Z,8741,Península,2024-12-23,00:30:00
3,22897.0,2024-12-23 00:45:00+01:00,2024-12-22T23:45:00Z,2024-12-22T23:45:00.000Z,8741,Península,2024-12-23,00:45:00
4,22401.0,2024-12-23 01:00:00+01:00,2024-12-23T00:00:00Z,2024-12-23T00:00:00.000Z,8741,Península,2024-12-23,01:00:00
...,...,...,...,...,...,...,...,...
91,28360.0,2024-12-23 22:45:00+01:00,2024-12-23T21:45:00Z,2024-12-23T21:45:00.000Z,8741,Península,2024-12-23,22:45:00
92,27687.0,2024-12-23 23:00:00+01:00,2024-12-23T22:00:00Z,2024-12-23T22:00:00.000Z,8741,Península,2024-12-23,23:00:00
93,26933.0,2024-12-23 23:15:00+01:00,2024-12-23T22:15:00Z,2024-12-23T22:15:00.000Z,8741,Península,2024-12-23,23:15:00
94,26237.0,2024-12-23 23:30:00+01:00,2024-12-23T22:30:00Z,2024-12-23T22:30:00.000Z,8741,Península,2024-12-23,23:30:00


Let's plot the two series now, knowing that December 23rd was a Monday and December 24th was a Tuesday, so therefore the demand shouldn't be too different according to the day of the week.

In [21]:
# use plotly graph objects to plot the two series

from plotly import graph_objects as go

fig = go.Figure()

fig.add_trace(go.Scatter(x=december_23['time'], y=december_23['value'], mode='lines', name='December 23'))
fig.add_trace(go.Scatter(x=december_24['time'], y=december_24['value'], mode='lines', name='December 24'))

fig.update_layout(title='Electricity demand forecast (15 min)', xaxis_title='Time', yaxis_title='Demand (MW)')

fig.show()

There is quite a difference in the demand forecast for the two days. Let's see the previous monday and tuesday to see if the difference is due to the day of the week.

In [22]:
indicator_id = 460

# when we want to get the data for a specific date range, we need to provide the start and end date
# in the ISO 8601 format: YYYY-MM-DDTHH:MM:SSZ
# since the hour in Spain is UTC+1 in winter and UTC+2 in summer, we need to adjust the hour
start_date = '2024-12-15T23:00:00Z' 
end_date = '2024-12-17T22:59:59Z'

# the API endpoint
url = f'https://api.esios.ree.es/indicators/{indicator_id}'

# define the headers
headers = {
    'Accept': 'application/json; application/vnd.esios-api-v1+json',
    'Content-Type': 'application/json',
    'x-api-key': f'{api_token}'
}

# define the parameters, in our case the start and end date
params = {
    'start_date': start_date,
    'end_date': end_date
}

# make the GET request
response = requests.get(url, headers=headers, params=params)
response_json = response.json()

# convert to a DataFrame
raw_data = response_json['indicator']['values']

data = pd.DataFrame(raw_data)

data['datetime'] = pd.to_datetime(data['datetime'])
data['date'] = data['datetime'].dt.date
data['time'] = data['datetime'].dt.time

december_16 = data[data['date'] == datetime(2024, 12, 16).date()]
december_17 = data[data['date'] == datetime(2024, 12, 17).date()]

fig = go.Figure()

fig.add_trace(go.Scatter(x=december_16['time'], y=december_16['value'], mode='lines', name='December 16'))
fig.add_trace(go.Scatter(x=december_17['time'], y=december_17['value'], mode='lines', name='December 17'))

fig.update_layout(title='Electricity demand forecast (15 min)', xaxis_title='Time', yaxis_title='Demand (MW)')

fig.show()

## Example 4: Create a function to get the data for an indicator between two dates

Instead of repeating the same code to get the data for an indicator between two dates, we can create a function that takes the indicator id, start date, and end date as input parameters, and returns the data for that indicator between the two dates as a DataFrame.

In [23]:
def get_indicator(indicator_id, start_date, end_date):

    api_token = os.getenv('API_KEY')

    url = f'https://api.esios.ree.es/indicators/{indicator_id}'

    headers = {
        'Accept': 'application/json; application/vnd.esios-api-v1+json',
        'Content-Type': 'application/json',
        'x-api-key': f'{api_token}'
    }

    params = {
        'start_date': start_date,
        'end_date': end_date
    }

    response_json = requests.get(url, headers=headers, params=params).json()

    raw_data = response_json['indicator']['values']

    return pd.DataFrame(raw_data)

get_indicator(544, '2024-12-15T23:00:00Z', '2024-12-17T22:59:59Z')

Unnamed: 0,value,datetime,datetime_utc,tz_time,geo_id,geo_name
0,26392.0,2024-12-16T00:00:00.000+01:00,2024-12-15T23:00:00Z,2024-12-15T23:00:00.000Z,8741,Península
1,26091.0,2024-12-16T00:05:00.000+01:00,2024-12-15T23:05:00Z,2024-12-15T23:05:00.000Z,8741,Península
2,25846.0,2024-12-16T00:10:00.000+01:00,2024-12-15T23:10:00Z,2024-12-15T23:10:00.000Z,8741,Península
3,25656.0,2024-12-16T00:15:00.000+01:00,2024-12-15T23:15:00Z,2024-12-15T23:15:00.000Z,8741,Península
4,25489.0,2024-12-16T00:20:00.000+01:00,2024-12-15T23:20:00Z,2024-12-15T23:20:00.000Z,8741,Península
...,...,...,...,...,...,...
571,29209.0,2024-12-17T23:35:00.000+01:00,2024-12-17T22:35:00Z,2024-12-17T22:35:00.000Z,8741,Península
572,28963.0,2024-12-17T23:40:00.000+01:00,2024-12-17T22:40:00Z,2024-12-17T22:40:00.000Z,8741,Península
573,28723.0,2024-12-17T23:45:00.000+01:00,2024-12-17T22:45:00Z,2024-12-17T22:45:00.000Z,8741,Península
574,28489.0,2024-12-17T23:50:00.000+01:00,2024-12-17T22:50:00Z,2024-12-17T22:50:00.000Z,8741,Península


We can see now that the difference between the indicators 460 and 544 is the granularity of the data. Indicator 460 has a granularity of 15 minutes, while indicator 544 has a granularity of 5 minutes.

In [24]:
# CHECK OUT FASTAPI LIBRARY