## Preprocessing the data

Create a TimeSeries object from a Pandas DataFrame, and split it in train/validation series:

In [6]:
import pandas as pd
from darts import TimeSeries

# Read a pandas DataFrame 2023
load2023 = pd.read_csv("Load_Data/Total Load - Day Ahead _ Actual_2023.csv", delimiter=",")
weather_data= pd.read_csv("Temperature Data/smhi-opendata_1_98230_202301_202412.csv", delimiter=";",skiprows=9)

# Change the timestamp following the dataseries for DARTS
load2023['Time (UTC)'] = load2023['Time (UTC)'].str.split(' - ').str[0]
# Delete the initial day ahead forecast data
load2023 = load2023.drop('Day-ahead Total Load Forecast [MW] - BZN|SE3',axis=1)   
# Change the name to simple name
load2023 = load2023.rename(columns={'Time (UTC)': 'completetime', 'Actual Total Load [MW] - BZN|SE3': 'Load [MW]'})
# Convert 'time' to datetime for easier splitting
load2023['datetime'] = pd.to_datetime(load2023['completetime'], format='%d.%m.%Y %H:%M')
load2023 = load2023.drop(['completetime'], axis=1)
# Adding the column temprature
load2023['Temprature']=weather_data['Lufttemperatur']

# Reorder the column
load2023 = load2023[['datetime', 'Load [MW]','Temprature']]

print(load2023)




                datetime  Load [MW]  Temprature
0    2023-01-01 00:00:00       8943         3.6
1    2023-01-01 01:00:00       8929         3.0
2    2023-01-01 02:00:00       8887         2.6
3    2023-01-01 03:00:00       8859         2.4
4    2023-01-01 04:00:00       8880         2.0
...                  ...        ...         ...
8755 2023-12-31 19:00:00      11561        -1.9
8756 2023-12-31 20:00:00      11276        -1.7
8757 2023-12-31 21:00:00      11035        -1.5
8758 2023-12-31 22:00:00      10831        -1.9
8759 2023-12-31 23:00:00      10722        -1.9

[8760 rows x 3 columns]


Adding features to this dataset can help us better understand it. Let's start with some simple calendar features like day of week and time of day. 

In [7]:
load2023 = load2023.reset_index(drop=True)
load2023['Day_of_week'] = load2023['datetime'].dt.dayofweek
load2023['Hour_of_day'] = load2023['datetime'].dt.hour
load2023 = load2023.set_index('datetime')
load2023

Unnamed: 0_level_0,Load [MW],Temprature,Day_of_week,Hour_of_day
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2023-01-01 00:00:00,8943,3.6,6,0
2023-01-01 01:00:00,8929,3.0,6,1
2023-01-01 02:00:00,8887,2.6,6,2
2023-01-01 03:00:00,8859,2.4,6,3
2023-01-01 04:00:00,8880,2.0,6,4
...,...,...,...,...
2023-12-31 19:00:00,11561,-1.9,6,19
2023-12-31 20:00:00,11276,-1.7,6,20
2023-12-31 21:00:00,11035,-1.5,6,21
2023-12-31 22:00:00,10831,-1.9,6,22
