In [None]:
import pandas as pd
import numpy as np

### Exploring preprocessed data

We start with "./METR-LA/data_in12_out12.pkl". Note that the data part is the same when considering, e.g., "./METR-LA/data_in2016_out12.pkl"; what changes are the indexes, which is the actual partitioning of the time-series in windows of given sizes.

In [None]:
path_data = "./METR-LA/data_in12_out12.pkl"
data = pd.read_pickle(path_data)

# In data['processed_data'] we have a 3D array where the last dimension contains the 3 features associated with
# each time step of a time series. The first feature is the detected speed, the second one
display(data.keys())
display(data['processed_data'].shape)

In [None]:
# Shape of the processed data: (num timesteps, num_cells, num features)
display(data['processed_data'].shape)

# 1st feature: (normalized) values detected from sensors.
display(data['processed_data'][:,:,0])
# 2nd feature: time of day, normalized in [0,1].
display(data['processed_data'][:,:,1])
# 3rd feature: day of the week (in [0,6]).
display(data['processed_data'][:,:,2])

### Exploring preprocessed indexes

This is where the differences between inXXX_outYYY come out.

An index pickle contains a dictionary. This dictionary contains a set of pairs of windows (examples), each expressed as a triple (x,y,z).
(x,y) represents the window in the past, while (y,z) represents the future window to predict.

The windows (examples) are partitioned into train (70%), validation (10%), and test (20%). The partitioning is done on the time axis.

In [None]:
path_index = "./METR-LA/index_in12_out12.pkl"
index = pd.read_pickle(path_index)
display(index.keys())

display(len(index['train']))
display(len(index['valid']))
display(len(index['test']))

display(index['test'])

### Exploring scalers computed during the preprocessing step.

These are the scalers computed to normalize (standard normalization) the detected speeds.

In [None]:
path_data = "./METR-LA/scaler_in2016_out12.pkl"
scaler = pd.read_pickle(path_data)
display(scaler)