# experiment procedure
this notebook describes the procedure of the experiment shown in the paper, and to reproduce the result.

TODO
- predict.py

### Input

preprocessed table data 
stored in pickle format and containing the following columns (declared in `config.py`)
- `datetime_id`: one-dimensional integer ID of datetime information
- `road_id`: one-dimensional interger ID of road information
- `search_1h`: Time-specified Search Count at one-hour interval
- `search_unspec_1d`: Non-Time-specified Search Count at one-day interval
- `allCars`: traffic volume at one-hour interval

## 0. preprocess raw traffic data, and create a dataset in pickle format

In [1]:
# !python3.9 df_preprocess.py ./datasets_1h/all.pkl -s 2021/4/2 -e 2022/6/1
# !python3.9 df_preprocess.py ./datasets_1h/mini.pkl -s 2021/4/2 -e 2021/6/1

## 1. format and standardize the input into spatio-temporal dataset
- represent the tabular input as a tensor ($D \times T \times S$)
    - $D$: number of features
    - $T$: time series length
    - $S$: number of road segments on E17

In [2]:
!python3.9 transformer.py -h

usage: transformer.py [-h] [-s] [-v] [--scaler_path SCALER_PATH]
                      input_path output_feature_path output_label_path

positional arguments:
  input_path            path of dataframe pickle
  output_feature_path   path of dataframe pickle
  output_label_path     path of dataframe pickle

optional arguments:
  -h, --help            show this help message and exit
  -s, --scale           whether to standardize data
  -v, --valid           set the scaler to validation mode (default: train)
  --scaler_path SCALER_PATH
                        path to save scaler


In [3]:
# !python3.9 transformer.py -s <train data> --scaler_path <scaler file> <tensor file of standardized train features> <tensor file of train labels>
# !python3.9 transformer.py -s -v <test data> --scaler_path <scaler file> <tensor file of standardized test features> <tensor file of test labels>

## 2. training models

In [4]:
!python3.9 train.py -h

usage: train.py [-h] [-e EPOCH] [-d {cpu,cuda,cuda:0,cuda:1,cuda:2,cuda:3}]
                [--model_name MODEL_NAME] [--log_path LOG_PATH]
                [--log_steps LOG_STEPS] [--background] [-r RANDOM_STATE]
                [--lstm_dropout LSTM_DROPOUT]
                {T-ED,ST-ED,TE-ED,STE-ED,STE-ED-S}

positional arguments:
  {T-ED,ST-ED,TE-ED,STE-ED,STE-ED-S}
                        training model

optional arguments:
  -h, --help            show this help message and exit
  -e EPOCH, --epoch EPOCH
                        number of epochs
  -d {cpu,cuda,cuda:0,cuda:1,cuda:2,cuda:3}, --device {cpu,cuda,cuda:0,cuda:1,cuda:2,cuda:3}
                        device on which training and inference are run
  --model_name MODEL_NAME
                        file name of saved model's parameter
  --log_path LOG_PATH   file name of model training log
  --log_steps LOG_STEPS
                        step size between epochs to output training logs
  --background          whether to output l

### 2.1 STE-ED-S

In [10]:
# !python3.9 train.py STE-ED-S -d cuda:3 -e 1 -r 1000

### 2.2 other DNNs

In [6]:
# !python3.9 train.py T-ED -d cuda:3 -e 10 -r 1000

### 2.3 classical baselines

In [10]:
!python3.9 baseline.py HA --result results/ha_result.pkl --X_train datasets_1h/features_train.pkl --X_test datasets_1h/features_test.pkl
!python3.9 baseline.py PeriodicHA --result results/pha_result.pkl --X_train datasets_1h/features_train.pkl --X_test datasets_1h/features_test.pkl
# !python3.9 baseline.py PeriodicHA --result results/pha_result.pkl
# !python3.9 baseline.py ARIMA --result results/arima.pkl
# !python3.9 baseline.py RF --result results/rf.pkl
# !python3.9 baseline.py SVR --result results/svr.pkl

-------------------- Train Loss: MAE = 472.626, RMSE = 674.248 --------------------
-------------------- Test Loss: MAE = 496.209, RMSE = 700.697 --------------------
saved result to results/ha_result.pkl
-------------------- Train Loss: MAE = 137.353, RMSE = 244.623 --------------------
-------------------- Test Loss: MAE = 159.868, RMSE = 285.426 --------------------
saved result to results/pha_result.pkl
