<a href="https://colab.research.google.com/github/daisyKim12/Tensorflow_Study/blob/main/Lecture_C5_Weekly_US_Retail.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Category 5

Sequence

Build and train a neural network to predict the time indexed variable of the univariate US diesel prices (On - Highway) All types for the period of 1994 - 2021.

Using a **window of past 10 observations of 1 feature** , train the model to predict the **next 10 observations** of that feature.

If you follow all the rules mentioned above and throughout this
question while training your neural network, there is a possibility that a
validation **MAE of approximately 0.02 or less on the normalized validation
dataset** may fetch you top marks.

# Import, URL

In [None]:
import urllib
import pandas as pd
import tensorflow as tf
from tensorflow.keras.layers import Dense, Conv1D, LSTM, Bidirectional
from tensorflow.keras.models import Sequential
from tensorflow.keras.callbacks import ModelCheckpoint

In [None]:
url = 'https://www.dropbox.com/s/eduk281didil1km/Weekly_U.S.Diesel_Retail_Prices.csv?dl=1'
urllib.request.urlretrieve(url, 'Weekly_U.S.Diesel_Retail_Prices.csv')

('Weekly_U.S.Diesel_Retail_Prices.csv',
 <http.client.HTTPMessage at 0x780fd4e133a0>)

In [None]:
def normalize_series(data, min, max):
    data = data - min
    data = data / max
    return data

In [None]:
def windowed_dataset(series, batch_size, n_past=10, n_future=10, shift=1):
    ds = tf.data.Dataset.from_tensor_slices(series)
    ds = ds.window(size=n_past + n_future, shift=shift, drop_remainder=True)
    ds = ds.flat_map(lambda w: w.batch(n_past + n_future))
    ds = ds.map(lambda w: (w[:n_past], w[n_past:]))
    return ds.batch(batch_size).prefetch(1)

In [None]:
df = pd.read_csv('Weekly_U.S.Diesel_Retail_Prices.csv', infer_datetime_format = True, index_col='Week of', header=0)
df.head(20)

Unnamed: 0_level_0,Weekly U.S. No 2 Diesel Retail Prices Dollars per Gallon
Week of,Unnamed: 1_level_1
1994-03-21,1.106
1994-03-28,1.107
1994-04-04,1.109
1994-04-11,1.108
1994-04-18,1.105
1994-04-25,1.106
1994-05-02,1.104
1994-05-09,1.101
1994-05-16,1.099
1994-05-23,1.099


In [None]:
N_FEATURES = len(df.columns)
N_FEATURES

1

# Split dataset

In [None]:
# Normalize
data = df.values
data = normalize_series(data, data.min(axis=0), data.max(axis=0))

# Split data
SPLIT_TIME = int(len(data) * 0.8)
x_train = data[:SPLIT_TIME]
x_valid = data[SPLIT_TIME:]

In [None]:
BATCH_SIZE = 32
N_PAST = 10
N_FUTURE = 10
SHIFT = 1

In [None]:
train_set = windowed_dataset(series=x_train, batch_size=BATCH_SIZE,
                             n_past=N_PAST, n_future=N_FUTURE,
                             shift=SHIFT)

valid_set = windowed_dataset(series=x_valid, batch_size=BATCH_SIZE,
                             n_past=N_PAST, n_future=N_FUTURE,
                             shift=SHIFT)

# Modeling, checkpoint, compile, fit, load, evaluate

In [None]:
model = Sequential([
    Conv1D(filters=32, kernel_size=5, padding='causal', activation='relu', input_shape=[N_PAST, 1]),
    Bidirectional(LSTM(32, return_sequences=True)),
    Bidirectional(LSTM(32, return_sequences=True)),
    Dense(32, activation='relu'),
    Dense(16, activation='relu'),
    Dense(N_FEATURES),
])

In [None]:
checkpoint_path = 'model/my_checkpoint.ckpt'
checkpoint = ModelCheckpoint(filepath=checkpoint_path,
                             save_weights_only=True,
                             save_best_only=True,
                             monitor='val_mae',
                             verbose=1)

In [None]:
optimizer = tf.keras.optimizers.Adam(0.0001)
model.compile(optimizer = optimizer, loss = tf.keras.losses.Huber(), metrics=['mae'])

In [None]:
model.fit(train_set, validation_data = valid_set,
          epochs=50,
          callbacks=[checkpoint])

Epoch 1/50
     32/Unknown - 19s 13ms/step - loss: 0.0465 - mae: 0.2308
Epoch 1: val_mae improved from inf to 0.30319, saving model to model/my_checkpoint.ckpt
Epoch 2/50
Epoch 2: val_mae improved from 0.30319 to 0.19678, saving model to model/my_checkpoint.ckpt
Epoch 3/50
Epoch 3: val_mae improved from 0.19678 to 0.07862, saving model to model/my_checkpoint.ckpt
Epoch 4/50
Epoch 4: val_mae improved from 0.07862 to 0.03929, saving model to model/my_checkpoint.ckpt
Epoch 5/50
Epoch 5: val_mae improved from 0.03929 to 0.03826, saving model to model/my_checkpoint.ckpt
Epoch 6/50
Epoch 6: val_mae improved from 0.03826 to 0.03654, saving model to model/my_checkpoint.ckpt
Epoch 7/50
Epoch 7: val_mae improved from 0.03654 to 0.03510, saving model to model/my_checkpoint.ckpt
Epoch 8/50
Epoch 8: val_mae improved from 0.03510 to 0.03366, saving model to model/my_checkpoint.ckpt
Epoch 9/50
Epoch 9: val_mae improved from 0.03366 to 0.03282, saving model to model/my_checkpoint.ckpt
Epoch 10/50
Epoc

<keras.callbacks.History at 0x780fc02e4af0>

In [None]:
model.load_weights(checkpoint_path)

In [None]:
model.evaluate(valid_set)