# LSTM Time Series Forecasting Input / Output Shape

### 1. Univariate Multi-step Input LSTM and Single-step Output

### 2. Multivariate Multi-step Input LSTM and Single-step Output

### 3. Univariate Multi-step Input LSTM and Multi-step Output

### 4. Multivariate Multi-step Input LSTM and Multi-step Output

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow.keras.models import Sequential

## 1. Univariate Multi-step Input and Single-step output LSTM 

- 단일변수 multi-timestep 입력 단일 timestep 출력  

    ex) 과거 3 일간 주가 입력 $\rightarrow$ 내일 주가 예상

### Data preparation

In [46]:
def windowed_ds(series, window_size, batch_size, shuffle_buffer):    
    ds = tf.data.Dataset.from_tensor_slices(series)
    ds = ds.window(window_size + 1, shift=1, drop_remainder=True)
    ds = ds.flat_map(lambda window: window.batch(window_size+1))
    ds = ds.map(lambda window: (window[:-1], window[-1]))
    ds = ds.batch(batch_size).prefetch(1)
    return ds

In [47]:
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]

window_size = 3
batch_size = 1
dataset = windowed_dataset(raw_seq, window_size, batch_size, 10)
dataset

<PrefetchDataset shapes: ((None, None, 1), (None, 1)), types: (tf.int32, tf.int32)>

In [55]:
for x, y in dataset:
    print(x.numpy().shape, y.numpy().shape)
    print(x.numpy())
    print(y.numpy())
    break

(1, 3, 1) (1, 1)
[[[10]
  [20]
  [30]]]
[[40]]


In [63]:
tf.keras.backend.clear_session()
model = Sequential([
    tf.keras.layers.LSTM(50, activation='relu', 
                         input_shape=[window_size, 1]),
    tf.keras.layers.Dense(1)
])

In [65]:
x_input = np.array([[70, 80, 90]])
x_input = x_input.reshape(1, -1, 1)
yhat = model.predict(x_input[:window_size].reshape(1, window_size, 1))

print(x_input.shape, yhat.shape)

(1, 3, 1) (1, 1)


## 2. Multivariate Multi-step Input and Single-step Output LSTM 

- 여러개의 변수를 multi-timestep 입력 $\rightarrow$ 단일 time-step 출력  

    ex) 주가, 환율 과거 3 일치 입력하여 다음날 주가(환율) 예측
    ```
    [[ 10,  15,  25],
     [ 20,  25,  45],
     [ 30,  35,  65]]   --> [40]   
    ```

- input sequence 정의

In [66]:
in_seq1 = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = np.array([15, 25, 35, 45, 55, 65, 75, 85, 95])

out_seq = np.array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
out_seq

array([ 25,  45,  65,  85, 105, 125, 145, 165, 185])

- `[row, columns]` 구조로 변환하고 열을 수평으로 쌓습니다.

In [67]:
in_seq1 = in_seq1.reshape(-1, 1)
in_seq2 = in_seq2.reshape(-1, 1)
out_seq = out_seq.reshape(-1, 1)

dataset = np.hstack((in_seq1, in_seq2, out_seq))
dataset

array([[ 10,  15,  25],
       [ 20,  25,  45],
       [ 30,  35,  65],
       [ 40,  45,  85],
       [ 50,  55, 105],
       [ 60,  65, 125],
       [ 70,  75, 145],
       [ 80,  85, 165],
       [ 90,  95, 185]])

### Multiple Input Series

- 두개 이상의 병렬 input series 와 그 input time series 에 종속되는 하나의 output  

    ```
     [[10 15]  
      [20 25]  
      [30 35]] -->  [65]
    ```
```
ds.map(lambda window: (window[:window_size, 0:2], window[-1][-1]))
```

In [68]:
window_size = 3
batch_size = 1

ds = tf.expand_dims(dataset, axis=1)
ds = tf.data.Dataset.from_tensor_slices(dataset)
ds = ds.window(window_size, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda window: window.batch(window_size))
ds = ds.map(lambda window: (window[:window_size, 0:2], window[-1][-1]))
ds = ds.batch(batch_size).prefetch(1)
ds

<PrefetchDataset shapes: ((None, None, 2), (None,)), types: (tf.int64, tf.int64)>

In [69]:
for x, y in ds:
    print(x.numpy().shape, y.numpy().shape)
    print(x.numpy())
    print(y.numpy())
    break

(1, 3, 2) (1,)
[[[10 15]
  [20 25]
  [30 35]]]
[65]


In [70]:
tf.keras.backend.clear_session()
model = Sequential([
    tf.keras.layers.LSTM(50, activation='relu', input_shape=[window_size, 2]),
    tf.keras.layers.Dense(1)
])

x_input = np.array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, 3, 2))
yhat = model.predict(x_input)
print(x_input.shape, yhat.shape)

(1, 3, 2) (1, 1)


### Multivariate Multi-step Input and Single-step output LSTM

- Multiple time series 가 병렬적으로 주어지고, prediction 이 각각에 대해 이루어짐

    ex) 주가, 환율, 금리의 과거 3 일치 입력 $\rightarrow$ 주가, 환율, 금리의 next day 동시 예측

       [[ 10,  15,  25],
        [ 20,  25,  45],
        [ 30,  35,  65]]   --> [ 40,  45,  85],

In [72]:
window_size = 3
batch_size = 1
n_features = 3
.
ds = tf.data.Dataset.from_tensor_slices(dataset)
ds = ds.window(window_size+1, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda window: window.batch(window_size+1))
ds = ds.map(lambda window: (window[:window_size], window[-1]))
ds = ds.batch(batch_size).prefetch(1)
ds

<PrefetchDataset shapes: ((None, None, 3), (None, 3)), types: (tf.int64, tf.int64)>

In [73]:
for x, y in ds:
    print(x.numpy().shape, y.numpy().shape)
    print(x.numpy())
    print(y.numpy())
    break

(1, 3, 3) (1, 3)
[[[10 15 25]
  [20 25 45]
  [30 35 65]]]
[[40 45 85]]


In [81]:
tf.keras.backend.clear_session()
model = Sequential([
    tf.keras.layers.LSTM(50, activation='relu',
                         input_shape=[window_size, n_features]),
    tf.keras.layers.Dense(n_features)
])

x_input = np.array([[70, 75, 145], [80, 85,165], [90, 95, 185]])
x_input = x_input.reshape((1, 3, 3))
yhat = model.predict(x_input)
print(x_input.shape, yhat.shape)

(1, 3, 3) (1, 3)


## 3. Univariate Multi-step Input Muti-step Output LSTM 

- 단일 변수 multi-timestep 입력 $\rightarrow$ multi-timestep 출력  

    ex) 과거 3 일간의 주가를 입력 받아 next 2 일간의 주가 예측
    
```
dataset.map(lambda w: (w[:-n_steps_out], w[-n_steps_out:]))
```

In [82]:
X_train = [10, 20, 30, 40, 50, 60, 70, 80, 90]

window_size = 3
batch_size = 1

In [83]:
def windowed_dataset(series, window_size, batch_size, shuffle_buffer, n_steps_out):
    series = tf.expand_dims(series, axis=-1)
    ds = tf.data.Dataset.from_tensor_slices(series)
    ds = ds.window(window_size + n_steps_out, shift=1, drop_remainder=True)
    ds = ds.flat_map(lambda w: w.batch(window_size+n_steps_out))
    ds = ds.shuffle(shuffle_buffer).map(lambda w: (w[:-n_steps_out], w[-n_steps_out:]))
    ds = ds.batch(batch_size).prefetch(1)
    return ds

In [84]:
window_size = 3
n_steps_out = 2
shuffle_buffer_size = 100

train_set = windowed_dataset(X_train, window_size, batch_size, 
                             shuffle_buffer_size, n_steps_out)
train_set

<PrefetchDataset shapes: ((None, None, 1), (None, None, 1)), types: (tf.int32, tf.int32)>

In [85]:
for x, y in train_set:
    print(x.numpy().shape, y.numpy().shape)
    print(x.numpy())
    print(y.numpy())
    break

(1, 3, 1) (1, 2, 1)
[[[20]
  [30]
  [40]]]
[[[50]
  [60]]]


In [87]:
tf.keras.backend.clear_session()
model = Sequential([
    tf.keras.layers.LSTM(50, activation='relu',
                         input_shape=[window_size, 1]),
    tf.keras.layers.Dense(n_steps_out)
])

x_input = np.array([[70, 80, 90]])
x_input = x_input.reshape(1, -1, 1)
yhat = model.predict(x_input)
print(x_input.shape, yhat.shape)

(1, 3, 1) (1, 2)


## 4. Multivariate Multi-step Input Muti-step Output LSTM 

- 단일 변수 multi-timestep 입력 $\rightarrow$ multi-timestep 출력  

    ex) 주가, 환율, 금리의 과거 3 일치 입력 $\rightarrow$ 주가의 next 2 일치 예측
```
       [[ 10,  15,  25],
        [ 20,  25,  45],
        [ 30,  35,  65]]   --> [ 40, 60]
```
        
 ```
 ds.map(lambda w: (w[:-n_steps_out], w[-n_steps_out:][0]))
 ```

In [88]:
in_seq1 = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = np.array([15, 25, 35, 45, 55, 65, 75, 85, 95])

out_seq = np.array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
out_seq

array([ 25,  45,  65,  85, 105, 125, 145, 165, 185])

In [95]:
in_seq1 = in_seq1.reshape(-1, 1)
in_seq2 = in_seq2.reshape(-1, 1)
out_seq = out_seq.reshape(-1, 1)

X_train = np.hstack((in_seq1, in_seq2, out_seq))
X_train

array([[ 10,  15,  25],
       [ 20,  25,  45],
       [ 30,  35,  65],
       [ 40,  45,  85],
       [ 50,  55, 105],
       [ 60,  65, 125],
       [ 70,  75, 145],
       [ 80,  85, 165],
       [ 90,  95, 185]])

In [98]:
def windowed_dataset(series, window_size, batch_size, shuffle_buffer, n_steps_out):
    ds = tf.data.Dataset.from_tensor_slices(series)
    ds = ds.window(window_size + n_steps_out, shift=1, drop_remainder=True)
    ds = ds.flat_map(lambda w: w.batch(window_size+n_steps_out))
    ds = ds.shuffle(shuffle_buffer).map(lambda w: (w[:-n_steps_out], w[-n_steps_out:][0]))
    ds = ds.batch(batch_size).prefetch(1)
    return ds

In [102]:
window_size = 3  #input steps
n_steps_out = 2
shuffle_buffer_size = 100

train_set = windowed_dataset(X_train, window_size, batch_size, 
                             shuffle_buffer_size, n_steps_out)
train_set

<PrefetchDataset shapes: ((None, None, 3), (None, 3)), types: (tf.int64, tf.int64)>

In [103]:
for x, y in train_set:
    print(x.numpy().shape, y.numpy().shape)
    print(x.numpy())
    print(y.numpy())
    break

(1, 3, 3) (1, 3)
[[[20 25 45]
  [30 35 65]
  [40 45 85]]]
[[ 50  55 105]]


In [106]:
tf.keras.backend.clear_session()

n_features = 3

model = Sequential([
    tf.keras.layers.LSTM(100, activation='relu', 
                         input_shape=(window_size, n_features)),
    tf.keras.layers.Dense(n_features),
])

x_input = np.array([[10,  15,  25], [20,  25,  45], [30,  35,  65]])
x_input = x_input.reshape((1, window_size, n_features))

yhat = model.predict(x_input)
print(x_input.shape, yhat.shape)

(1, 3, 3) (1, 3)
