# LSTM Time Series Forecasting Input / Output Shape

###  Univariate(단변수) Multi-step Input LSTM and Single-step Output

### Multivariate(다변수) Multi-step Input LSTM and Single-step Output

**multi-step size == window size**

In [1]:
import tensorflow as tf
import numpy as np
from tensorflow.keras.models import Sequential

## 1. Univariate Multi-step Input and Single-step output LSTM 

- 단일변수 multi-timestep 입력 단일 timestep 출력  

- input feature - 1, output unit - 1

    ex) 과거 3 일간 종가 입력 $\rightarrow$ 내일 주가 예상

In [2]:
def windowed_ds(series, window_size, batch_size, shuffle_buffer):    
    ds = tf.data.Dataset.from_tensor_slices(series)
    ds = ds.window(window_size + 1, shift=1, drop_remainder=True)
    ds = ds.flat_map(lambda window: window.batch(window_size+1))
    ds = ds.map(lambda window: (window[:-1], window[-1]))
    ds = ds.batch(batch_size).prefetch(1)
    return ds

In [3]:
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]

window_size = 3
batch_size = 1
dataset = windowed_ds(raw_seq, window_size, batch_size, 10)
dataset

<PrefetchDataset shapes: ((None, None), (None,)), types: (tf.int32, tf.int32)>

In [4]:
for x, y in dataset:
    print(x.numpy().shape, y.numpy().shape)
    print(x.numpy())
    print(y.numpy())
    break

(1, 3) (1,)
[[10 20 30]]
[40]


In [5]:
model = Sequential([
    tf.keras.layers.LSTM(50, activation='relu', input_shape=[window_size, 1]),
    tf.keras.layers.Dense(1)
])

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm (LSTM)                  (None, 50)                10400     
_________________________________________________________________
dense (Dense)                (None, 1)                 51        
Total params: 10,451
Trainable params: 10,451
Non-trainable params: 0
_________________________________________________________________


In [6]:
x_input = np.array([[70, 80, 90]])
x_input = x_input.reshape(1, -1, 1)
yhat = model.predict(x_input[:window_size])

print(x_input.shape, yhat.shape)

(1, 3, 1) (1, 1)


## 2. Multivariate Multi-step Input and Single-step Output LSTM 

- 여러개의 변수를 multi-timestep 입력 $\rightarrow$ 단일 time-step 출력  

- input feature - n, output unit - 1

    ex) 주가, 환율 과거 3 일치 입력하여 다음날 주가(환율) 예측
    ```
    [[ 10,  15],
     [ 20,  25],
     [ 30,  35]]   --> [40]   
    ```

- input sequence 정의

In [7]:
in_stock = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_forex = np.array([15, 25, 35, 45, 55, 65, 75, 85, 95])

out_seq = in_stock[3:]
out_seq

array([40, 50, 60, 70, 80, 90])

- `[row, columns]` 구조로 변환하고 열을 수평으로 쌓습니다.

In [8]:
in_stock = in_stock.reshape(-1, 1)
in_forex = in_forex.reshape(-1, 1)

raw_seq = np.hstack((in_stock, in_forex))
raw_seq

array([[10, 15],
       [20, 25],
       [30, 35],
       [40, 45],
       [50, 55],
       [60, 65],
       [70, 75],
       [80, 85],
       [90, 95]])

In [9]:
def windowed_ds(series, window_size, batch_size, shuffle_buffer):    
    ds = tf.data.Dataset.from_tensor_slices(series)
    ds = ds.window(window_size + 1, shift=1, drop_remainder=True)
    ds = ds.flat_map(lambda window: window.batch(window_size+1))
    ds = ds.map(lambda window: (window[:-1], window[-1,0]))
    ds = ds.batch(batch_size).prefetch(1)
    return ds

In [10]:
window_size = 3
batch_size = 1
dataset = windowed_ds(raw_seq, window_size, batch_size, 10)
dataset

<PrefetchDataset shapes: ((None, None, 2), (None,)), types: (tf.int32, tf.int32)>

In [11]:
for x, y in dataset:
    print(x.numpy().shape, y.numpy().shape)
    print(x.numpy(), y.numpy())
    break

(1, 3, 2) (1,)
[[[10 15]
  [20 25]
  [30 35]]] [40]


In [17]:
model = Sequential([
    tf.keras.layers.LSTM(50, activation='relu', input_shape=[window_size, 2]),
    tf.keras.layers.Dense(1)
])

model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_2 (LSTM)                (None, 50)                10600     
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 51        
Total params: 10,651
Trainable params: 10,651
Non-trainable params: 0
_________________________________________________________________


In [20]:
x_input = np.array([[70, 80]])
x_input = x_input.reshape(1, -1, 2)
yhat = model.predict(x_input[:window_size])

print(x_input.shape, yhat.shape)

(1, 1, 2) (1, 1)


### 다변수를 이용한 주가 예측

In [21]:
import yfinance as yf

df = yf.download('AAPL', start='2015-01-01', end='2019-12-31', progress=False)
df.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2014-12-31,28.205,28.282499,27.5525,27.594999,24.951866,165613600
2015-01-02,27.8475,27.860001,26.8375,27.3325,24.714504,212818400
2015-01-05,27.0725,27.1625,26.352501,26.5625,24.018261,257142000
2015-01-06,26.635,26.8575,26.157499,26.565001,24.020519,263188400
2015-01-07,26.799999,27.049999,26.674999,26.9375,24.357342,160423600


In [22]:
dataset = df.iloc[:, [3, 5]].values
dataset.shape

(1258, 2)

In [23]:
window_size = 3
batch_size = 1

ds = tf.expand_dims(dataset, axis=1)
ds = tf.data.Dataset.from_tensor_slices(dataset)
ds = ds.window(window_size+1, shift=1, drop_remainder=True)
ds = ds.flat_map(lambda w: w.batch(window_size+1))
ds = ds.map(lambda w: (w[:-1], w[-1][0]))
ds = ds.batch(batch_size).prefetch(1)
ds

for x, y in ds:
    print(x.shape, y.shape)
    print()
    print(x.numpy(), y.numpy())
    break

(1, 3, 2) (1,)

[[[2.75949993e+01 1.65613600e+08]
  [2.73325005e+01 2.12818400e+08]
  [2.65625000e+01 2.57142000e+08]]] [26.56500053]
