# Prompt 
* HW assignment: Using dataset selected for final project:


* Perform feature engineering
* Estimate baseline model
* Estimate different model and/or different loss function to improve model performance
* Hint: Determine what metric(s) is/are appropriate for your use case
* Interpret results
* Explain what you did and why

## Write-up
* Goal: Forecast the last 6 weeks of the time series for each store.
* Kaggle Prompt: Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied. In their first Kaggle competition, Rossmann is challenging you to predict 6 weeks of daily sales for 1,115 stores located across Germany. Reliable sales forecasts enable store managers to create effective staff schedules that increase productivity and motivation. By helping Rossmann create a robust prediction model, you will help store managers stay focused on what’s most important to them: their customers and their teams! 

**Variable Importance**

I first utilize a GBM in order to see if it can quickly tell me which variables are important or not. I do not set this up as a normal time series forecasting question, so I don't add any lags (prev day, week, month). I also don't split the dataset by time series and I don't force the train/test to not overlap. However, using the output, I am able to see that the following variables are important: 

    1. "Open": if store is open
    2. "Promo": if that Day is a promo day
    3. "CompetitionDistance": nearest competitor distance
    
**Feature Engineering**

I honestly couldn't think of a ton of different features to add. I did think that adding in that day's distance to the previous open day and the previous promo day could be promising, given how important those features seemed to be.

**1st Attempt: Prophet w/o addtl regressors**
As a baseline model, I decided to use <a href="https://research.fb.com/prophet-forecasting-at-scale/">Prophet from fb</a>. It's a time series model that I use as a baseline model because it doesn't make a ton of assumptions about the underlying data structure.

**2nd Attempt: Prophet w/ addtl regressors**
Additionally, it is very simple to implement and has the ability to add in additional regressors. The second version of the model included the regressors for the top 3 important variables as well as some of the features that I created, "daydiff_open_dt" and "daydiff_promo_dt". These variables represent the distance from the date being forecast to the previous day the store was open as well as the previous day that the store had a promo.

**Metric Selection**
The competition utilizes RMSPE (Root Mean Squared Percentage Error) and doesn't include any day with 0 sales. I think that this is good in that it penalizes large errors. However, I do not like the fact that we are looking at the error relative to the prediction, so I also recommend looking at RMSE. Since the competition says the purpose is to help the managers effectively staff their stores, I make the assumption that there are a certain number of salespeople per sales volume. The problem with RMSPE is: if there were only 1 store/day and we forecast 50,000 in sales and the actual came in at 100,000, the RMSPE would be 50%. Now, if there were only 1 store/day and we forecast 50 in sales and the actual came in at 100, the RMSPE would **also** be 50%. To me, the first seems like a much bigger error in terms of staffing (if there is some $/salespeople ratio). RMSE for the first scenario is 50,000 and 50 for the second.



* I noticed that there seemed to be an 

### Import Packages and Data

In [1]:
from multiprocessing import Pool, cpu_count

import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import pandas as pd
pd.set_option('display.float_format', lambda x: '%.3f' % x)

In [2]:
pwd

'/home/paperspace/ANDREW-MINKYU-SANG/final_proj/workprod'

In [3]:
cd /home/paperspace/ANDREW-MINKYU-SANG/

/home/paperspace/ANDREW-MINKYU-SANG


In [4]:
# cd /Users/andrewsang/Documents/ucla_stats/

In [5]:
ls

[0m[01;34mClass0-Intro[0m/  [01;34mClass2[0m/  [01;34mClass4[0m/  [01;34mfinal_proj[0m/  req.txt
[01;34mClass1[0m/        [01;34mClass3[0m/  [01;34mClass5[0m/  README.md    requirements.txt


In [6]:
train = pd.read_csv('final_proj/input_data/train.csv')
test = pd.read_csv('final_proj/input_data/test.csv')
store = pd.read_csv('final_proj/input_data/store.csv')

  interactivity=interactivity, compiler=compiler, result=result)


In [7]:
# train.loc[:,'Date'] = pd.to_datetime(train['Date'])
train.head()

Unnamed: 0,Store,DayOfWeek,Date,Sales,Customers,Open,Promo,StateHoliday,SchoolHoliday
0,1,5,2015-07-31,5263,555,1,1,0,1
1,2,5,2015-07-31,6064,625,1,1,0,1
2,3,5,2015-07-31,8314,821,1,1,0,1
3,4,5,2015-07-31,13995,1498,1,1,0,1
4,5,5,2015-07-31,4822,559,1,1,0,1


### Functions for Data Manipulation and Feature Engineering

In [8]:
def data_manipulation(dataframe):
    dataframe = dataframe.merge(store,how='left',on='Store')
    mask = pd.isnull(dataframe['CompetitionDistance'])
    # replace 3 stores with median where competition distance is null
    dataframe.loc[mask,'CompetitionDistance'] = store['CompetitionDistance'].median() 
    return dataframe

In [9]:
def day_dist(dataframe, start_dt):
    # measuring current date - 
    dataframe = dataframe.sort_values(['Store','Date'])
    dataframe.loc[:,'prev_'+start_dt] = dataframe.groupby('Store')[start_dt] \
                                                 .transform(lambda x:x.ffill().shift(1))
    dataframe.loc[:,'daydiff_'+start_dt] = (dataframe['Date']- dataframe['prev_'+start_dt]) / \
                                            np.timedelta64(1, 'D') - 1
    dataframe.loc[pd.isnull(dataframe['daydiff_'+start_dt]),'daydiff_'+start_dt] = np.NaN
    return dataframe

def ftr_eng(dataframe):
    dataframe.loc[:,'promo_dt'] = np.where(dataframe['Promo']==1,dataframe['Date'], np.NaN)
    dataframe.loc[:,'closed_dt'] = np.where(dataframe['Open']==0,dataframe['Date'], np.NaN)
    dataframe.loc[:,'open_dt'] = np.where(dataframe['Open']==1,dataframe['Date'], np.NaN)
    dataframe.head()
    for k in ['Date','closed_dt','open_dt','promo_dt']:
        dataframe.loc[:,k] = pd.to_datetime(dataframe.loc[:,k])
        
    dataframe = day_dist(dataframe,'open_dt') # distance to most recent open day
    dataframe = day_dist(dataframe,'promo_dt') # distance to most recent promo day
    return dataframe

In [10]:
def piv_df(dataframe):
    return pd.pivot_table(dataframe,values='Sales',index=['Store'], columns=['Date']).reset_index()

## Use GBM for Variable Importance

In [11]:
df = pd.read_csv('final_proj/input_data/train.csv')
df = data_manipulation(df)
df = ftr_eng(df)

In [13]:
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator

h2o.init()

hf = h2o.H2OFrame(df)

# Set up X and Y columns
X = [e for e in df]
X.remove('Store')
X.remove('Sales')
X.remove('Customers')
X.remove('open_dt')
X.remove('closed_dt')
X.remove('promo_dt')
y = 'Sales'

# Split Frame
train, valid, test = hf.split_frame([0.6, 0.2], seed=1234)

# Specify Model
gbm = H2OGradientBoostingEstimator(seed=123)
gbm.train(X, y, training_frame=train, validation_frame=valid)

# Summary
gbm

Checking whether there is an H2O instance running at http://localhost:54321..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_191"; OpenJDK Runtime Environment (build 1.8.0_191-8u191-b12-2ubuntu0.16.04.1-b12); OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
  Starting server from /home/paperspace/anaconda3/envs/fastai/lib/python3.6/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /tmp/tmp3xk96i_3
  JVM stdout: /tmp/tmp3xk96i_3/h2o_paperspace_started_from_python.out
  JVM stderr: /tmp/tmp3xk96i_3/h2o_paperspace_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321... successful.


0,1
H2O cluster uptime:,01 secs
H2O cluster timezone:,America/New_York
H2O data parsing timezone:,UTC
H2O cluster version:,3.22.1.2
H2O cluster version age:,25 days
H2O cluster name:,H2O_from_python_paperspace_4wg7xo
H2O cluster total nodes:,1
H2O cluster free memory:,6.543 Gb
H2O cluster total cores:,8
H2O cluster allowed cores:,8


Parse progress: |█████████████████████████████████████████████████████████| 100%
gbm Model Build progress: |███████████████████████████████████████████████| 100%
Model Details
H2OGradientBoostingEstimator :  Gradient Boosting Machine
Model Key:  GBM_model_python_1550116348767_1


ModelMetricsRegression: gbm
** Reported on train data. **

MSE: 4727991.869292869
RMSE: 2174.3945983406206
MAE: 1461.935261250391
RMSLE: NaN
Mean Residual Deviance: 4727991.869292869

ModelMetricsRegression: gbm
** Reported on validation data. **

MSE: 4713480.94515923
RMSE: 2171.055260733644
MAE: 1458.884191873281
RMSLE: NaN
Mean Residual Deviance: 4713480.94515923
Scoring History: 


0,1,2,3,4,5,6,7,8,9
,timestamp,duration,number_of_trees,training_rmse,training_mae,training_deviance,validation_rmse,validation_mae,validation_deviance
,2019-02-13 22:53:09,0.018 sec,0.0,3848.8251131,2885.9870348,14813454.7514988,3851.0255084,2889.8053439,14830397.4666661
,2019-02-13 22:53:11,2.210 sec,1.0,3629.3644376,2724.6232774,13172286.2210690,3630.8297633,2727.6975636,13182924.7700954
,2019-02-13 22:53:12,2.704 sec,2.0,3441.1792135,2585.5925094,11841714.3796366,3441.8948054,2587.9473207,11846639.8515635
,2019-02-13 22:53:12,2.961 sec,3.0,3280.2457228,2464.2569246,10760012.0016976,3280.4042185,2466.0787872,10761051.8369250
,2019-02-13 22:53:12,3.181 sec,4.0,3143.2508450,2360.7670172,9880025.8745346,3142.6874460,2361.9834542,9876484.3831225
,2019-02-13 22:53:13,3.379 sec,5.0,3026.4024134,2270.1669576,9159111.5680627,3025.3213713,2270.9906560,9152569.3998255
,2019-02-13 22:53:13,3.580 sec,6.0,2927.6902372,2191.6179454,8571370.1251473,2926.3661259,2191.8715543,8563618.7030274
,2019-02-13 22:53:13,3.754 sec,7.0,2844.1301747,2123.0809102,8089076.4507835,2842.2654479,2122.9725609,8078472.8761332
,2019-02-13 22:53:13,3.920 sec,8.0,2772.5379590,2061.5546515,7686966.7339068,2770.3757016,2061.0668882,7674981.5278258


Variable Importances: 


0,1,2,3
variable,relative_importance,scaled_importance,percentage
Open,21930090430464.0000000,1.0,0.6765801
Promo,3505222582272.0000000,0.1598362,0.1081420
CompetitionDistance,1262618411008.0000000,0.0575747,0.0389539
DayOfWeek,934970982400.0000000,0.0426342,0.0288454
StoreType,920981995520.0000000,0.0419963,0.0284138
Assortment,794724663296.0000000,0.0362390,0.0245186
Promo2SinceWeek,632819679232.0000000,0.0288562,0.0195236
Promo2SinceYear,592458612736.0000000,0.0270158,0.0182783
CompetitionOpenSinceMonth,459221336064.0000000,0.0209402,0.0141677




## Prophet

In [14]:
def prophetFormat(dataframe):
    dataframe = dataframe.rename(index=str, columns={"Sales": "y","Date": "ds"})
    dataframe.loc[:,'ds'] = pd.to_datetime(dataframe['ds'])
    for c in ['Promo','Open','CompetitionDistance']:
        dataframe.loc[:,c] = dataframe.loc[:,c].astype(float)
    dataframe.loc[:,'floor'] = 0
    dataframe = dataframe.fillna(0)
    return dataframe

In [15]:
from fbprophet import Prophet

og = pd.read_csv('final_proj/input_data/train.csv')
og = data_manipulation(og)
og = ftr_eng(og)
og = prophetFormat(og)

date_change = pd.to_datetime('2015-06-20') # this date leaves 42 days post
mask = og['ds']>=date_change
test = og[mask]
train = og[mask==False]

  interactivity=interactivity, compiler=compiler, result=result)


In [35]:
store_list = train['Store'].unique().tolist()

def run_indiv_forecast(idx, addtl_reg=False):
    # pull data
    sample = train.loc[train['Store']==store_list[idx],:]
    test_sample = test.loc[test['Store']==store_list[idx],:]
    
    # create model, add regressors
    m = Prophet(daily_seasonality=False)
    if addtl_reg:
        m.add_regressor('Promo',mode='multiplicative')
        m.add_regressor('Open',mode='multiplicative')
        m.add_regressor('CompetitionDistance')
        m.add_regressor('daydiff_open_dt')
        m.add_regressor('daydiff_promo_dt')
    m.fit(sample)
    
    # create results, merge w actuals
    forecast = m.predict(test_sample)
    forecast.loc[:,'Store'] = train.loc[train['Store']==store_list[idx],'Store'].max()
    fc = forecast[['Store','ds','yhat']]
    fc = fc.merge(test_sample.loc[:,['ds','y']],how='left',on=['ds'])
    return fc

def run_individ_forecast_w_reg(idx):
    return run_indiv_forecast(idx,addtl_reg=True)

### 1st Attempt: Prophet w/o addtl regressors

Parallelization comes in spirit from: https://medium.com/devschile/forecasting-multiples-time-series-using-prophet-in-parallel-2515abd1a245

In [21]:
cpu_count()

8

In [25]:
import time

# Run Version without External Regressors
store_list = train.Store.unique()
start_time = time.time()

p = Pool(cpu_count())
seriesidx = np.arange(0,train['Store'].nunique())
predictions = list(p.imap(run_indiv_forecast, seriesidx))
p.close()
p.join()

print("-- %s seconds --" % (time.time() - start_time))

results = pd.DataFrame()

for e in np.arange(0,len(predictions)):
    results = pd.concat([results,predictions[e]])

  min_dt = dt.iloc[dt.nonzero()[0]].min()
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  elif np.issubdtype(np.asarray(v).dtype, float):
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  elif np.issubdtype(np.asarray(v).dtype, float):
  elif np.issubdtype(np.asarray(v).dtype, float):
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  elif np.issubdtype(np.asarray(v).dtype, float):


-- 287.1863389015198 seconds --


In [26]:
# Output results to csv
results.to_csv('results.csv')

In [49]:
# Create Evaluation Metrics
def rmse(dataframe):
    interim = dataframe.loc[dataframe['y']!=0,:]
    return np.sqrt(np.mean((interim['y'] - interim['yhat'])**2))

def rmspe(dataframe):
    interim = dataframe.loc[dataframe['y']!=0,:]
    return np.sqrt(np.mean(((interim['y']-interim['yhat'])/interim['y'])**2))

In [50]:
print('rmse is '+str(rmse(results)))
print('rmspe is '+str(rmspe(results)))

rmse is 1630.4690648652977
rmspe is 0.2315375072219016


### 2nd Attempt: Prophet w/ addtl regressors

In [37]:
# Run Version with External Regressors
store_list = train.Store.unique()
results2 = pd.DataFrame()

start_time = time.time()

p = Pool(cpu_count())
seriesidx = np.arange(0,train['Store'].nunique())
predictions = list(p.imap(run_individ_forecast_w_reg, seriesidx))
p.close()
p.join()

print("--- %s seconds ---" % (time.time() - start_time))

results2 = pd.DataFrame()

for e in np.arange(0,len(predictions)):
    results2 = pd.concat([results2,predictions[e]])

  min_dt = dt.iloc[dt.nonzero()[0]].min()
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  elif np.issubdtype(np.asarray(v).dtype, float):
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):
  elif np.issubdtype(np.asarray(v).dtype, float):
  elif np.issubdtype(np.asarray(v).dtype, float):
  min_dt = dt.iloc[dt.nonzero()[0]].min()
  elif np.issubdtype(np.asarray(v).dtype, float):


--- 765.4674119949341 seconds ---


In [38]:
# Output second version of the model to csv
results2.to_csv('results2.csv')

In [51]:
print('rmse is '+str(rmse(results2)))
print('rmspe is '+str(rmspe(results2)))

rmse is 1601.6889262847183
rmspe is 0.21327051770936867


In [None]:
# Components

# fig1 = m.plot(forecast)
# fig2 = m.plot_components(forecast)

# Please Don't Go Further

## First Model (LSTM)

First, let's try use Seq2Seq in order to see if this is a good way of forecasting. I am following this tutorial: https://github.com/JEddy92/TimeSeries_Seq2Seq/blob/master/notebooks/TS_Seq2Seq_Intro.ipynb

In [None]:
train.head(3)

In [None]:
train.columns[1]

In [None]:
data_start_date = train.columns[1]
data_end_date = train.columns[-1]
print('Data ranges from %s to %s' % (data_start_date, data_end_date))

In [None]:
def plot_random_series(train, n_series):
    
    sample = train.sample(n_series, random_state=8)
    page_labels = sample['Store'].tolist()
    series_samples = sample.loc[:,data_start_date:data_end_date]
    
    plt.figure(figsize=(10,6))
    
    for i in range(series_samples.shape[0]):
        pd.Series(series_samples.iloc[i]).astype(np.float64).plot(linewidth=1.5)
    
    plt.title('Randomly Selected Daily Store Sales')
    plt.legend(page_labels)
    
plot_random_series(train, 3)

In [None]:
from datetime import timedelta

pred_steps = 14
pred_length = timedelta(pred_steps)

first_day = pd.to_datetime(data_start_date) 
last_day = pd.to_datetime(data_end_date)

val_pred_start = last_day - pred_length + timedelta(1)
val_pred_end = last_day

train_pred_start = val_pred_start - pred_length
train_pred_end = val_pred_start - timedelta(days=1)

enc_length = train_pred_start - first_day

train_enc_start = first_day
train_enc_end = train_enc_start + enc_length - timedelta(1)

val_enc_start = train_enc_start + pred_length
val_enc_end = val_enc_start + enc_length - timedelta(1)

print('Train encoding:', train_enc_start, '-', train_enc_end)
print('Train prediction:', train_pred_start, '-', train_pred_end, '\n')
print('Val encoding:', val_enc_start, '-', val_enc_end)
print('Val prediction:', val_pred_start, '-', val_pred_end)

print('\nEncoding interval:', enc_length.days)
print('Prediction interval:', pred_length.days)

In [None]:
date_to_index = pd.Series(index=pd.Index([pd.to_datetime(c) for c in train.columns[1:]]),
                          data=[i for i in range(len(train.columns[1:]))])

series_array = train[train.columns[1:]].values

def get_time_block_series(series_array, date_to_index, start_date, end_date):
    
    inds = date_to_index[start_date:end_date]
    return series_array[:,inds]

def transform_series_encode(series_array):
    
    series_array = np.nan_to_num(series_array) # filling NaN with 0
    series_mean = series_array.mean(axis=1).reshape(-1,1) 
    series_array = series_array - series_mean
    series_array = series_array.reshape((series_array.shape[0],series_array.shape[1], 1))
    
    return series_array, series_mean

def transform_series_decode(series_array, encode_series_mean):
    
    series_array = series_array - encode_series_mean
    series_array = series_array.reshape((series_array.shape[0],series_array.shape[1], 1))
    
    return series_array

In [None]:
from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Dropout, Lambda, concatenate
from keras.optimizers import Adam
from keras.layers.recurrent import LSTM

# convolutional layer parameters
n_filters = 32
filter_width = 2
dilation_rates = [2**i for i in range(8)]

# define an input history series and pass it through a stack of dilated causal convolutions. 
history_seq = Input(shape=(None, 1))
x = history_seq

for dilation_rate in dilation_rates:
    x = Conv1D(filters=n_filters,
               kernel_size=filter_width, 
               padding='causal',
               dilation_rate=dilation_rate)(x)

x = Dense(128, activation='relu')(x)
x = Dropout(.5)(x)
x = Dense(1)(x)

# extract the last 14 time steps as the training target
def slice(x, seq_length):
    return x[:,-seq_length:,:]

pred_seq_train = Lambda(slice, arguments={'seq_length':14})(x)

model = Model(history_seq, pred_seq_train)

model.summary()

In [None]:
first_n_samples = 1000
batch_size = 128
epochs = 10

# sample of series from train_enc_start to train_enc_end  
encoder_input_data = get_time_block_series(series_array, date_to_index, 
                                           train_enc_start, train_enc_end)[:first_n_samples]
encoder_input_data, encode_series_mean = transform_series_encode(encoder_input_data)

# sample of series from train_pred_start to train_pred_end 
decoder_target_data = get_time_block_series(series_array, date_to_index, 
                                            train_pred_start, train_pred_end)[:first_n_samples]
decoder_target_data = transform_series_decode(decoder_target_data, encode_series_mean)

# okay, so for each one of these, 
# 1) for the encoder portion, we first take each series and subtract the mean
# 2) for the decoder portion, we find the prediction (last 2 weeks) and then set that up with the same transformations

# we append a lagged history of the target series to the input data, 
# so that we can train with teacher forcing
lagged_target_history = decoder_target_data[:,:-1,:1]
encoder_input_data = np.concatenate([encoder_input_data, lagged_target_history], axis=1)

# here, we are adding the lagged history of the target series to the input data at the end
# looks like, train then test-1 day


In [None]:
model.compile(Adam(), loss='mean_absolute_percentage_error')
history = model.fit(encoder_input_data, 
                    decoder_target_data,
                    batch_size=batch_size,
                    epochs=epochs,
                    validation_split=0.2)

In [None]:
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])

plt.xlabel('Epoch')
plt.ylabel('Mean Absolute Percentage Error Loss')
plt.title('Loss Over Time')
plt.legend(['Train','Valid'])

In [None]:
def predict_sequence(input_sequence):

    history_sequence = input_sequence.copy()
    pred_sequence = np.zeros((1,pred_steps,1)) # initialize output (pred_steps time steps)  
    
    for i in range(pred_steps):
        
        # record next time step prediction (last time step of model output) 
        last_step_pred = model.predict(history_sequence)[0,-1,0]
        pred_sequence[0,i,0] = last_step_pred
        
        # add the next time step prediction to the history sequence
        history_sequence = np.concatenate([history_sequence, 
                                           last_step_pred.reshape(-1,1,1)], axis=1)

    return pred_sequence

In [None]:
encoder_input_data = get_time_block_series(series_array, date_to_index, val_enc_start, val_enc_end)
encoder_input_data, encode_series_mean = transform_series_encode(encoder_input_data)

decoder_target_data = get_time_block_series(series_array, date_to_index, val_pred_start, val_pred_end)
decoder_target_data = transform_series_decode(decoder_target_data, encode_series_mean)

In [None]:
encode_series_mean[0:1]

In [None]:
def predict_and_plot(encoder_input_data, encode_series_mean, decoder_target_data, sample_ind, enc_tail_len=14):

    encode_series = encoder_input_data[sample_ind:sample_ind+1,:,:] 
    encode_series_mean = encode_series_mean[sample_ind:sample_ind+1] 
    pred_series = predict_sequence(encode_series)
    
    encode_series = encode_series.reshape(-1,1)
    encode_series += encode_series_mean
    pred_series = pred_series.reshape(-1,1)
    pred_series += encode_series_mean
    target_series = decoder_target_data[sample_ind,:,:1].reshape(-1,1) 
    target_series += encode_series_mean
    
    encode_series_tail = np.concatenate([encode_series[-enc_tail_len:],target_series[:1]])
    x_encode = encode_series_tail.shape[0]
    
    plt.figure(figsize=(10,6))   
    
    plt.plot(range(1,x_encode+1),encode_series_tail)
    plt.plot(range(x_encode,x_encode+pred_steps),target_series,color='orange')
    plt.plot(range(x_encode,x_encode+pred_steps),pred_series,color='teal',linestyle='--')
    
    plt.title('Encoder Series Tail of Length %d, Target Series, and Predictions' % enc_tail_len)
    plt.legend(['Encoding Series','Target Series','Predictions'])

In [None]:
predict_and_plot(encoder_input_data, encode_series_mean, decoder_target_data,0)

issues with install: https://forums.fast.ai/t/fastai-v0-7-install-issues-thread/24652 

## Try version with actual LSTM

https://github.com/jfpuget/Kaggle/blob/master/WebTrafficPrediction/keras_simple.ipynb