<a href="https://colab.research.google.com/github/jeffheaton/app_deep_learning/blob/main/assignments/assignment_yourname_class10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# T81-558: Applications of Deep Neural Networks
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/index.html)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

**Module 10 Assignment: Time Series Neural Network**

**Student Name: Your Name**

# Assignment Instructions

For this assignment, you will use an LSTM to predict a time series contained in the data file **[series-31-num.csv](https://data.heatonresearch.com/data/t81-558/datasets/series-31-num.csv)**.  The code that you will use to complete this will be similar to the sunspots example from the course module.  This data set contains two columns: *time* and *value*.  Create an LSTM network and train it with a sequence size of 5 and a prediction window of 1.  If you use a different sequence size, you will not have the correct number of submission rows. Train the neural network, the data set is relatively simple, and you should easily be able to get an RMSE below 1.0.  FYI, I generate this dataset by fitting a cubic spline to a series of random points. 

This file contains a time series data set, do not randomize the order of the rows!  For your training data, use all *time* values less than 3000, and for the test, use the remaining amounts greater than or equal to 3000. For the submit file, please send me the results of your test evaluation.  You should have two columns: *time* and *value*.  The column *time* should be the time at the beginning of each predicted sequence. The *value* should be the next value that your neural network predicted for each of the sequences.

Your submission file will look similar to:

|time|value|
|-|-|
|3000|37.022846|
|3001|37.030582|
|3002|37.03816|
|3003|37.045563|
|3004|37.0528|
|...|...|

# Google CoLab Instructions

If you are using Google CoLab, it will be necessary to mount your GDrive so that you can send your notebook during the submit process. Running the following code will map your GDrive to ```/content/drive```.

In [None]:
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

# Assignment Submit Function

You will submit the ten programming assignments electronically.  The following **submit** function can be used to do this.  My server will perform a basic check of each assignment and let you know if it sees any underlying problems. 

**It is unlikely that should need to modify this function.**

In [None]:
import base64
import os
import numpy as np
import pandas as pd
import requests
import PIL
import PIL.Image
import io

# This function submits an assignment.  You can submit an assignment as much as you like, only the final
# submission counts.  The paramaters are as follows:
# data - List of pandas dataframes or images.
# key - Your student key that was emailed to you.
# no - The assignment class number, should be 1 through 1.
# source_file - The full path to your Python or IPYNB file.  This must have "_class1" as part of its name.  
# .             The number must match your assignment number.  For example "_class2" for class assignment #2.
def submit(data,key,no,source_file=None):
    if source_file is None and '__file__' not in globals(): raise Exception('Must specify a filename when a Jupyter notebook.')
    if source_file is None: source_file = __file__
    suffix = '_class{}'.format(no)
    if suffix not in source_file: raise Exception('{} must be part of the filename.'.format(suffix))
    with open(source_file, "rb") as image_file:
        encoded_python = base64.b64encode(image_file.read()).decode('ascii')
    ext = os.path.splitext(source_file)[-1].lower()
    if ext not in ['.ipynb','.py']: raise Exception("Source file is {} must be .py or .ipynb".format(ext))
    payload = []
    for item in data:
        if type(item) is PIL.Image.Image:
            buffered = BytesIO()
            item.save(buffered, format="PNG")
            payload.append({'PNG':base64.b64encode(buffered.getvalue()).decode('ascii')})
        elif type(item) is pd.core.frame.DataFrame:
            payload.append({'CSV':base64.b64encode(item.to_csv(index=False).encode('ascii')).decode("ascii")})
    r= requests.post("https://api.heatonresearch.com/assignment-submit",
        headers={'x-api-key':key}, json={ 'payload': payload,'assignment': no, 'ext':ext, 'py':encoded_python})
    if r.status_code==200:
        print("Success: {}".format(r.text))
    else: print("Failure: {}".format(r.text))

# Assignment #10 Sample Code

The following code provides a starting point for this assignment.

In [None]:
import numpy as np
def to_sequences(seq_size, obs):
    x = []
    y = []

    for i in range(len(obs)-SEQUENCE_SIZE):
        #print(i)
        window = obs[i:(i+SEQUENCE_SIZE)]
        after_window = obs[i+SEQUENCE_SIZE]
        window = [[x] for x in window]
        #print("{} - {}".format(window,after_window))
        x.append(window)
        y.append(after_window)
        
    return np.array(x),np.array(y)
    

# This is your student key that I emailed to you at the beginnning of the semester.
key = "H3B554uPhc3f8kirGGBYA7cYuDOamhXM87OY6QH1"  # This is an example key and will not work.


# You must also identify your source file.  (modify for your local setup)
# file='/resources/t81_558_deep_learning/assignment_yourname_class1.ipynb'  # IBM Data Science Workbench
#file='C:\\users\\jeff\\projects\\t81_558_deep_learning\\assignments\\assignment_yourname_class10.ipynb'  # Windows
file='/Users/jeff/projects/t81_558_deep_learning/assignments/assignment_yourname_class10.ipynb'  # Mac/Linux

# Read from time series file
df = pd.read_csv("https://data.heatonresearch.com/data/t81-558/datasets/series-31-num.csv")  

print("Starting file:")
print(df[0:10])

print("Ending file:")
print(df[-10:])

df_train = df[df['time']<3000]
df_test = df[df['time']>=3000]

spots_train = df_train['value'].tolist()
spots_test = df_test['value'].tolist()

print("Training set has {} observations.".format(len(spots_train)))
print("Test set has {} observations.".format(len(spots_test)))

SEQUENCE_SIZE = 5
x_train,y_train = to_sequences(SEQUENCE_SIZE,spots_train)
x_test,y_test = to_sequences(SEQUENCE_SIZE,spots_test)

print("Shape of training set: {}".format(x_train.shape))
print("Shape of test set: {}".format(x_test.shape))


In [None]:
# continue here


submit(source_file=file,data=[submit_df],key=key,no=10)