# PyTorch Price Predictor

Notes:
- PyTorch LSTMs expect all input to be 3D tensors.
 - Axis=0 - the sequence itself
 - Axis=1 - instances of the mini-batch
 - Axis=2 - instances of the input
 
Being honest, I don't fully understand how I am supposed to shape up my tensors based off of this explanation. Will need to find another article explaining that. 

Let's break this down. 

I want my dataset to be in chunks of, let's say 24 for now and we will predict the price as the next hour. So let's get the dataset in that form first. 

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

from pathlib import Path
import pandas as pd
import numpy as np

torch.manual_seed(1)

DOWNLOAD_DIR = Path('../download')

In [2]:
price = pd.read_csv(DOWNLOAD_DIR / 'price.csv')

In [3]:
price.shape

(92974, 5)

In [4]:
price[price.c.isna()].tail(50)

Unnamed: 0,timestamp,c,h,l,o
67799,2018-04-11T22:00:00Z,,,,
67800,2018-04-11T23:00:00Z,,,,
69304,2018-06-13T15:00:00Z,,,,
71175,2018-08-30T14:00:00Z,,,,
73477,2018-12-04T12:00:00Z,,,,
73478,2018-12-04T13:00:00Z,,,,
73479,2018-12-04T14:00:00Z,,,,
73480,2018-12-04T15:00:00Z,,,,
73481,2018-12-04T16:00:00Z,,,,
73482,2018-12-04T17:00:00Z,,,,


In [5]:
price_nonans = price.iloc[76307:]

In [6]:
# Almost 2 years of data (~100 weeks)
price_nonans.shape[0] / 24 / 365

1.9026255707762558

In [7]:
df = price.iloc[-105:].c

In [8]:
df.shape

(105,)

In [9]:
len(df) // 10

10

In [10]:
df.iloc[10]

55572.59765529383

In [11]:
# This works!
def make_seqs_and_targets(series, seq_length):
    """
    Given a pandas.Series and a sequence length, return a tuple of (seq, target) pairs.
    The target is the next element in the series.
    """
    seqs = []
    targets = []
    i = 0
    while i + seq_length < len(series):
        seq = series.iloc[i : i + seq_length].values
        target = np.array(series.iloc[i + seq_length])
        seqs.append(seq)
        targets.append(target)
        i += 1
    return seqs, targets

In [12]:
s, t = make_seqs_and_targets(price_nonans.c, 168)

In [13]:
len(s)

16499

In [15]:
len(t)

16499

In [16]:
list(zip(s[:5], t[:5]))

[(array([4129.6333425 , 4131.42946255, 4141.06476421, 4145.63077138,
         4159.55295648, 4155.00144503, 4161.61229033, 4154.55174481,
         4152.54323958, 4150.66313728, 4150.8128514 , 4154.40034138,
         4160.41022024, 4157.46569937, 4159.75713817, 4173.99505677,
         4192.8113557 , 4189.92539449, 4490.02393638, 4706.37669363,
         4688.01536359, 4657.22378832, 4673.81210589, 4747.73296815,
         4749.45380022, 4760.50620157, 4778.4061988 , 4778.39600084,
         4743.85111957, 4732.15571578, 4723.92672658, 4719.55270228,
         4727.58642079, 4747.00887215, 4783.10781374, 4784.71542909,
         4795.09481063, 4886.74969679, 4937.42033696, 4947.69117344,
         5006.68737329, 5037.20849485, 4888.93154987, 4927.17655526,
         4950.51696874, 4952.88614535, 4929.27843683, 4965.28678152,
         4962.66564373, 4970.00570972, 4994.30297079, 5027.3984022 ,
         5022.24714142, 5014.13154785, 5099.72593403, 5128.95524971,
         5153.63170469, 5128.94703

In [None]:
class BasicLSTM(nn.Module):
    
    def __init__(self)