## Preparing Features and Labels:

Now that we have understood the basics of __Time-Series__ analysis. Let's go ahead and see how can create __features__ and __labels__ to apply machine learning on our time-series data. For this we'll be taking a specific __window__ of data points refereing as __features__ and right after that window, the next value would be the __label__ for those features.

In [1]:
import tensorflow as tf 

## Creating Dataset:

In [32]:
dataset = tf.data.Dataset.range(10)

for val in dataset:
    print(val.numpy()) # converts the tensor object to int64

0
1
2
3
4
5
6
7
8
9


## Creating Window:

Let's now create a __window__ of features along with their __labels__.

In [30]:
dataset = dataset.window(5, shift=1)

for window in dataset:
    for val in window:
        print(val.numpy(), end=' ')
        
    print()

0 1 2 3 4 
1 2 3 4 5 
2 3 4 5 6 
3 4 5 6 7 
4 5 6 7 8 
5 6 7 8 9 
6 7 8 9 
7 8 9 
8 9 
9 


## Regular Sized Window:

Let's now edit our code a little bit, so that we can have __Regular Sized__ window for every iteration

In [33]:
dataset = dataset.window(5, shift=1, drop_remainder=True) # drop_reminder will truncated the remainders

for window in dataset:
    for val in window:
        print(val.numpy(), end=' ')
        
    print()

0 1 2 3 4 
1 2 3 4 5 
2 3 4 5 6 
3 4 5 6 7 
4 5 6 7 8 
5 6 7 8 9 


## Window to Array:

Now let's go ahead and convert each window to __Numpy Array__

In [34]:
dataset = tf.data.Dataset.range(10)

dataset = dataset.window(5, shift=1, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(5))

for window in dataset:
    print(window.numpy())

[0 1 2 3 4]
[1 2 3 4 5]
[2 3 4 5 6]
[3 4 5 6 7]
[4 5 6 7 8]
[5 6 7 8 9]


## Splitting the Data(Features & Labels):

Let's now grab all the values in items except the last on as __Features__ and declare the last one as __Label__

In [35]:
dataset = tf.data.Dataset.range(10)

dataset = dataset.window(5, shift=1, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(5))
dataset = dataset.map(lambda window: (window[:-1], window[-1]))

for x,y in dataset:
    print(x.numpy(), y.numpy())

[0 1 2 3] 4
[1 2 3 4] 5
[2 3 4 5] 6
[3 4 5 6] 7
[4 5 6 7] 8
[5 6 7 8] 9


## Shuffle the Dataset:

In [40]:
dataset = tf.data.Dataset.range(10)
dataset = dataset.window(5, shift=1, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(5))
dataset = dataset.map(lambda window: (window[:-1], window[-1:]))
dataset = dataset.shuffle(buffer_size=10) # shuffles the dataset 

for x,y in dataset:
    print(x.numpy(), y.numpy())

[0 1 2 3] [4]
[2 3 4 5] [6]
[5 6 7 8] [9]
[3 4 5 6] [7]
[1 2 3 4] [5]
[4 5 6 7] [8]


## Batching the Data:

Finally, let's go ahead and divide our dataset into batches

In [39]:
dataset = tf.data.Dataset.range(10)
dataset = dataset.window(5, shift=1, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(5))
dataset = dataset.map(lambda window: (window[:-1], window[-1:]))
dataset = dataset.shuffle(buffer_size=10)
dataset = dataset.batch(2).prefetch(1)

for x,y in dataset:
    print('x: ', x.numpy())
    print('y: ', y.numpy())

x:  [[3 4 5 6]
 [2 3 4 5]]
y:  [[7]
 [6]]
x:  [[0 1 2 3]
 [1 2 3 4]]
y:  [[4]
 [5]]
x:  [[5 6 7 8]
 [4 5 6 7]]
y:  [[9]
 [8]]
