In [1]:
import torch

Let's create some dummy data

In [2]:
x = torch.arange(10, 100, 10)
print(f"Length of x: {len(x)}")
x

Length of x: 9


tensor([10, 20, 30, 40, 50, 60, 70, 80, 90])

Let this be a timeseries with 9 observations.
Now, often, it is the case that you dont need all the observations to make a prediction.

Imagine you are predicting weather: is the weather from 4 months ago relevant for the prediction of tomorrow? Probably not that much.
So, typically, we will have to determine a relevant window of time. 

This will vary from case to case. In this case, let the optimal window be 3.

In [3]:
idx = torch.tensor([0, 1, 2])
x[idx]

tensor([10, 20, 30])

This is our first training example. But, because time flows, we could "simulate" this by extracting an additional training example, also of lenght 3:

In [4]:
idx = torch.tensor([1, 2, 3])
x[idx]

tensor([20, 30, 40])

And we can continue this all the way to the end.

In [5]:
idx = torch.tensor([6, 7, 8])
x[idx]

tensor([70, 80, 90])

It would be easy to scale this process. So, we can provide multiple indeces at once:

In [6]:
idx = torch.tensor([
    [0, 1, 2],
    [1, 2, 3]
])
x[idx]

tensor([[10, 20, 30],
        [20, 30, 40]])

This gives us two different training examples.
Now, what we would want, instead of typing out these indeces by hand, is to generate a complete set of indeces, for an array of a given lenght, with a given window.

In [8]:
n_time = 3
n_window = len(x) - n_time + 1
n_window

7

The minimum case is to extract one window of 3 from an array of length 3, which is the reason we need to add 1 to the formula.
If we increase the length of the array, for every single element extra on top of the length of the array, we can extract one more chunk.
This explains the len(x) - n_time part.

This formula can be used to calculate the maximum amount of slices we can extract from an array `x`: with a window of 3, we can squeeze out 7 training examples.

Now, the first index will look like this:

In [9]:
time = torch.arange(0, n_time).reshape(1, -1)
time

tensor([[0, 1, 2]])

And what we essentially need, to get the next slice, is to add +1 for every next slice.
Because we calculated that we are able to extract 7 slices, we need the numbers ranging from 0 to 6.

In [10]:
window = torch.arange(0, n_window).reshape(-1, 1)
window

tensor([[0],
        [1],
        [2],
        [3],
        [4],
        [5],
        [6]])

In [11]:
time.shape, window.shape

(torch.Size([1, 3]), torch.Size([7, 1]))

Using broadcasting, we can now simply add these two vectors. 

We are adding a (1,3) and a (7,1) matrix. This might seem weird, because you would expect to need two
(7,3) matrices, but Torch will expand the dimensions that don't match yet have a dimension of 1.

In [12]:
idx = time + window
idx

tensor([[0, 1, 2],
        [1, 2, 3],
        [2, 3, 4],
        [3, 4, 5],
        [4, 5, 6],
        [5, 6, 7],
        [6, 7, 8]])

Now, this is exactly what we need! This first index is still `[0, 1, 2]`, and the second index has the same, +1, so that is now `[1, 2, 3]` , etc.

Let's try this out on our timeseries:

In [13]:
x[idx]

tensor([[10, 20, 30],
        [20, 30, 40],
        [30, 40, 50],
        [40, 50, 60],
        [50, 60, 70],
        [60, 70, 80],
        [70, 80, 90]])

So, that worked! We started with a long timeseries of 9 steps. We ended up with seven examples to feed our model, all with a lenght of 3.

We can wrap this all into a function:

In [14]:
Tensor = torch.Tensor
def window(x: Tensor, n_time: int) -> Tensor:
    """
    Generates and index that can be used to window a timeseries.
    E.g. the single series [0, 1, 2, 3, 4, 5] can be windowed into 4 timeseries with
    length 3 like this:

    [0, 1, 2]
    [1, 2, 3]
    [2, 3, 4]
    [3, 4, 5]

    We now can feed 4 different timeseries into the model, instead of 1, all
    with the same length.
    """
    n_window = len(x) - n_time + 1
    time = torch.arange(0, n_time).reshape(1, -1)
    window = torch.arange(0, n_window).reshape(-1, 1)
    idx = time + window
    return idx

Now we can easily change the window size:

In [15]:
idx = window(x, 5)
x[idx]

tensor([[10, 20, 30, 40, 50],
        [20, 30, 40, 50, 60],
        [30, 40, 50, 60, 70],
        [40, 50, 60, 70, 80],
        [50, 60, 70, 80, 90]])

In [16]:
idx = window(x, 6)
x[idx]

tensor([[10, 20, 30, 40, 50, 60],
        [20, 30, 40, 50, 60, 70],
        [30, 40, 50, 60, 70, 80],
        [40, 50, 60, 70, 80, 90]])

Will this scale to more dimensions?
Let's imagine we have 5 timesteps, every timestep 3 features being observed. 
We can organize that into a `(5x3)` matrix instead of the one-dimensional raw data we started this notebook with.

In [17]:
x = torch.randint(0, 10, (5, 3))
x

tensor([[7, 8, 9],
        [9, 8, 9],
        [0, 7, 9],
        [9, 2, 1],
        [7, 7, 5]])

Now lets window it in chunks of 4 timesteps:

In [18]:
idx = window(x, 4)
x[idx]

tensor([[[7, 8, 9],
         [9, 8, 9],
         [0, 7, 9],
         [9, 2, 1]],

        [[9, 8, 9],
         [0, 7, 9],
         [9, 2, 1],
         [7, 7, 5]]])

We can use the windowed index to generate three training examples, every training examples covering four timesteps.

We can also apply this to batched. Let us have a batch (B) of 32 examples, every example having 6 timesteps (T) and 2 features (F). This is organized in a `(B, T, F)` matrix.

We can now apply the window on the 0th example, and squueze out an additional 3 training examples

In [21]:
batch = torch.randint(0, 10, (32, 6, 2))
x = batch[0]
idx = window(x, 4)
x_windowed = x[idx]

In [22]:
x.shape, x_windowed.shape

(torch.Size([6, 2]), torch.Size([3, 4, 2]))