# Forecasting Introduction

Before we start diving in and making forecasting models we need flesh out some details.

In [1]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

## Time series to tabular format

This is going to be general at first then I will make a specific example.

We start with a time-series of data. With associated observations

In [2]:
def generateObsTable(L):
    # Create the initial DataFrame
    t_values = list(range(1, L))  # Extended range for demonstration
    obs_values = ["Obs" + str(i) for i in t_values]
    df = pd.DataFrame({
        't': t_values,
        'Obs': obs_values
    })
    return df

df_ts = generateObsTable(15)
df_ts

Unnamed: 0,t,Obs
0,1,Obs1
1,2,Obs2
2,3,Obs3
3,4,Obs4
4,5,Obs5
5,6,Obs6
6,7,Obs7
7,8,Obs8
8,9,Obs9
9,10,Obs10


Our supervised learning algorithms want some kind of *tabular* data. We can convert our time series by taking (for example), as each observation:
* Independent variables as Obs1 and Obs2 and our dependent variables as Obs3. 
* Independent variables as Obs2 and Obs3 and our dependent variables as Obs4. 
* and so on until we run out of data.

In [3]:
def transformTable(df, N):
    # Check if N is valid
    if N >= len(df) - 1:
        raise ValueError("N is too large for the provided DataFrame.")
    
    # Create new DataFrame with shifted values
    data = {}
    for i in range(N):
        data[f'x{i+1}'] = df['Obs'][i:-N+i].values
    data['y'] = df['Obs'][N:].values
    
    df_new = pd.DataFrame(data)
    return df_new

In [4]:
df_tab = transformTable(df_ts,2)
df_tab

Unnamed: 0,x1,x2,y
0,Obs1,Obs2,Obs3
1,Obs2,Obs3,Obs4
2,Obs3,Obs4,Obs5
3,Obs4,Obs5,Obs6
4,Obs5,Obs6,Obs7
5,Obs6,Obs7,Obs8
6,Obs7,Obs8,Obs9
7,Obs8,Obs9,Obs10
8,Obs9,Obs10,Obs11
9,Obs10,Obs11,Obs12


How do we make forecasts from this?
* Take our last two observations Obs13 and Obs14, and use the regressor we learned to make Pred15.

In [5]:
def addPredictedRow(df):
    # Extract the last row
    last_row = df.iloc[-1]
    
    # Create new row data
    new_data = {}
    for i in range(len(last_row) - 1):
        new_data[f'x{i+1}'] = last_row[f'x{i+2}'] if i+2 <= len(last_row) - 1 else last_row['y']
    
    # Extract the number from the last 'y' value and increment it
    last_num = int(last_row['y'].replace('Obs', '').replace('Pred', ''))
    new_data['y'] = f'Pred{last_num + 1}'
    
    new_row_df = pd.DataFrame([new_data])
    df = pd.concat([df, new_row_df], ignore_index=True)
    return df

In [6]:
df_tab = addPredictedRow(df_tab)
df_tab

Unnamed: 0,x1,x2,y
0,Obs1,Obs2,Obs3
1,Obs2,Obs3,Obs4
2,Obs3,Obs4,Obs5
3,Obs4,Obs5,Obs6
4,Obs5,Obs6,Obs7
5,Obs6,Obs7,Obs8
6,Obs7,Obs8,Obs9
7,Obs8,Obs9,Obs10
8,Obs9,Obs10,Obs11
9,Obs10,Obs11,Obs12


Lets apply this one more time

In [7]:
df_tab = addPredictedRow(df_tab)
df_tab

Unnamed: 0,x1,x2,y
0,Obs1,Obs2,Obs3
1,Obs2,Obs3,Obs4
2,Obs3,Obs4,Obs5
3,Obs4,Obs5,Obs6
4,Obs5,Obs6,Obs7
5,Obs6,Obs7,Obs8
6,Obs7,Obs8,Obs9
7,Obs8,Obs9,Obs10
8,Obs9,Obs10,Obs11
9,Obs10,Obs11,Obs12
