1.Lag Features
Lag features shift the target variable back by n time steps.

They help the model learn from previous values.

Example: For forecasting sales today, you use yesterday’s sales, last week’s sales, etc.

In [1]:
import pandas as pd

data = {
    'date': [
        '2024-01-01','2024-01-02','2024-01-03','2024-01-04','2024-01-05',
        '2024-01-06','2024-01-07','2024-01-08','2024-01-09','2024-01-10'
    ],
    'value': [100,110,105,120,130,128,140,150,160,155]
}

df = pd.DataFrame(data)
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')
print(df)


            value
date             
2024-01-01    100
2024-01-02    110
2024-01-03    105
2024-01-04    120
2024-01-05    130
2024-01-06    128
2024-01-07    140
2024-01-08    150
2024-01-09    160
2024-01-10    155


In [2]:
df['lag_1'] = df['value'].shift(1)      # Yesterday
df['lag_7'] = df['value'].shift(7)      # Last week
df['lag_30'] = df['value'].shift(30)    # Last month (approx)
print(df)

            value  lag_1  lag_7  lag_30
date                                   
2024-01-01    100    NaN    NaN     NaN
2024-01-02    110  100.0    NaN     NaN
2024-01-03    105  110.0    NaN     NaN
2024-01-04    120  105.0    NaN     NaN
2024-01-05    130  120.0    NaN     NaN
2024-01-06    128  130.0    NaN     NaN
2024-01-07    140  128.0    NaN     NaN
2024-01-08    150  140.0  100.0     NaN
2024-01-09    160  150.0  110.0     NaN
2024-01-10    155  160.0  105.0     NaN


2.Rolling Window Features

These capture average, sum, std, etc., of recent periods.
Help the model understand short-term trends.
Example: last 3-day average, last 7-day maximum.

In [9]:
import pandas as pd
import numpy as np

# 1. Create time series data (60 days)
dates = pd.date_range(start="2024-01-01", periods=30, freq='D')
values = np.random.randint(100, 200, size=30)   # random values for example

df = pd.DataFrame({
    "date": dates,
    "value": values
})
df=df.set_index('date')
# 2. Rolling window features (7-day)
df['roll_mean_7'] = df['value'].rolling(window=7).mean()
df['roll_std_7']  = df['value'].rolling(window=7).std()
df['roll_max_7']  = df['value'].rolling(window=7).max()
df['roll_min_7']  = df['value'].rolling(window=7).min()

print(df)

            value  roll_mean_7  roll_std_7  roll_max_7  roll_min_7
date                                                              
2024-01-01    126          NaN         NaN         NaN         NaN
2024-01-02    121          NaN         NaN         NaN         NaN
2024-01-03    140          NaN         NaN         NaN         NaN
2024-01-04    176          NaN         NaN         NaN         NaN
2024-01-05    105          NaN         NaN         NaN         NaN
2024-01-06    157          NaN         NaN         NaN         NaN
2024-01-07    148   139.000000   23.888630       176.0       105.0
2024-01-08    161   144.000000   24.372115       176.0       105.0
2024-01-09    148   147.857143   22.161743       176.0       105.0
2024-01-10    106   143.000000   27.300794       176.0       105.0
2024-01-11    192   145.285714   30.950037       192.0       105.0
2024-01-12    147   151.285714   25.414656       192.0       106.0
2024-01-13    153   150.714286   25.309513       192.0       1

3.Time-based Features (Day, Month, Year, Day of Week)
Extract calendar patterns.
Example: sales increase on weekends, power usage changes by month.

In [14]:
df['day'] = df.index.day
df['month'] = df.index.month
df['year'] = df.index.year
df['day_of_week'] = df.index.dayofweek   # Monday=0, Sunday=6
df['is_weekend'] = (df['day_of_week'] >= 5).astype(int)
print(df[['day','month','year','day_of_week','is_weekend']])


            day  month  year  day_of_week  is_weekend
date                                                 
2024-01-01    1      1  2024            0           0
2024-01-02    2      1  2024            1           0
2024-01-03    3      1  2024            2           0
2024-01-04    4      1  2024            3           0
2024-01-05    5      1  2024            4           0
2024-01-06    6      1  2024            5           1
2024-01-07    7      1  2024            6           1
2024-01-08    8      1  2024            0           0
2024-01-09    9      1  2024            1           0
2024-01-10   10      1  2024            2           0
2024-01-11   11      1  2024            3           0
2024-01-12   12      1  2024            4           0
2024-01-13   13      1  2024            5           1
2024-01-14   14      1  2024            6           1
2024-01-15   15      1  2024            0           0
2024-01-16   16      1  2024            1           0
2024-01-17   17      1  2024

4.Holidays
Holiday Features

Holidays often cause sudden increases/decreases in demand.
Adding holiday flags improves accurac

In [15]:
pip install holidays

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


In [None]:
import holidays

indian_holidays = holidays.India()

df['holiday'] = df.index.isin(indian_holidays).astype(int)

print(df['holiday'])


date
2024-01-01    0
2024-01-02    0
2024-01-03    0
2024-01-04    0
2024-01-05    0
2024-01-06    0
2024-01-07    0
2024-01-08    0
2024-01-09    0
2024-01-10    0
2024-01-11    0
2024-01-12    0
2024-01-13    0
2024-01-14    0
2024-01-15    0
2024-01-16    0
2024-01-17    0
2024-01-18    0
2024-01-19    0
2024-01-20    0
2024-01-21    0
2024-01-22    0
2024-01-23    0
2024-01-24    0
2024-01-25    0
2024-01-26    0
2024-01-27    0
2024-01-28    0
2024-01-29    0
2024-01-30    0
Name: holiday, dtype: int64
