# **8. Working with Date and Time**

## **🌀 7. Rolling, Expanding, and Shifting Windows**

In [1]:
import pandas as pd 
import numpy as np

### 1. **What it does and When to Use It**

These techniques allow you to perform operations on **moving or cumulative windows** of data, especially useful in **time series analysis**.

| Technique     | What it does                                                                       | When to use                                      |
| ------------- | ---------------------------------------------------------------------------------- | ------------------------------------------------ |
| **Rolling**   | Applies a function over a **fixed-size moving window**                             | For running averages, volatility, or moving sums |
| **Expanding** | Applies a function over an **expanding window** (from the beginning of the series) | For cumulative statistics                        |
| **Shifting**  | Shifts data **forward or backward** in time                                        | For lagging/leading comparisons in forecasting   |


### 2. **Syntax and Core Parameters**

#### **Rolling**

```python
df.rolling(window, min_periods=1, center=False).agg_func()
```

#### **Expanding**

```python
df.expanding(min_periods=1).agg_func()
```

#### **Shifting**

```python
df.shift(periods=1, freq=None)
```

### 🔧 Key Parameters:

| Parameter     | Description                                           |
| ------------- | ----------------------------------------------------- |
| `window`      | Size of the moving window (e.g., 3 for 3 periods)     |
| `min_periods` | Minimum non-NA values required in the window          |
| `center`      | Whether label is at the center or right of the window |
| `periods`     | How many periods to shift (positive = future)         |
| `freq`        | Frequency to shift time index instead of values       |


### 3. **Different Methods and Techniques**

#### ✅ **Rolling Window with Mean**

```python
df['rolling_avg'] = df['sales'].rolling(window=3).mean()
```

#### ✅ **Rolling with Custom Aggregation**

```python
df['rolling_max'] = df['sales'].rolling(5).max()
```

#### ✅ **Expanding Window**

```python
df['expanding_sum'] = df['sales'].expanding().sum()
```

#### ✅ **Shifting**

```python
df['lagged'] = df['sales'].shift(1)       # Lag
df['lead'] = df['sales'].shift(-1)        # Lead
```

#### ✅ **Shift Index with Frequency**

```python
df.index = df.index.shift(1, freq='D')    # Shift date index forward by 1 day
```

### 4. **Examples on Real/Pseudo Data**

In [None]:
date_rng = pd.date_range(start='2025-07-01', periods=10, freq='D')
df = pd.DataFrame({'sales': np.random.randint(50, 200, size=10)}, index=date_rng)

df

Unnamed: 0,sales
2025-07-01,88
2025-07-02,138
2025-07-03,151
2025-07-04,131
2025-07-05,131
2025-07-06,172
2025-07-07,69
2025-07-08,102
2025-07-09,69
2025-07-10,158


In [5]:
# 1. Rolling 3-day average
df['rolling_avg'] = df['sales'].rolling(window=3).mean()
df

Unnamed: 0,sales,rolling_avg
2025-07-01,88,
2025-07-02,138,
2025-07-03,151,125.666667
2025-07-04,131,140.0
2025-07-05,131,137.666667
2025-07-06,172,144.666667
2025-07-07,69,124.0
2025-07-08,102,114.333333
2025-07-09,69,80.0
2025-07-10,158,109.666667


In [7]:
# 2. Expanding sum
df['expanding_sum'] = df['sales'].expanding().sum()
df

Unnamed: 0,sales,rolling_avg,expanding_sum
2025-07-01,88,,88.0
2025-07-02,138,,226.0
2025-07-03,151,125.666667,377.0
2025-07-04,131,140.0,508.0
2025-07-05,131,137.666667,639.0
2025-07-06,172,144.666667,811.0
2025-07-07,69,124.0,880.0
2025-07-08,102,114.333333,982.0
2025-07-09,69,80.0,1051.0
2025-07-10,158,109.666667,1209.0


In [8]:
# 3. Lag and Lead using shift
df['lagged'] = df['sales'].shift(1)
df['lead'] = df['sales'].shift(-1)

df

Unnamed: 0,sales,rolling_avg,expanding_sum,lagged,lead
2025-07-01,88,,88.0,,138.0
2025-07-02,138,,226.0,88.0,151.0
2025-07-03,151,125.666667,377.0,138.0,131.0
2025-07-04,131,140.0,508.0,151.0,131.0
2025-07-05,131,137.666667,639.0,131.0,172.0
2025-07-06,172,144.666667,811.0,131.0,69.0
2025-07-07,69,124.0,880.0,172.0,102.0
2025-07-08,102,114.333333,982.0,69.0,69.0
2025-07-09,69,80.0,1051.0,102.0,158.0
2025-07-10,158,109.666667,1209.0,69.0,


### 5. **Common Pitfalls and Best Practices**

| ❌ Pitfall                                          | ✅ Best Practice                                                    |
| -------------------------------------------------- | ------------------------------------------------------------------ |
| Using `rolling()` without enough data points       | Set `min_periods` wisely or handle NaNs                            |
| Misinterpreting lag/lead direction in `shift()`    | Positive = past (lag), negative = future (lead)                    |
| Applying to non-time-indexed data                  | Make sure to use DateTime index for time-aware operations          |
| Forgetting `center=True` when needed               | Use `center=True` if you want centered windows for symmetry        |
| Expecting `expanding()` to behave like `rolling()` | Know that `expanding()` starts from the first data point and grows |


### 6. **Real World Use Cases**

#### 📉 Financial Analytics

* Rolling volatility of stock prices (standard deviation over 30-day windows)
* 7-day moving average of closing prices
* Expanding average for long-term trend

#### 🛒 E-commerce

* Rolling weekly average of user visits or sales
* Lead/lag comparisons to measure effect of promotions

#### 🧠 Machine Learning Features

* Create lagged features for time series forecasting
* Rolling mean/variance as features to smooth noisy data

#### ⚙️ Manufacturing / IoT

* 5-minute moving average of sensor temperature readings
* Detect anomalies using rolling standard deviation

#### 📊 Marketing

* Expanding cumulative spend of advertising campaigns
* Shifting email open rates to analyze behavior lag


<center><b>Thanks</b></center>