# **8. Working with Date and Time**

## 📈 **6. Resampling and Frequency Conversion**

In [1]:
import pandas as pd 
import numpy as np

### 1. **What it does and When to Use It**

**Resampling** is the process of changing the frequency of your time series data.

* **Downsampling** = Reducing frequency (e.g., daily → monthly)
* **Upsampling** = Increasing frequency (e.g., daily → hourly)

📌 **Used when**:

* Aggregating time-based data (e.g., total monthly sales from daily sales)
* Filling in missing time intervals
* Analyzing trends over different time windows
* Aligning time series to a uniform frequency

### 2. **Syntax and Core Parameters**

```python
df.resample(rule, how/agg_function, **kwargs)
```

### 🔧 Core Parameters:

| Parameter | Description                                                                                |
| --------- | ------------------------------------------------------------------------------------------ |
| `rule`    | Frequency string like `'D'` (day), `'H'` (hour), `'M'` (month), `'W'` (week), `'A'` (year) |
| `on`      | Use this column as the datetime index (if index is not datetime)                           |
| `label`   | Where to label bin (e.g., `'right'` or `'left'`)                                           |
| `closed`  | Defines whether intervals are closed on the `'right'` or `'left'`                          |
| `method`  | For upsampling: `'ffill'`, `'bfill'`                                                       |


### 3. **Different Methods and Techniques**

#### ✅ **Downsampling** (Aggregation)

```python
df.resample('M').sum()    # Monthly sum
df.resample('W').mean()   # Weekly average
```

#### ✅ **Upsampling** (More granular frequency)

```python
df.resample('H').ffill()  # Hourly, forward-filled
df.resample('15T').bfill()  # Every 15 min, backward-filled
```

#### ✅ **Specifying datetime column (if not index)**

```python
df.resample('D', on='timestamp').mean()
```

#### ✅ **Custom aggregation**

```python
df.resample('M').agg({'sales': 'sum', 'visits': 'mean'})
```

#### ✅ **Using `asfreq()` for frequency conversion without fill**

```python
df.asfreq('H')  # Convert to hourly, NaNs inserted if data missing
```


### 4. **Examples on Real/Pseudo Data**

In [3]:
# Generate hourly data for 2 days
range_ = pd.date_range('2025-07-01', periods=48, freq='h')
df = pd.DataFrame({'temperature': np.random.randint(20, 40, size=48)},
                  index=range_)

df

Unnamed: 0,temperature
2025-07-01 00:00:00,28
2025-07-01 01:00:00,25
2025-07-01 02:00:00,29
2025-07-01 03:00:00,24
2025-07-01 04:00:00,23
2025-07-01 05:00:00,37
2025-07-01 06:00:00,33
2025-07-01 07:00:00,33
2025-07-01 08:00:00,26
2025-07-01 09:00:00,24


In [4]:
# 1. Downsampling: daily average temperature

df.resample('D').mean()

Unnamed: 0,temperature
2025-07-01,30.0
2025-07-02,30.083333


In [7]:
# 2. Upsampling: from hourly to every 15 minutes, forward-filled
df.resample('15min').ffill()

Unnamed: 0,temperature
2025-07-01 00:00:00,28
2025-07-01 00:15:00,28
2025-07-01 00:30:00,28
2025-07-01 00:45:00,28
2025-07-01 01:00:00,25
...,...
2025-07-02 22:00:00,21
2025-07-02 22:15:00,21
2025-07-02 22:30:00,21
2025-07-02 22:45:00,21


In [9]:
# 3. Custom aggregation
df.resample('D').agg({'temperature': ['min', 'mean', 'max']})

Unnamed: 0_level_0,temperature,temperature,temperature
Unnamed: 0_level_1,min,mean,max
2025-07-01,20,30.0,39
2025-07-02,20,30.083333,39


### 5. **Common Pitfalls and Best Practices**

| ❌ Pitfall                                      | ✅ Best Practice                                                                                                                 |
| ---------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
| Not setting or specifying a datetime index     | Use `df.set_index('date')` or `on='date'`                                                                                       |
| Forgetting to aggregate after `resample()`     | Always follow with aggregation or method (e.g., `.mean()`)                                                                      |
| Upsampling without fill method gives NaNs      | Use `.ffill()` or `.bfill()`                                                                                                    |
| Confusion between `resample()` and `groupby()` | `resample()` is time-aware, `groupby()` is not                                                                                  |
| Using incorrect frequency strings              | Refer to [pandas frequency aliases](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects) |


### 6. **Real World Use Cases**

#### 📊 Financial Sector:

* Resample stock data to daily, weekly, or monthly closing prices
* Compute 7-day or monthly moving averages for trends

#### 🌡️ IoT/Weather Data:

* Downsample hourly temperature readings to daily max/min
* Upsample sensor data to fill in missing periods with estimates

#### 🛒 E-commerce:

* Weekly aggregated sales from daily transactions
* Hourly visit patterns from server logs

#### 🚌 Transport:

* Aggregate trip durations per day/week
* Analyze traffic volume over different time buckets (15min/hourly)

#### 🏥 Healthcare:

* Patient vitals monitoring every minute → summarize to hourly
* Resample medication logs for compliance tracking

<center><b>Thanks</b></center>