# **8. Working with Date and Time**

## **3. Filtering and Indexing by Dates**

In [12]:
import pandas as pd 
import numpy as np

### ✅ 1. **What it does and when to use it**

#### 👉 What it does:

Filtering and indexing by dates allows you to **select rows based on specific time periods**, such as a particular date, month, year, or date range. It also includes using **datetime indexes** to slice datasets more efficiently.

#### 📌 When to use:

* When analyzing **time-bounded subsets** of data (e.g., all records from July 2024).
* When preparing **monthly, quarterly, or yearly reports**.
* When working with **time-series forecasting**, which often requires cutting timeframes.
* When your datetime column is **set as index**, it enables **date slicing** similar to numerical slicing.

### 🧾 2. Syntax and Core Parameters

There are **two primary modes**:

#### ✅ 1. **Boolean Filtering (with datetime column)**

```python
df[(df['date'] >= '2024-01-01') & (df['date'] <= '2024-03-31')]
```

#### ✅ 2. **Datetime Index Filtering** (using `.loc[]`)

```python
df.loc['2024-01-01']                   # specific date
df.loc['2024-01']                      # whole month
df.loc['2024']                         # whole year
df.loc['2024-01-01':'2024-03-31']      # range
```

> ✅ Make sure `df.index` is a `DatetimeIndex` for `.loc[]` slicing to work like this.


### 🧠 3. Different Methods and Techniques

| Technique                        | Description                                                  |
| -------------------------------- | ------------------------------------------------------------ |
| Boolean masks                    | Filter using comparisons on a datetime column                |
| Range filtering                  | Use `&` and comparison operators on datetime                 |
| `pd.date_range()`                | Create a list of dates and check if a column is in that list |
| `.between()`                     | Filter using chained comparisons                             |
| `.isin()` with `pd.date_range()` | Filter for multiple specific dates                           |
| `.loc[]` with `DatetimeIndex`    | Fast and readable slicing by date/time strings               |


### 🧪 4. Examples on Real/Pseudo Data

In [13]:
df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=10, freq='D'),
    'sales': [100, 200, 150, 120, 180, 250, 90, 300, 160, 220]
})

df

Unnamed: 0,date,sales
0,2024-01-01,100
1,2024-01-02,200
2,2024-01-03,150
3,2024-01-04,120
4,2024-01-05,180
5,2024-01-06,250
6,2024-01-07,90
7,2024-01-08,300
8,2024-01-09,160
9,2024-01-10,220


#### ✅ Example 1: Boolean filtering with `&`

In [14]:
mask = (df['date'] > '2024-01-03') & (df['date'] <= '2024-01-06')
df[mask]

Unnamed: 0,date,sales
3,2024-01-04,120
4,2024-01-05,180
5,2024-01-06,250


#### ✅ Example 2: Filtering using `.between()`

In [15]:
df['date'].between('2024-01-04', '2024-01-08')

0    False
1    False
2    False
3     True
4     True
5     True
6     True
7     True
8    False
9    False
Name: date, dtype: bool

In [16]:
df[df['date'].between('2024-01-04', '2024-01-08')]

Unnamed: 0,date,sales
3,2024-01-04,120
4,2024-01-05,180
5,2024-01-06,250
6,2024-01-07,90
7,2024-01-08,300


#### ✅ Example 3: Filtering with `pd.date_range()` and `.isin()`

In [17]:
date_list = pd.date_range('2024-01-02', '2024-01-04')
date_list

DatetimeIndex(['2024-01-02', '2024-01-03', '2024-01-04'], dtype='datetime64[ns]', freq='D')

In [18]:
df[df['date'].isin(date_list)]

Unnamed: 0,date,sales
1,2024-01-02,200
2,2024-01-03,150
3,2024-01-04,120


#### ✅ Example 4: Set `date` as index and use `.loc[]`

In [19]:
df.set_index('date', inplace=True)

df

Unnamed: 0_level_0,sales
date,Unnamed: 1_level_1
2024-01-01,100
2024-01-02,200
2024-01-03,150
2024-01-04,120
2024-01-05,180
2024-01-06,250
2024-01-07,90
2024-01-08,300
2024-01-09,160
2024-01-10,220


In [None]:
df.loc['2024-01-05'] # Specific date

sales    180
Name: 2024-01-05 00:00:00, dtype: int64

In [22]:
df.loc['2024-01'] # Slice by month

Unnamed: 0_level_0,sales
date,Unnamed: 1_level_1
2024-01-01,100
2024-01-02,200
2024-01-03,150
2024-01-04,120
2024-01-05,180
2024-01-06,250
2024-01-07,90
2024-01-08,300
2024-01-09,160
2024-01-10,220


In [23]:
# Date range

df.loc['2024-01-03': '2024-01-06']

Unnamed: 0_level_0,sales
date,Unnamed: 1_level_1
2024-01-03,150
2024-01-04,120
2024-01-05,180
2024-01-06,250


### ⚠️ 5. Common Pitfalls and Best Practices

| Pitfall                   | What Happens                                          | Best Practice                                          |
| ------------------------- | ----------------------------------------------------- | ------------------------------------------------------ |
| Index is not datetime     | `.loc['2024']` fails or returns wrong results         | Always convert with `pd.to_datetime()` and set index   |
| Mixing formats            | `'01/05/2024'` vs `'2024-01-05'` inconsistencies      | Stick to ISO format (`YYYY-MM-DD`)                     |
| Chained filtering         | Using `df['col'] >= date` without parentheses         | Always use `()` with multiple conditions               |
| Not using `inplace=False` | Overwriting your index without knowing                | Always check `.set_index(..., inplace=True)` intention |
| Time not stripped         | Comparing '2024-01-01' to datetime with time may fail | Use `.normalize()` or just date part when needed       |


### 🌍 6. Real World Use Cases

| Use Case                  | Description                                                |
| ------------------------- | ---------------------------------------------------------- |
| **Retail sales analysis** | Filter sales by month or holiday seasons to analyze demand |
| **Website traffic**       | Analyze logs from specific days or peak weeks              |
| **Stock market data**     | Slice stock prices between two dates for trend analysis    |
| **Healthcare**            | Filter patient visits in a given date range or flu season  |
| **Manufacturing**         | Monitor output or errors over specific time windows        |
| **IoT Devices**           | Extract activity over a particular week or shift           |


## ✅ Summary: Filtering and Indexing by Dates

| Concept          | Summary                                                       |
| ---------------- | ------------------------------------------------------------- |
| Boolean filters  | Work on date columns directly using comparison                |
| `.loc[]` slicing | Works when datetime is index (clean & powerful)               |
| Useful methods   | `.between()`, `.isin()`, `pd.date_range()`                    |
| Must do before   | Ensure `pd.to_datetime()` + set index for `.loc[]`            |
| Key for          | Time-bound analysis, reporting, forecasting, and segmentation |


<center><b>Thanks</b></center>