# **8. Working with Date and Time**

## **1. DateTime Conversion and Parsing**

In [2]:
import pandas as pd 
import numpy as np

### ✅ 1. **What** it does and **when** to use it

#### 👉 What it does:

DateTime Conversion and Parsing is the process of converting string/object data into **datetime objects** so that pandas can understand and operate on them **chronologically**.

#### 📌 Why it matters:

* Enables **time-based filtering**, resampling, rolling averages, and time arithmetic.
* Essential for **time series analysis**, trend evaluation, and indexing by dates.
* Many raw datasets (CSV, logs, APIs) store dates as strings, which are not natively usable for analysis unless parsed.

#### 🕰️ When to use it:

* When reading data from CSVs or JSON with date columns stored as strings.
* When you need to filter, resample, or do arithmetic on dates.
* Before setting a `DatetimeIndex` or performing `.resample()`, `.rolling()`, or `.shift()`.


### 🧾 2. Syntax and Core Parameters

#### 🧱 Core Function:

```python
pd.to_datetime(arg, errors='raise', format=None, dayfirst=False, utc=False)
```

#### 📌 Key Parameters:

| Parameter  | Purpose                                                         |
| ---------- | --------------------------------------------------------------- |
| `arg`      | The input to convert (series, single value, list, etc.)         |
| `errors`   | `{‘raise’, ‘coerce’, ‘ignore’}` → how to handle invalid parsing |
| `format`   | A specific datetime format string (`"%Y-%m-%d"` etc.)           |
| `dayfirst` | If True, interprets dates like 01/05/2024 as **May 1st**        |
| `utc`      | If True, converts to UTC timezone                               |


### 🧠 3. Different Methods and Techniques

#### ✅ Method 1: Convert a single column to datetime

```python
df['date'] = pd.to_datetime(df['date'])
```

#### ✅ Method 2: Specify format for performance

```python
df['date'] = pd.to_datetime(df['date'], format='%d-%m-%Y')
```

#### ✅ Method 3: Handle errors gracefully

```python
df['date'] = pd.to_datetime(df['date'], errors='coerce')  # invalid dates → NaT
```

#### ✅ Method 4: Convert a scalar or list of strings

```python
pd.to_datetime('2023-12-25')
pd.to_datetime(['2023/01/01', '2023/01/02'])
```

#### ✅ Method 5: Parse multiple datetime parts (year, month, day)

```python
df['date'] = pd.to_datetime(df[['year', 'month', 'day']])
```


### 🧪 4. Examples on Real/Pseudo Data

In [3]:
data = {
    'order_id': [1, 2, 3],
    'order_date': ['2023-08-01', '2023-08-02', 'not_a_date'],
}
df = pd.DataFrame(data)

df

Unnamed: 0,order_id,order_date
0,1,2023-08-01
1,2,2023-08-02
2,3,not_a_date


#### ✅ Convert string column to datetime:

In [None]:
pd.to_datetime(df['order_date'], errors='coerce')

0   2023-08-01
1   2023-08-02
2          NaT
Name: order_date, dtype: datetime64[ns]

In [None]:
convered_dates = df['order_date'].astype('datetime64[as]', errors='ignore')
convered_dates

# I think due to this one not_a_date all are coming as object.

0    2023-08-01
1    2023-08-02
2    not_a_date
Name: order_date, dtype: object

#### ✅ Using format for custom string:

In [None]:
pd.to_datetime(df['order_date'], format='%Y-%m-%d', errors='coerce')

# The format argument should be provided as 
# the original date format of the string data you are trying to convert.

0   2023-08-01
1   2023-08-02
2          NaT
Name: order_date, dtype: datetime64[ns]

In [16]:
df1 = pd.DataFrame({
    'event_time': ['01-07-2024', '05-12-2024']
})

df1

Unnamed: 0,event_time
0,01-07-2024
1,05-12-2024


In [18]:
pd.to_datetime(df1['event_time'], format='%d-%m-%Y')

0   2024-07-01
1   2024-12-05
Name: event_time, dtype: datetime64[ns]

#### ✅ Parse from multiple columns:

In [21]:
df = pd.DataFrame({
    'year': [2024, 2024],
    'month': [7, 8],
    'day': [25, 10]
})

df

Unnamed: 0,year,month,day
0,2024,7,25
1,2024,8,10


In [22]:
pd.to_datetime(df[['year', 'month', 'day']])

0   2024-07-25
1   2024-08-10
dtype: datetime64[ns]

### ⚠️ 5. Common Pitfalls and Best Practices

| Pitfall              | Description                                                | How to Avoid                                                          |
| -------------------- | ---------------------------------------------------------- | --------------------------------------------------------------------- |
| **Wrong format**     | Parsing may fail silently or convert incorrectly.          | Use `format=` if you know the format.                                 |
| **Mixed formats**    | If data has inconsistent formats, parsing might error.     | Use `errors='coerce'` to safely skip bad entries.                     |
| **Ambiguous dates**  | e.g., `01-02-2023` could be Jan 2 or Feb 1.                | Use `dayfirst=True` or consistent formatting.                         |
| **NaT vs NaN**       | Invalid datetime results in `NaT` (Not a Time), not `NaN`. | Use `.isna()` or `.notna()` to detect missing dates.                  |
| **Time zone issues** | Datetimes may be unaware (`naive`).                        | Localize with `.tz_localize()` or convert with `.tz_convert()` later. |


### 🌍 6. Real World Use Cases

| Use Case                      | Description                                                                                        |
| ----------------------------- | -------------------------------------------------------------------------------------------------- |
| **E-commerce orders**         | Convert string order timestamps into datetime to calculate delivery times or filter recent orders. |
| **Financial datasets**        | Convert price history to datetime for time series plots, rolling averages, or volatility checks.   |
| **IoT sensor logs**           | Parse log timestamps to track anomalies or usage patterns.                                         |
| **Marketing campaigns**       | Convert campaign start/end strings to datetime to measure performance over time.                   |
| **Flight or booking systems** | Parse booking and travel times to calculate duration, sort schedules, and more.                    |


## ✅ Summary for Section A: DateTime Conversion and Parsing

| Concept      | Key Takeaway                                                              |
| ------------ | ------------------------------------------------------------------------- |
| Conversion   | Use `pd.to_datetime()` to convert strings or separate fields to datetime. |
| Access       | Enables time-based filtering, indexing, and arithmetic.                   |
| Format       | Explicit format speeds parsing and avoids ambiguity.                      |
| Safe Parsing | Use `errors='coerce'` to handle bad data gracefully.                      |
| Real Use     | Foundation for time series analytics and temporal modeling.               |


<center><b>Thanks</b></center>