# Time Series Analysis in Pandas

## Overview

**Time Series** = Data indexed by time stamps (stock prices, weather, sensor data, etc.)

### Why Time Series in Pandas?

Pandas provides powerful tools for:
- 📅 **Date/time parsing** and manipulation
- 🔢 **Resampling** (change frequency)
- 📊 **Rolling calculations** (moving averages)
- ⏰ **Time-based indexing** and slicing
- 🌍 **Time zone handling**
- 📈 **Trend analysis** and forecasting prep

### Key Concepts

```
TIME SERIES HIERARCHY:

Timestamp     → Single point in time (2024-01-15 14:30:00)
DatetimeIndex → Array of timestamps (index for time series)
Period        → Time span (January 2024, Q1 2024)
Timedelta     → Duration (5 days, 3 hours)
```

### Common Operations

| Operation | Purpose | Example |
|-----------|---------|----------|
| **Parsing** | String → DateTime | `pd.to_datetime('2024-01-15')` |
| **Indexing** | Set time as index | `df.set_index('date')` |
| **Slicing** | Select time range | `df['2024-01':'2024-03']` |
| **Resampling** | Change frequency | `df.resample('M').sum()` |
| **Rolling** | Moving windows | `df.rolling(7).mean()` |
| **Shifting** | Lag/lead values | `df.shift(1)` |

### Frequency Codes

```python
'D'   - Day
'W'   - Week
'M'   - Month end
'MS'  - Month start
'Q'   - Quarter end
'Y'   - Year end
'H'   - Hour
'T'   - Minute
'S'   - Second
'B'   - Business day
'BM'  - Business month end
```

### What We'll Learn
1. ✅ DateTime basics and parsing
2. ✅ DatetimeIndex and time-based indexing
3. ✅ Resampling (up/down sampling)
4. ✅ Rolling windows and moving averages
5. ✅ Time shifts and lags
6. ✅ Date ranges and frequencies
7. ✅ Time zones
8. ✅ Real-world time series analysis

In [None]:
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Display settings
pd.set_option('display.max_rows', 20)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.precision', 2)

print("✅ Libraries imported")
print(f"Pandas version: {pd.__version__}")
print(f"Current date/time: {pd.Timestamp.now()}")

## 1. DateTime Basics

### Python datetime vs Pandas Timestamp

| Type | Module | Use Case |
|------|--------|----------|
| `datetime` | Python built-in | General date/time |
| `Timestamp` | Pandas | Time series (vectorized) |
| `DatetimeIndex` | Pandas | Time series index |

### Creating DateTime Objects

```python
# Python datetime
from datetime import datetime
dt = datetime(2024, 1, 15, 14, 30)

# Pandas Timestamp
ts = pd.Timestamp('2024-01-15 14:30:00')
ts = pd.Timestamp(2024, 1, 15, 14, 30)

# Parse strings
pd.to_datetime('2024-01-15')
pd.to_datetime('15/01/2024', format='%d/%m/%Y')
```

### Common Date Formats

```python
'2024-01-15'           # ISO format (YYYY-MM-DD)
'15/01/2024'           # DD/MM/YYYY
'01-15-2024'           # MM-DD-YYYY
'2024-01-15 14:30:00'  # With time
'Jan 15, 2024'         # Text month
'15-Jan-2024'          # Short month
```

### Format Codes

| Code | Meaning | Example |
|------|---------|----------|
| `%Y` | 4-digit year | 2024 |
| `%y` | 2-digit year | 24 |
| `%m` | Month (01-12) | 01 |
| `%d` | Day (01-31) | 15 |
| `%H` | Hour (00-23) | 14 |
| `%M` | Minute (00-59) | 30 |
| `%S` | Second (00-59) | 45 |
| `%b` | Short month | Jan |
| `%B` | Full month | January |
| `%a` | Short day | Mon |
| `%A` | Full day | Monday |

In [None]:
print("=== DATETIME BASICS ===\n")

# Example 1: Create Timestamps
print("Example 1: Creating Timestamps\n")
ts1 = pd.Timestamp('2024-01-15')
ts2 = pd.Timestamp(2024, 1, 15, 14, 30)
ts3 = pd.Timestamp.now()

print(f"From string: {ts1}")
print(f"From components: {ts2}")
print(f"Current time: {ts3}")
print()

# Example 2: Parse different formats
print("Example 2: Parse various date formats\n")
dates = [
    '2024-01-15',           # ISO
    '15/01/2024',           # DD/MM/YYYY
    'Jan 15, 2024',         # Text
    '2024-01-15 14:30:00'   # With time
]

for date_str in dates:
    parsed = pd.to_datetime(date_str)
    print(f"{date_str:25} → {parsed}")
print()

# Example 3: Custom format parsing
print("Example 3: Parse custom format\n")
custom_date = '15-Jan-2024'
parsed = pd.to_datetime(custom_date, format='%d-%b-%Y')
print(f"Custom: {custom_date} → {parsed}")
print()

# Example 4: Extract components
print("Example 4: Extract date components\n")
ts = pd.Timestamp('2024-01-15 14:30:45')
print(f"Full timestamp: {ts}")
print(f"Year: {ts.year}")
print(f"Month: {ts.month}")
print(f"Day: {ts.day}")
print(f"Hour: {ts.hour}")
print(f"Minute: {ts.minute}")
print(f"Day of week: {ts.day_name()}")
print(f"Quarter: {ts.quarter}")
print()

# Example 5: Convert DataFrame column
print("Example 5: Convert DataFrame column to datetime\n")
df = pd.DataFrame({
    'date_str': ['2024-01-01', '2024-01-02', '2024-01-03'],
    'value': [100, 110, 105]
})
print("Before conversion:")
print(df.dtypes)

df['date'] = pd.to_datetime(df['date_str'])
print("\nAfter conversion:")
print(df.dtypes)
print("\n", df)
print()

# Example 6: Handle errors
print("Example 6: Handle parsing errors\n")
mixed_dates = ['2024-01-01', 'invalid', '2024-01-03']

# errors='coerce' converts invalid to NaT (Not a Time)
parsed_dates = pd.to_datetime(mixed_dates, errors='coerce')
print("With errors='coerce':")
print(parsed_dates)
print("\nInvalid dates become NaT (Not a Time)")

## 2. DatetimeIndex - Time-Based Indexing

### What is DatetimeIndex?

**DatetimeIndex** = Special index type for time series data

### Benefits
- 🎯 **Powerful slicing**: Select by date ranges
- 🔄 **Resampling**: Change frequency easily
- 📊 **Time-based operations**: Rolling, shifting, etc.
- 🚀 **Performance**: Optimized for time series

### Creating DatetimeIndex

```python
# Method 1: From date range
dates = pd.date_range('2024-01-01', periods=10, freq='D')

# Method 2: From list of strings
dates = pd.to_datetime(['2024-01-01', '2024-01-02', '2024-01-03'])

# Method 3: Set existing column as index
df.set_index('date_column', inplace=True)
```

### Time-Based Slicing

```python
# Exact date
df['2024-01-15']

# Date range
df['2024-01':'2024-03']              # Jan to Mar
df['2024']                           # Entire year
df['2024-01']                        # Entire month

# Between dates
df['2024-01-15':'2024-02-15']

# Using loc
df.loc['2024-01-15']
df.loc['2024-01':'2024-03']
```

### Partial String Indexing

One of the most powerful features!

```python
df['2024']        # All data from 2024
df['2024-01']     # All data from January 2024
df['2024-Q1']     # Q1 2024 (if quarterly freq)
```

### Properties

```python
df.index.year         # Extract years
df.index.month        # Extract months
df.index.day          # Extract days
df.index.dayofweek    # Day of week (0=Mon)
df.index.quarter      # Quarter
df.index.is_month_start  # Boolean
df.index.is_month_end    # Boolean
```

In [None]:
print("=== DATETIMEINDEX EXAMPLES ===\n")

# Create sample time series data
np.random.seed(42)
dates = pd.date_range('2024-01-01', periods=90, freq='D')
ts_data = pd.DataFrame({
    'sales': np.random.randint(100, 200, 90) + np.random.randn(90) * 10,
    'customers': np.random.randint(50, 100, 90),
    'temperature': 20 + np.random.randn(90) * 5
}, index=dates)

# Example 1: Basic DatetimeIndex
print("Example 1: Time series with DatetimeIndex\n")
print(ts_data.head(10))
print(f"\nIndex type: {type(ts_data.index)}")
print(f"Frequency: {ts_data.index.freq}")
print()

# Example 2: Partial string indexing
print("="*70)
print("Example 2: Partial string indexing (most powerful feature!)\n")

print("All data from January 2024:")
print(ts_data['2024-01'].head())

print("\nAll data from February:")
print(ts_data['2024-02'].head())
print()

# Example 3: Date range slicing
print("="*70)
print("Example 3: Date range slicing\n")
jan_to_feb = ts_data['2024-01-15':'2024-02-15']
print(f"Data from Jan 15 to Feb 15 ({len(jan_to_feb)} days):")
print(jan_to_feb.head())
print()

# Example 4: Extract specific date
print("="*70)
print("Example 4: Extract specific date\n")
single_day = ts_data['2024-01-15']
print("Data for January 15, 2024:")
print(single_day)
print()

# Example 5: DatetimeIndex properties
print("="*70)
print("Example 5: Extract date components\n")
ts_data['year'] = ts_data.index.year
ts_data['month'] = ts_data.index.month
ts_data['day'] = ts_data.index.day
ts_data['day_name'] = ts_data.index.day_name()
ts_data['is_weekend'] = ts_data.index.dayofweek >= 5

print(ts_data[['sales', 'month', 'day', 'day_name', 'is_weekend']].head(10))
print()

# Example 6: Filter by day of week
print("="*70)
print("Example 6: Filter weekends vs weekdays\n")
weekends = ts_data[ts_data['is_weekend']]
weekdays = ts_data[~ts_data['is_weekend']]

print(f"Average sales on weekends: {weekends['sales'].mean():.2f}")
print(f"Average sales on weekdays: {weekdays['sales'].mean():.2f}")
print()

# Example 7: Filter by month
print("="*70)
print("Example 7: Compare months\n")
monthly_avg = ts_data.groupby(ts_data.index.month)['sales'].mean()
print("Average sales by month:")
print(monthly_avg)

# Clean up temporary columns
ts_data = ts_data[['sales', 'customers', 'temperature']]

## 3. Date Ranges and Frequencies

### pd.date_range()

Create sequences of dates with specific frequencies.

### Syntax

```python
pd.date_range(
    start='2024-01-01',    # Start date
    end='2024-12-31',      # End date
    periods=100,           # Number of periods
    freq='D'               # Frequency
)
```

### Frequency Codes

| Code | Description | Example |
|------|-------------|----------|
| **D** | Calendar day | 2024-01-01, 2024-01-02, ... |
| **B** | Business day | Skip weekends |
| **W** | Weekly | Every Sunday |
| **W-MON** | Weekly (Monday) | Every Monday |
| **M** | Month end | 2024-01-31, 2024-02-29, ... |
| **MS** | Month start | 2024-01-01, 2024-02-01, ... |
| **Q** | Quarter end | 2024-03-31, 2024-06-30, ... |
| **QS** | Quarter start | 2024-01-01, 2024-04-01, ... |
| **Y** | Year end | 2024-12-31 |
| **YS** | Year start | 2024-01-01 |
| **H** | Hourly | Every hour |
| **T** / **min** | Minutely | Every minute |
| **S** | Secondly | Every second |

### Multiples

```python
'2D'   # Every 2 days
'3W'   # Every 3 weeks
'6M'   # Every 6 months
'4H'   # Every 4 hours
```

### Business Day Frequencies

```python
'B'    # Business day (Mon-Fri)
'BM'   # Business month end
'BMS'  # Business month start
'BQ'   # Business quarter end
'BY'   # Business year end
```

### Common Patterns

```python
# Daily for a year
pd.date_range('2024-01-01', '2024-12-31', freq='D')

# 100 business days
pd.date_range('2024-01-01', periods=100, freq='B')

# Monthly dates
pd.date_range('2024-01-01', periods=12, freq='MS')

# Hourly for a day
pd.date_range('2024-01-01', periods=24, freq='H')
```

In [None]:
print("=== DATE RANGE EXAMPLES ===\n")

# Example 1: Daily dates
print("Example 1: Daily dates for January 2024\n")
daily = pd.date_range('2024-01-01', '2024-01-10', freq='D')
print(daily)
print()

# Example 2: Business days (skip weekends)
print("Example 2: Business days (Mon-Fri only)\n")
business_days = pd.date_range('2024-01-01', periods=10, freq='B')
print(business_days)
print("\nNotice: Skips weekends!")
print()

# Example 3: Weekly dates
print("="*70)
print("Example 3: Weekly dates (every Monday)\n")
weekly = pd.date_range('2024-01-01', periods=8, freq='W-MON')
print(weekly)
print()

# Example 4: Monthly dates
print("="*70)
print("Example 4: Month start vs month end\n")
month_start = pd.date_range('2024-01-01', periods=6, freq='MS')
month_end = pd.date_range('2024-01-01', periods=6, freq='M')

print("Month Start:")
print(month_start)
print("\nMonth End:")
print(month_end)
print()

# Example 5: Quarterly dates
print("="*70)
print("Example 5: Quarterly dates (quarter start)\n")
quarterly = pd.date_range('2024-01-01', periods=8, freq='QS')
print(quarterly)
print()

# Example 6: Custom frequency (every 2 days)
print("="*70)
print("Example 6: Every 2 days\n")
every_2_days = pd.date_range('2024-01-01', periods=10, freq='2D')
print(every_2_days)
print()

# Example 7: Hourly timestamps
print("="*70)
print("Example 7: Hourly timestamps for a day\n")
hourly = pd.date_range('2024-01-01', periods=24, freq='H')
print(hourly[:12])  # Show first 12 hours
print()

# Example 8: Create time series with date range
print("="*70)
print("Example 8: Create time series DataFrame\n")
dates = pd.date_range('2024-01-01', periods=5, freq='D')
df = pd.DataFrame({
    'sales': [100, 110, 105, 120, 115],
    'costs': [60, 65, 62, 70, 68]
}, index=dates)
print(df)
print()

# Example 9: Business month end
print("="*70)
print("Example 9: Business month end (last business day of month)\n")
bm = pd.date_range('2024-01-01', periods=6, freq='BM')
print(bm)
print("\nUseful for monthly reports!")

## 4. Resampling - Change Frequency

### What is Resampling?

**Resampling** = Convert time series from one frequency to another

### Two Types

```
DOWNSAMPLING (Aggregation):
High freq → Low freq (e.g., Daily → Monthly)
Need to aggregate (sum, mean, etc.)

Daily:  | | | | | | | | ... (30 values)
           ↓ resample('M').sum()
Monthly:  |-------------|      (1 value)

UPSAMPLING (Interpolation):
Low freq → High freq (e.g., Monthly → Daily)
Need to fill gaps (ffill, bfill, interpolate)

Monthly:  |-------------|      (1 value)
           ↓ resample('D').ffill()
Daily:  | | | | | | | | ... (30 values)
```

### Syntax

```python
# Downsampling (aggregation)
df.resample('M').sum()      # Monthly sum
df.resample('W').mean()     # Weekly average
df.resample('Q').agg(['sum', 'mean'])  # Multiple

# Upsampling (interpolation)
df.resample('D').ffill()    # Forward fill
df.resample('D').bfill()    # Backward fill
df.resample('D').interpolate()  # Linear interpolation
```

### Common Aggregations

| Method | Purpose | Example |
|--------|---------|----------|
| `.sum()` | Total | Total sales per month |
| `.mean()` | Average | Average temperature per week |
| `.count()` | Number of values | Transactions per day |
| `.min()` | Minimum | Lowest price per quarter |
| `.max()` | Maximum | Peak usage per hour |
| `.std()` | Std deviation | Volatility per month |
| `.first()` | First value | Opening price |
| `.last()` | Last value | Closing price |

### Fill Methods (Upsampling)

```python
# Forward fill (repeat last known value)
df.resample('D').ffill()

# Backward fill (use next known value)
df.resample('D').bfill()

# Linear interpolation
df.resample('D').interpolate()

# Fill with specific value
df.resample('D').asfreq(fill_value=0)
```

### Common Patterns

```python
# Daily → Monthly
df.resample('M').sum()           # Total per month

# Hourly → Daily
df.resample('D').mean()          # Daily average

# Minute → Hour
df.resample('H').agg(['min', 'max', 'mean'])

# Monthly → Daily (upsample)
df.resample('D').ffill()         # Fill forward
```

In [None]:
print("=== RESAMPLING EXAMPLES ===\n")

# Create high-frequency data
np.random.seed(42)
dates = pd.date_range('2024-01-01', periods=90, freq='D')
daily_data = pd.DataFrame({
    'sales': np.random.randint(100, 200, 90) + np.random.randn(90) * 10,
    'orders': np.random.randint(20, 50, 90)
}, index=dates)

# Example 1: Daily to Weekly (downsample)
print("Example 1: Daily → Weekly (sum)\n")
print("Original daily data (first 14 days):")
print(daily_data.head(14))

weekly = daily_data.resample('W').sum()
print("\nWeekly totals:")
print(weekly.head())
print()

# Example 2: Daily to Monthly (downsample)
print("="*70)
print("Example 2: Daily → Monthly\n")
monthly = daily_data.resample('M').agg({
    'sales': 'sum',       # Total sales
    'orders': 'mean'      # Average orders per day
})
print("Monthly summary:")
print(monthly)
print()

# Example 3: Multiple aggregations
print("="*70)
print("Example 3: Multiple aggregations per column\n")
monthly_detailed = daily_data.resample('M').agg({
    'sales': ['sum', 'mean', 'min', 'max'],
    'orders': ['sum', 'mean']
})
print(monthly_detailed.round(2))
print()

# Example 4: Upsampling with forward fill
print("="*70)
print("Example 4: Upsampling - Monthly → Daily\n")
monthly_simple = pd.DataFrame({
    'target': [1000, 1100, 1200]
}, index=pd.date_range('2024-01-01', periods=3, freq='MS'))

print("Monthly data:")
print(monthly_simple)

daily_upsampled = monthly_simple.resample('D').ffill()
print("\nUpsampled to daily (forward fill):")
print(daily_upsampled.head(10))
print(f"\nShape changed from {monthly_simple.shape} to {daily_upsampled.shape}")
print()

# Example 5: Interpolation
print("="*70)
print("Example 5: Upsampling with interpolation\n")
daily_interpolated = monthly_simple.resample('D').interpolate()
print("Interpolated values (smooth transition):")
print(daily_interpolated.head(10))
print()

# Example 6: Business days
print("="*70)
print("Example 6: Resample to business days\n")
business_monthly = daily_data.resample('BM').sum()  # Business month end
print("Business month end totals:")
print(business_monthly)
print()

# Example 7: Quarter aggregation
print("="*70)
print("Example 7: Quarterly summary\n")
quarterly = daily_data.resample('Q').agg({
    'sales': ['sum', 'mean'],
    'orders': 'sum'
})
print(quarterly.round(2))
print()

# Example 8: OHLC (Open-High-Low-Close)
print("="*70)
print("Example 8: OHLC summary (stock-like)\n")
weekly_ohlc = daily_data['sales'].resample('W').agg([
    ('open', 'first'),
    ('high', 'max'),
    ('low', 'min'),
    ('close', 'last')
])
print(weekly_ohlc.head().round(2))
print("\nUseful for financial time series!")

## 5. Rolling Windows - Moving Calculations

### What are Rolling Windows?

**Rolling** = Apply function to sliding window of data

Also called: **Moving average**, **Sliding window**, **Rolling calculation**

### Visual Example

```
Data: [10, 20, 30, 40, 50, 60]

3-day rolling mean:
Window 1: [10, 20, 30] → 20
Window 2:     [20, 30, 40] → 30
Window 3:         [30, 40, 50] → 40
Window 4:             [40, 50, 60] → 50

Result: [NaN, NaN, 20, 30, 40, 50]
```

### Syntax

```python
# Basic rolling
df.rolling(window=7).mean()     # 7-period moving average
df.rolling(window=7).sum()      # 7-period rolling sum

# With min_periods (handle start)
df.rolling(window=7, min_periods=1).mean()

# Center the window
df.rolling(window=7, center=True).mean()
```

### Common Rolling Functions

| Function | Purpose | Use Case |
|----------|---------|----------|
| `.mean()` | Moving average | Smooth trends |
| `.sum()` | Rolling total | Cumulative |
| `.std()` | Rolling std dev | Volatility |
| `.min()` | Rolling minimum | Support levels |
| `.max()` | Rolling maximum | Resistance levels |
| `.median()` | Rolling median | Robust average |
| `.count()` | Non-NaN count | Data quality |
| `.apply()` | Custom function | Any calculation |

### Parameters

```python
window=7           # Window size (required)
min_periods=1      # Min observations needed
center=False       # Center the window labels
win_type=None      # Window type (e.g., 'gaussian')
```

### Common Patterns

```python
# 7-day moving average (smooth short-term)
df['MA_7'] = df['price'].rolling(7).mean()

# 30-day moving average (smooth long-term)
df['MA_30'] = df['price'].rolling(30).mean()

# Rolling volatility (30-day)
df['volatility'] = df['returns'].rolling(30).std()

# Bollinger Bands
df['MA_20'] = df['price'].rolling(20).mean()
df['std_20'] = df['price'].rolling(20).std()
df['upper'] = df['MA_20'] + 2 * df['std_20']
df['lower'] = df['MA_20'] - 2 * df['std_20']
```

### Types of Windows

**Standard Rolling**: Equal weight to all values
```python
df.rolling(7).mean()
```

**Exponential Moving Average**: Recent values weighted more
```python
df.ewm(span=7).mean()  # More responsive
```

**Centered Window**: Label at center
```python
df.rolling(7, center=True).mean()
```

In [None]:
print("=== ROLLING WINDOW EXAMPLES ===\n")

# Create sample data with trend and noise
np.random.seed(42)
dates = pd.date_range('2024-01-01', periods=60, freq='D')
trend = np.linspace(100, 150, 60)
noise = np.random.randn(60) * 10
price = trend + noise

df = pd.DataFrame({'price': price}, index=dates)

# Example 1: Simple moving average
print("Example 1: 7-day moving average\n")
df['MA_7'] = df['price'].rolling(7).mean()
print(df[['price', 'MA_7']].head(10))
print("\nNotice: First 6 values are NaN (not enough data)")
print()

# Example 2: Multiple moving averages
print("="*70)
print("Example 2: Multiple moving averages (short and long term)\n")
df['MA_7'] = df['price'].rolling(7).mean()
df['MA_30'] = df['price'].rolling(30).mean()
print(df[['price', 'MA_7', 'MA_30']].tail(10).round(2))
print("\nMA_7 reacts faster to changes, MA_30 is smoother")
print()

# Example 3: min_periods parameter
print("="*70)
print("Example 3: Use min_periods to handle early values\n")
df['MA_7_minp'] = df['price'].rolling(7, min_periods=1).mean()
print(df[['price', 'MA_7', 'MA_7_minp']].head(10).round(2))
print("\nWith min_periods=1, no NaN at start")
print()

# Example 4: Rolling statistics
print("="*70)
print("Example 4: Multiple rolling statistics\n")
df['rolling_mean'] = df['price'].rolling(7).mean()
df['rolling_std'] = df['price'].rolling(7).std()
df['rolling_min'] = df['price'].rolling(7).min()
df['rolling_max'] = df['price'].rolling(7).max()

print(df[['price', 'rolling_mean', 'rolling_std', 'rolling_min', 'rolling_max']].tail(10).round(2))
print()

# Example 5: Bollinger Bands
print("="*70)
print("Example 5: Bollinger Bands (trading indicator)\n")
window = 20
df['MA'] = df['price'].rolling(window).mean()
df['STD'] = df['price'].rolling(window).std()
df['Upper_Band'] = df['MA'] + 2 * df['STD']
df['Lower_Band'] = df['MA'] - 2 * df['STD']

print(df[['price', 'MA', 'Upper_Band', 'Lower_Band']].tail(10).round(2))
print("\nPrice between bands = normal, outside = potential signal")
print()

# Example 6: Exponential moving average
print("="*70)
print("Example 6: Exponential moving average (EMA)\n")
df['EMA_7'] = df['price'].ewm(span=7).mean()
df['MA_7_regular'] = df['price'].rolling(7).mean()
print(df[['price', 'MA_7_regular', 'EMA_7']].tail(10).round(2))
print("\nEMA gives more weight to recent values")
print()

# Example 7: Centered window
print("="*70)
print("Example 7: Centered window (useful for retrospective analysis)\n")
df['MA_centered'] = df['price'].rolling(7, center=True).mean()
df['MA_normal'] = df['price'].rolling(7).mean()
print(df[['price', 'MA_normal', 'MA_centered']].iloc[3:10].round(2))
print()

# Example 8: Custom rolling function
print("="*70)
print("Example 8: Custom rolling function (price range)\n")
def price_range(x):
    return x.max() - x.min()

df['7day_range'] = df['price'].rolling(7).apply(price_range)
print(df[['price', '7day_range']].tail(10).round(2))
print("\nShows volatility: higher range = more volatile")

# Clean up for next examples
df = df[['price']]

## 6. Time Shifts and Lags

### What is Shifting?

**Shifting** = Move data forward or backward in time

### Visual Example

```
Original:  [10, 20, 30, 40, 50]

shift(1):  [NaN, 10, 20, 30, 40]  ← Lag (previous value)
shift(-1): [20, 30, 40, 50, NaN]  ← Lead (next value)
```

### Syntax

```python
# Shift values (lag/lead)
df.shift(1)       # Lag by 1 period
df.shift(-1)      # Lead by 1 period
df.shift(7)       # Lag by 7 periods

# Shift index (time)
df.shift(1, freq='D')    # Shift dates by 1 day
df.shift(1, freq='M')    # Shift dates by 1 month
```

### Common Use Cases

**1. Calculate Changes**
```python
# Absolute change
df['change'] = df['price'] - df['price'].shift(1)

# Percentage change
df['pct_change'] = df['price'].pct_change()
# Equivalent to:
# (df['price'] - df['price'].shift(1)) / df['price'].shift(1)
```

**2. Compare with Previous Period**
```python
df['yesterday'] = df['sales'].shift(1)
df['last_week'] = df['sales'].shift(7)
df['last_month'] = df['sales'].shift(30)
```

**3. Calculate Returns**
```python
# Daily returns
df['returns'] = df['price'].pct_change()

# Log returns
df['log_returns'] = np.log(df['price'] / df['price'].shift(1))
```

**4. Create Lagged Features (ML)**
```python
df['lag_1'] = df['value'].shift(1)
df['lag_2'] = df['value'].shift(2)
df['lag_7'] = df['value'].shift(7)
```

### diff() vs pct_change() vs shift()

```python
# Original
values = [100, 110, 105, 120]

# diff() - absolute difference
[NaN, 10, -5, 15]

# pct_change() - percentage change
[NaN, 0.10, -0.045, 0.143]

# shift(1) - previous value
[NaN, 100, 110, 105]
```

### Time-Based Shifts

```python
# Shift index by time period
df.shift(1, freq='D')    # Shift dates forward 1 day
df.shift(-1, freq='M')   # Shift dates back 1 month

# Using tshift (deprecated, use shift with freq)
df.shift(freq='D')       # Same as shift(1, freq='D')
```

In [None]:
print("=== TIME SHIFT EXAMPLES ===\n")

# Create sample data
dates = pd.date_range('2024-01-01', periods=10, freq='D')
df = pd.DataFrame({
    'price': [100, 105, 103, 110, 108, 115, 112, 120, 118, 125]
}, index=dates)

# Example 1: Basic shift (lag)
print("Example 1: Lag by 1 period (previous value)\n")
df['prev_price'] = df['price'].shift(1)
print(df[['price', 'prev_price']])
print()

# Example 2: Calculate daily change
print("="*70)
print("Example 2: Calculate daily change\n")
df['change'] = df['price'] - df['price'].shift(1)
print(df[['price', 'prev_price', 'change']])
print()

# Example 3: Percentage change
print("="*70)
print("Example 3: Percentage change (daily return)\n")
df['pct_change'] = df['price'].pct_change() * 100
print(df[['price', 'change', 'pct_change']].round(2))
print()

# Example 4: Multiple lags
print("="*70)
print("Example 4: Multiple lag periods\n")
df['lag_1'] = df['price'].shift(1)
df['lag_2'] = df['price'].shift(2)
df['lag_3'] = df['price'].shift(3)
print(df[['price', 'lag_1', 'lag_2', 'lag_3']])
print()

# Example 5: Lead (negative shift)
print("="*70)
print("Example 5: Lead - next day's value\n")
df['next_day'] = df['price'].shift(-1)
df['future_change'] = df['next_day'] - df['price']
print(df[['price', 'next_day', 'future_change']])
print()

# Example 6: diff() method
print("="*70)
print("Example 6: diff() - shortcut for change calculation\n")
df['diff_1'] = df['price'].diff()     # Same as price - price.shift(1)
df['diff_2'] = df['price'].diff(2)    # 2-period difference
print(df[['price', 'diff_1', 'diff_2']])
print()

# Example 7: Cumulative calculations
print("="*70)
print("Example 7: Cumulative sum of changes\n")
df['daily_change'] = df['price'].diff()
df['cumulative_gain'] = df['daily_change'].cumsum()
print(df[['price', 'daily_change', 'cumulative_gain']].round(2))
print()

# Example 8: Compare with last week
print("="*70)
print("Example 8: Week-over-week comparison\n")

# Create weekly data
weekly_dates = pd.date_range('2024-01-01', periods=8, freq='W')
weekly_df = pd.DataFrame({
    'sales': [1000, 1100, 1050, 1200, 1150, 1300, 1250, 1400]
}, index=weekly_dates)

weekly_df['last_week'] = weekly_df['sales'].shift(1)
weekly_df['wow_change'] = weekly_df['sales'] - weekly_df['last_week']
weekly_df['wow_pct'] = weekly_df['sales'].pct_change() * 100

print(weekly_df.round(2))
print()

# Example 9: ML feature creation
print("="*70)
print("Example 9: Create lagged features for ML\n")
features = pd.DataFrame({
    'value': [10, 15, 12, 18, 20, 17, 22, 25]
})

# Create multiple lag features
for i in range(1, 4):
    features[f'lag_{i}'] = features['value'].shift(i)

print(features)
print("\nThese lagged features can be used as ML inputs")

# Clean up
df = df[['price']]

## 7. Time Zones

### Time Zone Basics

**Timezone-naive**: No timezone info (default)
**Timezone-aware**: Has timezone info

### Common Operations

```python
# Localize (add timezone to naive)
df.tz_localize('US/Eastern')

# Convert (change timezone of aware)
df.tz_convert('Europe/London')

# Remove timezone
df.tz_localize(None)
```

### Common Time Zones

```python
'UTC'             # Coordinated Universal Time
'US/Eastern'      # EST/EDT (New York)
'US/Pacific'      # PST/PDT (Los Angeles)
'Europe/London'   # GMT/BST
'Asia/Tokyo'      # JST
'Asia/Kolkata'    # IST (India)
'Australia/Sydney' # AEST/AEDT
```

### Workflow

```
1. Parse dates → timezone-naive
2. tz_localize() → Add timezone (make aware)
3. tz_convert() → Convert to different timezone
4. Analyze in target timezone
```

### Best Practices

- 🌍 Store data in **UTC**
- 🔄 Convert to local for **display**
- ⚠️ Can't do arithmetic on mixed timezones
- 📅 Be aware of **DST** (Daylight Saving Time)

In [None]:
print("=== TIME ZONE EXAMPLES ===\n")

# Example 1: Create timezone-naive vs aware
print("Example 1: Timezone-naive vs timezone-aware\n")
naive = pd.Timestamp('2024-01-15 14:30:00')
aware = pd.Timestamp('2024-01-15 14:30:00', tz='US/Eastern')

print(f"Naive: {naive}")
print(f"  Timezone: {naive.tz}")
print(f"\nAware: {aware}")
print(f"  Timezone: {aware.tz}")
print()

# Example 2: Localize (add timezone)
print("="*70)
print("Example 2: Add timezone to naive timestamps\n")
dates = pd.date_range('2024-01-01', periods=5, freq='H')
df = pd.DataFrame({'value': range(5)}, index=dates)

print("Original (naive):")
print(df.index)

# Localize to US Eastern
df_eastern = df.copy()
df_eastern.index = df_eastern.index.tz_localize('US/Eastern')
print("\nLocalized to US/Eastern:")
print(df_eastern.index)
print()

# Example 3: Convert between timezones
print("="*70)
print("Example 3: Convert between timezones\n")

# Start with US Eastern
eastern_time = pd.Timestamp('2024-01-15 14:30:00', tz='US/Eastern')
print(f"US/Eastern:     {eastern_time}")

# Convert to other timezones
utc_time = eastern_time.tz_convert('UTC')
london_time = eastern_time.tz_convert('Europe/London')
tokyo_time = eastern_time.tz_convert('Asia/Tokyo')
india_time = eastern_time.tz_convert('Asia/Kolkata')

print(f"UTC:            {utc_time}")
print(f"London:         {london_time}")
print(f"Tokyo:          {tokyo_time}")
print(f"India:          {india_time}")
print()

# Example 4: DataFrame timezone conversion
print("="*70)
print("Example 4: Convert DataFrame timezone\n")

# Create data in UTC
utc_dates = pd.date_range('2024-01-01', periods=5, freq='H', tz='UTC')
df_utc = pd.DataFrame({'value': [10, 15, 12, 18, 20]}, index=utc_dates)

print("Original (UTC):")
print(df_utc)

# Convert to Eastern
df_eastern = df_utc.tz_convert('US/Eastern')
print("\nConverted to US/Eastern:")
print(df_eastern)
print()

# Example 5: Remove timezone
print("="*70)
print("Example 5: Remove timezone (make naive)\n")
print("Timezone-aware:")
print(df_eastern.index)

df_naive = df_eastern.tz_localize(None)
print("\nMade naive (timezone removed):")
print(df_naive.index)
print()

# Example 6: Working with UTC (best practice)
print("="*70)
print("Example 6: Store in UTC, display in local\n")

# Parse dates from different sources (all to UTC)
ny_time = pd.Timestamp('2024-01-15 09:00:00', tz='US/Eastern')
london_time = pd.Timestamp('2024-01-15 14:00:00', tz='Europe/London')

# Convert both to UTC for storage
ny_utc = ny_time.tz_convert('UTC')
london_utc = london_time.tz_convert('UTC')

print("Stored in UTC:")
print(f"  NY event:     {ny_utc}")
print(f"  London event: {london_utc}")

# Convert back for display
print("\nDisplay in Tokyo time:")
print(f"  NY event:     {ny_utc.tz_convert('Asia/Tokyo')}")
print(f"  London event: {london_utc.tz_convert('Asia/Tokyo')}")
print()

print("💡 Best Practice: Store in UTC, convert for display")

## 8. Real-World Time Series Analysis

### Common Analysis Patterns

**1. Sales Analysis**
```python
# Daily to monthly
monthly_sales = df.resample('M').sum()

# Moving average for trend
df['trend'] = df['sales'].rolling(30).mean()

# Year-over-year growth
df['yoy_growth'] = df['sales'].pct_change(365) * 100
```

**2. Stock Price Analysis**
```python
# Daily returns
df['returns'] = df['close'].pct_change()

# Moving averages
df['SMA_50'] = df['close'].rolling(50).mean()
df['SMA_200'] = df['close'].rolling(200).mean()

# Volatility
df['volatility'] = df['returns'].rolling(30).std()
```

**3. Sensor Data**
```python
# Resample high-frequency to minutes
df_min = df.resample('T').mean()

# Smooth noise
df['smoothed'] = df['reading'].rolling(10).mean()

# Detect anomalies
df['mean'] = df['reading'].rolling(100).mean()
df['std'] = df['reading'].rolling(100).std()
df['anomaly'] = abs(df['reading'] - df['mean']) > 3 * df['std']
```

**4. Web Traffic**
```python
# Hourly to daily
daily = df.resample('D').sum()

# Day of week patterns
df['dow'] = df.index.dayofweek
dow_pattern = df.groupby('dow')['visits'].mean()

# Compare to last week
df['last_week'] = df['visits'].shift(7)
df['wow_change'] = df['visits'].pct_change(7)
```

In [None]:
print("=== REAL-WORLD TIME SERIES ANALYSIS ===\n")

# Scenario 1: E-commerce Sales Analysis
print("Scenario 1: E-commerce Sales Analysis\n")

# Create realistic sales data
np.random.seed(42)
dates = pd.date_range('2023-01-01', '2024-03-31', freq='D')
n_days = len(dates)

# Base sales with weekly pattern and trend
trend = np.linspace(1000, 1500, n_days)
weekly_pattern = 200 * np.sin(np.arange(n_days) * 2 * np.pi / 7)
noise = np.random.randn(n_days) * 50
sales = trend + weekly_pattern + noise

df = pd.DataFrame({'sales': sales}, index=dates)

# Analysis
print("Daily sales data (sample):")
print(df.head())

# Monthly summary
monthly = df.resample('M').agg({
    'sales': ['sum', 'mean', 'std']
})
monthly.columns = ['total', 'avg_daily', 'std']
print("\nMonthly summary:")
print(monthly.tail(6).round(0))

# Moving averages
df['MA_7'] = df['sales'].rolling(7).mean()
df['MA_30'] = df['sales'].rolling(30).mean()

print("\nWith moving averages:")
print(df[['sales', 'MA_7', 'MA_30']].tail().round(0))

# Year-over-year growth
df['yoy_change'] = df['sales'].pct_change(365) * 100
print(f"\nCurrent YoY growth: {df['yoy_change'].iloc[-1]:.1f}%")
print()

# Scenario 2: Temperature Data Analysis
print("="*70)
print("Scenario 2: Temperature Monitoring\n")

# Create hourly temperature data
hourly_dates = pd.date_range('2024-01-01', periods=24*7, freq='H')
temp_base = 20
daily_cycle = 5 * np.sin(np.arange(len(hourly_dates)) * 2 * np.pi / 24)
temp = temp_base + daily_cycle + np.random.randn(len(hourly_dates))

temp_df = pd.DataFrame({'temperature': temp}, index=hourly_dates)

print("Hourly temperature (first day):")
print(temp_df.head(24).round(1))

# Daily statistics
daily_temp = temp_df.resample('D').agg([
    ('min', 'min'),
    ('max', 'max'),
    ('avg', 'mean')
])
daily_temp.columns = daily_temp.columns.droplevel(0)

print("\nDaily temperature summary:")
print(daily_temp.round(1))
print()

# Scenario 3: Stock Price Technical Analysis
print("="*70)
print("Scenario 3: Stock Price Technical Analysis\n")

# Simulate stock prices
stock_dates = pd.date_range('2024-01-01', periods=100, freq='B')
returns = np.random.randn(100) * 0.02
price = 100 * np.exp(np.cumsum(returns))

stock_df = pd.DataFrame({'close': price}, index=stock_dates)

# Technical indicators
stock_df['SMA_20'] = stock_df['close'].rolling(20).mean()
stock_df['SMA_50'] = stock_df['close'].rolling(50).mean()
stock_df['returns'] = stock_df['close'].pct_change()
stock_df['volatility'] = stock_df['returns'].rolling(20).std() * np.sqrt(252)

# Bollinger Bands
stock_df['BB_middle'] = stock_df['close'].rolling(20).mean()
stock_df['BB_std'] = stock_df['close'].rolling(20).std()
stock_df['BB_upper'] = stock_df['BB_middle'] + 2 * stock_df['BB_std']
stock_df['BB_lower'] = stock_df['BB_middle'] - 2 * stock_df['BB_std']

print("Stock analysis (last 10 days):")
print(stock_df[['close', 'SMA_20', 'SMA_50', 'volatility']].tail(10).round(2))

# Trading signal (simple)
stock_df['signal'] = np.where(stock_df['SMA_20'] > stock_df['SMA_50'], 'BUY', 'SELL')
print(f"\nCurrent signal: {stock_df['signal'].iloc[-1]}")
print()

# Scenario 4: Web Traffic Analysis
print("="*70)
print("Scenario 4: Website Traffic Patterns\n")

# Daily traffic
traffic_dates = pd.date_range('2024-01-01', periods=30, freq='D')
base_traffic = 10000
weekend_boost = [1.3 if d in [5, 6] else 1.0 for d in traffic_dates.dayofweek]
traffic = base_traffic * np.array(weekend_boost) + np.random.randn(30) * 500

traffic_df = pd.DataFrame({'visits': traffic}, index=traffic_dates)
traffic_df['dow'] = traffic_df.index.day_name()
traffic_df['is_weekend'] = traffic_df.index.dayofweek >= 5

print("Traffic by day of week:")
dow_avg = traffic_df.groupby('dow')['visits'].mean().round(0)
# Sort by day of week
day_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
dow_avg = dow_avg.reindex(day_order)
print(dow_avg)

print("\nWeekend vs Weekday:")
print(traffic_df.groupby('is_weekend')['visits'].mean().round(0))

## 9. Best Practices & Common Pitfalls

### Best Practices ✅

**1. Always Set DatetimeIndex**
```python
# ✅ Good
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date')

# ❌ Avoid
df['date'] = df['date']  # Keep as string
```

**2. Use Explicit Format for Parsing**
```python
# ✅ Fast and explicit
pd.to_datetime(df['date'], format='%Y-%m-%d')

# ❌ Slow, Pandas must infer
pd.to_datetime(df['date'])
```

**3. Handle Missing Dates**
```python
# ✅ Reindex to fill gaps
complete_dates = pd.date_range(start=df.index.min(), 
                                end=df.index.max(), 
                                freq='D')
df = df.reindex(complete_dates)
```

**4. Store in UTC**
```python
# ✅ Standard practice
df.index = df.index.tz_localize('US/Eastern').tz_convert('UTC')

# ❌ Store in local timezone
df.index = df.index.tz_localize('US/Eastern')
```

**5. Use min_periods for Rolling**
```python
# ✅ Avoid too many NaN
df['MA'] = df['price'].rolling(30, min_periods=10).mean()

# ❌ Many NaN at start
df['MA'] = df['price'].rolling(30).mean()
```

### Common Pitfalls ❌

**1. Mixing Timezone-Aware and Naive**
```python
# ❌ Error: Can't combine
df_utc + df_naive  # TypeError

# ✅ Convert both to same
df_naive = df_naive.tz_localize('UTC')
df_utc + df_naive
```

**2. Not Sorting Index**
```python
# ❌ Unsorted index causes issues
df.resample('M').sum()  # May fail

# ✅ Sort first
df = df.sort_index()
df.resample('M').sum()
```

**3. Using String Dates**
```python
# ❌ String comparison
df[df['date'] > '2024-01-01']  # Works but slow

# ✅ DateTime comparison
df['date'] = pd.to_datetime(df['date'])
df[df['date'] > pd.Timestamp('2024-01-01')]
```

**4. Forgetting DST**
```python
# ⚠️ DST can cause issues
# March 10, 2024: US clock springs forward
# One "hour" is actually missing!

# ✅ Be aware when working with local times
dates = pd.date_range('2024-03-10', periods=24, 
                      freq='H', tz='US/Eastern')
```

**5. Resampling Without Aggregation**
```python
# ❌ Incomplete
df.resample('M')  # Just returns Resampler object

# ✅ Must specify aggregation
df.resample('M').sum()
```

### Performance Tips 🚀

**1. Use Categorical for Repeated Dates**
```python
# For columns with repeated date strings
df['month'] = df['date'].dt.to_period('M').astype('category')
```

**2. Vectorize Operations**
```python
# ✅ Fast
df['year'] = df.index.year

# ❌ Slow
df['year'] = df.index.map(lambda x: x.year)
```

**3. Downsample Before Heavy Operations**
```python
# If you don't need high frequency
df_daily = df.resample('D').mean()  # Reduce size first
df_daily.rolling(30).mean()         # Then analyze
```

## 10. Practice Exercises

### Beginner Level (1-5)

1. **Parse dates**
   - Convert string column to datetime
   - Set as index

2. **Extract components**
   - Extract year, month, day from DatetimeIndex
   - Find day of week

3. **Date range**
   - Create 30 days of dates
   - Create 12 months of dates

4. **Basic slicing**
   - Select specific month
   - Select date range

5. **Simple resample**
   - Convert daily to weekly sum
   - Convert daily to monthly mean

### Intermediate Level (6-10)

6. **Moving average**
   - Calculate 7-day and 30-day MA
   - Compare short vs long term trends

7. **Calculate returns**
   - Daily percentage change
   - Cumulative returns

8. **Lag features**
   - Create 3 lag features
   - Calculate difference from previous day

9. **Weekend analysis**
   - Filter weekends vs weekdays
   - Compare averages

10. **Quarterly summary**
    - Resample to quarters
    - Multiple aggregations

### Advanced Level (11-15)

11. **Bollinger Bands**
    - Calculate 20-day MA and std
    - Create upper and lower bands

12. **Year-over-year growth**
    - Calculate YoY percentage change
    - Handle missing years

13. **Time zone conversion**
    - Parse timestamps in local timezone
    - Convert to UTC

14. **Fill missing dates**
    - Identify gaps in date index
    - Reindex with complete date range
    - Forward fill values

15. **Custom resampling**
    - OHLC summary (open-high-low-close)
    - Week starting Monday

### Challenge Problems (16-20)

16. **Complete sales dashboard**
    - Daily, weekly, monthly views
    - MoM and YoY growth
    - Moving averages

17. **Seasonality detection**
    - Extract seasonal patterns
    - Day of week effects
    - Month of year effects

18. **Anomaly detection**
    - Rolling mean and std
    - Flag values > 3 std from mean

19. **Multi-timezone analysis**
    - Combine data from different timezones
    - Normalize to UTC
    - Analyze by local business hours

20. **Forecasting prep**
    - Create lagged features (1-7 days)
    - Add rolling features (mean, std)
    - Add date features (dow, month, quarter)

## Quick Reference Card

### DateTime Parsing

```python
# Parse strings
pd.to_datetime('2024-01-15')
pd.to_datetime(df['date'], format='%Y-%m-%d')

# Create timestamps
pd.Timestamp('2024-01-15 14:30:00')
pd.Timestamp(2024, 1, 15, 14, 30)

# Date ranges
pd.date_range('2024-01-01', periods=30, freq='D')
pd.date_range('2024-01-01', '2024-12-31', freq='W')
```

### DatetimeIndex Operations

```python
# Set index
df.set_index('date', inplace=True)

# Time-based slicing
df['2024-01']                    # Entire month
df['2024-01-15']                 # Specific day
df['2024-01':'2024-03']          # Date range

# Extract components
df.index.year, df.index.month, df.index.day
df.index.day_name(), df.index.dayofweek
df.index.quarter, df.index.week
```

### Resampling

```python
# Downsampling (aggregation)
df.resample('W').sum()           # Weekly total
df.resample('M').mean()          # Monthly average
df.resample('Q').agg(['sum', 'mean'])  # Multiple

# Upsampling (interpolation)
df.resample('D').ffill()         # Forward fill
df.resample('D').bfill()         # Backward fill
df.resample('D').interpolate()   # Linear interpolation
```

### Rolling Windows

```python
# Moving average
df['MA_7'] = df['price'].rolling(7).mean()
df['MA_30'] = df['price'].rolling(30).mean()

# Other rolling functions
df.rolling(7).sum()              # Rolling sum
df.rolling(7).std()              # Rolling std dev
df.rolling(7).min()              # Rolling minimum
df.rolling(7).max()              # Rolling maximum

# With min_periods
df.rolling(7, min_periods=1).mean()

# Exponential moving average
df.ewm(span=7).mean()
```

### Time Shifts

```python
# Lag (previous values)
df['prev'] = df['value'].shift(1)
df['last_week'] = df['value'].shift(7)

# Lead (future values)
df['next'] = df['value'].shift(-1)

# Calculate changes
df['change'] = df['price'].diff()        # Absolute
df['pct_change'] = df['price'].pct_change()  # Percentage
```

### Time Zones

```python
# Localize (add timezone to naive)
df.index = df.index.tz_localize('US/Eastern')

# Convert (change timezone)
df.index = df.index.tz_convert('UTC')

# Create with timezone
pd.Timestamp('2024-01-15', tz='US/Eastern')
pd.date_range('2024-01-01', periods=10, freq='D', tz='UTC')
```

### Frequency Codes

```python
'D'    # Day
'W'    # Week
'M'    # Month end
'MS'   # Month start
'Q'    # Quarter end
'QS'   # Quarter start
'Y'    # Year end
'H'    # Hour
'T'    # Minute
'S'    # Second
'B'    # Business day
'BM'   # Business month end

# Multiples
'2D'   # Every 2 days
'3W'   # Every 3 weeks
'6M'   # Every 6 months
```

### Common Patterns

```python
# Pattern 1: Parse and set index
df['date'] = pd.to_datetime(df['date'])
df = df.set_index('date').sort_index()

# Pattern 2: Moving averages
df['MA_7'] = df['price'].rolling(7).mean()
df['MA_30'] = df['price'].rolling(30).mean()

# Pattern 3: Resample to monthly
monthly = df.resample('M').agg({'sales': 'sum', 'orders': 'count'})

# Pattern 4: Calculate returns
df['returns'] = df['price'].pct_change()
df['cum_returns'] = (1 + df['returns']).cumprod() - 1

# Pattern 5: Fill missing dates
complete_idx = pd.date_range(df.index.min(), df.index.max(), freq='D')
df = df.reindex(complete_idx).ffill()
```

## Summary

### Key Concepts Mastered ✅

**1. DateTime Fundamentals**
- **Timestamp**: Single point in time
- **DatetimeIndex**: Time series index
- **Parsing**: String → DateTime conversion
- **Components**: Extract year, month, day, etc.

**2. Time-Based Indexing**
- **Partial string indexing**: `df['2024-01']`
- **Date range slicing**: `df['2024-01':'2024-03']`
- **Boolean filtering**: Day of week, month, etc.
- **Date ranges**: `pd.date_range()`

**3. Resampling**
- **Downsampling**: High freq → Low freq (aggregate)
- **Upsampling**: Low freq → High freq (interpolate)
- **Frequency conversion**: Daily → Weekly → Monthly
- **Multiple aggregations**: sum, mean, min, max

**4. Rolling Windows**
- **Moving averages**: Smooth trends
- **Rolling statistics**: Mean, std, min, max
- **Window sizes**: 7, 30, 90 days
- **EMA**: Exponential weighted average

**5. Time Shifts**
- **Lag**: Previous values (shift(1))
- **Lead**: Future values (shift(-1))
- **Differences**: diff(), pct_change()
- **Lagged features**: For ML models

**6. Time Zones**
- **Localize**: Add timezone to naive
- **Convert**: Change timezone
- **Best practice**: Store in UTC
- **Common zones**: US/Eastern, Europe/London, Asia/Tokyo

---

### Method Summary

| Operation | Method | Example |
|-----------|--------|----------|
| **Parse** | `pd.to_datetime()` | String → DateTime |
| **Create range** | `pd.date_range()` | Generate dates |
| **Slice** | `df['2024-01']` | Select time period |
| **Resample** | `.resample('M')` | Change frequency |
| **Rolling** | `.rolling(7)` | Moving window |
| **Shift** | `.shift(1)` | Lag/lead values |
| **Diff** | `.diff()` | Calculate change |
| **Pct change** | `.pct_change()` | Percentage change |

---

### Real-World Applications

**Finance**
- Stock price analysis
- Moving averages (SMA, EMA)
- Bollinger Bands
- Daily returns and volatility

**Business**
- Sales trend analysis
- Monthly/quarterly reporting
- YoY and MoM growth
- Seasonality detection

**IoT/Sensors**
- High-frequency data aggregation
- Noise smoothing
- Anomaly detection
- Downsampling for storage

**Web Analytics**
- Traffic patterns
- Day of week analysis
- Time zone normalization
- Hourly → Daily aggregation

---

### Common Workflows

**Workflow 1: Sales Analysis**
```
1. Parse dates → DatetimeIndex
2. Resample to monthly → totals
3. Calculate MoM growth
4. Add moving averages
5. Identify trends
```

**Workflow 2: Stock Analysis**
```
1. Load OHLC data
2. Calculate daily returns
3. Compute moving averages (20, 50, 200)
4. Calculate Bollinger Bands
5. Generate trading signals
```

**Workflow 3: Sensor Data**
```
1. Parse timestamps
2. Resample high-freq → minute/hour
3. Apply rolling average (smooth)
4. Detect anomalies (> 3 std)
5. Fill missing values
```

---

### Remember

- 📅 **Always** set DatetimeIndex for time series
- 🌍 **Store in UTC**, convert for display
- 📊 **Resample** to reduce frequency for analysis
- 📈 **Rolling** windows smooth noise and show trends
- ⏰ **Shift** creates lag features for ML
- 🔄 **Sort index** before resampling

---

### Next Steps

After mastering time series:
1. **Advanced Indexing** - MultiIndex time series
2. **Visualization** - Time series plotting
3. **Forecasting** - ARIMA, Prophet, ML models
4. **Seasonality** - Decomposition and seasonal adjustment
5. **Performance** - Optimize large time series

---

**Happy Time Series Analysis! 🐼📈⏰**