
# ⏳ Time Series Data Preparation & Analysis

This notebook provides **code templates and checklists** for **preparing and analyzing time series data**.

### 🔹 What’s Covered:
- Handling datetime features
- Checking stationarity & trend analysis
- Feature engineering for time series
- Forecasting with simple models


In [None]:

# Ensure required libraries are installed (Uncomment if necessary)
# !pip install pandas numpy matplotlib statsmodels



## 📆 Handling Datetime Features

✅ Convert timestamps to a proper datetime format.  
✅ Extract **year, month, day, weekday, hour** features.  
✅ Handle **time zone conversions** if necessary.  


In [None]:

import pandas as pd

# Sample dataset with timestamps
df = pd.DataFrame({
    'timestamp': ["2023-01-01 12:00:00", "2023-02-15 15:30:00", "2023-03-20 18:45:00"]
})

# Convert to datetime format
df['timestamp'] = pd.to_datetime(df['timestamp'])

# Extract useful features
df['year'] = df['timestamp'].dt.year
df['month'] = df['timestamp'].dt.month
df['day'] = df['timestamp'].dt.day
df['weekday'] = df['timestamp'].dt.weekday
df['hour'] = df['timestamp'].dt.hour

print(df.head())



## 🔍 Checking Stationarity

✅ Use **rolling mean plots** to check trends.  
✅ Apply **Augmented Dickey-Fuller (ADF) test** to confirm stationarity.  


In [None]:

import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller

# Create synthetic time series data
df['value'] = np.random.randn(len(df)) + df['month']

# Rolling mean & standard deviation
plt.figure(figsize=(8,4))
df['value'].rolling(window=3).mean().plot(label="Rolling Mean")
df['value'].rolling(window=3).std().plot(label="Rolling Std Dev")
plt.legend()
plt.title("Rolling Mean & Standard Deviation")
plt.show()

# Augmented Dickey-Fuller Test
adf_test = adfuller(df['value'])
print(f"ADF Statistic: {adf_test[0]}")
print(f"p-value: {adf_test[1]}")



## 🔨 Feature Engineering for Time Series

✅ Create **lag features** for past values.  
✅ Use **rolling window features** for trend detection.  
✅ Encode seasonal patterns using **Fourier transforms**.  


In [None]:

# Lag feature creation (shifting data by 1 time step)
df['value_lag1'] = df['value'].shift(1)

# Rolling window feature (mean of last 3 observations)
df['rolling_mean'] = df['value'].rolling(window=3).mean()

print(df.head())



## 📈 Simple Time Series Forecasting

✅ Use **Naïve, Moving Average, or ARIMA models**.  
✅ Compare **forecasting performance** using error metrics.  


In [None]:

from statsmodels.tsa.arima.model import ARIMA

# Fit an ARIMA model (p=1, d=1, q=1)
model = ARIMA(df['value'].dropna(), order=(1,1,1))
model_fit = model.fit()

# Generate forecast
forecast = model_fit.forecast(steps=3)
print("Next 3 Forecasted Values:", forecast)



## ✅ Best Practices & Common Pitfalls

- **Ensure stationarity**: Many models assume stationary data—apply differencing if needed.  
- **Use lag features carefully**: The right lag length depends on the dataset.  
- **Watch for seasonality**: Consider Fourier transforms or dummy variables for seasonal trends.  
- **Validate forecasting models**: Use train-test splits and compare with baseline models.  
