# Task 4.1: Temporal & Circadian Features

**Goal**: Engineer time-based features to capture circadian rhythms and academic schedules.

**Features**:
- **Cyclical Time**: Sin/Cos encoding for Hour (24h), Day (7d), Week (10w).
- **Day Parts**: Night, Morning, Afternoon, Evening.
- **Academic Context**: Weekend, Term Phase (Early, Midterm, Finals).

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import sys

# Add src to path
sys.path.append('../../')

from src.features.temporal_features import process_temporal_features

## 1. Load Data
Loading the training dataset.

In [None]:
DATA_DIR = Path('../../data/processed')
TEST_DATA_DIR = Path('../../data/processed_test')

# Try loading main dataset, fallback to test dataset
if (DATA_DIR / 'train.parquet').exists():
    df = pd.read_parquet(DATA_DIR / 'train.parquet')
    print("Loaded full training data")
elif (TEST_DATA_DIR / 'train.parquet').exists():
    df = pd.read_parquet(TEST_DATA_DIR / 'train.parquet')
    print("Loaded test subset data (full dataset not generated yet)")
else:
    # Create dummy data for demonstration if nothing exists
    print("No data found. Generating dummy data for demonstration.")
    dates = pd.date_range(start='2013-03-27', end='2013-06-05', freq='h')
    df = pd.DataFrame({'timestamp': dates})
    df['hour_of_day'] = df['timestamp'].dt.hour
    df['day_of_week'] = df['timestamp'].dt.dayofweek
    df['week_of_term'] = ((df['timestamp'] - df['timestamp'].min()).dt.days // 7) + 1

print(f"Shape: {df.shape}")
df.head()

## 2. Apply Feature Engineering

In [None]:
df_features = process_temporal_features(df)
df_features.head()

## 3. Visualization

### 3.1 Cyclical Encoding Check
Plotting Hour Sin vs Cos. This should form a perfect circle (clock face).

In [None]:
plt.figure(figsize=(6, 6))
sns.scatterplot(data=df_features.iloc[:24], x='hour_sin', y='hour_cos', hue='hour_of_day', palette='viridis', s=100)
plt.title("Cyclical Hour Encoding (24h Clock)")
plt.xlabel("Sin(Hour)")
plt.ylabel("Cos(Hour)")
plt.grid(True)
plt.show()

### 3.2 Day Parts Distribution

In [None]:
plt.figure(figsize=(8, 4))
sns.countplot(data=df_features, x='day_part', order=['Morning', 'Afternoon', 'Evening', 'Night'])
plt.title("Distribution of Data by Day Part")
plt.show()

### 3.3 Term Phase

In [None]:
plt.figure(figsize=(8, 4))
sns.countplot(data=df_features, x='term_phase', order=['Early', 'Midterm', 'Finals'])
plt.title("Distribution of Data by Term Phase")
plt.show()