# TemporalScope Tutorial: TimeFrame and Backend-Agnostic Data Loading

## TimeFrame Modes

The `TimeFrame` class supports two key modes for handling temporal data:

1. **Implicit & Static Time Series** (Default Mode):
   - Time column is treated as a feature for static modeling
   - Supports mixed-frequency workflows
   - No strict temporal ordering enforced
   - Use when: Building ML models where time is just another feature
   - Example: `enforce_temporal_uniqueness=False` (default)

2. **Strict Time Series**:
   - Enforces strict temporal ordering and uniqueness
   - Suitable for forecasting tasks
   - Can validate by groups using `id_col`
   - Use when: Building forecasting models requiring temporal integrity
   - Example: `enforce_temporal_uniqueness=True`

## Engineering Design Overview

The `TimeFrame` class uses Narwhals for backend-agnostic DataFrame operations and is designed with several key assumptions:

1. **Preprocessed Data Assumption**:
   - TemporalScope assumes users provide clean, preprocessed data
   - Similar to TensorFlow and GluonTS, preprocessing should be handled before using TemporalScope

2. **Time Column Constraints**:
   - `time_col` must be numeric index or timestamp
   - Critical for operations like sliding window partitioning and temporal XAI

3. **Numeric Features Requirement**:
   - All features (except `time_col`) must be numeric
   - Ensures compatibility with ML models and XAI techniques

4. **Universal Model Assumption**:
   - Models operate on entire dataset without hidden groupings
   - Enables seamless integration with SHAP, Boruta-SHAP, and LIME

## Backend Support

TemporalScope leverages Narwhals for backend-agnostic operations, supporting:

- **Production Environment**:
  - `pandas`: Core DataFrame library (default)
  - `narwhals`: Backend-agnostic operations

- **Test Environment** (via hatch):
  - `modin`: Parallelized Pandas operations
  - `pyarrow`: Apache Arrow-based processing
  - `polars`: High-performance Rust implementation
  - `dask`: Distributed computing framework

This separation ensures lightweight production deployments while maintaining robust testing across backends.

In [1]:
import pandas as pd
import narwhals as nw

from temporalscope.core.temporal_data_loader import TimeFrame
from temporalscope.datasets.datasets import DatasetLoader

# Load example data
loader = DatasetLoader("macrodata")
data = loader.load_data()

# Create TimeFrame (default mode: time as static feature)
tf = TimeFrame(data, time_col="ds", target_col="realgdp")

# Display configuration
print("TimeFrame Configuration:")
print(f"Mode: {tf.mode}")
print(f"Sort Order: {'Ascending' if tf.ascending else 'Descending'}")

# Preview data
print("\nData Preview:")
print(tf.df.head())

Loading dataset: 'macrodata'
DataFrame shape: (203, 13)
Target column: realgdp
TimeFrame Configuration:
Mode: single_target
Sort Order: Ascending

Data Preview:
    realgdp  realcons  realinv  realgovt  realdpi    cpi     m1  tbilrate  \
0  2710.349    1707.4  286.898   470.045   1886.9  28.98  139.7      2.82   
1  2778.801    1733.7  310.859   481.301   1919.7  29.15  141.7      3.08   
2  2775.488    1751.8  289.226   491.260   1916.4  29.35  140.5      3.82   
3  2785.204    1753.7  299.356   484.052   1931.3  29.37  140.0      4.33   
4  2847.699    1770.5  331.722   462.199   1955.5  29.54  139.6      3.50   

   unemp      pop  infl  realint         ds  
0    5.8  177.146  0.00     0.00 1959-01-01  
1    5.1  177.830  2.34     0.74 1959-04-01  
2    5.3  178.657  2.74     1.09 1959-07-01  
3    5.6  179.386  0.27     4.06 1959-10-01  
4    5.2  180.007  2.31     1.19 1960-01-01  


## Example: Group-Level Temporal Uniqueness

TimeFrame supports validation of temporal uniqueness at the group level, essential for multi-entity time series applications:

In [2]:
# Create sample multi-entity data
df = pd.DataFrame(
    {
        "id": [1, 1, 2, 2],
        "time": [1, 2, 1, 3],  # Note: Different groups can share timestamps
        "feature": [0.1, 0.2, 0.3, 0.4],
        "target": [10, 20, 30, 40],
    }
)

# Create TimeFrame with group-level temporal validation
tf = TimeFrame(df, time_col="time", target_col="target", enforce_temporal_uniqueness=True, id_col="id")

print("Data with valid temporal uniqueness within groups:")
print(tf.df)

Data with valid temporal uniqueness within groups:
   id  time  feature  target
0   1     1      0.1      10
2   2     1      0.3      30
1   1     2      0.2      20
3   2     3      0.4      40


## Example: TimeFrame Metadata

TimeFrame includes a metadata container for extensibility and future ML framework integrations:

In [3]:
# Store custom metadata
tf.metadata["model_config"] = {
    "type": "LSTM",
    "framework": "PyTorch",
    "hyperparameters": {"hidden_size": 64, "num_layers": 2},
}

print("TimeFrame Metadata:")
print(tf.metadata)

TimeFrame Metadata:
{'model_config': {'type': 'LSTM', 'framework': 'PyTorch', 'hyperparameters': {'hidden_size': 64, 'num_layers': 2}}}
