# Getting Started with TimeSmith

TimeSmith is a time series machine learning library with strict layer boundaries and a clean architecture. This notebook demonstrates the core concepts and workflow.

## What Makes TimeSmith Special?

1. **Strict Layer Boundaries**: Clean separation of concerns
2. **Task Semantics**: Tasks hold meaning, models hold parameters
3. **Composition**: Flexible pipelines and adapters
4. **Type Safety**: Runtime validators for data structures
5. **Production Ready**: Error handling, logging, serialization

Let's dive in!


In [None]:
# Import TimeSmith and standard libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from timesmith import (
    ForecastTask,
    SimpleMovingAverageForecaster,
    backtest_forecaster,
    summarize_backtest,
    make_forecaster_pipeline,
)
from timesmith.examples import LogTransformer, NaiveForecaster

# Set random seed for reproducibility
np.random.seed(42)

print("TimeSmith imported successfully!")
print(f"TimeSmith version: {__import__('timesmith').__version__}")


## 1. Create Time Series Data

Let's create a realistic time series with trend and seasonality.


In [None]:
# Create a time series with trend and seasonality
dates = pd.date_range("2020-01-01", periods=100, freq="D")

# Trend component
trend = np.linspace(100, 150, len(dates))

# Seasonal component (weekly pattern)
seasonal = 10 * np.sin(2 * np.pi * np.arange(len(dates)) / 7)

# Noise
noise = np.random.normal(0, 5, len(dates))

# Combine
y = pd.Series(trend + seasonal + noise, index=dates, name="value")

# Plot
plt.figure(figsize=(12, 5))
plt.plot(y.index, y.values, linewidth=1.5, alpha=0.7)
plt.title("Sample Time Series Data", fontsize=14, fontweight="bold")
plt.xlabel("Date")
plt.ylabel("Value")
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f"Data shape: {y.shape}")
print(f"Date range: {y.index[0]} to {y.index[-1]}")
print(f"Mean: {y.mean():.2f}, Std: {y.std():.2f}")


## 2. Validate Data with TimeSmith Types

TimeSmith provides runtime validators to ensure data quality.


In [None]:
from timesmith.typing import assert_series, is_series

# Check if data is SeriesLike
print(f"Is SeriesLike: {is_series(y)}")

# Validate the data
assert_series(y, name="y")
print(" Data validation passed!")


## 3. Create a Forecast Task

Tasks hold semantics - they define what problem we're solving.


In [None]:
# Create a forecast task
task = ForecastTask(
    y=y,
    fh=14,  # Forecast 14 days ahead
    frequency="D"  # Daily frequency
)

print(f"Task created:")
print(f"  - Data length: {len(task.y)}")
print(f"  - Forecast horizon: {task.fh}")
print(f"  - Frequency: {task.frequency}")


## 4. Fit and Predict with a Forecaster

Let's use a simple moving average forecaster.


In [None]:
# Create and fit forecaster
forecaster = SimpleMovingAverageForecaster(window=7)
forecaster.fit(y)

# Make predictions
forecast = forecaster.predict(fh=14)

print(f"Forecast shape: {forecast.y_pred.shape}")
print(f"Forecast values:\n{forecast.y_pred}")

# Visualize
plt.figure(figsize=(12, 6))
plt.plot(y.index[-30:], y.values[-30:], label="Historical", linewidth=2)
plt.plot(forecast.y_pred.index, forecast.y_pred.values, 
         label="Forecast", linewidth=2, linestyle="--", marker="o")
plt.axvline(x=y.index[-1], color="red", linestyle=":", alpha=0.7, label="Cutoff")
plt.title("Time Series Forecast", fontsize=14, fontweight="bold")
plt.xlabel("Date")
plt.ylabel("Value")
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()


## 5. Build a Pipeline

TimeSmith makes it easy to compose transformers and forecasters.


In [None]:
# Create a pipeline: log transform â†’ naive forecaster
transformer = LogTransformer(offset=1.0)
forecaster = NaiveForecaster()
pipeline = make_forecaster_pipeline(transformer, forecaster=forecaster)

# Fit and predict
pipeline.fit(y)
pipeline_forecast = pipeline.predict(fh=14)

print("Pipeline forecast completed!")
print(f"Forecast shape: {pipeline_forecast.y_pred.shape}")


## 6. Run a Backtest

Evaluate your model with time series cross-validation.


In [None]:
# Run backtest
result = backtest_forecaster(forecaster, task)

# Summarize results
summary = summarize_backtest(result)

print("Backtest Results:")
print("=" * 50)
print("\nAggregate Metrics:")
for key, value in summary["aggregate_metrics"].items():
    print(f"  {key}: {value:.4f}")

print(f"\nNumber of folds: {len(result.results)}")
print(f"\nPer-fold metrics:")
print(result.results[["fold_id", "mae", "rmse", "mape"]].head())


## Summary

You've learned:
- - How to create and validate time series data
- - How to define forecast tasks
- - How to fit and predict with forecasters
- - How to build pipelines
- - How to evaluate models with backtesting

**Next Steps:**
- Check out the other notebooks for advanced features
- Explore network analysis capabilities
- Learn about model serialization and production workflows
