Skip to content

step_date_cyclic(...) to feature code time into sin and cos #702

@lindeloev

Description

@lindeloev

Updated: I initially proposed an API like step_date(..., cyclic = TRUE). I now think that would be confusing because features = "month" would then mean "time of month" whereas it currently means "month of year". Instead, I propose a new function called step_date_cyclic

Feature

When forecasting, cyclical time-trends can be an important predictor. I propose to add a new function caled step_date_cyclic which feature codes POSIXct columns using trigonometric columns as mlr3pipelines::PipeOpDateFeatures(). E.g., for features = "month" it would add two columns: sin(time) and cos(time). This avoids the data reduction inherent in the current categorical feature coding. And since many learners support numerical features, this enables more models to work on forecasting.

Here's one exposition of the rationale behind this feature coding: https://towardsdatascience.com/cyclical-features-encoding-its-about-time-ce23581845ca

R code demo

Say the user has data with some timestamps:

df = data.frame(
  timestamps = seq(as.POSIXct("2021-02-01"), as.POSIXct("2021-05-01"), length.out = 300)
)

Calling step_date_cyclic(..., features = "month"), would add two new columns:

timestamps_secs = as.numeric(df$timestamps)
secs_per_month = 30 * 24 * 60 * 60
df$timestamps_month_sin = sin(timestamps_secs * 2 * pi / secs_per_month)
df$timestamps_month_cos = cos(timestamps_secs * 2 * pi / secs_per_month)

Visually:

plot(feature_month_sin ~ timestamps, df)
points(feature_month_cos ~ timestamps, df, col = "red")

image

Jointly, these two columns uniquely identifies each time of month:

plot(feature_month_cos ~ feature_month_sin, df)

image

Postscript

For each features, two columns would be added. For each feature, just use secs_per_{feature}, and the rest should be the same.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurea feature request or enhancement

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions