Transforming cyclic variables, also known as circular predictors, is essential in some machine learning tasks, especially when dealing with time-related or angular data like time of day, months of the year, or compass directions. To represent cyclic variables in a meaningful way, you can use techniques like **circular encoding** or **trigonometric encoding**. In this notebook, you find examples of both methods in Python.

# Circular Encoding

Circular encoding transforms cyclic variables into two new features by using trigonometric functions like sine and cosine. These two features represent the cyclic variable's position on a unit circle.

Here's an example using the hour of the day as a cyclic variable:

In [1]:
import pandas as pd
import numpy as np

# Sample data
data = pd.DataFrame({'hour_of_day': [0, 3, 12, 15, 21]})

# Circular encoding
data['hour_sin'] = np.sin(2 * np.pi * data['hour_of_day'] / 24)
data['hour_cos'] = np.cos(2 * np.pi * data['hour_of_day'] / 24)

print(data)

   hour_of_day      hour_sin  hour_cos
0            0  0.000000e+00  1.000000
1            3  7.071068e-01  0.707107
2           12  1.224647e-16 -1.000000
3           15 -7.071068e-01 -0.707107
4           21 -7.071068e-01  0.707107


In this example, 'hour_sin' and 'hour_cos' are the circular encoding of the 'hour_of_day' variable. They represent the time of day as points on the unit circle.

# Trigonometric Encoding

Trigonometric encoding is similar to circular encoding but combines sine and cosine into a single feature, providing both amplitude and phase information.

Here's how you can use trigonometric encoding for the hour of the day:

In [2]:
import pandas as pd
import numpy as np

# Sample data
data = pd.DataFrame({'hour_of_day': [0, 3, 12, 15, 21]})

# Trigonometric encoding
data['hour_trig'] = np.sin(2 * np.pi * data['hour_of_day'] / 24) + 1j * np.cos(2 * np.pi * data['hour_of_day'] / 24)

print(data)

   hour_of_day           hour_trig
0            0  0.000000+1.000000j
1            3  0.707107+0.707107j
2           12  0.000000-1.000000j
3           15 -0.707107-0.707107j
4           21 -0.707107+0.707107j


In this example, 'hour_trig' is a complex number that represents the hour of the day, combining amplitude and phase information.

You can apply these encoding techniques to other cyclic variables like days of the week, months of the year, or compass directions. Just make sure to adjust the scaling factor (e.g., 24 for hours of the day, 7 for days of the week, etc.) accordingly.

# Remarks

- When using these transformed features in your machine learning pipeline, remember to scale or standardize them along with your other features to ensure they have similar magnitudes.