# DatetimeSinusoidCalculator
This notebook shows the functionality of the `DatetimeSinusoidCalculator` class. This transformer derives a feature in a dataframe by calculating the sine or cosine of a datetime column in a given unit (e.g hour), with the option to scale period of the sine or cosine to match the natural period of the unit (e.g. 24).

In [1]:
import tubular
import tests.test_data as d
from tubular.dates import DatetimeSinusoidCalculator

In [2]:
tubular.__version__

'0.3.3'

## Load dummy dataset

In [3]:
df = d.create_datediff_test_df()
df.shape

(8, 2)

In [4]:
df

Unnamed: 0,a,b
0,1993-09-27 11:58:58,2020-05-01 12:59:59
1,2000-03-19 12:59:59,2019-12-25 11:58:58
2,2018-11-10 11:59:59,2018-11-10 11:59:59
3,2018-10-10 11:59:59,2018-11-10 11:59:59
4,2018-10-10 11:59:59,2018-09-10 09:59:59
5,2018-10-10 10:59:59,2015-11-10 11:59:59
6,2018-12-10 11:59:59,2015-11-10 12:59:59
7,1985-07-23 11:59:59,2015-07-23 11:59:59


## Simple usage

### Initialising DatetimeSinusoidCalculator
The user must specify the following;
- `columns` giving the column to operate on, this can be a single column or a list of column names.
- `method` argument to specify which function is to be calculated. Accepted values are 'sin', 'cos' or a list containing both.
- `units` which time unit the calculation is to be carried out on. Accepted values are 'year', 'month', 'day', 'hour', 'minute', 'second', 'microsecond'. 
- `period` the period of the output in the units specified above. To leave the period of the sinusoid output as 2 pi, leave the value as default.




In [5]:
cosine_month_calculator = DatetimeSinusoidCalculator(
    ["a", "b"],
    ["sin", "cos"],
    "month",
    12,
)

### DatetimeSinusoidCalculator fit
There is no `fit` method for the `DatetimeSinusoidCalculator` class, it does not learn anything from the input data `X`.

### DateTimeSinusoidCalculator transform
Four columns are added to the dataframe when the class is instantiated like this; sin_a, cos_a, sin_b and cos_b.

In [6]:
df_2 = cosine_month_calculator.transform(df)

In [7]:
df_2

Unnamed: 0,a,b,sin_a,cos_a,sin_b,cos_b
0,1993-09-27 11:58:58,2020-05-01 12:59:59,-1.0,-1.83697e-16,0.5,-0.8660254
1,2000-03-19 12:59:59,2019-12-25 11:58:58,1.0,6.123234000000001e-17,-2.449294e-16,1.0
2,2018-11-10 11:59:59,2018-11-10 11:59:59,-0.5,0.8660254,-0.5,0.8660254
3,2018-10-10 11:59:59,2018-11-10 11:59:59,-0.8660254,0.5,-0.5,0.8660254
4,2018-10-10 11:59:59,2018-09-10 09:59:59,-0.8660254,0.5,-1.0,-1.83697e-16
5,2018-10-10 10:59:59,2015-11-10 11:59:59,-0.8660254,0.5,-0.5,0.8660254
6,2018-12-10 11:59:59,2015-11-10 12:59:59,-2.449294e-16,1.0,-0.5,0.8660254
7,1985-07-23 11:59:59,2015-07-23 11:59:59,-0.5,-0.8660254,-0.5,-0.8660254


In [8]:
df_2.dtypes

a        datetime64[ns]
b        datetime64[ns]
sin_a           float64
cos_a           float64
sin_b           float64
cos_b           float64
dtype: object