# SeriesDtMethodTransformer
This notebook shows the functionality in the `SeriesDtMethodTransformer` class. This transformer applys a `pd.Series.dt` method to a specific column in the input `X`. <br>
This generic transformer means that many `pd.Series.dt` methods are available for use within the package without having to directly implement a transformer for each specific function. <br>
Most of the `pd.Series.dt` methods simply access attributes e.g. [pd.Series.dt.year](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.dt.year.html) with a few actually being callable e.g. [pd.Series.dt.to_period](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.to_period.html).

In [1]:
import pandas as pd
import numpy as np
import datetime

In [2]:
import tubular
from tubular.dates import SeriesDtMethodTransformer

In [3]:
tubular.__version__

'0.3.0'

## Create dummy dataset

In [4]:
df = pd.DataFrame(
    {
        "a": [
            datetime.datetime(1993, 9, 27, 11, 58, 58),
            datetime.datetime(2000, 3, 19, 12, 59, 59),
            datetime.datetime(2018, 11, 10, 11, 59, 59),
            datetime.datetime(2018, 10, 10, 11, 59, 59),
            datetime.datetime(2018, 10, 10, 11, 59, 59),
            datetime.datetime(2018, 10, 10, 10, 59, 59),
            datetime.datetime(2018, 12, 10, 11, 59, 59),
            datetime.datetime(1985, 7, 23, 11, 59, 59),
        ],
        "b": [
            datetime.datetime(2020, 5, 1, 12, 59, 59),
            datetime.datetime(2019, 12, 25, 11, 58, 58),
            datetime.datetime(2018, 11, 10, 11, 59, 59),
            datetime.datetime(2018, 11, 10, 11, 59, 59),
            datetime.datetime(2018, 9, 10, 9, 59, 59),
            datetime.datetime(2015, 11, 10, 11, 59, 59),
            datetime.datetime(2015, 11, 10, 12, 59, 59),
            datetime.datetime(2015, 7, 23, 11, 59, 59),
        ],
    }
)

In [5]:
df

Unnamed: 0,a,b
0,1993-09-27 11:58:58,2020-05-01 12:59:59
1,2000-03-19 12:59:59,2019-12-25 11:58:58
2,2018-11-10 11:59:59,2018-11-10 11:59:59
3,2018-10-10 11:59:59,2018-11-10 11:59:59
4,2018-10-10 11:59:59,2018-09-10 09:59:59
5,2018-10-10 10:59:59,2015-11-10 11:59:59
6,2018-12-10 11:59:59,2015-11-10 12:59:59
7,1985-07-23 11:59:59,2015-07-23 11:59:59


In [6]:
df.dtypes

a    datetime64[ns]
b    datetime64[ns]
dtype: object

## Simple usage

### Initialising SeriesDtMethodTransformer

The user must specify the following; <br>
- `new_column_name` the name of the column to assign the outputs of the `pd.Series.str` method to <br> 
- `pd_method_name` the name of the `pd.Series.dt` method to be called <br>
- `column` the column in the `DataFrame` passed to the `transform` method to be transformed <br>
- `pd_method_kwargs` a dictionary of keyword arguments that are passed to the `pd.Series.dt` method when called, only applicable if the method is `callable`, otherwise will be ignored <br>

In [7]:
month_transformer = SeriesDtMethodTransformer(
    column="a", pd_method_name="month", new_column_name="a_month"
)

### SeriesDtMethodTransformer fit
There is no fit method for the `SeriesDtMethodTransformer` as the methods that it can run do not 'learn' anything from the data.

### SeriesDtMethodTransformer transform
When running transform with this configuration a new column `a_month` is added to the input `X` which is the result or running `df['a'].dt.month`.

In [8]:
df_2 = month_transformer.transform(df)

In [9]:
df_2[["a", "a_month"]].head()

Unnamed: 0,a,a_month
0,1993-09-27 11:58:58,9
1,2000-03-19 12:59:59,3
2,2018-11-10 11:59:59,11
3,2018-10-10 11:59:59,10
4,2018-10-10 11:59:59,10
