# ToDatetimeTransformer
This notebook shows the functionality of the `ToDatetimeTransformer` class. This transformer converts a specified column to datetime type. The transformer simply uses the `pd.to_datetime` method to do the conversion. 

In [1]:
import pandas as pd
import numpy as np

In [2]:
import tubular
from tubular.dates import ToDatetimeTransformer

In [3]:
tubular.__version__

'0.3.0'

## Load dummy dataset

In [4]:
df = pd.DataFrame(
    {"a": [1950, 1960, 2000, 2001, np.NaN, 2010], "b": [1, 2, 3, 4, 5, np.NaN]}
)

In [5]:
df

Unnamed: 0,a,b
0,1950.0,1.0
1,1960.0,2.0
2,2000.0,3.0
3,2001.0,4.0
4,,5.0
5,2010.0,


## Simple usage

### Initialising ToDatetimeTransformer
The user must specify the following;
- `column` giving the column to convert to datetime
- `new_column_name` giving the name of the column to assign the converted column to

The user can also optionally supply `to_datetime_kwargs` which is a dictionary of keyword arguments that will be passed to `pd.to_datetime` when it is called in the `transform` method. There are many useful arguments that can be found in the [documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html).

In [6]:
to_dt_1 = ToDatetimeTransformer(
    column="a", new_column_name="a_dt", to_datetime_kwargs={"format": "%Y"}
)

### ToDatetimeTransformer fit
There is not `fit` method for the `ToDatetimeTransformer` class, it does not learn anything from the input data `X`.

### ToDatetimeTransformer transform
A new column called 'a_dt' has been added to the input DataFrame after transform ran.

In [7]:
df_2 = to_dt_1.transform(df)

In [8]:
df_2

Unnamed: 0,a,b,a_dt
0,1950.0,1.0,1950-01-01
1,1960.0,2.0,1960-01-01
2,2000.0,3.0,2000-01-01
3,2001.0,4.0,2001-01-01
4,,5.0,NaT
5,2010.0,,2010-01-01


In [9]:
df_2.dtypes

a              float64
b              float64
a_dt    datetime64[ns]
dtype: object