**Etracting Patterns from the Timestamp using Sin and Cosine**

Converting the timestamp using sin and cosine can make sense if patterns occur 24/7. 

In this way, 23:59 is very close to 00:00, as it should be.

We map each cyclical variable onto a circle such that the lowest value for that variable appears right next to the largest value. We compute the x- and y- component of that point using sin and cos trigonometric functions.  Here's what it looks like for the "hours" variable. Zero (midnight) is on the right, and the hours increase counterclockwise around the circle. 

Source: David Kaleko (http://blog.davidkaleko.com/feature-engineering-cyclical-features.html)

![title](http://i65.tinypic.com/2akh56x.jpg)

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os
print(os.listdir("../input"))

In [None]:
#import only a few rows
train = pd.read_csv('../input/train.csv', nrows=1000)

In [None]:
train.columns = ['ip', 'app', 'device', 'os', 'channel', 'click_time', 'attributed_time', 'is_attributed']

**STEP 1: Breaking Up Date Data into Multiple Features**

The goal is to break up the timestamp into seperate columns of year, month, day, hour and minute.

In [None]:
df = pd.DataFrame(data=train, columns=['click_time'])

In [None]:
df.click_time = pd.to_datetime(df.click_time)

In [None]:
df['new_formatted_date'] = df.click_time.dt.strftime('%d/%m/%y %H:%M')

In [None]:
df.new_formatted_date.head(3)

In [None]:
# pandas.Series.dt
df['month'] = df.click_time.dt.month
df['day'] = df.click_time.dt.day
df['year'] = df.click_time.dt.year
df['hour'] = df.click_time.dt.hour
df['minute'] = df.click_time.dt.minute
df.head(3)

In [None]:
print('Unique values of month:', df.month.unique())
print('Unique values of day:', df.day.unique())
print('Unique values of year:', df.year.unique())
print('Unique values of hour:', df.hour.unique())
print('Unique values of minute:', df.minute.unique())

**The Magic :-) **

Only day, hour and minute are relevant in this dataset.

In [None]:
# Day
df['day_sin'] = np.sin(df.day*(2.*np.pi/30))
df['day_cos'] = np.cos(df.day*(2.*np.pi/30))

In [None]:
# Hour
df['hour_sin'] = np.sin(df.day*(2.*np.pi/24))
df['hour_cos'] = np.cos(df.day*(2.*np.pi/24))

In [None]:
# Minute
df['minute_sin'] = np.sin(df.day*(2.*np.pi/60))
df['minute_cos'] = np.cos(df.day*(2.*np.pi/60))

In [None]:
# Concatenate
concatenated = pd.concat([train, df], axis=1)

In [None]:
# Define X
X = concatenated[['ip', 'app', 'device', 'os', 'channel', 'day_sin','day_cos', 'hour_sin', 'hour_cos',
                  'minute_sin', 'minute_cos']]

In [None]:
# Now we have timestamp with sin and cosine:
X.head(3)

I stop here. The rest is easy.

I tested this approach against Bojan Tunguz's post and did NOT get a better accuracy, but it might work in another case.
https://www.kaggle.com/tunguz/xgboost-starter