notes by Jake Vanderplas 
https://github.com/jakevdp/PythonDataScienceHandbook

basic pandas functionality :
http://pandas.pydata.org/pandas-docs/stable/basics.html#dt-accessor

### Native Python date and time

to create the date use python `datetime` type :

In [1]:
from datetime import datetime
datetime(year=2015, month=7, day=4)

datetime.datetime(2015, 7, 4, 0, 0)

`dateutil` parses dates in various formats

In [2]:
from dateutil import parser
date = parser.parse("4th of July, 2015")
date

datetime.datetime(2015, 7, 4, 0, 0)

if you have `datetime` object, you can get day of the week :
- here we use standard code of str format `("%A")`. See `datetime` documentation for detail   

In [3]:
date.strftime('%A')


'Saturday'

**the only poor feature of these modules is that they are rather slow when working with large data**

### `datetime64` Numpy type  

- date is coded as 64-bit number
- `datetime64` requires  exact input format 

In [4]:
import numpy as np
date = np.array('2015-07-04', dtype=np.datetime64)
date

array('2015-07-04', dtype='datetime64[D]')

but as soon as the date has been formatted, you can use it in various vectorized operations :

In [5]:
date + np.arange(12)

array(['2015-07-04', '2015-07-05', '2015-07-06', '2015-07-07',
       '2015-07-08', '2015-07-09', '2015-07-10', '2015-07-11',
       '2015-07-12', '2015-07-13', '2015-07-14', '2015-07-15'],
      dtype='datetime64[D]')

**although numpy `datetime64` lacks some handy operations which native Python objects have**

## Date and time in Pandas

- Pandas uses `Timestamp` obj
- It creates `DatetimeIndex` obj out of these objects and uses it as index
- it is fast as numpy `datetime64` and uses operations as native Python datetime types

In [6]:
import pandas as pd
date = pd.to_datetime("4th of July, 2015")
date

Timestamp('2015-07-04 00:00:00')

In [7]:
date.strftime("%A")

'Saturday'

In [8]:
date + pd.to_timedelta(np.arange(12), 'D')

DatetimeIndex(['2015-07-04', '2015-07-05', '2015-07-06', '2015-07-07',
               '2015-07-08', '2015-07-09', '2015-07-10', '2015-07-11',
               '2015-07-12', '2015-07-13', '2015-07-14', '2015-07-15'],
              dtype='datetime64[ns]', freq=None)

### Indexing by time

create Series obj with date indexed by time :

In [9]:
index = pd.DatetimeIndex(['2014-07-04','2014-08-04', '2015-07-04', '2015-08-04' ])
data = pd.Series([0, 1, 2, 3], index=index)
data

2014-07-04    0
2014-08-04    1
2015-07-04    2
2015-08-04    3
dtype: int64

now we can use index operations. f.e. slicing :

In [10]:
data['2014-07-04' : '2015-07-04']

2014-07-04    0
2014-08-04    1
2015-07-04    2
dtype: int64

there are also special operations for datetime only. f.e,  slice by year :

In [11]:
data['2015']

2015-07-04    2
2015-08-04    3
dtype: int64

### Pandas time series data structures

+ for date and time marks Pandas uses `Timestamp` type. It's used instead of native Python structures and based upon more effective `numpy.datetime64`. Corresponding index construction is `DatetimeIndex`
+ for time periods Pandas uses Period. It's based upon `numpy.datetime64`. Index construction is `PeriodIndex`
+ for time delta Pandas has `Timedelta` based on `numpy.timedelta64`. Index construction is `TimedeltaIndex`

`pd.to_datetime()` takes str  and parses many format types. Returns `Timestamp` is only one string is given and `DatetimeIndex` if array of dates is passed

In [12]:
dates = pd.to_datetime([datetime(2015, 7,3), '4th of July, 2015', "2015-Jul-6", '07-07-2015', '20150708', '12/31/2015'])
dates

DatetimeIndex(['2015-07-03', '2015-07-04', '2015-07-06', '2015-07-07',
               '2015-07-08', '2015-12-31'],
              dtype='datetime64[ns]', freq=None)

we can convert `DatetimeIndex` obj into `PeriodIndex`. We should denote code to periodic interval. Here it is 'D' which is interval is day

In [13]:
dates.to_period('D')

PeriodIndex(['2015-07-03', '2015-07-04', '2015-07-06', '2015-07-07',
             '2015-07-08', '2015-12-31'],
            dtype='period[D]', freq='D')

`TimedeltaIndex` is returned when one date is subtracted from the other

In [14]:
dates - dates[0]

TimedeltaIndex(['0 days', '1 days', '3 days', '4 days', '5 days', '181 days'], dtype='timedelta64[ns]', freq=None)

In [None]:
!curl -o FremontBridge.csv
https://data.seattle.gov/api/views/65db-xm6k/rows.csv?access