# Introduction

- Time stamps reference particular moments in time (e.g., July 4th, 2015, at 7:00 a.m.).

- Time intervals and periods reference a length of time between a particular beginning and end point—for example, the year 2015. Periods usually reference a special case of time intervals in which each interval is of uniform length and does not overlap (e.g., 24 hour-long periods constituting days).


- Time deltas or durations reference an exact length of time (e.g., a duration of 22.56 seconds).

In [6]:
import numpy as np
import pandas as pd

---

# Dates and Times in Python

---Native Python dates and times: datetime and dateutil---

In [1]:
from datetime import datetime
datetime(year=2019, month=7, day=11)

datetime.datetime(2019, 7, 11, 0, 0)

In [2]:
from dateutil import parser
date = parser.parse("11th of July, 2019")
date

datetime.datetime(2019, 7, 11, 0, 0)

In [5]:
date.strftime('%A')

'Thursday'

---Typed arrays of times: Numpy's datetime64---

In [24]:
# requires a very spesific input format
date = np.array('2019-07-11', dtype=np.datetime64)
date

array('2019-07-11', dtype='datetime64[D]')

In [11]:
date + np.arange(12)

array(['2019-07-11', '2019-07-12', '2019-07-13', '2019-07-14',
       '2019-07-15', '2019-07-16', '2019-07-17', '2019-07-18',
       '2019-07-19', '2019-07-20', '2019-07-21', '2019-07-22'],
      dtype='datetime64[D]')

In [13]:
date + np.arange(0, 10, 2)

array(['2019-07-11', '2019-07-13', '2019-07-15', '2019-07-17',
       '2019-07-19'], dtype='datetime64[D]')

this type of numpy datetime64 will operate more quickly compared to native python datetime

In [15]:
np.datetime64('2019-07-11') # day-based datetime

numpy.datetime64('2019-07-11')

In [17]:
np.datetime64('2019-07-11 12:00') # minute-based time

numpy.datetime64('2019-07-11T12:00')

In [19]:
np.datetime64('2019-07-11 12:59:59.50', 'ns') # nanosecond-based time

numpy.datetime64('2019-07-11T12:59:59.500000000')

---Dates and times in Pandas: Best of both worlds---

In [25]:
date = pd.to_datetime('11th of July, 2019')
date

Timestamp('2019-07-11 00:00:00')

In [27]:
date1 = pd.to_datetime('2019-07-11')
date1

Timestamp('2019-07-11 00:00:00')

In [29]:
date2 = pd.to_datetime(np.datetime64('1997-09-21'))
date2

Timestamp('1997-09-21 00:00:00')

In [30]:
date2.strftime('%A')

'Sunday'

In [31]:
date + pd.to_timedelta(np.arange(12), 'D')

DatetimeIndex(['2019-07-11', '2019-07-12', '2019-07-13', '2019-07-14',
               '2019-07-15', '2019-07-16', '2019-07-17', '2019-07-18',
               '2019-07-19', '2019-07-20', '2019-07-21', '2019-07-22'],
              dtype='datetime64[ns]', freq=None)

In [32]:
date + pd.to_timedelta(np.arange(10), 'W')

DatetimeIndex(['2019-07-11', '2019-07-18', '2019-07-25', '2019-08-01',
               '2019-08-08', '2019-08-15', '2019-08-22', '2019-08-29',
               '2019-09-05', '2019-09-12'],
              dtype='datetime64[ns]', freq=None)

---

In [33]:
pd.to_timedelta(np.arange(10), 'D')

TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days', '5 days',
                '6 days', '7 days', '8 days', '9 days'],
               dtype='timedelta64[ns]', freq=None)

# Pandas Time Series: Indexing by Time

In [35]:
index = pd.DatetimeIndex(['2014-07-04', '2014-08-04',
                          '2015-07-04', '2015-08-04'])
data = pd.Series([0, 1, 2, 3], index=index)
data

2014-07-04    0
2014-08-04    1
2015-07-04    2
2015-08-04    3
dtype: int64

In [36]:
data['2014-08-04':'2015-07-04']

2014-08-04    1
2015-07-04    2
dtype: int64

In [37]:
data['2015']

2015-07-04    2
2015-08-04    3
dtype: int64

In [38]:
data['2015-07']

2015-07-04    2
dtype: int64

---

# Pandas Time Series Data Structures

- For time stamps, Pandas provides the Timestamp type. As mentioned before, it is essentially a replacement for Python’s native datetime, but is based on the more efficient numpy.datetime64 data type. The associated index structure is DatetimeIndex.

- For time periods, Pandas provides the Period type. This encodes a fixedfrequency interval based on numpy.datetime64. The associated index structure is PeriodIndex.

- For time deltas or durations, Pandas provides the Timedelta type. Timedelta is a more efficient replacement for Python’s native datetime.timedelta type, and is based on numpy.timedelta64. The associated index structure is TimedeltaIndex.

In [40]:
# passing a single date to to_datatime() yields a Timestamp
# passing a series of dates by default yields DatetimeIndex
# there are many form of parameter to pass to to_datetime
dates = pd.to_datetime([datetime(2019, 7, 11), '12th of July, 2019', 
                        '2019-Jul-13', '14-07-2019', '20190715'])
dates

DatetimeIndex(['2019-07-11', '2019-07-12', '2019-07-13', '2019-07-14',
               '2019-07-15'],
              dtype='datetime64[ns]', freq=None)

In [41]:
# convert to PeriodIndex
# frequency code, ex: 'D'
dates.to_period('D')

PeriodIndex(['2019-07-11', '2019-07-12', '2019-07-13', '2019-07-14',
             '2019-07-15'],
            dtype='period[D]', freq='D')

In [42]:
# TimedeltaIndex is created by doing operations of (duration) to DatetimeIndex
dates - dates[0]

TimedeltaIndex(['0 days', '1 days', '2 days', '3 days', '4 days'], dtype='timedelta64[ns]', freq=None)

---Regular sequences: pd.date_range()---

In [43]:
pd.date_range('2019-07-11', '2019-07-19')

DatetimeIndex(['2019-07-11', '2019-07-12', '2019-07-13', '2019-07-14',
               '2019-07-15', '2019-07-16', '2019-07-17', '2019-07-18',
               '2019-07-19'],
              dtype='datetime64[ns]', freq='D')

In [44]:
pd.date_range('2019-07-11', periods=8)

DatetimeIndex(['2019-07-11', '2019-07-12', '2019-07-13', '2019-07-14',
               '2019-07-15', '2019-07-16', '2019-07-17', '2019-07-18'],
              dtype='datetime64[ns]', freq='D')

In [57]:
pd.date_range('2019-07-11', periods=8,freq='H')

DatetimeIndex(['2019-07-11 00:00:00', '2019-07-11 01:00:00',
               '2019-07-11 02:00:00', '2019-07-11 03:00:00',
               '2019-07-11 04:00:00', '2019-07-11 05:00:00',
               '2019-07-11 06:00:00', '2019-07-11 07:00:00'],
              dtype='datetime64[ns]', freq='H')

In [48]:
pd.period_range('2019-07', periods=8, freq='M')

PeriodIndex(['2019-07', '2019-08', '2019-09', '2019-10', '2019-11', '2019-12',
             '2020-01', '2020-02'],
            dtype='period[M]', freq='M')

In [53]:
pd.date_range('2019-07-11', periods=8, freq='MS')

DatetimeIndex(['2019-08-01', '2019-09-01', '2019-10-01', '2019-11-01',
               '2019-12-01', '2020-01-01', '2020-02-01', '2020-03-01'],
              dtype='datetime64[ns]', freq='MS')

In [52]:
pd.timedelta_range(0, periods=10, freq='H')

TimedeltaIndex(['00:00:00', '01:00:00', '02:00:00', '03:00:00', '04:00:00',
                '05:00:00', '06:00:00', '07:00:00', '08:00:00', '09:00:00'],
               dtype='timedelta64[ns]', freq='H')

# Frequencies and Offsets

In [54]:
pd.timedelta_range(0, periods=10, freq='2H30T')
# the space between time is specified in freq keyword              

TimedeltaIndex(['00:00:00', '02:30:00', '05:00:00', '07:30:00', '10:00:00',
                '12:30:00', '15:00:00', '17:30:00', '20:00:00', '22:30:00'],
               dtype='timedelta64[ns]', freq='150T')

In [55]:
from pandas.tseries.offsets import BDay
pd.date_range('2019-07-11', periods=5, freq=BDay())

DatetimeIndex(['2019-07-11', '2019-07-12', '2019-07-15', '2019-07-16',
               '2019-07-17'],
              dtype='datetime64[ns]', freq='B')

---

# Resampling, Shifting, and Windowing

In [58]:
from pandas_datareader import data
goog = data.DataReader('GOOG', start='2004', end='2016',
                       data_source='google')

ModuleNotFoundError: No module named 'pandas_datareader'