# Time Series More

## Creating date ranges with `date_range` function
Creating single dates is usually not what you will be doing in pandas. The **date_range** function gives you the ability to create very precise and varied ranges of Timestamps. The **`date_range`** function formally creates a **`DatetimeIndex`** object (and not a list or a range) which is naturally meant to be stored as a pandas index in a Series or DataFrame. You can also assign to a column of a DataFrame and similarly as the values of a Series. The data type will be **`datetime64`**.

A couple common ways of making a range of dates are outlined below
* Supply the start and end dates along with a frequency
* Supply exactly one of start or end date and then give the number of periods and frequency

Start and end dates can be strings or datetime/Timestamp objects

In [42]:
# give start and end dates and generate each day
# notice that the type of each object is datetime64 and there are 152 days and the frequency is days
# Also notice the type of the whole range is of DatetimeIndex and not a list. pandas does some extra work to
# make the DatetimeIndex faster and more powerful than a list
pd.date_range(start="2016-01-01", end="2016-05-31", freq='D')

DatetimeIndex(['2016-01-01', '2016-01-02', '2016-01-03', '2016-01-04',
               '2016-01-05', '2016-01-06', '2016-01-07', '2016-01-08',
               '2016-01-09', '2016-01-10',
               ...
               '2016-05-22', '2016-05-23', '2016-05-24', '2016-05-25',
               '2016-05-26', '2016-05-27', '2016-05-28', '2016-05-29',
               '2016-05-30', '2016-05-31'],
              dtype='datetime64[ns]', length=152, freq='D')

In [43]:
# Do the same range except only do business days
pd.date_range(start="2016-01-01", end="2016-05-31", freq='B')

DatetimeIndex(['2016-01-01', '2016-01-04', '2016-01-05', '2016-01-06',
               '2016-01-07', '2016-01-08', '2016-01-11', '2016-01-12',
               '2016-01-13', '2016-01-14',
               ...
               '2016-05-18', '2016-05-19', '2016-05-20', '2016-05-23',
               '2016-05-24', '2016-05-25', '2016-05-26', '2016-05-27',
               '2016-05-30', '2016-05-31'],
              dtype='datetime64[ns]', length=108, freq='B')

## Where are these frequencies coming from?
Those frequencies are called [offset aliases](http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases) and determine at what intervals the date range will output. The table below shows all the possible offsets and their aliases (what you will use).

<table border="1" class="docutils">
<colgroup>
<col width="13%" />
<col width="87%" />
</colgroup>
<thead valign="bottom">
<tr class="row-odd"><th class="head">Alias</th>
<th class="head">Description</th>
</tr>
</thead>
<tbody valign="top">
<tr class="row-even"><td>B</td>
<td>business day frequency</td>
</tr>
<tr class="row-odd"><td>C</td>
<td>custom business day frequency (experimental)</td>
</tr>
<tr class="row-even"><td>D</td>
<td>calendar day frequency</td>
</tr>
<tr class="row-odd"><td>W</td>
<td>weekly frequency</td>
</tr>
<tr class="row-even"><td>M</td>
<td>month end frequency</td>
</tr>
<tr class="row-odd"><td>BM</td>
<td>business month end frequency</td>
</tr>
<tr class="row-even"><td>CBM</td>
<td>custom business month end frequency</td>
</tr>
<tr class="row-odd"><td>MS</td>
<td>month start frequency</td>
</tr>
<tr class="row-even"><td>BMS</td>
<td>business month start frequency</td>
</tr>
<tr class="row-odd"><td>CBMS</td>
<td>custom business month start frequency</td>
</tr>
<tr class="row-even"><td>Q</td>
<td>quarter end frequency</td>
</tr>
<tr class="row-odd"><td>BQ</td>
<td>business quarter endfrequency</td>
</tr>
<tr class="row-even"><td>QS</td>
<td>quarter start frequency</td>
</tr>
<tr class="row-odd"><td>BQS</td>
<td>business quarter start frequency</td>
</tr>
<tr class="row-even"><td>A</td>
<td>year end frequency</td>
</tr>
<tr class="row-odd"><td>BA</td>
<td>business year end frequency</td>
</tr>
<tr class="row-even"><td>AS</td>
<td>year start frequency</td>
</tr>
<tr class="row-odd"><td>BAS</td>
<td>business year start frequency</td>
</tr>
<tr class="row-even"><td>BH</td>
<td>business hour frequency</td>
</tr>
<tr class="row-odd"><td>H</td>
<td>hourly frequency</td>
</tr>
<tr class="row-even"><td>T, min</td>
<td>minutely frequency</td>
</tr>
<tr class="row-odd"><td>S</td>
<td>secondly frequency</td>
</tr>
<tr class="row-even"><td>L, ms</td>
<td>milliseconds</td>
</tr>
<tr class="row-odd"><td>U, us</td>
<td>microseconds</td>
</tr>
<tr class="row-even"><td>N</td>
<td>nanoseconds</td>
</tr>
</tbody>
</table>

In [44]:
# Do the same thing except for business end of month
# only 5 months here
pd.date_range(start="2016-01-01", end="2016-05-31", freq='BM')

DatetimeIndex(['2016-01-29', '2016-02-29', '2016-03-31', '2016-04-29',
               '2016-05-31'],
              dtype='datetime64[ns]', freq='BM')

In [45]:
# use the periods argument to specify how many dates you want.
# Specify only one of either start or end date
pd.date_range(start="2016-01-10", periods=10, freq='W')

DatetimeIndex(['2016-01-10', '2016-01-17', '2016-01-24', '2016-01-31',
               '2016-02-07', '2016-02-14', '2016-02-21', '2016-02-28',
               '2016-03-06', '2016-03-13'],
              dtype='datetime64[ns]', freq='W-SUN')

## An actual time series
So far we have been playing around with standard Python and pandas date functionality without any data attached to the dates. We will now begin to explore slicing a Series of time series data.

In [46]:
# get a year and a half of business day data
idx = pd.date_range(start='2014-01-01', end='2015-06-30', freq='B')
s = pd.Series(np.random.rand(len(idx)), idx)

In [47]:
s.head(15)

2014-01-01    0.400304
2014-01-02    0.782518
2014-01-03    0.659919
2014-01-06    0.919413
2014-01-07    0.005965
2014-01-08    0.722603
2014-01-09    0.451332
2014-01-10    0.294067
2014-01-13    0.619016
2014-01-14    0.914934
2014-01-15    0.574649
2014-01-16    0.542240
2014-01-17    0.752337
2014-01-20    0.667220
2014-01-21    0.831314
Freq: B, dtype: float64