# Resample Data
## Pandas Resample
You've learned about bucketing to different periods of time like Months. Let's see how it's done. We'll start with an example series of days.

In [1]:
import numpy as np
import pandas as pd

dates = pd.date_range('10/10/2018', periods=11, freq='D')
close_prices = np.arange(len(dates))

close = pd.Series(close_prices, dates)
close

2018-10-10     0
2018-10-11     1
2018-10-12     2
2018-10-13     3
2018-10-14     4
2018-10-15     5
2018-10-16     6
2018-10-17     7
2018-10-18     8
2018-10-19     9
2018-10-20    10
Freq: D, dtype: int64

Let's say we want to bucket these days into 3 day periods. To do that, we'll use the [DataFrame.resample](https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.DataFrame.resample.html) function. The first parameter in this function is a string called `rule`, which is a representation of how to resample the data. This string representation is made using an offset alias. You can find a list of them [here](http://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases). To create 3 day periods, we'll set `rule` to "3D".

In [2]:
close.resample('3D')

DatetimeIndexResampler [freq=<3 * Days>, axis=0, closed=left, label=left, convention=start, base=0]

This returns a `DatetimeIndexResampler` object. It's an intermediate object similar to the `GroupBy` object. Just like group by, it breaks the original data into groups. That means, we'll have to apply an operation to these groups. Let's make it simple and get the first element from each group.

In [3]:
close.resample('3D').first()

2018-10-10    0
2018-10-13    3
2018-10-16    6
2018-10-19    9
dtype: int64

You might notice that this is the same as `.iloc[::3]`

In [4]:
close.iloc[::3]

2018-10-10    0
2018-10-13    3
2018-10-16    6
2018-10-19    9
Freq: 3D, dtype: int64

So, why use the `resample` function instead of `.iloc[::3]` or the `groupby` function?

The `resample` function shines when handling time and/or date specific tasks. In fact, you can't use this function if the index isn't a [time-related class](https://pandas.pydata.org/pandas-docs/version/0.21/timeseries.html#overview).

In [5]:
try:
    # Attempt resample on a series without a time index
    pd.Series(close_prices).resample('W')
except TypeError:
    print('It threw a TypeError.')
else:
    print('It worked.')

It threw a TypeError.


One of the resampling tasks it can help with is resampling on periods, like weeks. Let's resample `close` from it's days frequency to weeks. We'll use the "W" offset allies, which stands for Weeks.

In [6]:
pd.DataFrame({
    'days': close,
    'weeks': close.resample('W').first()})

Unnamed: 0,days,weeks
2018-10-10,0.0,
2018-10-11,1.0,
2018-10-12,2.0,
2018-10-13,3.0,
2018-10-14,4.0,0.0
2018-10-15,5.0,
2018-10-16,6.0,
2018-10-17,7.0,
2018-10-18,8.0,
2018-10-19,9.0,


The weeks offset considers the start of a week on a Monday. Since 2018-10-10 is a Wednesday, the first group only looks at the first 5 items. There are offsets that handle more complicated problems like filtering for Holidays. For now, we'll only worry about resampling for days, weeks, months, quarters, and years. The frequency you want the data to be in, will depend on how often you'll be trading. If you're making trade decisions based on reports that come out at the end of the year, we might only care about a frequency of years or months.
## OHLC
Now that you've seen how Pandas resamples time series data, we can apply this to Open, High, Low, and Close (OHLC). Pandas provides the [`Resampler.ohlc`](https://pandas.pydata.org/pandas-docs/version/0.21.0/generated/pandas.core.resample.Resampler.ohlc.html#pandas.core.resample.Resampler.ohlc) function will convert any resampling frequency to OHLC data. Let's get the Weekly OHLC.

In [7]:
close.resample('W').ohlc()

Unnamed: 0,open,high,low,close
2018-10-14,0,4,0,4
2018-10-21,5,10,5,10


Can you spot a potential problem with that? It has to do with resampling data that has already been resampled.

We're getting the OHLC from close data. If we want OHLC data from already resampled data, we should resample the first price from the open data, resample the highest price from the high data, etc..

To get the weekly closing prices from `close`, you can use the [`Resampler.last`](https://pandas.pydata.org/pandas-docs/version/0.21.0/generated/pandas.core.resample.Resampler.last.html#pandas.core.resample.Resampler.last) function.

In [8]:
close.resample('W').last()

2018-10-14     4
2018-10-21    10
Freq: W-SUN, dtype: int64

## Quiz
Implement `days_to_weeks` function to resample OHLC price data to weekly OHLC price data. You find find more Resampler functions [here](https://pandas.pydata.org/pandas-docs/version/0.21.0/api.html#id44) for calculating high and low prices.

In [9]:
import quiz_tests


def days_to_weeks(open_prices, high_prices, low_prices, close_prices):
    """Converts daily OHLC prices to weekly OHLC prices.
    
    Parameters
    ----------
    open_prices : DataFrame
        Daily open prices for each ticker and date
    high_prices : DataFrame
        Daily high prices for each ticker and date
    low_prices : DataFrame
        Daily low prices for each ticker and date
    close_prices : DataFrame
        Daily close prices for each ticker and date

    Returns
    -------
    open_prices_weekly : DataFrame
        Weekly open prices for each ticker and date
    high_prices_weekly : DataFrame
        Weekly high prices for each ticker and date
    low_prices_weekly : DataFrame
        Weekly low prices for each ticker and date
    close_prices_weekly : DataFrame
        Weekly close prices for each ticker and date
    """
    
    # TODO: Implement Function
    open_prices_weekly = open_prices.resample('W').first()
    high_prices_weekly = high_prices.resample('W').max()
    low_prices_weekly = open_prices.resample('W').min()
    close_prices_weekly = open_prices.resample('W').last()
    
    return open_prices_weekly, high_prices_weekly, low_prices_weekly, close_prices_weekly


quiz_tests.test_days_to_weeks(days_to_weeks)

AssertionError: Wrong value for days_to_weeks.

INPUT open_prices:
            GCZG  SDAX  ZDTQ
2018-10-10    24    21    43
2018-10-11    14    22    41
2018-10-12    29    23    44
2018-10-13    44    14    13
2018-10-14    31    28    34
2018-10-15    36    49    27
2018-10-16    48    20    46
2018-10-17    48    37    27
2018-10-18    16    42    22
2018-10-19    23    36    32
2018-10-20    13    31    28
2018-10-21    23    33    18
2018-10-22    14    47    45
2018-10-23    28    21    31
2018-10-24    31    36    40
2018-10-25    19    25    46
2018-10-26    30    46    48
2018-10-27    19    34    35
2018-10-28    24    13    24
2018-10-29    48    15    39
2018-10-30    16    34    14
2018-10-31    37    30    28
2018-11-01    34    24    20
2018-11-02    17    15    38
2018-11-03    44    15    22
2018-11-04    24    36    28
2018-11-05    12    41    49
2018-11-06    24    27    14

INPUT high_prices:
            GCZG  SDAX  ZDTQ
2018-10-10    48    48    43
2018-10-11    42    49    47
2018-10-12    45    47    48
2018-10-13    48    46    48
2018-10-14    49    49    46
2018-10-15    40    49    49
2018-10-16    49    44    49
2018-10-17    49    46    48
2018-10-18    46    49    49
2018-10-19    49    47    47
2018-10-20    45    49    46
2018-10-21    45    49    49
2018-10-22    49    48    48
2018-10-23    48    49    49
2018-10-24    49    49    48
2018-10-25    48    48    49
2018-10-26    48    47    48
2018-10-27    47    49    49
2018-10-28    47    49    49
2018-10-29    48    49    48
2018-10-30    49    49    47
2018-10-31    48    47    48
2018-11-01    47    48    47
2018-11-02    49    49    45
2018-11-03    49    49    49
2018-11-04    47    46    48
2018-11-05    47    47    49
2018-11-06    49    49    46

INPUT low_prices:
            GCZG  SDAX  ZDTQ
2018-10-10    12    12    13
2018-10-11    12    14    15
2018-10-12    13    14    12
2018-10-13    14    14    13
2018-10-14    12    12    14
2018-10-15    12    12    12
2018-10-16    12    12    12
2018-10-17    13    12    13
2018-10-18    12    12    13
2018-10-19    14    12    14
2018-10-20    12    12    12
2018-10-21    13    14    16
2018-10-22    14    13    13
2018-10-23    13    14    12
2018-10-24    14    12    14
2018-10-25    15    12    13
2018-10-26    12    12    12
2018-10-27    12    13    15
2018-10-28    14    12    12
2018-10-29    12    12    12
2018-10-30    12    14    13
2018-10-31    12    12    13
2018-11-01    13    14    15
2018-11-02    12    12    12
2018-11-03    12    14    12
2018-11-04    12    12    13
2018-11-05    12    12    12
2018-11-06    16    12    14

INPUT close_prices:
            GCZG  SDAX  ZDTQ
2018-10-10    27    45    15
2018-10-11    40    49    40
2018-10-12    25    26    36
2018-10-13    26    36    19
2018-10-14    25    34    46
2018-10-15    22    39    45
2018-10-16    40    14    17
2018-10-17    42    46    33
2018-10-18    35    41    49
2018-10-19    14    24    31
2018-10-20    41    18    13
2018-10-21    36    27    18
2018-10-22    16    16    45
2018-10-23    37    24    16
2018-10-24    43    40    28
2018-10-25    39    29    45
2018-10-26    38    20    43
2018-10-27    44    13    34
2018-10-28    23    17    47
2018-10-29    25    14    38
2018-10-30    48    44    23
2018-10-31    37    24    33
2018-11-01    40    28    17
2018-11-02    31    12    44
2018-11-03    29    40    49
2018-11-04    18    30    13
2018-11-05    27    16    47
2018-11-06    31    32    14

OUTPUT open_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    24    21    43
2018-10-21    36    49    27
2018-10-28    14    47    45
2018-11-04    48    15    39
2018-11-11    12    41    49

OUTPUT high_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    44    28    44
2018-10-21    48    49    46
2018-10-28    31    47    48
2018-11-04    48    36    39
2018-11-11    24    41    49

OUTPUT low_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    14    14    13
2018-10-21    13    20    18
2018-10-28    14    13    24
2018-11-04    16    15    14
2018-11-11    12    27    14

OUTPUT close_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    31    28    34
2018-10-21    23    33    18
2018-10-28    24    13    24
2018-11-04    24    36    28
2018-11-11    24    27    14

EXPECTED OUTPUT FOR open_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    24    21    43
2018-10-21    36    49    27
2018-10-28    14    47    45
2018-11-04    48    15    39
2018-11-11    12    41    49

EXPECTED OUTPUT FOR high_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    49    49    48
2018-10-21    49    49    49
2018-10-28    49    49    49
2018-11-04    49    49    49
2018-11-11    49    49    49

EXPECTED OUTPUT FOR low_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    12    12    12
2018-10-21    12    12    12
2018-10-28    12    12    12
2018-11-04    12    12    12
2018-11-11    12    12    12

EXPECTED OUTPUT FOR close_prices_weekly:
            GCZG  SDAX  ZDTQ
2018-10-14    25    34    46
2018-10-21    36    27    18
2018-10-28    23    17    47
2018-11-04    18    30    13
2018-11-11    31    32    14


## Quiz Solution
If you're having trouble, you can check out the quiz solution [here](resample_data_solution.ipynb).