<a href="https://colab.research.google.com/github/wfwan/github-slideshow/blob/master/Section2_Unit3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Weekly Price Data & Resampling Techniques

So far you have learnt how to download the data points for every day. But sometimes you might need more granularity to test your strategies like a data point for each hour, every 30 minutes or even each minute. In this notebook, you will learn how to download minute level data and how to resample them into different time frames such as 15 minutes and 1 hour. An important point to note here is, you can resample high frequency data to low frequency data, but not the other way round.

You will perform the following steps:
1. Download Minute Data
2. Resample Data

# Import Libraries

In [2]:
# For data manipulation
import pandas as pd

# To fetch financial data
!pip install yfinance
import yfinance as yf

# For visualisation
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('seaborn-darkgrid')

Collecting yfinance
  Downloading https://files.pythonhosted.org/packages/5e/4e/88d31f5509edcbc51bcbb7eeae72516b17ada1bc2ad5b496e2d05d62c696/yfinance-0.1.60.tar.gz
Collecting lxml>=4.5.1
[?25l  Downloading https://files.pythonhosted.org/packages/30/c0/d0526314971fc661b083ab135747dc68446a3022686da8c16d25fcf6ef07/lxml-4.6.3-cp37-cp37m-manylinux2014_x86_64.whl (6.3MB)
[K     |████████████████████████████████| 6.3MB 8.1MB/s 
Building wheels for collected packages: yfinance
  Building wheel for yfinance (setup.py) ... [?25l[?25hdone
  Created wheel for yfinance: filename=yfinance-0.1.60-py2.py3-none-any.whl size=23819 sha256=7568d8d0b1f52060874cdabec1e17ed202de6bbba4fd12bb6224e9762b27eeaa
  Stored in directory: /root/.cache/pip/wheels/f0/be/a4/846f02c5985562250917b0ab7b33fff737c8e6e8cd5209aa3b
Successfully built yfinance
Installing collected packages: lxml, yfinance
  Found existing installation: lxml 4.2.6
    Uninstalling lxml-4.2.6:
      Successfully uninstalled lxml-4.2.6
Successfu

# Download Weekly Data

The `download` method of yfinance has parameters `period` and `interval`. You can play around with these parameters to download data for different periods and intervals.

You can download the minute data for up to seven days from Yahoo! Finance. The syntax for downloading the minute data of an asset for 5 days is as below:

yf.download(tickers, period="5d", interval="1m", auto_adjust=True)

Parameters:
1. ticker: Ticker of the asset
2. period: This is the number of days/month of data required. The valid frequencies are 1d, 5d, 1mo, 3mo, 1y, 2y, 5y, ytd, max.
3. interval: This is the frequency of data. The valid intervals are 1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, 3mo.
4. auto_adjust: True to download adjusted data, else False.

In [16]:
# Download the weekly data for Top Glove
topglove_weekly_data = yf.download(tickers="7113.KL", start="2021-05-23", end="2021-07-03", interval="1d", auto_adjust=False)

# Set the index to a datetime object
topglove_weekly_data.index = pd.to_datetime(topglove_weekly_data.index)

# Display the first 5 rows
topglove_weekly_data.head()

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2021-05-24,5.3,5.34,5.23,5.26,5.200047,8666000
2021-05-25,5.26,5.27,5.15,5.2,5.140731,17405900
2021-05-27,5.21,5.29,5.11,5.12,5.061643,50654100
2021-05-28,5.13,5.17,5.07,5.09,5.031985,13000300
2021-05-31,5.09,5.2,5.09,5.18,5.120959,7125700


# Resample Data

During strategy modelling, you might be required to work with a custom frequency of stock market data such as 15 minutes or 1 hour or even 1 month. If you have minute level data, then you can easily construct the 15 minutes, 1 hour or daily candles by resampling them. Thus, you don't have to buy them seperately.

In this case, you can use the pandas `resample()` method to convert the stock data to the frequency of your choice.

The first step is to define the dictionary with the conversion logic. For example, to get the open value the first value will be used, to get the high value the maximum value will be used and so on. The name Open, High, Low, Close and Volume should match the column names in your data frame.

In [7]:
# Aggregate function
ohlcv_dict = {'Open': 'first',
              'High': 'max',
              'Low': 'min',
              'Close': 'last',
              'Volume': 'sum'
              }

You can now use the `resample()` method to resample the data to the desired frequency.

Syntax:

DataFrame.resample(interval).agg(aggregate)

Parameters:
1. interval: Resampling interval such as 15T for 15 minutes (H is for hour, D is for days, W is for weeks, M is for months)
2. aggregate: Dictionary with aggregating values to be used while resampling

Returns:
Resampled dataframe

# Resample minute data to 1 week data

In [17]:
# Resample data to 1 week data frame
topglove_weekly_data_1W = topglove_weekly_data.resample('1W').agg(ohlcv_dict)

# Drop the missing values
topglove_weekly_data_1W.dropna(inplace=True)

# Display the first 5 rows
topglove_weekly_data_1W.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2021-05-30,5.3,5.34,5.07,5.09,89726300
2021-06-06,5.09,5.2,4.88,4.9,71006100
2021-06-13,4.89,4.95,4.7,4.72,109615400
2021-06-20,4.98,5.09,4.61,4.7,142950500
2021-06-27,4.71,4.73,4.31,4.35,86808900


# Tweak the code

You can tweak the code in the following ways:
1. Use different asset other than Top Glove of your choice and download the data.
2. Use a different time interval to resample the data.