# Notebook Instructions

1. If you are new to Jupyter notebooks, please go through this introductory manual <a href='https://quantra.quantinsti.com/quantra-notebook' target="_blank">here</a>.
1. Any changes made in this notebook would be lost after you close the browser window. **You can download the notebook to save your work on your PC.**
1. Before running this notebook on your local PC:<br>
i.  You need to set up a Python environment and the relevant packages on your local PC. To do so, go through the section on "**Run Codes Locally on Your Machine**" in the course.<br>
ii. You need to **download the zip file available in the last unit** of this course. The zip file contains the data files and/or python modules that might be required to run this notebook.

## Minute Price Data & Resampling Techniques

So far you have learnt how to download the data points for every day. But sometimes you might need more granularity to test your strategies like a data point for each hour, every 30 minutes or even each minute. In this notebook, you will learn how to download minute level data and how to resample them into different time frames such as 15 minutes and 1 hour. An important point to note here is, you can resample high frequency data to low frequency data, but not the other way round.

You will perform the following steps:
1. [Download Minute Data](#minute-data)
2. [Resample Data](#resample-data)

Note: At times, downloading the data may give you errors due to changes in yahoo finance. 

In such cases, it is recommended to upgrade the `yfinance` package using `pip install --upgrade yfinance`. You can update the 'cell type' for the below from 'Raw NBConvert' to a 'Code' cell. Then restart the kernel and run all of the following cells again.

## Import Libraries

In [1]:
# To fetch financial data
import yfinance as yf

# For visualisation
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('seaborn-v0_8-darkgrid')

# Ignore warnings
import warnings 
warnings.filterwarnings('ignore')

<a id='minute-data'></a> 
## Download Minute Data

The `download` method of `yfinance` has parameters `period` and `interval`. You can play around with these parameters to download data for different periods and intervals.

You can download the minute data for up to seven days from Yahoo! Finance. The syntax for downloading the minute data of an asset for 5 days is as below:
```python
yf.download(tickers, period="5d", interval="1m", auto_adjust=True)
```

Parameters:
1. **ticker:** Ticker of the asset.
2. **period:** This is the number of days/month of data required. The valid frequencies are `1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, 10y, ytd, max`.
3. **interval:** This is the frequency of data. The valid intervals are `1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, 3mo`.
4. **auto_adjust:** `True` to download adjusted data, else `False`.

In [2]:
# Download the minute data for Apple
apple_minute_data = yf.download(tickers="AAPL", period="5d", interval="1m", auto_adjust=True,multi_level_index=False)

# Display the first 5 rows
apple_minute_data.head()

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Close,High,Low,Open,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-04-08 13:30:00+00:00,185.740005,186.899994,185.369995,186.729996,4394549
2025-04-08 13:31:00+00:00,186.485001,186.850006,185.520004,185.720001,794782
2025-04-08 13:32:00+00:00,186.809998,187.520004,186.479996,186.485001,736933
2025-04-08 13:33:00+00:00,186.25,187.160004,186.059998,186.800003,616096
2025-04-08 13:34:00+00:00,186.309998,186.888901,186.0401,186.252396,641008


<a id='resample-data'></a> 
## Resample Data

During strategy modelling, you might be required to work with a custom frequency of stock market data such as 15 minutes or 1 hour or even 1 month. If you have minute level data, then you can easily construct the 15 minutes, 1 hour or daily candles by resampling them. Thus, you don't have to buy them separately.

In this case, you can use the pandas `resample()` method to convert the stock data to the frequency of your choice.

The first step is to define the dictionary with the conversion logic. For example, to get the open value the first value will be used, to get the high value the maximum value will be used and so on. The names `Open`, `High`, `Low`, `Close` and `Volume` should match the column names in your dataframe.

In [3]:
# Aggregate function
ohlcv_dict = {'Open': 'first',
              'High': 'max',
              'Low': 'min',
              'Close': 'last',
              'Volume': 'sum'
             }

You can now use the `resample()` method to resample the data to the desired frequency.

Syntax:
```python
DataFrame.resample(interval).agg(aggregate)
```

Parameters:
1. **interval:** Resampling interval such as 15T for 15 minutes (H is for hour, D is for days, M is for months)
2. **aggregate:** Dictionary with aggregating values to be used while resampling

Returns: <br>
Resampled dataframe

### Resample minute data to 15 minutes data

In [4]:
# Resample data to 15 minutes data
apple_minute_data_15M = apple_minute_data.resample('15T').agg(ohlcv_dict)

# Drop the missing values
apple_minute_data_15M.dropna(inplace=True)

# Display the first 5 rows
apple_minute_data_15M.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-04-08 13:30:00+00:00,186.729996,187.520004,185.100006,186.330002,11828895
2025-04-08 13:45:00+00:00,186.997498,188.559998,185.257294,188.0,5180120
2025-04-08 14:00:00+00:00,188.0,189.570007,187.779999,189.285004,4673625
2025-04-08 14:15:00+00:00,189.270004,190.335007,189.220001,189.330002,3654037
2025-04-08 14:30:00+00:00,189.320007,189.929993,188.600006,188.630005,2533728


### Resample minute data to 1 hour data

In [5]:
# Resample data to 1 hour data
apple_minute_data_1H = apple_minute_data.resample('1H').agg(ohlcv_dict)

# Drop the missing values
apple_minute_data_1H.dropna(inplace=True)

# Display the first 5 rows
apple_minute_data_1H.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-04-08 13:00:00+00:00,186.729996,188.559998,185.100006,188.0,17009015
2025-04-08 14:00:00+00:00,188.0,190.335007,186.179993,186.199997,14173356
2025-04-08 15:00:00+00:00,186.214996,186.940002,183.029999,184.470001,9988049
2025-04-08 16:00:00+00:00,184.470001,184.868195,178.940903,179.168503,11863108
2025-04-08 17:00:00+00:00,179.149994,179.600006,174.850006,178.145004,14867061


### Resample minute data to 4 hours data

In [6]:
# Resample data to 4 hours data
apple_minute_data_4H = apple_minute_data.resample('4H').agg(ohlcv_dict)

# Drop the missing values
apple_minute_data_4H.dropna(inplace=True)

# Display the first 5 rows
apple_minute_data_4H.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2025-04-08 12:00:00+00:00,186.729996,190.335007,183.029999,184.470001,41170420
2025-04-08 16:00:00+00:00,184.470001,184.868195,169.210098,172.770004,62287294
2025-04-09 12:00:00+00:00,172.179001,180.5,171.889999,178.729996,54801593
2025-04-09 16:00:00+00:00,178.710007,200.610001,177.875,198.850006,102814911
2025-04-10 12:00:00+00:00,189.164993,194.779907,186.449997,186.649902,45434357


## Tweak the code

You can tweak the code in the following ways:

1. Use different asset other than the `AAPL` of your choice and download the data.
2. Use a different time interval to resample the data.
<br><br>