# Notebook Instructions

1. If you are new to Jupyter notebooks, please go through this introductory manual <a href='https://quantra.quantinsti.com/quantra-notebook' target="_blank">here</a>.
1. Any changes made in this notebook would be lost after you close the browser window. **You can download the notebook to save your work on your PC.**
1. Before running this notebook on your local PC:<br>
i.  You need to set up a Python environment and the relevant packages on your local PC. To do so, go through the section on "**Run Codes Locally on Your Machine**" in the course.<br>
ii. You need to **download the zip file available in the last unit** of this course. The zip file contains the data files and/or python modules that might be required to run this notebook.

## Minute Price Data & Resampling Techniques

So far you have learnt how to download the data points for every day. But sometimes you might need more granularity to test your strategies like a data point for each hour, every 30 minutes or even each minute. In this notebook, you will learn how to download minute level data and how to resample them into different time frames such as 15 minutes and 1 hour. An important point to note here is, you can resample high frequency data to low frequency data, but not the other way round.

You will perform the following steps:
1. [Download Minute Data](#minute-data)
2. [Resample Data](#resample-data)

Note: At times, downloading the data may give you errors due to changes in yahoo finance. 

In such cases, it is recommended to upgrade the `yfinance` package using `pip install --upgrade yfinance`. You can update the 'cell type' for the below from 'Raw NBConvert' to a 'Code' cell. Then restart the kernel and run all of the following cells again.

## Import Libraries

In [1]:
# To fetch financial data
import yfinance as yf

# For visualisation
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('seaborn-v0_8-darkgrid')

# Ignore warnings
import warnings 
warnings.filterwarnings('ignore')

<a id='minute-data'></a> 
## Download Minute Data

The `download` method of `yfinance` has parameters `period` and `interval`. You can play around with these parameters to download data for different periods and intervals.

You can download the minute data for up to seven days from Yahoo! Finance. The syntax for downloading the minute data of an asset for 5 days is as below:
```python
yf.download(tickers, period="5d", interval="1m", auto_adjust=True)
```

Parameters:
1. **ticker:** Ticker of the asset.
2. **period:** This is the number of days/month of data required. The valid frequencies are `1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, 10y, ytd, max`.
3. **interval:** This is the frequency of data. The valid intervals are `1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, 3mo`.
4. **auto_adjust:** `True` to download adjusted data, else `False`.

In [2]:
# Download the minute data for Apple
apple_minute_data = yf.download(tickers="AAPL", period="5d", interval="1m", auto_adjust=True)

# Display the first 5 rows
apple_minute_data.head()

[*********************100%%**********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2024-08-30 09:30:00-04:00,230.0,230.399994,229.679993,229.785004,1537267
2024-08-30 09:31:00-04:00,229.789993,229.949997,229.345001,229.660004,270139
2024-08-30 09:32:00-04:00,229.619995,230.039993,229.559998,229.800003,156875
2024-08-30 09:33:00-04:00,229.830002,230.050003,229.729996,229.990005,129943
2024-08-30 09:34:00-04:00,229.970001,230.25,229.720001,230.039993,175263


<a id='resample-data'></a> 
## Resample Data

During strategy modelling, you might be required to work with a custom frequency of stock market data such as 15 minutes or 1 hour or even 1 month. If you have minute level data, then you can easily construct the 15 minutes, 1 hour or daily candles by resampling them. Thus, you don't have to buy them separately.

In this case, you can use the pandas `resample()` method to convert the stock data to the frequency of your choice.

The first step is to define the dictionary with the conversion logic. For example, to get the open value the first value will be used, to get the high value the maximum value will be used and so on. The names `Open`, `High`, `Low`, `Close` and `Volume` should match the column names in your dataframe.

In [3]:
# Aggregate function
ohlcv_dict = {'Open': 'first',
              'High': 'max',
              'Low': 'min',
              'Close': 'last',
              'Volume': 'sum'
             }

You can now use the `resample()` method to resample the data to the desired frequency.

Syntax:
```python
DataFrame.resample(interval).agg(aggregate)
```

Parameters:
1. **interval:** Resampling interval such as 15T for 15 minutes (H is for hour, D is for days, M is for months)
2. **aggregate:** Dictionary with aggregating values to be used while resampling

Returns: <br>
Resampled dataframe

### Resample minute data to 15 minutes data

In [4]:
# Resample data to 15 minutes data
apple_minute_data_15M = apple_minute_data.resample('15T').agg(ohlcv_dict)

# Drop the missing values
apple_minute_data_15M.dropna(inplace=True)

# Display the first 5 rows
apple_minute_data_15M.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2024-08-30 09:30:00-04:00,230.0,230.399994,229.345001,229.815002,3823522
2024-08-30 09:45:00-04:00,229.830002,230.119995,228.809998,229.089996,2089964
2024-08-30 10:00:00-04:00,229.100006,229.320007,228.5401,228.776901,1757321
2024-08-30 10:15:00-04:00,228.630005,229.199997,228.600006,229.195007,1291462
2024-08-30 10:30:00-04:00,229.199997,230.009995,229.130005,229.529999,1371963


### Resample minute data to 1 hour data

In [5]:
# Resample data to 1 hour data
apple_minute_data_1H = apple_minute_data.resample('1H').agg(ohlcv_dict)

# Drop the missing values
apple_minute_data_1H.dropna(inplace=True)

# Display the first 5 rows
apple_minute_data_1H.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2024-08-30 09:00:00-04:00,230.0,230.399994,228.809998,229.089996,5913486
2024-08-30 10:00:00-04:00,229.100006,230.009995,228.5401,228.690002,5542615
2024-08-30 11:00:00-04:00,228.699997,229.0,227.550003,228.619995,4370349
2024-08-30 12:00:00-04:00,228.610001,228.875,227.479996,228.350006,3015741
2024-08-30 13:00:00-04:00,228.369995,228.699997,227.679306,227.759995,2088426


### Resample minute data to 4 hours data

In [6]:
# Resample data to 4 hours data
apple_minute_data_4H = apple_minute_data.resample('4H').agg(ohlcv_dict)

# Drop the missing values
apple_minute_data_4H.dropna(inplace=True)

# Display the first 5 rows
apple_minute_data_4H.head()

Unnamed: 0_level_0,Open,High,Low,Close,Volume
Datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2024-08-30 08:00:00-04:00,230.0,230.399994,227.550003,228.619995,15826450
2024-08-30 12:00:00-04:00,228.610001,229.850006,227.479996,228.860001,16713128
2024-09-03 08:00:00-04:00,228.238495,229.0,223.270004,223.539993,18875611
2024-09-03 12:00:00-04:00,223.550003,224.309998,221.169998,222.75,19550022
2024-09-04 08:00:00-04:00,221.449997,221.755005,217.479996,219.320007,21601707


## Tweak the code

You can tweak the code in the following ways:

1. Use different asset other than the `AAPL` of your choice and download the data.
2. Use a different time interval to resample the data.
<br><br>