<a href="https://colab.research.google.com/github/srilav/neuralnetwork/blob/main/M4_MP5_NB_Stock_Trading_using_DRL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advanced Certification Program in Computational Data Science
## A program by IISc and TalentSprint
### Mini Project: Stock Trading using DRL

## Learning Objectives

At the end of the experiment, you will be able to

* perform stock trading using Deep Reinforcement Learning
* build an environment for agent and perform stock trading
* experiment with SAC model and improve the reward
* create a dashboard for stock trading using `jupyter-dash`

## Information

Deep reinforcement learning combines artificial neural networks with a framework of reinforcement learning that helps software agents learn how to reach their goals. That is, it unites function approximation and target optimization, mapping states and actions to the rewards they lead to.

Reinforcement learning refers to goal-oriented algorithms, which learn how to achieve a complex objective (goal) or how to maximize along a particular dimension over many steps; for example, they can maximize the points won in a game over many moves. Reinforcement learning algorithms can start from a blank slate, and under the right conditions, achieve superhuman performance. Like a pet incentivized by scolding and treats, these algorithms are penalized when they make the wrong decisions and rewarded when they make the right ones – this is reinforcement.

![img](https://miro.medium.com/max/974/0*NgZ_bq_nUOq73jK_.png)

**SAC:** Soft Actor Critic is defined for RL tasks involving continuous actions. The biggest feature of SAC is that it uses a modified RL objective function. Instead of only seeking to maximize the lifetime rewards, SAC seeks to also maximize the entropy of the policy. The term ‘entropy’ has a rather esoteric definition and many interpretations depending on the application

![img](https://miro.medium.com/max/353/0*5Y3SzMyOQZBRUhrh.png)

Fig: Actor-Critic architecture. Source: Medium

Learning of the actor is based policy gradient approach and critic is learned in value-based fashion. In SAC, there are three networks: the first network represents state-value(V) parameterised by ψ, the second one is a policy function that parameterised by ϕ, and the last one represents soft q function parameterised by θ.

Read More about SAC [here](https://arxiv.org/abs/1801.01290)

## Dataset

### Dataset Description

The Dataset chosen for this mini project is NIFTY50 Stock tradings. The data is the price history and trading volumes of the fifty stocks in the index NIFTY 50 from NSE (National Stock Exchange) India. All stocks are at a day-level with pricing and trading values split across. The NIFTY 50 is a diversified 50 stock index accounting for 13 sectors of the economy.

See the stock indexes of NIFTY 50 in the following [link](https://www1.nseindia.com/products/content/equities/indices/nifty_50.htm)

**Note:** Choose the NIFTY 50 ticker symbols and download the stock data from '2009-01-01' to '2021-09-01' using YahooDownloader

## Grading = 10 Points

In [2]:
#@title Install FinRL, other necessary libraries and extensions
!pip -qq install git+https://github.com/AI4Finance-LLC/FinRL-Library.git

!pip install -q jupyter-dash==0.3.0rc1 dash-bootstrap-components

!pip install pyyaml==5.4.1

!pip install macrodemos --upgrade
!pip install -q dash==2.0.0

[?25l[K     |█▍                              | 10 kB 30.7 MB/s eta 0:00:01[K     |██▉                             | 20 kB 32.8 MB/s eta 0:00:01[K     |████▏                           | 30 kB 40.0 MB/s eta 0:00:01[K     |█████▋                          | 40 kB 27.1 MB/s eta 0:00:01[K     |███████                         | 51 kB 8.3 MB/s eta 0:00:01[K     |████████▍                       | 61 kB 9.8 MB/s eta 0:00:01[K     |█████████▉                      | 71 kB 10.9 MB/s eta 0:00:01[K     |███████████▏                    | 81 kB 12.1 MB/s eta 0:00:01[K     |████████████▋                   | 92 kB 13.3 MB/s eta 0:00:01[K     |██████████████                  | 102 kB 13.5 MB/s eta 0:00:01[K     |███████████████▍                | 112 kB 13.5 MB/s eta 0:00:01[K     |████████████████▉               | 122 kB 13.5 MB/s eta 0:00:01[K     |██████████████████▏             | 133 kB 13.5 MB/s eta 0:00:01[K     |███████████████████▋            | 143 kB 13.5 MB/s eta 0:00

In [3]:
#@title Download the data
!wget -qq https://cdn.iisc.talentsprint.com/CDS/MiniProjects/nifty50list.csv

!wget -qq https://cdn.iisc.talentsprint.com/CDS/MiniProjects/df_account_value.csv

### Import required packages

In [4]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
import warnings
warnings.filterwarnings("ignore")
import datetime
import os
from finrl import config
from finrl.finrl_meta.preprocessor.yahoodownloader import YahooDownloader
from finrl.finrl_meta.preprocessor.preprocessors import FeatureEngineer, data_split
from finrl.finrl_meta.env_stock_trading.env_stocktrading import StockTradingEnv
from finrl.agents.stablebaselines3.models import DRLAgent
from finrl.plot import backtest_stats, backtest_plot, get_daily_return, get_baseline
import sys
sys.path.append("../FinRL-Library")

### Data Loading (1 point)

* Read the ticker symbols of Nifty 50 and add `.NS` extension

* Using the symbols download the stock prices data using YahooDownloader

Hint: [YahooDownloader](https://gist.githubusercontent.com/BruceYanghy/6c37022257cfe765d551c1b173570bd4/raw/b2c69214c6316ff8fa46d9b14d437ec6a1edeef2/DownloadData.py)

In [5]:
path = "/content/nifty50list.csv"
# YOUR CODE HERE

In [6]:
DOW_50_TICKER = pd.read_csv(path)

In [7]:
DOW_50_TICKER.Symbol

0            ACC
1     ABBOTINDIA
2       ADANIENT
3     ADANIGREEN
4     ADANITRANS
5          ALKEM
6      AMBUJACEM
7     APOLLOHOSP
8     AUROPHARMA
9          DMART
10    BAJAJHLDNG
11    BANDHANBNK
12    BERGEPAINT
13        BIOCON
14      BOSCHLTD
15      CADILAHC
16        COLPAL
17           DLF
18         DABUR
19          GAIL
20         GLAND
21      GODREJCP
22       HDFCAMC
23       HAVELLS
24     HINDPETRO
25       ICICIGI
26    ICICIPRULI
27           IGL
28    INDUSTOWER
29        NAUKRI
30        INDIGO
31      JUBLFOOD
32           LTI
33         LUPIN
34           MRF
35        MARICO
36    MUTHOOTFIN
37          NMDC
38      PETRONET
39    PIDILITIND
40           PEL
41          PGHH
42           PNB
43       SBICARD
44       SIEMENS
45    TORNTPHARM
46           UBL
47    MCDOWELL-N
48          VEDL
49       YESBANK
Name: Symbol, dtype: object

In [8]:
""" Download and save the data in a pandas DataFrame """

start_date = '2009-01-01'
end_date = '2021-09-01'
# Download and save the data in a pandas DataFrame:
df = YahooDownloader(start_date,
                     end_date,
                     ticker_list = DOW_50_TICKER.Symbol).fetch_data()

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed

1 Failed download:
- ABBOTINDIA: No data found, symbol may be delisted
[*********************100%***********************]  1 of 1 completed

1 Failed download:
- ADANIENT: No data found, symbol may be delisted
[*********************100%***********************]  1 of 1 completed

1 Failed download:
- ADANIGREEN: No data found, symbol may be delisted
[*********************100%***********************]  1 of 1 completed

1 Failed download:
- ADANITRANS: No data found, symbol may be delisted
[*********************100%***********************]  1 of 1 completed

1 Failed download:
- ALKEM: No data found, symbol may be delisted
[*********************100%***********************]  1 of 1 completed

1 Failed download:
- AMBUJACEM: No data found, symbol may be delisted
[*********************100%***********************]  1 of 1 completed

1 Failed download:
- AP

In [9]:
df.head()

Unnamed: 0,date,open,high,low,close,volume,tic,day
0,2009-01-02,20.57,20.76,19.52,11.407827,763300.0,ACC,4
1,2009-01-02,20.48,20.98,20.299999,20.74,888830.0,IGL,4
2,2009-01-02,104.230003,108.129997,104.230003,106.400002,2973855.0,PNB,4
3,2009-01-02,31.8818,33.472698,30.290899,31.5909,11970970.0,UBL,4
4,2009-01-05,19.77,19.77,18.66,11.186768,1191100.0,ACC,0


### Preprocess Data (1 point)

FinRL uses a `FeatureEngineer` class to preprocess data. Some of the technical indicators to be used in the analysis of financial markets includes.

1. `relative strength index` (RSI): it represents the size of recent gains and losses, during a specified time period.
2. `moving average convergence divergence` (MACD):  it is an indicator to determine price momentum and short term trend.
3. `commodity channel index` (CCI): its an indicator helps in identifying cyclical trends.
4. `directional index` (DX): it represents group of directional movements that form trading system

* Configure the technical indicators and apply feature engineering

  Hint: `FeatureEngineer()`

In [10]:
df['date'] = pd.to_datetime(df['date'],format='%Y-%m-%d')

In [11]:
df.sort_values(['date','tic'],ignore_index=True).head()

Unnamed: 0,date,open,high,low,close,volume,tic,day
0,2009-01-02,20.57,20.76,19.52,11.407827,763300.0,ACC,4
1,2009-01-02,20.48,20.98,20.299999,20.74,888830.0,IGL,4
2,2009-01-02,104.230003,108.129997,104.230003,106.400002,2973855.0,PNB,4
3,2009-01-02,31.8818,33.472698,30.290899,31.5909,11970970.0,UBL,4
4,2009-01-05,19.77,19.77,18.66,11.186768,1191100.0,ACC,0


In [12]:
""" Perform Feature Engineering """
df = FeatureEngineer(use_technical_indicator=True, use_turbulence=False).preprocess_data(df.copy())


# add covariance matrix as states
df=df.sort_values(['date','tic'],ignore_index=True)
df.index = df.date.factorize()[0]

cov_list = []
# look back is one year
lookback=252
for i in range(lookback,len(df.index.unique())):
  data_lookback = df.loc[i-lookback:i,:]
  price_lookback=data_lookback.pivot_table(index = 'date',columns = 'tic', values = 'close')
  return_lookback = price_lookback.pct_change().dropna()
  covs = return_lookback.cov().values 
  cov_list.append(covs)
  
df_cov = pd.DataFrame({'date':df.date.unique()[lookback:],'cov_list':cov_list})
df = df.merge(df_cov, on='date')
df = df.sort_values(['date','tic']).reset_index(drop=True)
df.head()

Successfully added technical indicators


Unnamed: 0,date,open,high,low,close,volume,tic,day,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,cov_list
0,2010-01-04,28.389999,28.77,27.1,16.8873,628900.0,ACC,0,0.194004,17.939631,16.249297,51.571446,37.393202,3.214785,16.939093,16.667816,[[0.0015858912972847303]]
1,2010-01-05,27.360001,27.440001,26.76,16.616022,861400.0,ACC,1,0.128163,17.948087,16.220495,49.640913,-47.587372,4.697722,16.929434,16.676236,[[0.001585361394983224]]
2,2010-01-06,26.93,27.09,26.15,16.307745,1621800.0,ACC,2,0.050525,17.949781,16.217567,47.548468,-96.729077,16.913411,16.911759,16.672719,[[0.0015689189754917656]]
3,2010-01-07,26.469999,26.950001,26.299999,16.579025,476400.0,ACC,3,0.010763,17.93076,16.255701,49.486762,-78.902058,16.913411,16.900867,16.671593,[[0.0015693420297262731]]
4,2010-01-08,26.73,26.82,26.360001,16.424885,253500.0,ACC,4,-0.032809,17.935578,16.246566,48.434704,-85.174112,16.913411,16.899428,16.673681,[[0.0015692177919720898]]


### Exploratory Data Analysis (2 points)

#### Describe the statistics of the data

In [13]:
# YOUR CODE HERE

#### Find how many times did the stock prices end lower than their opening prices in 2019 vs in 2020?

**Hint:** Open - Close per day

In [14]:
# YOUR CODE HERE

#### Find the stock that shows the highest increase in stock price per day, over the entire time period

In [15]:
# YOUR CODE HERE

#### Find the loss percentage of each stock considering open and closing prices of each day

**Hint:** `sum(open - close) / len(instances)`

In [16]:
# YOUR CODE HERE

#### Find the Top 10 Stocks with high volume

In [17]:
# YOUR CODE HERE

#### Plot the closing value of stock with highest volume and returns

In [18]:
# YOUR CODE HERE

#### Daily Returns of the stocks

* Apply pct_change() and extract daily returns

* Plot the histogram of daily returns

* Find the stock with maximum daily return

In [19]:
# YOUR CODE HERE

### Train & Trade Data Split

In real life trading, the model needs to be updated periodically using rolling windows. Here, we just slice the data once into train and trade set.

In [20]:
PATH_TO_MODEL_DIR = 'drive/MyDrive/FinRLManystock/'
print(PATH_TO_MODEL_DIR)
import os
if not os.path.exists(PATH_TO_MODEL_DIR +'saved'):
  os.makedirs(PATH_TO_MODEL_DIR+'saved')
if not os.path.exists(PATH_TO_MODEL_DIR + 'trained_model_data'):
  os.makedirs(PATH_TO_MODEL_DIR + 'trained_model_data')
if not os.path.exists(PATH_TO_MODEL_DIR + 'tensor'):
  os.makedirs(PATH_TO_MODEL_DIR + 'tensor')
if not os.path.exists(PATH_TO_MODEL_DIR + 'results'):
  os.makedirs(PATH_TO_MODEL_DIR + 'results')

drive/MyDrive/FinRLManystock/


In [21]:
train = data_split(df, start_date, end_date)
trade = data_split(df, start_date, end_date)
train.to_csv(PATH_TO_MODEL_DIR + 'saved' + '/train_MULTI.csv',index=False)
trade.to_csv(PATH_TO_MODEL_DIR + 'saved' + '/trade_MULTI.csv',index=False)

### Build Environment (1 point)


* Define the below kwargs to be used in Stock Trading Environment

  * stock_dim: (int) number of unique stocks
  * hmax : (int) maximum number of shares to trade
  * initial_amount: (int) start money
  * transaction_cost_pct : (float) transaction cost percentage per trade
  * reward_scaling: (float) scaling factor for reward, good for training
  * tech_indicator_list: (list) a list of technical indicator names (modified from config.py)

In [22]:
from finrl import config
## stockstats technical indicator column names
## check https://pypi.org/project/stockstats/ for different names
TECHNICAL_INDICATORS_LIST = ["macd","boll_ub","boll_lb","rsi_30", "cci_30", "dx_30","close_30_sma","close_60_sma"]

stock_dimension = len(train.tic.unique())
state_space = 1 + 2*stock_dimension + len(TECHNICAL_INDICATORS_LIST)*stock_dimension
#state_space = stock_dimension
#state_space = 156
print(f"Stock data Dimensions: {stock_dimension}, State Spaces: {state_space}")
env_kwargs = {
"hmax": 100,
'num_stock_shares': 50,
"initial_amount": 1000000,
#"transaction_cost_pct": 0.001,
"buy_cost_pct":0.001,
"sell_cost_pct":0.001,
"state_space": state_space,
"stock_dim": stock_dimension,
"tech_indicator_list": TECHNICAL_INDICATORS_LIST,
"action_space": stock_dimension,
"reward_scaling": 1e-4}
e_train_gym = StockTradingEnv(df = train, **env_kwargs)
env_train, _ = e_train_gym.get_sb_env()
print(type(env_train))    

Stock data Dimensions: 1, State Spaces: 11
<class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>


In [34]:
""" Build the stock trading Environment """
# YOUR CODE HERE
trade = data_split(df, start_date, end_date)
e_trade_gym = StockTradingEnv(df = trade, **env_kwargs)
env_trade, obs_trade = e_trade_gym.get_sb_env()

df_account_value, df_actions = DRLAgent.DRL_prediction(e_trade_gym, env_trade, obs_trade)

AttributeError: ignored

In [26]:
agent = DRLAgent(env = env_train)
DDPG_PARAMS = {"batch_size": 64, "buffer_size": 500000, "learning_rate": 0.0001}
model_ddpg = agent.get_model("ddpg",model_kwargs = DDPG_PARAMS)

trained_ddpg = agent.train_model(model=model_ddpg, 
                             tb_log_name='ddpg',
                             total_timesteps=30000)

{'batch_size': 64, 'buffer_size': 500000, 'learning_rate': 0.0001}
Using cuda device


TypeError: ignored

In [1]:
agent = DRLAgent(env = env_train)
PPO_PARAMS = {
    "n_steps": 2048,
    "ent_coef": 0.005,
    "learning_rate": 0.0001,
    "batch_size": 128,}
model_ppo = agent.get_model("ppo",model_kwargs = PPO_PARAMS)
trained_ppo = agent.train_model(model=model_ppo, 
                             tb_log_name='ppo',
                             total_timesteps=80000)

NameError: ignored

In [57]:
#Model: A2C
agent = DRLAgent(env = env_train)
A2C_PARAMS = {"n_steps": 5, "ent_coef": 0.005, "learning_rate": 0.0002}
model_a2c = agent.get_model(model_name="a2c",model_kwargs = A2C_PARAMS)
trained_a2c = agent.train_model(model=model_a2c, 
                                tb_log_name='a2c',
                                total_timesteps=50000)

{'n_steps': 5, 'ent_coef': 0.005, 'learning_rate': 0.0002}
Using cpu device


TypeError: ignored

In [56]:
agent = DRLAgent(env = env_train)
A2C_PARAMS = {"n_steps": 5, "ent_coef": 0.005, "learning_rate": 0.0002}
model_a2c = agent.get_model(model_name="a2c",model_kwargs = A2C_PARAMS)
trained_a2c = agent.train_model(model_a2c,"a2c")

{'n_steps': 5, 'ent_coef': 0.005, 'learning_rate': 0.0002}
Using cpu device


TypeError: ignored

### Implement DRL Algorithm (2 points)

Use Soft Actor-Critic (SAC) for stock trading, it is one of the most recent state-of-art algorithms. SAC is featured by its stability. 

* Define the SAC parameters and train the SAC model
* Optimize the parameters to improve the reward

In [None]:
""" Train SAC Model """
# YOUR CODE HERE

#### Optional: Implement other DRL Algorithms

### Trading (1 point)

* Build the Environment for trading
* Use the trained SAC model to trade

In [None]:
""" Create trading env and make prediction and get the account value change """
# YOUR CODE HERE

### Backtesting Performance (Optional)

Backtesting plays a key role in evaluating the performance of a trading strategy. Backtesting assesses the viability of a trading strategy by discovering how it would perform on historical data. If backtesting works, traders and analysts may have increased confidence to employ it going forward.Automated backtesting tool is preferred because it reduces the human error.

`FinRL` uses a set of functions to do the backtesting with [Quantopian pyfolio](https://github.com/quantopian/pyfolio) package. It is easy to use and consists of various individual plots that provide a comprehensive image of the performance of a trading strategy.

* Perform backtest on the account values and baseline data

In [None]:
""" BackTest Stats """
# YOUR CODE HERE

#### Plot the Backtest plot with baseline ticker as "^NSEI"

In [None]:
""" BackTest Plot """
# YOUR CODE HERE

### DashBoard (2 points)

Dash is a simple open source library. It is the original low-code framework for rapidly building data apps in Python, R, Julia, and F#. It can bind a user interface to Python code in less than 10 minutes.

Dash apps are rendered in the web browser. Since Dash apps are viewed in the web browser, Dash is inherently cross-platform and mobile ready.

Dash is released under the permissive MIT license. Plotly develops Dash and also offers a platform for writing and deploying Dash apps.

Refer to Dash Documentation [here](https://dash.plotly.com/). Mainly refer to Part 2 (Layout) and Part 3 (Basic callbacks) within the Dash tutorial in the given documentation. 

To know more about Dash, refer [here](https://medium.com/plotly/introducing-jupyterdash-811f1f57c02e).



#### Create the dashboard using Dash HTML components

Hint: [Layout](https://dash.plotly.com/layout) , [callbacks](https://dash.plotly.com/basic-callbacks)

* Scatter plot of Stock closing price 
  * Create a dropdown for ticker Symbols
  * Create a plot of stock closing price values that changes upon selecting ticker using drop down
* Bar plot of trade off balance resulted from DRL Agent 
  * Create two dropdowns for selecting start and end dates
  * Create a bar plot showing account value between start and end date as a result to dropdown change.


In [None]:
import plotly.express as px
from jupyter_dash import JupyterDash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output

In [None]:
""" Build App """
app = JupyterDash(__name__)
# YOUR CODE HERE

In [None]:
# Run app and display result on external broswer
app.run_server(mode='external')

### Report Analysis

* Discuss on the parameters used to increase the reward
* Report the safest stocks to trade without much loss
* Comment on the Dashboard application and user interface
