# Advanced Certification Program in Computational Data Science
## A program by IISc and TalentSprint
### Mini Project: Stock Trading using DRL

**DISCLAIMER:** THIS NOTEBOOK IS PROVIDED ONLY AS A REFERENCE SOLUTION NOTEBOOK FOR THE MINI-PROJECT. THERE MAY BE OTHER POSSIBLE APPROACHES/METHODS TO ACHIEVE THE SAME RESULTS.

## Learning Objectives

At the end of the experiment, you will be able to

* perform stock trading using Deep Reinforcement Learning
* build an environment for agent and perform stock trading
* experiment with SAC model and improve the reward
* create a dashboard for stock trading using `jupyter-dash`

## Information

Deep reinforcement learning combines artificial neural networks with a framework of reinforcement learning that helps software agents learn how to reach their goals. That is, it unites function approximation and target optimization, mapping states and actions to the rewards they lead to.

Reinforcement learning refers to goal-oriented algorithms, which learn how to achieve a complex objective (goal) or how to maximize along a particular dimension over many steps; for example, they can maximize the points won in a game over many moves. Reinforcement learning algorithms can start from a blank slate, and under the right conditions, achieve superhuman performance. Like a pet incentivized by scolding and treats, these algorithms are penalized when they make the wrong decisions and rewarded when they make the right ones – this is reinforcement.

![img](https://miro.medium.com/max/974/0*NgZ_bq_nUOq73jK_.png)

**SAC:** Soft Actor Critic is defined for RL tasks involving continuous actions. The biggest feature of SAC is that it uses a modified RL objective function. Instead of only seeking to maximize the lifetime rewards, SAC seeks to also maximize the entropy of the policy. The term ‘entropy’ has a rather esoteric definition and many interpretations depending on the application

![img](https://miro.medium.com/max/353/0*5Y3SzMyOQZBRUhrh.png)

Fig: Actor-Critic architecture. Source: Medium

Learning of the actor is based policy gradient approach and critic is learned in value-based fashion. In SAC, there are three networks: the first network represents state-value(V) parameterised by ψ, the second one is a policy function that parameterised by ϕ, and the last one represents soft q function parameterised by θ.

Read More about SAC [here](https://arxiv.org/abs/1801.01290)

## Dataset

### Dataset Description

The Dataset chosen for this mini project is NIFTY50 Stock tradings. The data is the price history and trading volumes of the fifty stocks in the index NIFTY 50 from NSE (National Stock Exchange) India. All stocks are at a day-level with pricing and trading values split across. The NIFTY 50 is a diversified 50 stock index accounting for 13 sectors of the economy.

See the stock indexes of NIFTY 50 in the following [link](https://www1.nseindia.com/products/content/equities/indices/nifty_50.htm)

**Note:** Choose the NIFTY 50 ticker symbols and download the stock data from '2009-01-01' to '2021-09-01' using YahooDownloader

In [None]:
#@title Install FinRL
!pip -qq install git+https://github.com/AI4Finance-LLC/FinRL-Library.git

In [None]:
#@title Download the data
!wget -qq https://cdn.iisc.talentsprint.com/CDS/MiniProjects/nifty50list.csv

### Import required packages

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
matplotlib.use('Agg')
import warnings
warnings.filterwarnings("ignore")
import datetime
import os
from finrl.apps import config
from finrl.neo_finrl.preprocessor.yahoodownloader import YahooDownloader
from finrl.neo_finrl.preprocessor.preprocessors import FeatureEngineer, data_split
from finrl.neo_finrl.env_stock_trading.env_stocktrading import StockTradingEnv
from finrl.drl_agents.stablebaselines3.models import DRLAgent
from finrl.plot import backtest_stats, backtest_plot, get_daily_return, get_baseline
import sys
sys.path.append("../FinRL-Library")

### Data Loading

* Read the ticker symbols of Nifty 50 and add `.NS` extension

* Using the symbols download the stock prices data using YahooDownloader

In [None]:
nif = pd.read_csv("/content/nifty50list.csv")
nif.Symbol = nif.Symbol + ".NS"
nif.Symbol.values

In [None]:
# Download and save the data in a pandas DataFrame:
df = YahooDownloader(start_date = '2009-01-01',
                          end_date = '2021-09-01',
                          ticker_list = nif.Symbol.values).fetch_data()

print(df.sort_values(['date','tic'],ignore_index=True).head())

In [None]:
df.tic.nunique()

In [None]:
df.date.min(), df.date.max()

In [None]:
df.dtypes

### Preprocess Data

FinRL uses a `FeatureEngineer` class to preprocess data. Some of the technical indicators used in the analysis of financial markets includes.

1. relative strength index (RSI): it represents the size of recent gains and losses, during a specified time period.
2. moving average convergence divergence (MACD):  it is an indicator to determine price momentum and short term trend.
3. commodity channel index (CCI): its an indicator helps in identifying cyclical trends.
4. directional index (DX): it represents group of directional movements that form trading system

* Configure the technical indicators and apply feature engineering

  Hint: `FeatureEngineer()`

In [None]:
# Perform Feature Engineering
tech_indicator_list = config.TECHNICAL_INDICATORS_LIST
print(tech_indicator_list)

In [None]:
df.shape

In [None]:
fe = FeatureEngineer(
                    use_technical_indicator=True,
                    tech_indicator_list = tech_indicator_list,
                    use_turbulence=False,
                    user_defined_feature = False)

df = fe.preprocess_data(df)
df.head()

In [None]:
df.shape

### Exploratory Data Analysis

#### Describe the statistics of the data

In [None]:
# max and min
df.close.max(), df.close.min()

In [None]:
# mean of close
df.close.mean()

In [None]:
# stock with min closing value
df[df.close == df.close.min()]

#### Find how many times did the stock prices end lower than their opening prices in 2019 vs in 2020?

Hint: Open - Close per day

In [None]:
df19 = df[(df.date >'2018-12-31')&(df.date<'2020-01-01')]
df20 = df[(df.date >'2019-12-31')&(df.date<'2021-01-01')]
price_decrease19 = len(df19[df19.open > df19.close])
price_decrease20 = len(df20[df20.open > df20.close])
price_decrease19, price_decrease20

#### Find the loss percentage of each stock considering open and closing prices of each day

**Hint:** `sum(open - close) / len(instances)`

In [None]:
# loss percentage for each stocks
price_decrease = df[df.open > df.close]
for sym in set(price_decrease.tic):
  symbol_price_decrease = price_decrease[price_decrease.tic == sym]
  loss_pct = sum(symbol_price_decrease['open'] - symbol_price_decrease['close']) / len(symbol_price_decrease)
  print("symbol is {}, and loss percentage is {}".format(sym,loss_pct))

In [None]:
# loss percentage
price_decrease = df[df.open > df.close]
print("No.of instances of price decrease: {} out of {}".format(len(price_decrease),len(df)))
loss = sum(price_decrease['open'] - price_decrease['close']) / len(price_decrease)
loss

#### Find the stock that shows the highest increase in stock price per day, over the entire time period

In [None]:
stock_high = df.copy()
stock_high['gain'] = stock_high['close'] - stock_high['open']
stock_high[stock_high['gain'] == stock_high['gain'].max()]

#### Top 10 Stocks with high volume

In [None]:
vol = df[['tic','volume']]
vol.groupby('tic').sum('volume').sort_values('volume',ascending=False).head(10)

#### Plot the closing value of stock with highest volume and returns

In [None]:
%matplotlib inline

mrf = df[df.tic =="MRF.NS"]
mrf.set_index('date',inplace=True,drop=False)
mrf = mrf["2021-06-01":"2021-08-30"]
plt.figure(figsize=(20,8))
plt.plot('date','close',data=mrf)
plt.xlabel('Date')
plt.ylabel('Close Price')
plt.xticks(rotation=45)
plt.title("MRF stock closing value")
plt.show()

#### Daily Returns of the stocks

* Apply pct_change() and extract daily returns

* Plot the histogram of daily returns

* Find the stock with maximum daily return

In [None]:
df1 = df.copy()
df1['Daily Lag'] = df1['close'].shift(1)
df1['Daily Returns'] = (df1['Daily Lag']/df1['close']) -1
df1['Daily Returns'].hist()

In [None]:
# Maximum Daily return stock
df1[df1['Daily Returns'] == df1['Daily Returns'].max()]

### Train & Trade Data Split

In real life trading, the model needs to be updated periodically using rolling windows. Here, we just slice the data once into train and trade set.

In [None]:
# Train and trade data
train = data_split(df, start = '2009-01-02', end = '2021-01-01')
trade = data_split(df, start = '2021-01-01', end = '2021-09-01')
# Check the length of the two datasets
print(len(train))
print(len(trade))

In [None]:
train.close.max(), train.close.min()

### Build Environment

* stock_dim: (int) number of unique stocks

* hmax : (int) maximum number of shares to trade

* initial_amount: (int) start money

* transaction_cost_pct : (float) transaction cost percentage per trade

* reward_scaling: (float) scaling factor for reward, good for training

* tech_indicator_list: (list) a list of technical indicator names (modified from config.py)

In [None]:
# Compute State Space and Action Space
stock_dimension = len(train.tic.unique())
state_space = 1 + 2*stock_dimension + len(config.TECHNICAL_INDICATORS_LIST) * stock_dimension
print(f"Stock data Dimensions: {stock_dimension}, State Spaces: {state_space}")

# Initialize an environment class
env_kwargs = {
    "hmax": 50, 
    "initial_amount": 189628060, # Sum of total stocks closing value
    "buy_cost_pct": 0.001,
    "sell_cost_pct": 0.001, 
    "state_space": state_space, 
    "stock_dim": stock_dimension, 
    "tech_indicator_list": config.TECHNICAL_INDICATORS_LIST, 
    "action_space": stock_dimension, 
    "reward_scaling": 1e-4}

e_train_gym = StockTradingEnv(df = train, **env_kwargs)
env_train, _ = e_train_gym.get_sb_env()
print(type(env_train))

### Implement DRL Algorithms

Use Soft Actor-Critic (SAC) for stock trading, it is one of the most recent state-of-art algorithms. SAC is featured by its stability. 

* Define the SAC parameters and train the SAC model
* Optimize the parameters to improve the reward

In [None]:
# Train SAC Model
agent = DRLAgent(env = env_train)
SAC_PARAMS = {
    "batch_size": 128,
    "buffer_size": 100,
    "learning_rate": 0.001,
    "learning_starts": 200,
    "ent_coef": "auto_0.1"
}
model_sac = agent.get_model("sac",model_kwargs = SAC_PARAMS)
trained_sac = agent.train_model(model=model_sac, 
                             tb_log_name='sac',
                             total_timesteps=30000)

### Trading

* Build the Environment for trading
* Use the trained SAC model to trade

In [None]:
# Trade data
trade.head()

In [None]:
# Create trading env
e_trade_gym = StockTradingEnv(df = trade, **env_kwargs)

# Make prediction and get the account value change
df_account_value, df_actions = DRLAgent.DRL_prediction(model = trained_sac, environment = e_trade_gym)

In [None]:
df_account_value.head()

### Backtesting Performance **(Optional)**

Backtesting plays a key role in evaluating the performance of a trading strategy. Backtesting assesses the viability of a trading strategy by discovering how it would perform on historical data. If backtesting works, traders and analysts may have increased confidence to employ it going forward.Automated backtesting tool is preferred because it reduces the human error.

`FinRL` uses a set of functions to do the backtesting with [Quantopian pyfolio](https://github.com/quantopian/pyfolio) package. It is easy to use and consists of various individual plots that provide a comprehensive image of the performance of a trading strategy.

* Perform backtest on the account values and baseline data

In [None]:
# BackTestStats
perf_stats_all = backtest_stats(account_value = df_account_value)
perf_stats_all = pd.DataFrame(perf_stats_all)

In [None]:
# Baseline stats
baseline_df = trade
stats = backtest_stats(baseline_df, value_col_name = 'close')

In [None]:
df_account_value.date.min(),df_account_value.date.max()

### Plot the Backtest plot with baseline ticker as "^NSEI"

In [None]:
# BackTestPlot
%matplotlib inline
backtest_plot(account_value = df_account_value,     # pass the account value memory into the backtest functions
              baseline_ticker = '^NSEI',             # select a baseline ticker Dow Jones Index: ^DJI, S&P 500: ^GSPC, NASDAQ 100: ^NDX
              baseline_start = '2021-01-01', 
              baseline_end = '2021-08-31')

### DashBoard

Dash is a simple open source library. It is the original low-code framework for rapidly building data apps in Python, R, Julia, and F#. It can bind a user interface to Python code in less than 10 minutes.

Dash apps are rendered in the web browser. Since Dash apps are viewed in the web browser, Dash is inherently cross-platform and mobile ready.

Dash is released under the permissive MIT license. Plotly develops Dash and also offers a platform for writing and deploying Dash apps.

Refer to Dash Documentation [here](https://dash.plotly.com/). Mainly refer to Part 2 (Layout) and Part 3 (Basic callbacks) within the Dash tutorial in the given documentation. 

To know more about Dash, refer [here](https://medium.com/plotly/introducing-jupyterdash-811f1f57c02e).



In [None]:
# Install the library
!pip install -q jupyter-dash==0.3.0rc1 dash-bootstrap-components

In [None]:
!wget -qq https://cdn.iisc.talentsprint.com/CDS/MiniProjects/df_account_value.csv
df_account_value = pd.read_csv("df_account_value.csv")

#### Create the dashboard using Dash HTML components

Hint: [Layout](https://dash.plotly.com/layout) , [callbacks](https://dash.plotly.com/basic-callbacks)

* Scatter plot of Stock closing price 
  * Create a dropdown for ticker Symbols
  * Create a plot of stock closing price values that changes upon selecting ticker using drop down
* Bar plot of trade off balance resulted from DRL Agent 
  * Create two dropdowns for selecting start and end dates
  * Create a bar plot showing account value between start and end date as a result to dropdown change.


In [None]:
import plotly.express as px
from jupyter_dash import JupyterDash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output

# Build App
app = JupyterDash(__name__)
app.layout = html.Div([
    html.H1("Stocks Dashboard using JupyterDash"),
    html.Label([
        "ticker",
        dcc.Dropdown(
            id='ticker-dropdown', clearable=False,
            value='plasma', options=[
                {'label': c, 'value': c}
                for c in set(df.tic)
            ])
    ]),
    dcc.Graph(id='graph1'),
    html.H2("Trade off Balance Given by DRL Agent"),
    html.Label([
        "startDate",
        dcc.Dropdown(
            id='startDate-dropdown', clearable=False,
            value='plasma', options=[
                {'label': c, 'value': c}
                for c in df_account_value.date.values
            ])
    ]),
    html.Label([
        "endDate",
        dcc.Dropdown(
            id='endDate-dropdown', clearable=False,
            value='plasma', options=[
                {'label': c, 'value': c}
                for c in df_account_value.date.values
            ])
    ]),
    dcc.Graph(id='graph2'),
])

# Define callback to update graph
@app.callback(
    Output('graph1', 'figure'),
    [Input("ticker-dropdown", "value")]
)
def update_graph1(ticker):
    ticker_df = df[df.tic == ticker]
    #df_account_value = getResult_from_Agent(required)
    return px.scatter(ticker_df, x='date', y='close')

# Define callback to update graph
@app.callback(
    Output('graph2', 'figure'),
    [Input("startDate-dropdown", "value"),Input("endDate-dropdown", "value")]
)
def update_figure(startDate,endDate):
    required = df_account_value[(df_account_value.date > startDate) & (df_account_value.date < endDate)]
    #df_account_value = getResult_from_Agent(required)
    return px.bar(required, x='date', y='account_value')

In [None]:
# Run app and display result on external broswer
app.run_server(mode='external')

### Report Analysis

* Discuss on the parameters used to increase the reward
* Report the safest stocks to trade without much loss
* Comment on the Dashboard application and user interface


**References:** 

http://finrl.org/tutorial/finrl_multiple_stock.html

FinRL Doc: http://finrl.org/tutorial/finrl_single_stock.html

https://www1.nseindia.com/products/content/equities/indices/nifty_50.htm

https://finance.yahoo.com/quote/%5ENSEI/components/

Reference for participants: https://analyticsindiamag.com/stock-market-prediction-using-finrl/

https://medium.com/plotly/introducing-jupyterdash-811f1f57c02e