<a href="https://colab.research.google.com/github/AI4Finance-Foundation/FinRL/blob/master/tutorials/1-Introduction/FinRL_PortfolioAllocation_NeurIPS_2020.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep Reinforcement Learning for Stock Trading from Scratch: Portfolio Allocation

Tutorials to use OpenAI DRL to perform portfolio allocation in one Jupyter Notebook | Presented at NeurIPS 2020: Deep RL Workshop

* This blog is based on our paper: FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance, presented at NeurIPS 2020: Deep RL Workshop.
* Check out medium blog for detailed explanations: 
* Please report any issues to our Github: https://github.com/AI4Finance-Foundation/FinRL/issues
* **Pytorch Version** 



# Content

* [1. Problem Definition](#0)
* [2. Getting Started - Load Python packages](#1)
    * [2.1. Install Packages](#1.1)    
    * [2.2. Check Additional Packages](#1.2)
    * [2.3. Import Packages](#1.3)
    * [2.4. Create Folders](#1.4)
* [3. Download Data](#2)
* [4. Preprocess Data](#3)        
    * [4.1. Technical Indicators](#3.1)
    * [4.2. Perform Feature Engineering](#3.2)
* [5.Build Environment](#4)  
    * [5.1. Training & Trade Data Split](#4.1)
    * [5.2. User-defined Environment](#4.2)   
    * [5.3. Initialize Environment](#4.3)    
* [6.Implement DRL Algorithms](#5)  
* [7.Backtesting Performance](#6)  
    * [7.1. BackTestStats](#6.1)
    * [7.2. BackTestPlot](#6.2)   
    * [7.3. Baseline Stats](#6.3)   
    * [7.3. Compare to Stock Market Index](#6.4)             

<a id='0'></a>
# Part 1. Problem Definition

This problem is to design an automated trading solution for portfolio alloacation. We model the stock trading process as a Markov Decision Process (MDP). We then formulate our trading goal as a maximization problem.

The algorithm is trained using Deep Reinforcement Learning (DRL) algorithms and the components of the reinforcement learning environment are:


* Action: The action space describes the allowed actions that the agent interacts with the
environment. Normally, a ∈ A represents the weight of a stock in the porfolio: a ∈ (-1,1). Assume our stock pool includes N stocks, we can use a list [a<sub>1</sub>, a<sub>2</sub>, ... , a<sub>N</sub>] to determine the weight for each stock in the porfotlio, where a<sub>i</sub> ∈ (-1,1), a<sub>1</sub>+ a<sub>2</sub>+...+a<sub>N</sub>=1. For example, "The weight of AAPL in the portfolio is 10%." is [0.1 , ...].

* Reward function: r(s, a, s′) is the incentive mechanism for an agent to learn a better action. The change of the portfolio value when action a is taken at state s and arriving at new state s',  i.e., r(s, a, s′) = v′ − v, where v′ and v represent the portfolio
values at state s′ and s, respectively

* State: The state space describes the observations that the agent receives from the environment. Just as a human trader needs to analyze various information before executing a trade, so
our trading agent observes many different features to better learn in an interactive environment.

* Environment: Dow 30 consituents


The data of the single stock that we will be using for this case study is obtained from Yahoo Finance API. The data contains Open-High-Low-Close price and volume.


<a id='1'></a>
# Part 2. Getting Started- Load Python Packages

<a id='1.1'></a>
## 2.1. Install all the packages through FinRL library


In [1]:
## install finrl library
#%pip install git+https://github.com/AI4Finance-LLC/FinRL-Library.git


<a id='1.2'></a>
## 2.2. Check if the additional packages needed are present, if not install them. 
* Yahoo Finance API
* pandas
* numpy
* matplotlib
* stockstats
* OpenAI gym
* stable-baselines
* tensorflow
* pyfolio

<a id='1.3'></a>
## 2.3. Import Packages

In [2]:
from utils import process_future_data, dates_intersection, add_covariance, StockPortfolioEnv
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('Agg')
%matplotlib inline
import datetime

from finrl import config
from finrl import config_tickers
from finrl.finrl_meta.preprocessor.yahoodownloader import YahooDownloader
from finrl.finrl_meta.preprocessor.preprocessors import FeatureEngineer, data_split
from finrl.finrl_meta.env_portfolio_allocation.env_portfolio import StockPortfolioEnv
from finrl.agents.stablebaselines3.models import DRLAgent
from finrl.plot import backtest_stats, backtest_plot, get_daily_return, get_baseline,convert_daily_return_to_pyfolio_ts
from finrl.finrl_meta.data_processor import DataProcessor
from finrl.finrl_meta.data_processors.processor_yahoofinance import YahooFinanceProcessor
import sys
sys.path.append("../FinRL-Library")

  'Module "zipline.assets" not found; multipliers will not be applied'


<a id='1.4'></a>
## 2.4. Create Folders

In [3]:
import os
if not os.path.exists("./" + config.DATA_SAVE_DIR):
    os.makedirs("./" + config.DATA_SAVE_DIR)
if not os.path.exists("./" + config.TRAINED_MODEL_DIR):
    os.makedirs("./" + config.TRAINED_MODEL_DIR)
if not os.path.exists("./" + config.TENSORBOARD_LOG_DIR):
    os.makedirs("./" + config.TENSORBOARD_LOG_DIR)
if not os.path.exists("./" + config.RESULTS_DIR):
    os.makedirs("./" + config.RESULTS_DIR)

<a id='2'></a>
# Part 3. Download Data
Yahoo Finance is a website that provides stock data, financial news, financial reports, etc. All the data provided by Yahoo Finance is free.
* FinRL uses a class **YahooDownloader** to fetch data from Yahoo Finance API
* Call Limit: Using the Public API (without authentication), you are limited to 2,000 requests per hour per IP (or up to a total of 48,000 requests a day).


In [4]:
arrz = process_future_data('data/ARROZ.csv')
bg   = process_future_data('data/BOI_GORDO.csv')
cf   = process_future_data('data/CAFE.csv')
eth  = process_future_data('data/ETANOL.csv')
mil  = process_future_data('data/MILHO.csv')
mf   = process_future_data('data/MINERIO_FERRO.csv')
gold = process_future_data('data/OURO.csv')
petr = process_future_data('data/PETROLEO.csv')
soj  = process_future_data('data/SOJA.csv')
trg  = process_future_data('data/TRIGO.csv')

In [5]:
arrz

Unnamed: 0,date,open,high,low,close,adjcp,volume,tic,day
0,2000-01-03,5.180,5.250,5.180,5.230,5.230,486.0,ZR,0
1,2000-01-04,5.250,5.310,5.250,5.270,5.270,836.0,ZR,1
2,2000-01-05,5.250,5.290,5.250,5.290,5.290,700.0,ZR,2
3,2000-01-06,5.320,5.390,5.305,5.360,5.360,478.0,ZR,3
4,2000-01-07,5.300,5.340,5.300,5.340,5.340,714.0,ZR,4
...,...,...,...,...,...,...,...,...,...
5737,2022-10-21,16.815,16.870,16.575,16.680,16.680,2003.0,ZR,4
5738,2022-10-24,16.680,16.775,16.560,16.575,16.575,2512.0,ZR,0
5739,2022-10-25,16.560,16.625,16.530,16.530,16.530,1459.0,ZR,1
5740,2022-10-26,16.545,16.550,16.335,16.370,16.370,1921.0,ZR,2


In [6]:
cmds_f =pd.concat([arrz,bg,cf,eth,mil,mf,gold,petr,soj,trg],axis=0) 

In [7]:
cmds_f['tic'].unique().tolist()

['ZR', 'LE', 'KC', 'FL', 'ZC', 'TR', 'GC', 'CB', 'ZS', 'ZW']

In [8]:
cmds_f_l = cmds_f["tic"].unique().tolist()
stocks_br  = ["PETR4.SA", "VALE3.SA", "ITUB4.SA", "MGLU3.SA", "BBAS3.SA", "BBDC4.SA","B3SA3.SA", "PETR3.SA", "RENT3.SA", "ELET3.SA" ]
stocks_usa = ["META", "AAPL", "AMZN", "F", "T", "BAC", "GOOGL", "MSFT", "INTC", "CMCSA"]
stocks_eur = ["iSP.MI", "ENEL.MI", "SAN.MC", "INGA.AS", "ENI.MI", "BBVA.MC", "IBE.MC", "CS.PA", "STLA.MI","DTE.DE"]
stocks_chn = ["601899.SS","600010.SS","600795.SS", "603993.SS", "600157.SS", "601288.SS", "600050.SS", "601398.SS", "600537.SS","600777.SS"]

In [9]:
ativos = list(set().union(stocks_br,stocks_usa,stocks_eur,stocks_chn))

In [10]:
len(ativos)

40

In [11]:
#print(config_tickers.DOW_30_TICKER)

In [12]:
dp = YahooFinanceProcessor()
df = dp.download_data(start_date = '2004-01-01',
                     end_date = '2022-11-07',
                     ticker_list = ativos, time_interval='1D')
ativos = ativos = list(set().union(stocks_br,stocks_usa,stocks_eur,stocks_chn,cmds_f_l))

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%********

In [13]:
df = pd.concat([df,cmds_f],axis=0)
df['date']= pd.to_datetime(df['date'])
df = df.sort_values(by='date')

# Part 4: Preprocess Data
Data preprocessing is a crucial step for training a high quality machine learning model. We need to check for missing data and do feature engineering in order to convert the data into a model-ready state.
* Add technical indicators. In practical trading, various information needs to be taken into account, for example the historical stock prices, current holding shares, technical indicators, etc. In this article, we demonstrate two trend-following technical indicators: MACD and RSI.
* Add turbulence index. Risk-aversion reflects whether an investor will choose to preserve the capital. It also influences one's trading strategy when facing different market volatility level. To control the risk in a worst-case scenario, such as financial crisis of 2007–2008, FinRL employs the financial turbulence index that measures extreme asset price fluctuation.

In [15]:
dates_f3 = dates_intersection(df)
print(df.shape)
df=df[df['date'].isin(dates_f3)]
print(df.shape)

(228365, 9)
(112500, 9)


In [16]:
fe = FeatureEngineer(
                    use_technical_indicator=True,
                    use_turbulence=False,
                    user_defined_feature = False)

df = fe.preprocess_data(df)

Successfully added technical indicators


In [19]:
df

Unnamed: 0,date,open,high,low,close,adjcp,volume,tic,day,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma
0,2012-10-09,1.792857,1.864285,1.785714,1.853571,1.812031,162731223.0,600010.SS,1,0.000000,1.874080,1.843776,100.000000,66.666667,100.000000,1.853571,1.853571
2250,2012-10-09,3.630000,3.700000,3.630000,3.680000,3.209576,50801915.0,600050.SS,1,0.000000,1.874080,1.843776,100.000000,66.666667,100.000000,3.680000,3.680000
4500,2012-10-09,3.357692,3.492307,3.350000,3.442307,3.148193,93670189.0,600157.SS,1,0.000000,1.874080,1.843776,100.000000,66.666667,100.000000,3.442307,3.442307
6750,2012-10-09,4.163636,4.404545,4.154545,4.354545,4.088676,16474447.0,600537.SS,1,0.000000,1.874080,1.843776,100.000000,66.666667,100.000000,4.354545,4.354545
9000,2012-10-09,1.228947,1.302631,1.228947,1.297368,1.297368,43570465.0,600777.SS,1,0.000000,1.874080,1.843776,100.000000,66.666667,100.000000,1.297368,1.297368
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
103499,2022-10-27,685.000000,689.250000,679.250000,682.250000,682.250000,175423.0,ZC,3,4.532375,700.291508,663.808492,52.666919,18.526497,8.734012,681.866667,654.075000
105749,2022-10-27,16.395000,16.455000,16.280000,16.345000,16.345000,1630.0,ZR,3,-0.244069,17.606281,16.153219,40.886366,-126.424201,38.764618,17.125500,17.168750
107999,2022-10-27,1392.750000,1405.000000,1389.000000,1393.500000,1393.500000,230756.0,ZS,3,-8.795626,1438.890118,1352.134882,47.259055,-41.383004,2.350999,1411.408333,1413.145833
110249,2022-10-27,843.500000,858.000000,833.500000,838.500000,838.500000,68920.0,ZW,3,-4.442215,934.579452,809.120548,47.606769,-68.830991,2.925366,867.266667,832.091667


In [20]:
df['tic'].value_counts()

600010.SS    2250
PETR3.SA     2250
GOOGL        2250
IBE.MC       2250
INGA.AS      2250
INTC         2250
ITUB4.SA     2250
KC           2250
LE           2250
META         2250
MGLU3.SA     2250
MSFT         2250
PETR4.SA     2250
600050.SS    2250
RENT3.SA     2250
SAN.MC       2250
STLA.MI      2250
T            2250
TR           2250
VALE3.SA     2250
ZC           2250
ZR           2250
ZS           2250
ZW           2250
GC           2250
FL           2250
F            2250
ENI.MI       2250
600157.SS    2250
600537.SS    2250
600777.SS    2250
600795.SS    2250
601288.SS    2250
601398.SS    2250
601899.SS    2250
603993.SS    2250
AAPL         2250
AMZN         2250
B3SA3.SA     2250
BAC          2250
BBAS3.SA     2250
BBDC4.SA     2250
BBVA.MC      2250
CB           2250
CMCSA        2250
CS.PA        2250
DTE.DE       2250
ELET3.SA     2250
ENEL.MI      2250
iSP.MI       2250
Name: tic, dtype: int64

In [21]:
df.shape

(112500, 17)

## Add covariance matrix as states

In [22]:
df = add_covariance(df)

In [23]:
df.shape

(99900, 19)

<a id='4'></a>
# Part 5. Design Environment
Considering the stochastic and interactive nature of the automated stock trading tasks, a financial task is modeled as a **Markov Decision Process (MDP)** problem. The training process involves observing stock price change, taking an action and reward's calculation to have the agent adjusting its strategy accordingly. By interacting with the environment, the trading agent will derive a trading strategy with the maximized rewards as time proceeds.

Our trading environments, based on OpenAI Gym framework, simulate live stock markets with real market data according to the principle of time-driven simulation.


In [24]:
print(df.date.head(1))
print(df.date.tail(1))

0   2013-12-03
Name: date, dtype: datetime64[ns]
99899   2022-10-27
Name: date, dtype: datetime64[ns]


## Training data split: 2009-01-01 to 2020-07-01

In [25]:
train = data_split(df, '2013-12-03','2018-01-01')
trade = data_split(df,'2018-01-02', '2022-10-27')

In [26]:
train.head()

Unnamed: 0,date,open,high,low,close,adjcp,volume,tic,day,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,cov_list,return_list
0,2013-12-03,1.4,1.4,1.4,1.4,1.37216,0.0,600010.SS,1,-0.003455,1.4,1.4,46.404911,-13.801097,6.946997,1.402024,1.416488,"[[0.000501897894437649, 0.00014178723648854168...",tic 600010.SS 600050.SS 600157.SS 6...
0,2013-12-03,3.4,3.48,3.37,3.44,3.039121,103779924.0,600050.SS,1,0.00543,3.462869,3.216131,53.528358,102.492158,25.151774,3.358,3.3605,"[[0.000501897894437649, 0.00014178723648854168...",tic 600010.SS 600050.SS 600157.SS 6...
0,2013-12-03,2.342307,2.376923,2.330769,2.373076,2.23988,39431228.0,600157.SS,1,-0.008643,2.469121,2.281263,46.820058,-20.530516,4.950777,2.370769,2.475512,"[[0.000501897894437649, 0.00014178723648854168...",tic 600010.SS 600050.SS 600157.SS 6...
0,2013-12-03,5.054545,5.163636,5.0,5.145454,4.831295,9891292.0,600537.SS,1,-0.111753,5.601813,4.986823,46.4255,-83.708023,23.351325,5.534697,5.675303,"[[0.000501897894437649, 0.00014178723648854168...",tic 600010.SS 600050.SS 600157.SS 6...
0,2013-12-03,1.457894,1.457894,1.457894,1.457894,1.457894,0.0,600777.SS,1,0.008107,1.517146,1.409695,53.793799,23.767776,5.074069,1.452456,1.406228,"[[0.000501897894437649, 0.00014178723648854168...",tic 600010.SS 600050.SS 600157.SS 6...


## Environment for Portfolio Allocation
##### Got from utils.py 

In [27]:
stock_dimension = len(train.tic.unique())
state_space = stock_dimension
print(f"Stock Dimension: {stock_dimension}, State Space: {state_space}")


Stock Dimension: 50, State Space: 50


In [28]:
config.INDICATORS

['macd',
 'boll_ub',
 'boll_lb',
 'rsi_30',
 'cci_30',
 'dx_30',
 'close_30_sma',
 'close_60_sma']

In [29]:
env_kwargs = {
    "hmax": 100, 
    "initial_amount": 1000000, 
    "transaction_cost_pct": 0.001, 
    "state_space": state_space, 
    "stock_dim": stock_dimension, 
    "tech_indicator_list": config.INDICATORS, 
    "action_space": stock_dimension, 
    "reward_scaling": 1e-4
    
}

e_train_gym = StockPortfolioEnv(df = train, **env_kwargs)

In [30]:
state_space

50

In [31]:
env_train, _ = e_train_gym.get_sb_env()
print(type(env_train))

<class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>


# <a id='5'></a>
# Part 6: Implement DRL Algorithms
* The implementation of the DRL algorithms are based on **OpenAI Baselines** and **Stable Baselines**. Stable Baselines is a fork of OpenAI Baselines, with a major structural refactoring, and code cleanups.
* FinRL library includes fine-tuned standard DRL algorithms, such as DQN, DDPG,
Multi-Agent DDPG, PPO, SAC, A2C and TD3. We also allow users to
design their own DRL algorithms by adapting these DRL algorithms.

In [None]:
# initialize
agent = DRLAgent(env = env_train)

### Model 1: **A2C**


In [None]:
agent = DRLAgent(env = env_train)

A2C_PARAMS = {"n_steps": 5, "ent_coef": 0.005, "learning_rate": 0.0002}
model_a2c = agent.get_model(model_name="a2c",model_kwargs = A2C_PARAMS)

In [None]:
trained_a2c = agent.train_model(model=model_a2c, 
                                tb_log_name='a2c',
                                total_timesteps=50000)

In [None]:
trained_a2c.save('/content/trained_models/trained_a2c.zip')

### Model 2: **PPO**


In [None]:
agent = DRLAgent(env = env_train)
PPO_PARAMS = {
    "n_steps": 2048,
    "ent_coef": 0.005,
    "learning_rate": 0.0001,
    "batch_size": 128,
}
model_ppo = agent.get_model("ppo",model_kwargs = PPO_PARAMS)

In [None]:
trained_ppo = agent.train_model(model=model_ppo, 
                             tb_log_name='ppo',
                             total_timesteps=80000)

In [None]:
trained_ppo.save('/content/trained_models/trained_ppo.zip')

### Model 3: **DDPG**


In [None]:
agent = DRLAgent(env = env_train)
DDPG_PARAMS = {"batch_size": 128, "buffer_size": 50000, "learning_rate": 0.001}


model_ddpg = agent.get_model("ddpg",model_kwargs = DDPG_PARAMS)

In [None]:
trained_ddpg = agent.train_model(model=model_ddpg, 
                             tb_log_name='ddpg',
                             total_timesteps=50000)

In [None]:
trained_ddpg.save('/content/trained_models/trained_ddpg.zip')

### Model 4: **SAC**


In [None]:
agent = DRLAgent(env = env_train)
SAC_PARAMS = {
    "batch_size": 128,
    "buffer_size": 100000,
    "learning_rate": 0.0003,
    "learning_starts": 100,
    "ent_coef": "auto_0.1",
}

model_sac = agent.get_model("sac",model_kwargs = SAC_PARAMS)

In [None]:
trained_sac = agent.train_model(model=model_sac, 
                             tb_log_name='sac',
                             total_timesteps=50000)

In [None]:
trained_sac.save('/content/trained_models/trained_sac.zip')

### Model 5: **TD3**


In [None]:
agent = DRLAgent(env = env_train)
TD3_PARAMS = {"batch_size": 100, 
              "buffer_size": 1000000, 
              "learning_rate": 0.001}

model_td3 = agent.get_model("td3",model_kwargs = TD3_PARAMS)

In [None]:
trained_td3 = agent.train_model(model=model_td3, 
                             tb_log_name='td3',
                             total_timesteps=30000)

In [None]:
trained_td3.save('/content/trained_models/trained_td3.zip')

## Trading
Assume that we have $1,000,000 initial capital at 2019-01-01. We use the A2C model to trade Dow jones 30 stocks.

In [None]:
df.tail()

In [None]:
e_trade_gym = StockPortfolioEnv(df = trade, **env_kwargs)

In [None]:
trade.shape

In [None]:
#trained_sac
#trained_ddpg
#trained_ppo
#trained_a2c
#trained_td3

df_daily_return, df_actions = DRLAgent.DRL_prediction(model=trained_sac,
                        environment = e_trade_gym)

df_daily_return2, df_actions2 = DRLAgent.DRL_prediction(model=trained_ddpg,
                        environment = e_trade_gym)

df_daily_return3, df_actions3 = DRLAgent.DRL_prediction(model=trained_ppo,
                        environment = e_trade_gym)

df_daily_return4, df_actions4 = DRLAgent.DRL_prediction(model=trained_a2c,
                        environment = e_trade_gym)

df_daily_return5, df_actions5 = DRLAgent.DRL_prediction(model=trained_td3,
                        environment = e_trade_gym)

In [None]:
df_daily_return.to_csv('df_daily_return.csv')
df_daily_return.to_csv('df_daily_return2.csv')
df_daily_return.to_csv('df_daily_return3.csv')
df_daily_return.to_csv('df_daily_return4.csv')
df_daily_return.to_csv('df_daily_return5.csv')

In [None]:
df_actions.to_csv('df_actions.csv')
df_actions.to_csv('df_actions2.csv')
df_actions.to_csv('df_actions3.csv')
df_actions.to_csv('df_actions4.csv')
df_actions.to_csv('df_actions5.csv')

<a id='6'></a>
# Part 7: Backtest Our Strategy
Backtesting plays a key role in evaluating the performance of a trading strategy. Automated backtesting tool is preferred because it reduces the human error. We usually use the Quantopian pyfolio package to backtest our trading strategies. It is easy to use and consists of various individual plots that provide a comprehensive image of the performance of a trading strategy.

<a id='6.1'></a>
## 7.1 BackTestStats
pass in df_account_value, this information is stored in env class


In [None]:
from pyfolio import timeseries
DRL_strat = convert_daily_return_to_pyfolio_ts(df_daily_return2)
perf_func = timeseries.perf_stats 
perf_stats_all = perf_func( returns=DRL_strat, 
                              factor_returns=DRL_strat, 
                                positions=None, transactions=None, turnover_denom="AGB")

In [None]:
print("==============DRL Strategy Stats===========")
perf_stats_all

In [None]:
#baseline stats
print("==============Get Baseline Stats===========")
baseline_df = get_baseline(
        ticker="^DJI", 
        start = df_daily_return.loc[0,'date'],
        end = df_daily_return.loc[len(df_daily_return)-1,'date'])

stats = backtest_stats(baseline_df, value_col_name = 'close')

In [None]:
#dp = YahooFinanceProcessor()
#df_comp = dp.download_data(start_date = '2008-01-01',
#                     end_date = '2021-10-31',
#                     ticker_list = ativos, time_interval='1D')
#
#dates1= df_comp.query('tic == "AAPL"').date.tolist()
#dates2= df_comp.query('tic == "PETR3.SA"').date.tolist()
#dates_final=list(set(dates1).intersection(dates2))
#print(len(dates1),len(dates2))
#print(len(dates_final))
#
#print(df_comp.shape)
#df_comp=df_comp[df_comp['date'].isin(dates_final)]
#print(df_comp.shape)
#df_comp = data_split(df_comp,'2020-07-01', '2021-10-31')

In [None]:
#df_comp=df_comp[['date','close','tic']]
df_comp = df[['date','close','tic']].tail(38916)
df_comp.set_index('date',inplace=True)
res = df_comp.pivot(columns='tic', values='close')

# Asset weights
wts = [0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1,0.1
      ,0.1,0.1,0.1,0.1,0.1,0.1]


ret_data = res.pct_change()

weighted_returns = (wts * ret_data)
weighted_returns.index= pd.to_datetime(weighted_returns.index)
weighted_returns

In [None]:
ptf_return= weighted_returns.sum(axis=1,skipna=False)
ptf_return.name= 'Portfolio Returns'
ptf_return.index=DRL_strat.index
ptf_return

<a id='6.2'></a>
## 7.2 BackTestPlot

In [None]:
import pyfolio
%matplotlib inline

baseline_df = get_baseline(
        ticker='^DJI', start=df_daily_return.loc[0,'date'], end='2022-11-04'
    )

baseline_returns = get_daily_return(baseline_df, value_col_name="close")

with pyfolio.plotting.plotting_context(font_scale=1.1):
        pyfolio.create_full_tear_sheet(returns = DRL_strat,
                                       benchmark_rets=baseline_returns, set_context=False)

In [None]:
with pyfolio.plotting.plotting_context(font_scale=1.1):
        pyfolio.create_full_tear_sheet(returns = DRL_strat,
                                       benchmark_rets=ptf_return, set_context=False)

## Min-Variance Portfolio Allocation

In [None]:
#%pip install PyPortfolioOpt

In [None]:
from pypfopt.efficient_frontier import EfficientFrontier
from pypfopt import risk_models

In [None]:
unique_tic = trade.tic.unique()
unique_trade_date = trade.date.unique()

In [None]:
df.head()

In [None]:
#calculate_portfolio_minimum_variance
portfolio = pd.DataFrame(index = range(1), columns = unique_trade_date)
initial_capital = 1000000
portfolio.loc[0,unique_trade_date[0]] = initial_capital

for i in range(len( unique_trade_date)-1):
    df_temp = df[df.date==unique_trade_date[i]].reset_index(drop=True)
    df_temp_next = df[df.date==unique_trade_date[i+1]].reset_index(drop=True)
    #Sigma = risk_models.sample_cov(df_temp.return_list[0])
    #calculate covariance matrix
    Sigma = df_temp.return_list[0].cov()
    #portfolio allocation
    ef_min_var = EfficientFrontier(None, Sigma,weight_bounds=(0, 0.1))
    #minimum variance
    raw_weights_min_var = ef_min_var.min_volatility()
    #get weights
    cleaned_weights_min_var = ef_min_var.clean_weights()
    
    #current capital
    cap = portfolio.iloc[0, i]
    #current cash invested for each stock
    current_cash = [element * cap for element in list(cleaned_weights_min_var.values())]
    # current held shares
    current_shares = list(np.array(current_cash)
                                      / np.array(df_temp.close))
    # next time period price
    next_price = np.array(df_temp_next.close)
    ##next_price * current share to calculate next total account value 
    portfolio.iloc[0, i+1] = np.dot(current_shares, next_price)
    
portfolio=portfolio.T
portfolio.columns = ['account_value']

In [None]:
#trained_sac
#trained_ddpg
#trained_ppo
#trained_a2c
sac_cumpod =(df_daily_return.daily_return+1).cumprod()-1
ddpg_cumpod =(df_daily_return2.daily_return+1).cumprod()-1
ppo_cumpod =(df_daily_return3.daily_return+1).cumprod()-1
a2c_cumpod =(df_daily_return4.daily_return+1).cumprod()-1


In [None]:
min_var_cumpod =(portfolio.account_value.pct_change().fillna(0)+1).cumprod()-1

In [None]:
dji_cumpod =(baseline_returns.fillna(0)+1).cumprod()-1

In [None]:
ptf_cumpod =(ptf_return.fillna(0)+1).cumprod()-1

## Plotly: DRL, Min-Variance, DJIA

In [None]:
#%pip install plotly

In [None]:
from datetime import datetime as dt

import matplotlib.pyplot as plt
import plotly
import plotly.graph_objs as go

In [None]:
time_ind = pd.Series(df_daily_return.date)

In [None]:
#trained_sac
#trained_ddpg
#trained_ppo
#trained_a2c


trace0_portfolio = go.Scatter(x = time_ind, y = a2c_cumpod, mode = 'lines', name = 'A2C (Portfolio Allocation)')
trace1_portfolio = go.Scatter(x = time_ind, y = dji_cumpod, mode = 'lines', name = 'DJIA')
trace2_portfolio = go.Scatter(x = time_ind, y = min_var_cumpod, mode = 'lines', name = 'Min-Variance')
trace3_portfolio = go.Scatter(x = time_ind, y = ptf_cumpod, mode = 'lines', name = 'Portfolio Buy & Hold')
trace4_portfolio = go.Scatter(x = time_ind, y = ddpg_cumpod, mode = 'lines', name = 'DDPG')
trace5_portfolio = go.Scatter(x = time_ind, y = sac_cumpod, mode = 'lines', name = 'SAC')
trace6_portfolio = go.Scatter(x = time_ind, y = ppo_cumpod, mode = 'lines', name = 'PPO')

#trace4 = go.Scatter(x = time_ind, y = addpg_cumpod, mode = 'lines', name = 'Adaptive-DDPG')

#trace2 = go.Scatter(x = time_ind, y = portfolio_cost_minv, mode = 'lines', name = 'Min-Variance')
#trace3 = go.Scatter(x = time_ind, y = spx_value, mode = 'lines', name = 'SPX')

In [None]:
fig = go.Figure()
fig.add_trace(trace0_portfolio)
fig.add_trace(trace1_portfolio)
fig.add_trace(trace2_portfolio)
fig.add_trace(trace3_portfolio)
fig.add_trace(trace4_portfolio)
fig.add_trace(trace5_portfolio)
fig.add_trace(trace6_portfolio)




fig.update_layout(
    legend=dict(
        x=0,
        y=1,
        traceorder="normal",
        font=dict(
            family="sans-serif",
            size=15,
            color="black"
        ),
        bgcolor="White",
        bordercolor="white",
        borderwidth=2
        
    ),
)
#fig.update_layout(legend_orientation="h")
fig.update_layout(title={
        #'text': "Cumulative Return using FinRL",
        'y':0.85,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'})
#with Transaction cost
#fig.update_layout(title =  'Quarterly Trade Date')
fig.update_layout(
#    margin=dict(l=20, r=20, t=20, b=20),

    #paper_bgcolor='rgba(1,1,0,0)',
    #paper_bgcolor='rgb(255,1,0)',
    #plot_bgcolor='rgba(1, 1, 0, 0)',
    #xaxis_title="Date",
    yaxis_title="Cumulative Return",
xaxis={'type': 'date', 
       'tick0': time_ind[0], 
        'tickmode': 'linear', 
       'dtick': 86400000.0 *80}

)
#fig.update_xaxes(showline=True,linecolor='black',showgrid=True, gridwidth=1, gridcolor='LightSteelBlue',mirror=True)
#fig.update_yaxes(showline=True,linecolor='black',showgrid=True, gridwidth=1, gridcolor='LightSteelBlue',mirror=True)
#fig.update_yaxes(zeroline=True, zerolinewidth=1, zerolinecolor='LightSteelBlue')

fig.show()