<a id='0'></a>
# Part 1. Problem Definition

This problem is to design an automated trading solution for multiple cryptocurrency. We model the crypto trading process as a Markov Decision Process (MDP). We then formulate our trading goal as the maximatization of the value of the portfolio.

The algorithm is trained using Deep Reinforcement Learning (DRL) algorithms and the components of the reinforcement learning environment are:

* Action: Every hour we can rebalance the portfolio, decide which percentage of money have in every crypto considered and in dollars. (to implemet the part in dollars)
* Reward function: The difference of total money respect to the previous hour.(To check if we want to modify this)
* Environment: For every hour and every crypto we consider the following variables: <br>
    at the moment -> just covariance and the techinical indicators for the six cryto <br>
    to implement -> covariance for the day, past 2(?) months data for all the crypto <br>
* State: How much money we have in the portfolio. 

<a id='1'></a>
# Part 2. Getting Started- Load Python Packages

<a id='1.1'></a>
## 2.1. Install all the packages

In [1]:
!pip install -r requirements.txt --user



<a id='1.2'></a>
## 2.2. Import Packages

In [1]:
import pandas as pd
from config import config
from dataset.download_dataset.cryptodownloader_binance import CryptoDownloader_binance
from preprocessing.preprocessors import FeatureEngineer
from preprocessing.data import data_split
from env.env_portfolio import StockPortfolioEnv
from model.models import DRLAgent
# from trade.backtest import backtest_stats, backtest_plot, get_daily_return, get_baseline,convert_daily_return_to_pyfolio_ts


<a id='1.3'></a>
## 2.3 Create Folders

In [3]:
import os
download_data = False
if not os.path.exists(config.DATA_SAVE_DIR):
    os.makedirs(config.DATA_SAVE_DIR)
    download_data = True
if not os.path.exists(config.TRAINED_MODEL_DIR):
    os.makedirs("./" + config.TRAINED_MODEL_DIR)
if not os.path.exists(config.TENSORBOARD_LOG_DIR):
    os.makedirs(config.TENSORBOARD_LOG_DIR)
if not os.path.exists(config.RESULTS_DIR):
    os.makedirs(config.RESULTS_DIR)

<a id='2'></a>
# Part 3. Download Data

In [4]:
data_downloader = CryptoDownloader_binance(config.START_DATE, config.END_DATE, config.MULTIPLE_TICKER_8, config.DATA_SAVE_DIR, config.DATA_GRANULARITY)
if download_data:    
    data_downloader.download_data()
df = data_downloader.load()

In [5]:
df

Unnamed: 0,date,open,high,low,close,volume,tic
0,2020-01-01 00:00:00,7195.24,7196.25,7175.46,7177.02,511.814901,btc
1,2020-01-01 01:00:00,7176.47,7230.00,7175.71,7216.27,883.052603,btc
2,2020-01-01 02:00:00,7215.52,7244.87,7211.41,7242.85,655.156809,btc
3,2020-01-01 03:00:00,7242.66,7245.00,7220.00,7225.01,783.724867,btc
4,2020-01-01 04:00:00,7225.00,7230.00,7215.03,7217.27,467.812578,btc
...,...,...,...,...,...,...,...
87547,2021-03-31 19:00:00,193.98,195.25,192.35,193.05,38115.533510,ltc
87548,2021-03-31 20:00:00,193.03,196.39,192.48,195.66,32785.188130,ltc
87549,2021-03-31 21:00:00,195.65,196.90,194.63,195.29,20567.010810,ltc
87550,2021-03-31 22:00:00,195.31,197.00,195.23,195.80,8967.744990,ltc


# Part 4: Preprocess Data

In [None]:
We have added 8 of the most important technical indicators. 

In [6]:
fe = FeatureEngineer(
                    use_technical_indicator=True,
                    use_turbulence=False,
                    user_defined_feature = False)

df = fe.preprocess_data(df)

Successfully added technical indicators


In [7]:
df

Unnamed: 0,date,open,high,low,close,volume,tic,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma
0,2020-01-01 00:00:00,0.032850,0.032850,0.032700,0.032780,1.166001e+06,ada,0.000000,0.033182,0.032588,100.000000,66.666667,100.000000,0.032780,0.032780
10944,2020-01-01 00:00:00,13.715900,13.721100,13.690300,13.698100,6.201669e+04,bnb,0.000000,0.033182,0.032588,100.000000,66.666667,100.000000,13.698100,13.698100
21888,2020-01-01 00:00:00,7195.240000,7196.250000,7175.460000,7177.020000,5.118149e+02,btc,0.000000,0.033182,0.032588,100.000000,66.666667,100.000000,7177.020000,7177.020000
32832,2020-01-01 00:00:00,0.002014,0.002023,0.002008,0.002008,9.630910e+05,doge,0.000000,0.033182,0.032588,100.000000,66.666667,100.000000,0.002008,0.002008
43776,2020-01-01 00:00:00,129.160000,129.190000,128.680000,128.870000,7.769173e+03,eth,0.000000,0.033182,0.032588,100.000000,66.666667,100.000000,128.870000,128.870000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
43775,2021-03-31 23:00:00,0.053143,0.053911,0.053000,0.053770,4.448111e+07,doge,-0.000103,0.054076,0.052762,50.006619,-10.311768,28.977404,0.053646,0.053938
54719,2021-03-31 23:00:00,1903.970000,1924.210000,1901.620000,1919.370000,2.122478e+04,eth,26.214167,1948.647647,1762.109353,64.618395,164.610885,23.659415,1851.563000,1833.205500
65663,2021-03-31 23:00:00,28.571700,29.440100,28.550100,29.416700,4.557268e+05,link,0.276832,29.035767,26.359743,64.396467,226.662894,36.167462,27.746403,27.922328
76607,2021-03-31 23:00:00,194.620000,197.630000,194.470000,196.700000,2.863109e+04,ltc,0.332601,198.176985,188.501015,55.327678,76.382622,7.269085,194.139000,194.302667


## Add covariance matrix as states

For a given day, we consider the previous six months and we look at the convariance matrix of the close value of the six cryptocurrencies. This value is particular interesting in the "crypto world" where al the altcoins (name for all the cryptocurrencies different from bitcoin) are strictly correlated to the behaviour of the bitcoin. 

In [8]:
# We rewrite the index in order to have the same for all the cryptos in a given hour.
df=df.sort_values(['date','tic'],ignore_index=True)
df.index = df.date.factorize()[0]
cov_list = []
# look back is six months
lookback=4320
for i in range(lookback,len(df.index.unique())):
  data_lookback = df.loc[i-lookback:i,:]
  price_lookback=data_lookback.pivot_table(index = 'date',columns = 'tic', values = 'close')
  return_lookback = price_lookback.pct_change().dropna()
  covs = return_lookback.cov().values 
  cov_list.append(covs)

# We add the covariance metrices and we eliminate the first 6 month of training since we can not use them 
df_cov = pd.DataFrame({'date':df.date.unique()[lookback:],'cov_list':cov_list})
df = df.merge(df_cov, on='date')
df = df.sort_values(['date','tic']).reset_index(drop=True)
        

In [9]:
df

Unnamed: 0,date,open,high,low,close,volume,tic,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,cov_list
0,2020-06-29 00:00:00,0.080260,0.080950,0.080260,0.080940,1.167030e+07,ada,0.000604,0.081911,0.076667,55.930396,108.526409,28.496652,0.078419,0.079352,"[[0.00014507929701762984, 0.000108020019269806..."
1,2020-06-29 00:00:00,15.371400,15.538900,15.353800,15.536400,1.132871e+05,bnb,0.027815,15.576862,15.069758,52.217953,113.438821,8.658749,15.228067,15.472620,"[[0.00014507929701762984, 0.000108020019269806..."
2,2020-06-29 00:00:00,9116.160000,9136.660000,9095.260000,9123.850000,9.816174e+02,btc,16.217347,9201.529316,8979.700684,50.430629,74.364701,1.172282,9054.039000,9101.748833,"[[0.00014507929701762984, 0.000108020019269806..."
3,2020-06-29 00:00:00,0.002319,0.002319,0.002291,0.002306,4.805539e+06,doge,-0.000001,0.002335,0.002287,44.634946,9.097746,9.545452,0.002303,0.002328,"[[0.00014507929701762984, 0.000108020019269806..."
4,2020-06-29 00:00:00,224.890000,225.450000,224.080000,224.950000,1.220172e+04,eth,0.371818,228.510373,219.341627,47.886893,63.945385,4.670359,222.493667,225.876500,"[[0.00014507929701762984, 0.000108020019269806..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
52987,2021-03-31 23:00:00,0.053143,0.053911,0.053000,0.053770,4.448111e+07,doge,-0.000103,0.054076,0.052762,50.006619,-10.311768,28.977404,0.053646,0.053938,"[[0.0002524175366490601, 0.0001025937539860113..."
52988,2021-03-31 23:00:00,1903.970000,1924.210000,1901.620000,1919.370000,2.122478e+04,eth,26.214167,1948.647647,1762.109353,64.618395,164.610885,23.659415,1851.563000,1833.205500,"[[0.0002524175366490601, 0.0001025937539860113..."
52989,2021-03-31 23:00:00,28.571700,29.440100,28.550100,29.416700,4.557268e+05,link,0.276832,29.035767,26.359743,64.396467,226.662894,36.167462,27.746403,27.922328,"[[0.0002524175366490601, 0.0001025937539860113..."
52990,2021-03-31 23:00:00,194.620000,197.630000,194.470000,196.700000,2.863109e+04,ltc,0.332601,198.176985,188.501015,55.327678,76.382622,7.269085,194.139000,194.302667,"[[0.0002524175366490601, 0.0001025937539860113..."


In [2]:
import pickle
#pickle.dump(df, open('df','wb'))

In [3]:
df = pickle.load(open('df','rb'))

In [4]:
df["date"].min()

'2020-06-29 00:00:00'

<a id='4'></a>
# Part 5. Design Environment

Considering the stochastic and interactive nature of the cryptocurrency trading, a financial task is modeled as a **Markov Decision Process (MDP)** problem. The training process involves observing  the cryptocurrencies price change, taking an action and reward's calculation to have the agent adjusting its strategy accordingly. By interacting with the environment, the trading agent will derive a trading strategy with the maximized rewards as time proceeds.

Our trading environments, based on OpenAI Gym framework, simulate live crypto markets with real data according to the principle of time-driven simulation.

The action space will hve the dimension of the number of stocks plus 1 for saving also the money that we keep in dollars. The values vary from 0 to 1 and the total sum is one. This will represents the distribution of the portfolio in percentage  for the different cryptos and dollars. 

## Training data 

For training data we will use the last six month of 2021.

In [5]:
train = data_split(df, '2020-07-01','2020-12-31')

## Environment for Portfolio Allocation


In [6]:
stock_dimension = len(train.tic.unique())
state_space = stock_dimension
print(f"Stock Dimension: {stock_dimension}, State Space: {state_space}")

env_kwargs = {
    "initial_amount": 100000, 
    "transaction_cost_pct": 0.001, 
    "state_space": state_space, 
    "stock_dim": stock_dimension, 
    "tech_indicator_list": config.TECHNICAL_INDICATORS_LIST, 
    "action_space": stock_dimension, 
    "reward_scaling": 1e-4
    
}

e_train_gym = StockPortfolioEnv(df = train, **env_kwargs)

env_train, _ = e_train_gym.get_sb_env()

Stock Dimension: 8, State Space: 8


<a id='5'></a>
# Part 6: Implement DRL Algorithms

In [7]:
# initialize
agent = DRLAgent(env = env_train)

### Model 1: **A2C**


In [8]:
A2C_PARAMS = {"n_steps": 5, "ent_coef": 0.005, "learning_rate": 0.0002}
model_a2c = agent.get_model(model_name="a2c",model_kwargs = A2C_PARAMS)

{'n_steps': 5, 'ent_coef': 0.005, 'learning_rate': 0.0002}
Using cpu device


In [9]:
trained_a2c = agent.train_model(model=model_a2c, 
                                tb_log_name='a2c',
                                total_timesteps=60000)

Logging to ./tensorboard_log/a2c\a2c_2
-------------------------------------
| time/                 |           |
|    fps                | 209       |
|    iterations         | 100       |
|    time_elapsed       | 2         |
|    total_timesteps    | 500       |
| train/                |           |
|    entropy_loss       | -11.3     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0002    |
|    n_updates          | 99        |
|    policy_loss        | 4.67e+06  |
|    std                | 0.997     |
|    value_loss         | 1.74e+11  |
-------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 306       |
|    iterations         | 200       |
|    time_elapsed       | 3         |
|    total_timesteps    | 1000      |
| train/                |           |
|    entropy_loss       | -11.3     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0002    |
|    n_upda

------------------------------------
| time/                 |          |
|    fps                | 502      |
|    iterations         | 1600     |
|    time_elapsed       | 15       |
|    total_timesteps    | 8000     |
| train/                |          |
|    entropy_loss       | -11.2    |
|    explained_variance | 0        |
|    learning_rate      | 0.0002   |
|    n_updates          | 1599     |
|    policy_loss        | 8.6e+06  |
|    std                | 0.987    |
|    value_loss         | 5.57e+11 |
------------------------------------
------------------------------------
| time/                 |          |
|    fps                | 504      |
|    iterations         | 1700     |
|    time_elapsed       | 16       |
|    total_timesteps    | 8500     |
| train/                |          |
|    entropy_loss       | -11.2    |
|    explained_variance | 0        |
|    learning_rate      | 0.0002   |
|    n_updates          | 1699     |
|    policy_loss        | 1.05e+07 |
|

------------------------------------
| time/                 |          |
|    fps                | 519      |
|    iterations         | 3100     |
|    time_elapsed       | 29       |
|    total_timesteps    | 15500    |
| train/                |          |
|    entropy_loss       | -11.2    |
|    explained_variance | 0        |
|    learning_rate      | 0.0002   |
|    n_updates          | 3099     |
|    policy_loss        | 4.57e+06 |
|    std                | 0.978    |
|    value_loss         | 2.29e+11 |
------------------------------------
-------------------------------------
| time/                 |           |
|    fps                | 520       |
|    iterations         | 3200      |
|    time_elapsed       | 30        |
|    total_timesteps    | 16000     |
| train/                |           |
|    entropy_loss       | -11.2     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0002    |
|    n_updates          | 3199      |
|    policy_loss        | 6

------------------------------------
| time/                 |          |
|    fps                | 524      |
|    iterations         | 4600     |
|    time_elapsed       | 43       |
|    total_timesteps    | 23000    |
| train/                |          |
|    entropy_loss       | -11.1    |
|    explained_variance | 0        |
|    learning_rate      | 0.0002   |
|    n_updates          | 4599     |
|    policy_loss        | 5.8e+06  |
|    std                | 0.971    |
|    value_loss         | 3.09e+11 |
------------------------------------
------------------------------------
| time/                 |          |
|    fps                | 524      |
|    iterations         | 4700     |
|    time_elapsed       | 44       |
|    total_timesteps    | 23500    |
| train/                |          |
|    entropy_loss       | -11.1    |
|    explained_variance | 1.79e-07 |
|    learning_rate      | 0.0002   |
|    n_updates          | 4699     |
|    policy_loss        | 5.42e+06 |
|

-------------------------------------
| time/                 |           |
|    fps                | 525       |
|    iterations         | 6100      |
|    time_elapsed       | 58        |
|    total_timesteps    | 30500     |
| train/                |           |
|    entropy_loss       | -11.1     |
|    explained_variance | -1.19e-07 |
|    learning_rate      | 0.0002    |
|    n_updates          | 6099      |
|    policy_loss        | 7.98e+06  |
|    std                | 0.966     |
|    value_loss         | 7.11e+11  |
-------------------------------------
begin_total_asset:100000
end_total_asset:226750.4073503916
Sharpe:  0.4274366959800641
------------------------------------
| time/                 |          |
|    fps                | 524      |
|    iterations         | 6200     |
|    time_elapsed       | 59       |
|    total_timesteps    | 31000    |
| train/                |          |
|    entropy_loss       | -11.1    |
|    explained_variance | 0        |
|    learn

------------------------------------
| time/                 |          |
|    fps                | 522      |
|    iterations         | 7600     |
|    time_elapsed       | 72       |
|    total_timesteps    | 38000    |
| train/                |          |
|    entropy_loss       | -11      |
|    explained_variance | 1.19e-07 |
|    learning_rate      | 0.0002   |
|    n_updates          | 7599     |
|    policy_loss        | 5.51e+06 |
|    std                | 0.961    |
|    value_loss         | 3.28e+11 |
------------------------------------
------------------------------------
| time/                 |          |
|    fps                | 522      |
|    iterations         | 7700     |
|    time_elapsed       | 73       |
|    total_timesteps    | 38500    |
| train/                |          |
|    entropy_loss       | -11      |
|    explained_variance | 1.19e-07 |
|    learning_rate      | 0.0002   |
|    n_updates          | 7699     |
|    policy_loss        | 5.51e+06 |
|

------------------------------------
| time/                 |          |
|    fps                | 518      |
|    iterations         | 9100     |
|    time_elapsed       | 87       |
|    total_timesteps    | 45500    |
| train/                |          |
|    entropy_loss       | -11      |
|    explained_variance | 0        |
|    learning_rate      | 0.0002   |
|    n_updates          | 9099     |
|    policy_loss        | 5.19e+06 |
|    std                | 0.953    |
|    value_loss         | 3.1e+11  |
------------------------------------
------------------------------------
| time/                 |          |
|    fps                | 518      |
|    iterations         | 9200     |
|    time_elapsed       | 88       |
|    total_timesteps    | 46000    |
| train/                |          |
|    entropy_loss       | -11      |
|    explained_variance | 0        |
|    learning_rate      | 0.0002   |
|    n_updates          | 9199     |
|    policy_loss        | 5.33e+06 |
|

begin_total_asset:100000
end_total_asset:268413.2321142247
Sharpe:  0.5026665592737011
------------------------------------
| time/                 |          |
|    fps                | 517      |
|    iterations         | 10600    |
|    time_elapsed       | 102      |
|    total_timesteps    | 53000    |
| train/                |          |
|    entropy_loss       | -10.9    |
|    explained_variance | 1.19e-07 |
|    learning_rate      | 0.0002   |
|    n_updates          | 10599    |
|    policy_loss        | 3.98e+06 |
|    std                | 0.948    |
|    value_loss         | 1.85e+11 |
------------------------------------
------------------------------------
| time/                 |          |
|    fps                | 517      |
|    iterations         | 10700    |
|    time_elapsed       | 103      |
|    total_timesteps    | 53500    |
| train/                |          |
|    entropy_loss       | -10.9    |
|    explained_variance | 0        |
|    learning_rate      |

### Model 2: **PPO**

In [98]:
# agent = DRLAgent(env = env_train)
# PPO_PARAMS = {
#     "n_steps": 2048,
#     "ent_coef": 0.005,
#     "learning_rate": 0.0001,
#     "batch_size": 128,
# }
# model_ppo = agent.get_model("ppo",model_kwargs = PPO_PARAMS)

In [99]:
# trained_ppo = agent.train_model(model=model_ppo, 
#                              tb_log_name='ppo',
#                              total_timesteps=80000)

## Trading
Assume that we have $1,000,000 initial capital at 2021-01-01.

In [110]:
trade = data_split(df,'2021-01-01', config.END_DATE)
e_trade_gym = StockPortfolioEnv(df = trade, **env_kwargs)

In [111]:
trade.shape

(17280, 16)

In [102]:
trade

Unnamed: 0,date,open,high,low,close,volume,tic,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,cov_list
0,2021-01-01 00:00:00,0.181340,0.181460,0.178310,0.180510,1.919492e+07,ada,-0.000153,0.184657,0.176516,50.750459,-27.165052,6.100618,0.180782,0.182191,"[[0.00023491493831790994, 7.732551281244585e-0..."
0,2021-01-01 00:00:00,37.359600,37.442300,36.963600,37.376400,9.511383e+04,bnb,-0.083237,37.946399,36.719461,51.231566,-33.515795,11.715168,37.421230,37.590240,"[[0.00023491493831790994, 7.732551281244585e-0..."
0,2021-01-01 00:00:00,28923.630000,29031.340000,28690.170000,28995.130000,2.311811e+03,btc,133.938215,29333.932277,28372.364723,57.001762,31.346237,5.617701,28857.221667,28217.099500,"[[0.00023491493831790994, 7.732551281244585e-0..."
0,2021-01-01 00:00:00,0.004672,0.004701,0.004601,0.004679,2.768207e+07,doge,0.000018,0.004729,0.004558,53.910140,48.169530,6.372370,0.004639,0.004574,"[[0.00023491493831790994, 7.732551281244585e-0..."
0,2021-01-01 00:00:00,736.420000,739.000000,729.330000,734.070000,2.793270e+04,eth,-0.496311,752.068407,727.317593,50.347614,-92.128309,16.290326,741.657333,735.789500,"[[0.00023491493831790994, 7.732551281244585e-0..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2159,2021-03-31 23:00:00,0.053143,0.053911,0.053000,0.053770,4.448111e+07,doge,-0.000103,0.054076,0.052762,50.006619,-10.311768,28.977404,0.053646,0.053938,"[[0.0001184825404164931, 7.097453732794227e-05..."
2159,2021-03-31 23:00:00,1903.970000,1924.210000,1901.620000,1919.370000,2.122478e+04,eth,26.214167,1948.647647,1762.109353,64.618395,164.610885,23.659415,1851.563000,1833.205500,"[[0.0001184825404164931, 7.097453732794227e-05..."
2159,2021-03-31 23:00:00,28.571700,29.440100,28.550100,29.416700,4.557268e+05,link,0.276832,29.035767,26.359743,64.396467,226.662894,36.167462,27.746403,27.922328,"[[0.0001184825404164931, 7.097453732794227e-05..."
2159,2021-03-31 23:00:00,194.620000,197.630000,194.470000,196.700000,2.863109e+04,ltc,0.332601,198.176985,188.501015,55.327678,76.382622,7.269085,194.139000,194.302667,"[[0.0001184825404164931, 7.097453732794227e-05..."


In [112]:
df_daily_return, df_actions = DRLAgent.DRL_prediction(model=trained_a2c,
                        environment = e_trade_gym)

begin_total_asset:1000000
end_total_asset:3955745.7869483377
Sharpe:  0.8952605536069735
hit end!


In [104]:
df_daily_return

Unnamed: 0,date,daily_return
0,2021-01-01 00:00:00,0.000000
1,2021-01-01 01:00:00,0.022631
2,2021-01-01 02:00:00,0.003002
3,2021-01-01 03:00:00,0.006520
4,2021-01-01 04:00:00,0.001370
...,...,...
2155,2021-03-31 19:00:00,-0.005520
2156,2021-03-31 20:00:00,0.011206
2157,2021-03-31 21:00:00,-0.000456
2158,2021-03-31 22:00:00,0.001301


In [105]:
df_actions

Unnamed: 0_level_0,ada,bnb,btc,doge,eth,link,ltc,xrp
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2021-01-01 00:00:00,0.125000,0.125000,0.125000,0.125000,0.125000,0.125000,0.125000,0.125000
2021-01-01 01:00:00,0.066554,0.076688,0.180913,0.066554,0.066554,0.180913,0.180913,0.180913
2021-01-01 02:00:00,0.077391,0.130851,0.077391,0.077391,0.198717,0.121681,0.175248,0.141330
2021-01-01 03:00:00,0.099662,0.099662,0.201063,0.139609,0.161020,0.099662,0.099662,0.099662
2021-01-01 04:00:00,0.181079,0.104423,0.104423,0.145047,0.104423,0.104423,0.124155,0.132029
...,...,...,...,...,...,...,...,...
2021-03-31 19:00:00,0.162344,0.116857,0.070699,0.069601,0.189195,0.189195,0.077495,0.124615
2021-03-31 20:00:00,0.219924,0.080906,0.080906,0.080906,0.080906,0.219063,0.156485,0.080906
2021-03-31 21:00:00,0.194661,0.194661,0.071612,0.071612,0.071612,0.071612,0.129571,0.194661
2021-03-31 22:00:00,0.066529,0.093833,0.164047,0.180844,0.180844,0.180844,0.066529,0.066529


In [113]:
df_actions.to_csv('df_actions.csv')

<a id='6'></a>
# Part 7: Backtest

<a id='6.1'></a>
## 7.1 BackTestStats
pass in df_account_value, this information is stored in env class


In [107]:
# from pyfolio import timeseries
# DRL_strat = convert_daily_return_to_pyfolio_ts(df_daily_return)
# perf_func = timeseries.perf_stats 
# perf_stats_all = perf_func( returns=DRL_strat, 
#                               factor_returns=DRL_strat, 
#                                 positions=None, transactions=None, turnover_denom="AGB")

In [108]:
# print("==============DRL Strategy Stats===========")
# perf_stats_all

<a id='6.2'></a>
## 7.2 BackTestPlot

In [109]:
# print("==============Compare to IHSG===========")
# %matplotlib inline
# BackTestPlot(df_account_value, 
#              baseline_ticker = '^JKSE', 
#              baseline_start = df_account_value.loc[0,'date'],
#              baseline_end = df_account_value.loc[len(df_account_value)-1,'date'])