
###Automated stock trading using FinRL

The algorithm is trained using Deep Reinforcement Learning (DRL) algorithms and the components of the reinforcement learning environment are:

Action: The action space describes the allowed actions that the agent interacts with the environment. Normally, a ∈ A includes three actions: a ∈ {−1, 0, 1}, where −1, 0, 1 represent selling, holding, and buying one stock. Also, an action can be carried upon multiple shares. We use an action space {−k, ..., −1, 0, 1, ..., k}, where k denotes the number of shares. For example, "Buy 10 shares of AAPL" or "Sell 10 shares of AAPL" are 10 or −10, respectively

Reward function: r(s, a, s′) is the incentive mechanism for an agent to learn a better action. The change of the portfolio value when action a is taken at state s and arriving at new state s', i.e., r(s, a, s′) = v′ − v, where v′ and v represent the portfolio values at state s′ and s, respectively

State: The state space describes the observations that the agent receives from the environment. Just as a human trader needs to analyze various information before executing a trade, so our trading agent observes many different features to better learn in an interactive environment.

Environment: Dow 30 consituents

Install all the packages through FinRL library

In [1]:
!python --version



Python 3.10.16


In [17]:
!python -m pip install git+https://github.com/AI4Finance-Foundation/FinRL.git

Collecting git+https://github.com/AI4Finance-Foundation/FinRL.git
  Cloning https://github.com/AI4Finance-Foundation/FinRL.git to c:\users\natna\appdata\local\temp\pip-req-build-l77kt8or
  Resolved https://github.com/AI4Finance-Foundation/FinRL.git to commit bc12fe7b57c483e8fac666f4cf05cbf62077958a
  Installing build dependencies: started
  Installing build dependencies: finished with status 'done'
  Getting requirements to build wheel: started
  Getting requirements to build wheel: finished with status 'done'
  Preparing metadata (pyproject.toml): started
  Preparing metadata (pyproject.toml): finished with status 'done'
Collecting elegantrl@ git+https://github.com/AI4Finance-Foundation/ElegantRL.git (from finrl==0.3.6)
  Cloning https://github.com/AI4Finance-Foundation/ElegantRL.git to c:\users\natna\appdata\local\temp\pip-install-cwe6w9_h\elegantrl_f378379c6c3f4bdeb5934f405e115e04
  Resolved https://github.com/AI4Finance-Foundation/ElegantRL.git to commit 2fa34dd9236498beada8d8443d9

  Running command git clone --filter=blob:none --quiet https://github.com/AI4Finance-Foundation/FinRL.git 'C:\Users\natna\AppData\Local\Temp\pip-req-build-l77kt8or'
  Running command git clone --filter=blob:none --quiet https://github.com/AI4Finance-Foundation/ElegantRL.git 'C:\Users\natna\AppData\Local\Temp\pip-install-cwe6w9_h\elegantrl_f378379c6c3f4bdeb5934f405e115e04'
  You can safely remove it manually.
  You can safely remove it manually.


In [6]:
!pip3 install pandas
!pip install numpy



In [25]:
!set PATH=%PATH%;C:\Users\natna\miniforge3\envs\finRl\Scripts


In [3]:
!python -m pip install numpy==1.26.4 scipy==1.12.0 scikit-learn==1.6.1



# !python --version 1.23

Collecting numpy==1.26.4
  Using cached numpy-1.26.4-cp310-cp310-win_amd64.whl.metadata (61 kB)
Collecting scipy==1.12.0
  Using cached scipy-1.12.0-cp310-cp310-win_amd64.whl.metadata (60 kB)
Collecting scikit-learn==1.6.1
  Using cached scikit_learn-1.6.1-cp310-cp310-win_amd64.whl.metadata (15 kB)
Using cached numpy-1.26.4-cp310-cp310-win_amd64.whl (15.8 MB)
Using cached scipy-1.12.0-cp310-cp310-win_amd64.whl (46.2 MB)
Using cached scikit_learn-1.6.1-cp310-cp310-win_amd64.whl (11.1 MB)
Installing collected packages: numpy, scipy, scikit-learn
Successfully installed numpy-1.26.4 scikit-learn-1.6.1 scipy-1.12.0




In [2]:
import numpy
import scipy
import sklearn
print("Numpy version:", numpy.__version__)
print("Scipy version:", scipy.__version__)
print("Scikit-learn version:", sklearn.__version__)


Numpy version: 1.26.4
Scipy version: 1.12.0
Scikit-learn version: 1.6.1


Import Packages

In [1]:
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
# matplotlib.use('Agg')
import datetime

%matplotlib inline
from finrl import config
from finrl.meta.preprocessor.yahoodownloader import YahooDownloader
from finrl.meta.preprocessor.preprocessors import FeatureEngineer, data_split
from finrl.meta.env_stock_trading.env_stocktrading import StockTradingEnv
from finrl.agents.stablebaselines3.models import DRLAgent
from finrl.plot import backtest_stats, backtest_plot, get_daily_return, get_baseline

from pprint import pprint

import sys
sys.path.append("../FinRL-Library")

import itertools



In [5]:
pip freeze > requirements.txt

Note: you may need to restart the kernel to use updated packages.


Create Folders

In [6]:
import os
if not os.path.exists("./" + config.DATA_SAVE_DIR):
    os.makedirs("./" + config.DATA_SAVE_DIR)
if not os.path.exists("./" + config.TRAINED_MODEL_DIR):
    os.makedirs("./" + config.TRAINED_MODEL_DIR)
if not os.path.exists("./" + config.TENSORBOARD_LOG_DIR):
    os.makedirs("./" + config.TENSORBOARD_LOG_DIR)
if not os.path.exists("./" + config.RESULTS_DIR):
    os.makedirs("./" + config.RESULTS_DIR)

Download Data

In [2]:
from finrl import config_tickers
df = YahooDownloader(start_date = '2009-01-01',
                           end_date = '2020-09-30',
                           ticker_list = config_tickers.DOW_30_TICKER).fetch_data()

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%********

Shape of DataFrame:  (86111, 8)


Add technical Indicators

In [3]:
df = FeatureEngineer(use_technical_indicator=True,
                      tech_indicator_list = config.INDICATORS,
                      use_turbulence=True,
                      user_defined_feature = False).preprocess_data(df.copy())

Successfully added technical indicators
Successfully added turbulence index


In [4]:
df=df.sort_values(['date','tic'],ignore_index=True)
df.index = df.date.factorize()[0]

cov_list = []
# look back is one year
lookback=252
for i in range(lookback,len(df.index.unique())):
  data_lookback = df.loc[i-lookback:i,:]
  price_lookback=data_lookback.pivot_table(index = 'date',columns = 'tic', values = 'close')
  return_lookback = price_lookback.pct_change().dropna()
  covs = return_lookback.cov().values
  cov_list.append(covs)

df_cov = pd.DataFrame({'date':df.date.unique()[lookback:],'cov_list':cov_list})
df = df.merge(df_cov, on='date')
df = df.sort_values(['date','tic']).reset_index(drop=True)
df.head()

Unnamed: 0,date,close,high,low,open,volume,tic,day,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,turbulence,cov_list
0,2010-01-04,6.447413,7.660714,7.585,7.6225,493729600,AAPL,0,0.118289,6.510971,5.54371,62.133198,168.772848,33.760635,6.031814,5.978972,0.0,"[[0.0004430597365654261, 0.0001369804643834412..."
1,2010-01-04,40.591297,57.869999,56.560001,56.630001,5277400,AMGN,0,0.226693,41.025202,38.556795,52.850028,85.62114,6.350919,39.784668,39.709536,0.0,"[[0.0004430597365654261, 0.0001369804643834412..."
2,2010-01-04,32.828983,41.099998,40.389999,40.810001,6894300,AXP,0,0.288679,33.751081,31.445595,56.779299,1.010171,11.537387,32.716407,31.134859,0.0,"[[0.0004430597365654261, 0.0001369804643834412..."
3,2010-01-04,43.777565,56.389999,54.799999,55.720001,6186700,BA,0,0.501042,44.009591,41.898202,58.805064,81.205664,10.840906,42.319078,40.753002,0.0,"[[0.0004430597365654261, 0.0001369804643834412..."
4,2010-01-04,39.738232,59.189999,57.509998,57.650002,7325600,CAT,0,0.07545,40.134163,38.238726,55.292759,49.350754,8.534279,39.345486,38.969691,0.0,"[[0.0004430597365654261, 0.0001369804643834412..."


In real life trading, the model needs to be updated periodically using rolling windows. But here I'm just cutting the data into train and trade set.

In [5]:
train = data_split(df, '2009-01-01','2019-12-31')
trade = data_split(df, '2020-01-01','2020-09-30')

State Space and Action Space Calculation

In [6]:
stock_dimension = len(train.tic.unique())
state_space = 1 + 2*stock_dimension + len(config.INDICATORS)*stock_dimension

In [7]:
print(stock_dimension)
print(state_space)

29
291


## Environment Details

In [8]:
# Define transaction cost lists for buying and selling stocks
buy_cost_list = sell_cost_list = [0.001] * stock_dimension
# Explanation: 
# - `buy_cost_list` and `sell_cost_list` represent the transaction costs as a percentage for buying and selling stocks.
# - `[0.001] * stock_dimension` creates a list where each element is 0.001 (0.1% transaction fee), repeated for each stock.
# - The use of `=` assigns the same list to both `buy_cost_list` and `sell_cost_list`.

# Initialize the list to track the number of shares owned for each stock
num_stock_shares = [0] * stock_dimension
# Explanation:
# - `num_stock_shares` is a list where each element is initialized to 0, representing that no shares are owned initially.
# - `[0] * stock_dimension` ensures the list length matches the number of stocks (`stock_dimension`).

# Create a dictionary to store environment configuration parameters
env_kwargs = {
    "hmax": 100,  # Maximum number of shares that can be bought or sold in a single transaction.
    "initial_amount": 1_000_000,  # Initial cash available for the agent to trade with (e.g., $1,000,000).
    "num_stock_shares": num_stock_shares,  # Initial portfolio: number of shares owned for each stock.
    "buy_cost_pct": buy_cost_list,  # Transaction cost percentage for buying stocks.
    "sell_cost_pct": sell_cost_list,  # Transaction cost percentage for selling stocks.
    "state_space": state_space,  # Dimension of the state space (e.g., features describing the environment).
    "stock_dim": stock_dimension,  # Number of stocks being traded (dimension of the stock universe).
    "tech_indicator_list": config.INDICATORS,  # List of technical indicators used as features for the state space.
    "action_space": stock_dimension,  # Dimension of the action space (one action per stock).
    "reward_scaling": 1e-4  # Scaling factor for rewards to normalize them and improve learning stability.
}
# Explanation:
# - This dictionary (`env_kwargs`) encapsulates all the necessary parameters required to initialize the stock trading environment.
# - It includes configuration for portfolio management (e.g., `hmax`, `initial_amount`, `num_stock_shares`) and the structure of the RL problem (e.g., `state_space`, `action_space`).

# Initialize the stock trading environment with the training data and configuration parameters
e_train_gym = StockTradingEnv(df=train, **env_kwargs)
# Explanation:
# - `StockTradingEnv` is a custom environment class for stock trading, compliant with OpenAI Gym standards.
# - `df=train` specifies the training data (a DataFrame containing historical stock prices and other features).
# - `**env_kwargs` unpacks the `env_kwargs` dictionary, passing each key-value pair as an argument to the environment initializer.
# - The environment simulates the stock trading process, enabling the RL agent to interact with it by observing states, taking actions, and receiving rewards.


Environment for training

In [9]:
env_train, _ = e_train_gym.get_sb_env() #get stable baseline environment for training
print(type(env_train))

<class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>


In [10]:
agent = DRLAgent(env = env_train)
# Set the corresponding values to 'True' for the algorithms that you want to use
if_using_a2c = True
if_using_ddpg = True
if_using_ppo = True
if_using_td3 = False
if_using_sac = False

In [16]:
from stable_baselines3.common.logger import configure

 Implement DRL Algorithms

In [11]:
import torch

print('Current version of PyTorch: ', torch.__version__)

if torch.cuda.is_available:
  print('PyTorch can use GPUs!')
else:
  print('PyTorch cannot use GPUs.')

Current version of PyTorch:  2.6.0+cu118
PyTorch can use GPUs!


DDPG

Training

In [18]:
agent = DRLAgent(env = env_train)
model_ddpg = agent.get_model("ddpg")

if if_using_ddpg:
  # set up logger
  tmp_path = config.RESULTS_DIR + '/ddpg'
  new_logger_ddpg = configure(tmp_path, ["stdout", "csv", "tensorboard"])
  # Set new logger
  model_ddpg.set_logger(new_logger_ddpg)

{'batch_size': 128, 'buffer_size': 50000, 'learning_rate': 0.001}
Using cuda device
Logging to results/ddpg


In [19]:
trained_ddpg = agent.train_model(model=model_ddpg,
                             tb_log_name='ddpg',
                             total_timesteps=50000) if if_using_ddpg else None

-----------------------------------
| time/              |            |
|    episodes        | 4          |
|    fps             | 75         |
|    time_elapsed    | 133        |
|    total_timesteps | 10060      |
| train/             |            |
|    actor_loss      | 11.8       |
|    critic_loss     | 19.8       |
|    learning_rate   | 0.001      |
|    n_updates       | 9959       |
|    reward          | -2.4185367 |
-----------------------------------
-----------------------------------
| time/              |            |
|    episodes        | 8          |
|    fps             | 75         |
|    time_elapsed    | 266        |
|    total_timesteps | 20120      |
| train/             |            |
|    actor_loss      | 1          |
|    critic_loss     | 2.9        |
|    learning_rate   | 0.001      |
|    n_updates       | 20019      |
|    reward          | -2.4185367 |
-----------------------------------
day: 2514, episode: 10
begin_total_asset: 1000000.00
end_total_a

In [20]:
trained_ddpg.save(config.TRAINED_MODEL_DIR + "/agent_ddpg") if if_using_ddpg else None


Trading

In [42]:
e_trade_gym = StockTradingEnv(df = trade, **env_kwargs)

In [43]:
df_account_value, df_actions = DRLAgent.DRL_prediction(
    model=trained_ddpg,
    environment = e_trade_gym)

hit end!


In [44]:
df_account_value.tail()

Unnamed: 0,date,account_value
183,2020-09-23,923456.79624
184,2020-09-24,928167.969877
185,2020-09-25,937527.298529
186,2020-09-28,950563.141333
187,2020-09-29,944897.941351


Backtesting Performance

In [39]:
df_dji = YahooDownloader(
    start_date='2020-01-01', end_date='2020-09-30', ticker_list=["dji"]
).fetch_data()
df_dji = df_dji[["date", "close"]]
fst_day = df_dji["close"][0]
dji = pd.merge(
    df_dji["date"],
    df_dji["close"].div(fst_day).mul(1000000),
    how="outer",
    left_index=True,
    right_index=True,
).set_index("date")


[*********************100%***********************]  1 of 1 completed

Shape of DataFrame:  (183, 8)





In [40]:
print("==============Get Backtest Results===========")
now = datetime.datetime.now().strftime('%Y%m%d-%Hh%M')

perf_stats_all = backtest_stats(account_value=df_account_value)
perf_stats_all = pd.DataFrame(perf_stats_all)
perf_stats_all.to_csv("./"+config.RESULTS_DIR+"/perf_stats_all_"+now+'.csv')

Annual return         -0.050662
Cumulative returns    -0.038044
Annual volatility      0.399601
Sharpe ratio           0.069960
Calmar ratio          -0.141434
Stability              0.000611
Max drawdown          -0.358206
Omega ratio            1.014689
Sortino ratio          0.095153
Skew                        NaN
Kurtosis                    NaN
Tail ratio             0.858610
Daily value at risk   -0.050234
dtype: float64


In [41]:
#baseline stats
print("==============Get Baseline Stats===========")
baseline_df = get_baseline(
        ticker="^DJI",
        start = '2020-01-01',
        end = '2020-09-30')

stats = backtest_stats(baseline_df, value_col_name = 'close')

[*********************100%***********************]  1 of 1 completed

Shape of DataFrame:  (188, 8)
Annual return         -0.065199
Cumulative returns    -0.049054
Annual volatility      0.416030
Sharpe ratio           0.046016
Calmar ratio          -0.175803
Stability              0.012240
Max drawdown          -0.370862
Omega ratio            1.009343
Sortino ratio          0.062829
Skew                        NaN
Kurtosis                    NaN
Tail ratio             0.860019
Daily value at risk   -0.052339
dtype: float64





Back Test Plot

In [45]:
df_result_ddpg = df_account_value.set_index(df_account_value.columns[0])
result = pd.DataFrame(
    {
        "ddpg": df_result_ddpg["account_value"],
        "dji": dji["close"],
    }
)
result

Unnamed: 0_level_0,ddpg,dji
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-01-02,1.000000e+06,
2020-01-03,9.978712e+05,1.000000e+06
2020-01-06,9.981349e+05,1.002392e+06
2020-01-07,9.939272e+05,9.982119e+05
2020-01-08,1.000095e+06,1.003848e+06
...,...,...
2020-09-23,9.234568e+05,9.346322e+05
2020-09-24,9.281680e+05,9.364587e+05
2020-09-25,9.375273e+05,9.489818e+05
2020-09-28,9.505631e+05,


In [47]:
plt.rcParams["figure.figsize"] = (15,5)
plt.figure()
result.plot()

<Axes: xlabel='date'>

PPO

In [24]:
agent = DRLAgent(env = env_train)
PPO_PARAMS = {
    "n_steps": 2048,
    "ent_coef": 0.01,
    "learning_rate": 0.00025,
    "batch_size": 128,
}
model_ppo = agent.get_model("ppo",model_kwargs = PPO_PARAMS)

if if_using_ppo:
  # set up logger
  tmp_path = config.RESULTS_DIR + '/ppo'
  new_logger_ppo = configure(tmp_path, ["stdout", "csv", "tensorboard"])
  # Set new logger
  model_ppo.set_logger(new_logger_ppo)

{'n_steps': 2048, 'ent_coef': 0.01, 'learning_rate': 0.00025, 'batch_size': 128}
Using cuda device
Logging to results/ppo




In [25]:
trained_ppo = agent.train_model(model=model_ppo,
                             tb_log_name='ppo',
                             total_timesteps=200000) if if_using_ppo else None

---------------------------------
| time/              |          |
|    fps             | 147      |
|    iterations      | 1        |
|    time_elapsed    | 13       |
|    total_timesteps | 2048     |
| train/             |          |
|    reward          | 0.337964 |
---------------------------------
-----------------------------------------
| time/                   |             |
|    fps                  | 145         |
|    iterations           | 2           |
|    time_elapsed         | 28          |
|    total_timesteps      | 4096        |
| train/                  |             |
|    approx_kl            | 0.014884099 |
|    clip_fraction        | 0.207       |
|    clip_range           | 0.2         |
|    entropy_loss         | -41.2       |
|    explained_variance   | 0.00748     |
|    learning_rate        | 0.00025     |
|    loss                 | 5.02        |
|    n_updates            | 10          |
|    policy_gradient_loss | -0.0258     |
|    reward           

In [26]:
trained_ppo.save(config.TRAINED_MODEL_DIR + "/agent_ppo") if if_using_ppo else None

Trading

In [48]:
e_trade_gym = StockTradingEnv(df = trade, **env_kwargs)

In [49]:
df_account_value, df_actions = DRLAgent.DRL_prediction(
    model=trained_ppo,
    environment = e_trade_gym)

hit end!


In [50]:
df_account_value.tail()

Unnamed: 0,date,account_value
183,2020-09-23,909194.253185
184,2020-09-24,911009.383914
185,2020-09-25,916962.469833
186,2020-09-28,928577.920684
187,2020-09-29,919340.924575


Backtesting Performance

In [51]:
print("==============Get Backtest Results===========")
now = datetime.datetime.now().strftime('%Y%m%d-%Hh%M')

perf_stats_all = backtest_stats(account_value=df_account_value)
perf_stats_all = pd.DataFrame(perf_stats_all)
perf_stats_all.to_csv("./"+config.RESULTS_DIR+"/perf_stats_all_"+now+'.csv')

Annual return         -0.106606
Cumulative returns    -0.080659
Annual volatility      0.422709
Sharpe ratio          -0.056058
Calmar ratio          -0.294170
Stability              0.000500
Max drawdown          -0.362396
Omega ratio            0.989203
Sortino ratio         -0.076674
Skew                        NaN
Kurtosis                    NaN
Tail ratio             1.107088
Daily value at risk   -0.053350
dtype: float64


In [52]:
#baseline stats
print("==============Get Baseline Stats===========")
baseline_df = get_baseline(
        ticker="^DJI",
        start = '2020-01-01',
        end = '2020-09-30')

stats = backtest_stats(baseline_df, value_col_name = 'close')

[*********************100%***********************]  1 of 1 completed

Shape of DataFrame:  (188, 8)
Annual return         -0.065199
Cumulative returns    -0.049054
Annual volatility      0.416030
Sharpe ratio           0.046016
Calmar ratio          -0.175803
Stability              0.012240
Max drawdown          -0.370862
Omega ratio            1.009343
Sortino ratio          0.062829
Skew                        NaN
Kurtosis                    NaN
Tail ratio             0.860019
Daily value at risk   -0.052339
dtype: float64





In [53]:
df_result_ppo = df_account_value.set_index(df_account_value.columns[0])
result = pd.DataFrame(
    {
        "ppo": df_result_ppo["account_value"],
        "dji": dji["close"],
    }
)
result

Unnamed: 0_level_0,ppo,dji
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-01-02,1.000000e+06,
2020-01-03,9.994978e+05,1.000000e+06
2020-01-06,9.999417e+05,1.002392e+06
2020-01-07,9.997660e+05,9.982119e+05
2020-01-08,1.000481e+06,1.003848e+06
...,...,...
2020-09-23,9.091943e+05,9.346322e+05
2020-09-24,9.110094e+05,9.364587e+05
2020-09-25,9.169625e+05,9.489818e+05
2020-09-28,9.285779e+05,


In [54]:
plt.rcParams["figure.figsize"] = (15,5)
plt.figure()
result.plot()

<Axes: xlabel='date'>

A2C

In [30]:
agent = DRLAgent(env = env_train)
model_a2c = agent.get_model("a2c")

if if_using_a2c:
  # set up logger
  tmp_path = config.RESULTS_DIR + '/a2c'
  new_logger_a2c = configure(tmp_path, ["stdout", "csv", "tensorboard"])
  # Set new logger
  model_a2c.set_logger(new_logger_a2c)

{'n_steps': 5, 'ent_coef': 0.01, 'learning_rate': 0.0007}
Using cuda device
Logging to results/a2c




In [31]:
trained_a2c = agent.train_model(model=model_a2c,
                             tb_log_name='a2c',
                             total_timesteps=50000) if if_using_a2c else None

----------------------------------------
| time/                 |              |
|    fps                | 126          |
|    iterations         | 100          |
|    time_elapsed       | 3            |
|    total_timesteps    | 500          |
| train/                |              |
|    entropy_loss       | -41.3        |
|    explained_variance | 0.0855       |
|    learning_rate      | 0.0007       |
|    n_updates          | 99           |
|    policy_loss        | -33.9        |
|    reward             | -0.018644849 |
|    std                | 1.01         |
|    value_loss         | 0.967        |
----------------------------------------
---------------------------------------
| time/                 |             |
|    fps                | 126         |
|    iterations         | 200         |
|    time_elapsed       | 7           |
|    total_timesteps    | 1000        |
| train/                |             |
|    entropy_loss       | -41.3       |
|    explained_variance 

In [32]:
trained_a2c.save(config.TRAINED_MODEL_DIR + "/agent_a2c") if if_using_a2c else None

Trading

In [55]:
e_trade_gym = StockTradingEnv(df = trade, **env_kwargs)

In [56]:
df_account_value, df_actions = DRLAgent.DRL_prediction(
    model=trained_a2c,
    environment = e_trade_gym)

hit end!


In [57]:
df_account_value.tail()

Unnamed: 0,date,account_value
183,2020-09-23,932457.358241
184,2020-09-24,935975.452659
185,2020-09-25,949245.242111
186,2020-09-28,965283.748893
187,2020-09-29,961955.772083


Backtesting Performance

In [58]:
print("==============Get Backtest Results===========")
now = datetime.datetime.now().strftime('%Y%m%d-%Hh%M')

perf_stats_all = backtest_stats(account_value=df_account_value)
perf_stats_all = pd.DataFrame(perf_stats_all)
perf_stats_all.to_csv("./"+config.RESULTS_DIR+"/perf_stats_all_"+now+'.csv')

Annual return         -0.050662
Cumulative returns    -0.038044
Annual volatility      0.399601
Sharpe ratio           0.069960
Calmar ratio          -0.141434
Stability              0.000611
Max drawdown          -0.358206
Omega ratio            1.014689
Sortino ratio          0.095153
Skew                        NaN
Kurtosis                    NaN
Tail ratio             0.858610
Daily value at risk   -0.050234
dtype: float64


In [59]:
#baseline stats
print("==============Get Baseline Stats===========")
baseline_df = get_baseline(
        ticker="^DJI",
        start = '2020-01-01',
        end = '2020-09-30')

stats = backtest_stats(baseline_df, value_col_name = 'close')

[*********************100%***********************]  1 of 1 completed

Shape of DataFrame:  (188, 8)
Annual return         -0.065199
Cumulative returns    -0.049054
Annual volatility      0.416030
Sharpe ratio           0.046016
Calmar ratio          -0.175803
Stability              0.012240
Max drawdown          -0.370862
Omega ratio            1.009343
Sortino ratio          0.062829
Skew                        NaN
Kurtosis                    NaN
Tail ratio             0.860019
Daily value at risk   -0.052339
dtype: float64





In [60]:
df_result_a2c = df_account_value.set_index(df_account_value.columns[0])
result = pd.DataFrame(
    {
        "a2c": df_result_a2c["account_value"],
        "dji": dji["close"],
    }
)
result

Unnamed: 0_level_0,a2c,dji
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-01-02,1.000000e+06,
2020-01-03,9.985980e+05,1.000000e+06
2020-01-06,9.999666e+05,1.002392e+06
2020-01-07,9.986441e+05,9.982119e+05
2020-01-08,1.004843e+06,1.003848e+06
...,...,...
2020-09-23,9.324574e+05,9.346322e+05
2020-09-24,9.359755e+05,9.364587e+05
2020-09-25,9.492452e+05,9.489818e+05
2020-09-28,9.652837e+05,


In [61]:
plt.rcParams["figure.figsize"] = (15,5)
plt.figure()
result.plot()

<Axes: xlabel='date'>

In [65]:
result.plot()
plt.savefig('results.png')