<a href="https://colab.research.google.com/github/AI4Finance-Foundation/FinRL/blob/master/FinRL_StockTrading_NeurIPS_2018.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep Reinforcement Learning for Stock Trading from Scratch: Multiple Stock Trading

* **Pytorch Version** 



# Content

In [10]:
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
# matplotlib.use('Agg')
import datetime

import os
import sys
sys.path.append('..')

from finrl import config
from finrl import config_tickers

%matplotlib inline
from preprocess.default_preprocessors import data_split 
from finrl.metaFinrl.env_stock_trading.env_stocktrading import StockTradingEnv
from finrl.agents.stablebaselines3.models import DRLAgent




from pprint import pprint


import itertools

* [1.Build Environment](#0)  
    * [1.1. Training & Trade Data Split](#0.1)
    * [1.2. User-defined Environment](#0.2)   
    * [1.3. Initialize Environment](#0.3)    
* [2.Implement DRL Algorithms](#1) 
            

<a id='0'></a>
# Part 1. Design Environment
Considering the stochastic and interactive nature of the automated stock trading tasks, a financial task is modeled as a **Markov Decision Process (MDP)** problem. The training process involves observing stock price change, taking an action and reward's calculation to have the agent adjusting its strategy accordingly. By interacting with the environment, the trading agent will derive a trading strategy with the maximized rewards as time proceeds.

Our trading environments, based on OpenAI Gym framework, simulate live stock markets with real market data according to the principle of time-driven simulation.

The action space describes the allowed actions that the agent interacts with the environment. Normally, action a includes three actions: {-1, 0, 1}, where -1, 0, 1 represent selling, holding, and buying one share. Also, an action can be carried upon multiple shares. We use an action space {-k,…,-1, 0, 1, …, k}, where k denotes the number of shares to buy and -k denotes the number of shares to sell. For example, "Buy 10 shares of AAPL" or "Sell 10 shares of AAPL" are 10 or -10, respectively. The continuous action space needs to be normalized to [-1, 1], since the policy is defined on a Gaussian distribution, which needs to be normalized and symmetric.

In [16]:
import os
data_path = 'full_preprocessed_data.csv'
processed_full = pd.read_csv(os.path.join(config.DATA_SAVE_DIR, data_path), index_col=[0])

## Training data and Trading data split

In [17]:
# from config.py TRAIN_START_DATE is a string
#config.TRAIN_START_DATE
train_start_date = datetime.datetime(2019,1,1).strftime('%Y-%m-%d')
# from config.py TRAIN_END_DATE is a string
train_end_date = datetime.datetime(2022,1,1).strftime('%Y-%m-%d')
trade_end_date = datetime.datetime(2023,1,1).strftime('%Y-%m-%d')
train = data_split(processed_full, train_start_date ,train_end_date)
trade = data_split(processed_full, train_end_date ,trade_end_date)
print(len(train))
print(len(trade))

4536
1500


In [18]:
train.tail()

Unnamed: 0,date,tic,adj close,close,high,low,open,macd,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,vix,turbulence
755,2021-12-31,EURUSD=X,1.132503,1.132503,1.137915,1.130506,1.132323,-0.000965,44.911151,109.118753,16.257448,1.129425,1.142246,17.219999,3.822869
755,2021-12-31,GBPUSD=X,1.349837,1.349837,1.354848,1.346747,1.349892,0.002156,52.327576,171.604444,30.39303,1.332,1.347087,17.219999,3.822869
755,2021-12-31,USDCAD=X,1.27444,1.27444,1.275,1.26268,1.2743,0.003544,52.687505,-70.077708,14.859173,1.277305,1.260555,17.219999,3.822869
755,2021-12-31,USDCHF=X,0.9137,0.9137,0.91474,0.91047,0.9139,-0.002014,44.964408,-170.194575,37.535785,0.922676,0.921276,17.219999,3.822869
755,2021-12-31,USDJPY=X,115.063004,115.063004,115.192001,115.004997,115.058998,0.301309,59.394022,120.417399,32.024355,114.023701,113.945,17.219999,3.822869


In [19]:
trade.tail()

Unnamed: 0,date,tic,adj close,close,high,low,open,macd,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,vix,turbulence
249,2022-12-29,EURUSD=X,1.062925,1.062925,1.067019,1.061233,1.062925,0.008925,60.167083,78.327051,49.139568,1.0509,1.022255,21.440001,6.166426
249,2022-12-29,GBPUSD=X,1.202848,1.202848,1.207584,1.201548,1.203297,0.003764,52.732919,-46.930901,3.480165,1.211069,1.176042,21.440001,6.166426
249,2022-12-29,USDCAD=X,1.35994,1.35994,1.36076,1.35445,1.35994,0.002138,52.738428,36.079932,8.181686,1.353157,1.357071,21.440001,6.166426
249,2022-12-29,USDCHF=X,0.92771,0.92771,0.92872,0.92119,0.92771,-0.006936,40.654053,-97.75697,35.855458,0.938252,0.963402,21.440001,6.166426
249,2022-12-29,USDJPY=X,134.033997,134.033997,134.188004,132.936005,134.033997,-1.870648,41.56499,-88.398318,29.149269,136.605233,141.392699,21.440001,6.166426


In [20]:
config.INDICATORS

['macd', 'rsi_30', 'cci_30', 'dx_30', 'close_30_sma', 'close_60_sma']

In [22]:
stock_dimension = len(train.tic.unique())
state_space = 1 + len(config.INDICATORS)*stock_dimension + 2*stock_dimension
print(f"Stock Dimension: {stock_dimension}, State Space: {state_space}")


Stock Dimension: 6, State Space: 49


In [23]:
buy_cost_list = sell_cost_list = [0.001] * stock_dimension
num_stock_shares = [0] * stock_dimension

env_kwargs = {
    "hmax": 100,
    "initial_amount": 1000000,
    "num_stock_shares": num_stock_shares,
    "buy_cost_pct": buy_cost_list,
    "sell_cost_pct": sell_cost_list,
    "state_space": state_space,
    "stock_dim": stock_dimension,
    "tech_indicator_list": config.INDICATORS,
    "action_space": stock_dimension,
    "reward_scaling": 1e-4
}


e_train_gym = StockTradingEnv(df = train, **env_kwargs)

## Environment for Training



In [24]:
env_train, _ = e_train_gym.get_sb_env()
print(type(env_train))

<class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>




<a id='1'></a>
# Part 2: Implement DRL Algorithms
* The implementation of the DRL algorithms are based on **OpenAI Baselines** and **Stable Baselines**. Stable Baselines is a fork of OpenAI Baselines, with a major structural refactoring, and code cleanups.
* FinRL library includes fine-tuned standard DRL algorithms, such as DQN, DDPG,
Multi-Agent DDPG, PPO, SAC, A2C and TD3. We also allow users to
design their own DRL algorithms by adapting these DRL algorithms.

### Model Training: 5 models, A2C DDPG, PPO, TD3, SAC


### Model 1: A2C


In [None]:
agent = DRLAgent(env = env_train)
model_a2c = agent.get_model("a2c")
model_a2c.pre

{'n_steps': 5, 'ent_coef': 0.01, 'learning_rate': 0.0007}
Using cpu device


In [None]:
model_name = 'a2c_'
trained_a2c = agent.train_model(model=model_a2c, 
                             tb_log_name='a2c',
                             total_timesteps=50000)
trained_a2c.save(os.path.join(config.TRAINED_MODEL_DIR, model_name + ".pth"))

---------------------------------------
| time/                 |             |
|    fps                | 58          |
|    iterations         | 100         |
|    time_elapsed       | 8           |
|    total_timesteps    | 500         |
| train/                |             |
|    entropy_loss       | -96.5       |
|    explained_variance | -52.8       |
|    learning_rate      | 0.0007      |
|    n_updates          | 99          |
|    policy_loss        | -16.7       |
|    reward             | 0.044553936 |
|    std                | 1.02        |
|    value_loss         | 0.0917      |
---------------------------------------
----------------------------------------
| time/                 |              |
|    fps                | 69           |
|    iterations         | 200          |
|    time_elapsed       | 14           |
|    total_timesteps    | 1000         |
| train/                |              |
|    entropy_loss       | -97.4        |
|    explained_variance | -9.89 

### Model 2: DDPG

In [32]:
from stable_baselines3.common.utils import get_schedule_fn
agent = DRLAgent(env = env_train)
DDPG_PARAMS = {"batch_size": 128, "buffer_size": 50, "learning_rate": 0.0025}
model_ddpg = agent.get_model("ddpg",model_kwargs= DDPG_PARAMS,  tensorboard_log = config.TENSORBOARD_LOG_DIR)

{'batch_size': 128, 'buffer_size': 50, 'learning_rate': 0.0025}
Using cuda device


In [33]:
model_name  = 'DDPG_'
total_timesteps = 1000000
trained_ddpg = agent.train_model(model=model_ddpg, 
                             tb_log_name='ddpg',
                             total_timesteps=total_timesteps)
trained_ddpg.save(os.path.join(config.TRAINED_MODEL_DIR, model_name + str(total_timesteps) + ".pth"))

Logging to MARKETS/ForexMarket/TENSORBOARD_LOG_DIR/ddpg_3
----------------------------------
| time/              |           |
|    episodes        | 4         |
|    fps             | 188       |
|    time_elapsed    | 16        |
|    total_timesteps | 3024      |
| train/             |           |
|    actor_loss      | -88.2     |
|    critic_loss     | 22.3      |
|    learning_rate   | 0.0025    |
|    n_updates       | 2268      |
|    reward          | 0.1039026 |
----------------------------------
----------------------------------
| time/              |           |
|    episodes        | 8         |
|    fps             | 196       |
|    time_elapsed    | 30        |
|    total_timesteps | 6048      |
| train/             |           |
|    actor_loss      | -51       |
|    critic_loss     | 19.8      |
|    learning_rate   | 0.0025    |
|    n_updates       | 5292      |
|    reward          | 0.1039026 |
----------------------------------
day: 755, episode: 10
begin_tota

In [34]:
TENSORBOARD_LOG_DIR

[autoreload of stable_baselines3.common.logger failed: Traceback (most recent call last):
  File "/home/mohammad/miniconda3/envs/Finrl/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 276, in check
    superreload(m, reload, self.old_objects)
  File "/home/mohammad/miniconda3/envs/Finrl/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 500, in superreload
    update_generic(old_obj, new_obj)
  File "/home/mohammad/miniconda3/envs/Finrl/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 397, in update_generic
    update(a, b)
  File "/home/mohammad/miniconda3/envs/Finrl/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 365, in update_class
    update_instances(old, new)
  File "/home/mohammad/miniconda3/envs/Finrl/lib/python3.10/site-packages/IPython/extensions/autoreload.py", line 319, in update_instances
    refs = gc.get_referrers(old)
KeyboardInterrupt
]
[autoreload of stable_baselines3.common.type_aliases f

### Model 3: PPO

In [75]:
agent = DRLAgent(env = env_train)
PPO_PARAMS = config.PPO_PARAMS
model_ppo = agent.get_model("ppo",model_kwargs = PPO_PARAMS, tensorboard_log= config.TENSORBOARD_LOG_DIR)

{'n_steps': 2048, 'ent_coef': 0.01, 'learning_rate': 0.00025, 'batch_size': 64}
Using cuda device


In [76]:
model_ppo.device

device(type='cuda')

In [77]:
model_name  = 'ppo_'
model_version = '50000'
trained_ppo = agent.train_model(model=model_ppo, 
                             tb_log_name='ppo',
                             total_timesteps=50000)
trained_ppo.save(os.path.join(config.TRAINED_MODEL_DIR, model_name + ".pth"))

Logging to MARKETS/ForexMarket/TENSORBOARD_LOG_DIR/ppo_1
day: 521, episode: 120
begin_total_asset: 1000000.00
end_total_asset: 999930.23
total_reward: -69.77
total_cost: 28.03
total_trades: 477
Sharpe: -1.430
--------------------------------------
| time/              |               |
|    fps             | 125           |
|    iterations      | 1             |
|    time_elapsed    | 16            |
|    total_timesteps | 2048          |
| train/             |               |
|    reward          | -0.0018495668 |
--------------------------------------
--------------------------------------------
| time/                   |                |
|    fps                  | 109            |
|    iterations           | 2              |
|    time_elapsed         | 37             |
|    total_timesteps      | 4096           |
| train/                  |                |
|    approx_kl            | 0.00394472     |
|    clip_fraction        | 0.03           |
|    clip_range           | 0.2    

### Model 4: TD3

In [25]:
agent = DRLAgent(env = env_train)
TD3_PARAMS = {"batch_size": 100, 
              "buffer_size": 1000000, 
              "learning_rate": 0.001}

model_td3 = agent.get_model("td3",model_kwargs = TD3_PARAMS)

{'batch_size': 100, 'buffer_size': 1000000, 'learning_rate': 0.001}
Using cuda device


In [26]:
model_name ='td3_'
trained_td3 = agent.train_model(model=model_td3, 
                             tb_log_name='td3',
                             total_timesteps=30000)
trained_td3.save(os.path.join(config.TRAINED_MODEL_DIR, model_name + ".pth"))

-------------------------------------
| time/              |              |
|    episodes        | 4            |
|    fps             | 199          |
|    time_elapsed    | 15           |
|    total_timesteps | 3024         |
| train/             |              |
|    actor_loss      | -422         |
|    critic_loss     | 4.47e+03     |
|    learning_rate   | 0.001        |
|    n_updates       | 2268         |
|    reward          | -0.055712607 |
-------------------------------------
-------------------------------------
| time/              |              |
|    episodes        | 8            |
|    fps             | 195          |
|    time_elapsed    | 30           |
|    total_timesteps | 6048         |
| train/             |              |
|    actor_loss      | -317         |
|    critic_loss     | 1.52e+05     |
|    learning_rate   | 0.001        |
|    n_updates       | 5292         |
|    reward          | -0.055712607 |
-------------------------------------
day: 755, ep

### Model 5: SAC

In [None]:
agent = DRLAgent(env = env_train)
SAC_PARAMS = {
    "batch_size": 128,
    "buffer_size": 1000000,
    "learning_rate": 0.0001,
    "learning_starts": 100,
    "ent_coef": "auto_0.1",
}

model_sac = agent.get_model("sac",model_kwargs = SAC_PARAMS)

{'batch_size': 128, 'buffer_size': 1000000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'}
Using cpu device


In [None]:
model_name = 'sac_'
trained_sac = agent.train_model(model=model_sac, 
                             tb_log_name='sac',
                             total_timesteps=60000)
trained_sac.save(os.path.join(config.TRAINED_MODEL_DIR, model_name + ".pth"))

----------------------------------
| time/              |           |
|    episodes        | 4         |
|    fps             | 23        |
|    time_elapsed    | 63        |
|    total_timesteps | 1508      |
| train/             |           |
|    actor_loss      | 1.22e+03  |
|    critic_loss     | 7.89e+03  |
|    ent_coef        | 0.115     |
|    ent_coef_loss   | 1.08e+03  |
|    learning_rate   | 0.0001    |
|    n_updates       | 1407      |
|    reward          | 4.9791036 |
----------------------------------
----------------------------------
| time/              |           |
|    episodes        | 8         |
|    fps             | 19        |
|    time_elapsed    | 156       |
|    total_timesteps | 3016      |
| train/             |           |
|    actor_loss      | 1.66e+03  |
|    critic_loss     | 8.25e+03  |
|    ent_coef        | 0.134     |
|    ent_coef_loss   | 1.01e+03  |
|    learning_rate   | 0.0001    |
|    n_updates       | 2915      |
|    reward         

KeyboardInterrupt: 

### Model 6: recurrentppo

In [None]:
import os
import numpy as np

from sb3_contrib import RecurrentPPO
from stable_baselines3.common.evaluation import evaluate_policy


In [74]:
model = RecurrentPPO("MlpLstmPolicy", env=env_train, verbose=1)
model.learn(50000)
model_version = '50000_iter_'
model_name = 'recurrent_ppo'

env = model.get_env()
mean_reward, std_reward = evaluate_policy(model, env, n_eval_episodes=20, warn=False)
print(mean_reward)

model.save(os.path.join(config.TRAINED_MODEL_DIR, model_version + model_name  + ".pth"))

Using cuda device
----------------------------
| time/              |     |
|    fps             | 13  |
|    iterations      | 1   |
|    time_elapsed    | 9   |
|    total_timesteps | 128 |
----------------------------
----------------------------------------
| time/                   |            |
|    fps                  | 13         |
|    iterations           | 2          |
|    time_elapsed         | 19         |
|    total_timesteps      | 256        |
| train/                  |            |
|    approx_kl            | 0.15408333 |
|    clip_fraction        | 0.392      |
|    clip_range           | 0.2        |
|    entropy_loss         | -0.863     |
|    explained_variance   | -0.28      |
|    learning_rate        | 0.0003     |
|    loss                 | 0.465      |
|    n_updates            | 10         |
|    policy_gradient_loss | 0.431      |
|    std                  | 0.998      |
|    value_loss           | 0.00468    |
----------------------------------------


# Trading
Assume that we have $1,000,000 initial capital at 2020-07-01. We use the DDPG model to trade Dow jones 30 stocks.

### Set turbulence threshold
Set the turbulence threshold to be greater than the maximum of insample turbulence data, if current turbulence index is greater than the threshold, then we assume that the current market is volatile

In [31]:
data_risk_indicator = processed_full[(processed_full.date<trade_end_date) & (processed_full.date> train_end_date)]
insample_risk_indicator = data_risk_indicator.drop_duplicates(subset=['date'])

In [32]:
insample_risk_indicator.vix.describe()

count    250.000000
mean      25.639720
std        4.216336
min       16.600000
25%       22.230000
50%       25.505000
75%       28.930001
max       36.450001
Name: vix, dtype: float64

In [33]:
insample_risk_indicator.vix.quantile(0.996)

35.13528106689452

In [34]:
insample_risk_indicator.turbulence.describe()

count    250.000000
mean      10.574578
std       12.101466
min        0.451542
25%        4.143907
50%        7.096417
75%       13.162082
max       89.601229
Name: turbulence, dtype: float64

In [35]:
insample_risk_indicator.turbulence.quantile(0.996)

88.29403926590331

### Trade

DRL model needs to update periodically in order to take full advantage of the data, ideally we need to retrain our model yearly, quarterly, or monthly. We also need to tune the parameters along the way, in this notebook I only use the in-sample data from 2009-01 to 2020-07 to tune the parameters once, so there is some alpha decay here as the length of trade date extends. 

Numerous hyperparameters – e.g. the learning rate, the total number of samples to train on – influence the learning process and are usually determined by testing some variations.

In [36]:
#trade = data_split(processed_full, '2020-07-01','2021-10-31')
e_trade_gym = StockTradingEnv(df = trade, turbulence_threshold = 70, risk_indicator_col='vix', **env_kwargs)
# env_trade, obs_trade = e_trade_gym.get_sb_env()

In [37]:
trade.head()

Unnamed: 0,date,tic,adj close,close,high,low,open,macd,rsi_30,cci_30,dx_30,close_30_sma,close_60_sma,vix,turbulence
0,2022-01-03,AUDUSD=X,0.726818,0.726818,0.727908,0.71859,0.72685,0.001099,51.288051,104.966545,3.520604,0.716629,0.728673,16.6,6.664439
0,2022-01-03,EURUSD=X,1.137346,1.137346,1.137592,1.128541,1.137385,-0.000456,47.814245,122.620797,9.183568,1.12975,1.141925,16.6,6.664439
0,2022-01-03,GBPUSD=X,1.352228,1.352228,1.35318,1.343274,1.352228,0.00305,53.283006,154.852691,21.80004,1.332292,1.34693,16.6,6.664439
0,2022-01-03,USDCAD=X,1.26588,1.26588,1.27781,1.2644,1.26571,0.002133,48.905144,-87.188526,9.888084,1.277323,1.260846,16.6,6.664439
0,2022-01-03,USDCHF=X,0.911975,0.911975,0.91991,0.9115,0.91203,-0.00236,43.914787,-136.171788,14.666355,0.922099,0.921011,16.6,6.664439


In [38]:
model_name = 'td3_.pth'
train_model_path = os.path.join(config.TRAINED_MODEL_DIR, model_name)
trained_ddpg = DRLAgent.DRL_load_from_file(model_name = 'td3' ,
    cwd=train_model_path)

Successfully load model /mnt/f/financial_projects/Deep Reinforcement Learning Approaches on Stock Prediction/FinRL/MARKETS/ForexMarket/TRAINED_MODEL_DIR/td3_.pth


In [39]:
df_account_value, df_actions = DRLAgent.DRL_prediction(
    model=trained_ddpg, 
    environment=e_trade_gym)



hit end!


In [40]:
import os
action_value_df = pd.merge(df_actions, df_account_value,on='date')
action_value_df.to_csv(os.path.join(config.RESULTS_DIR,'td3_actions_account_value.csv'))

In [41]:
df_actions

Unnamed: 0_level_0,AUDUSD=X,EURUSD=X,GBPUSD=X,USDCAD=X,USDCHF=X,USDJPY=X
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2022-01-03,0,100,100,100,0,0
2022-01-04,0,100,100,100,0,0
2022-01-05,0,100,100,100,0,0
2022-01-06,0,100,100,100,0,0
2022-01-07,0,100,100,100,0,0
...,...,...,...,...,...,...
2022-12-21,0,100,100,100,0,0
2022-12-22,0,100,100,100,0,0
2022-12-23,0,100,100,100,0,0
2022-12-27,0,100,100,100,0,0


### Trade by recurrentppo

we use recurrent ppo as alternative to finrl's drl agents beacause our env is partially observable and we need memory so we use recurrent ppo (lstm ppo) for using of recurrent prediction 


In [42]:
#trade = data_split(processed_full, '2020-07-01','2021-10-31')
e_trade_gym = StockTradingEnv(df = trade,  turbulence_threshold = 70,risk_indicator_col='vix', **env_kwargs)
env_trade, obs_trade = e_trade_gym.get_sb_env()



In [43]:
os.path.join(config.TRAINED_MODEL_DIR,  model_version + model_version + ".pth")

NameError: name 'model_version' is not defined

In [56]:
model_version = '1000000'
model_name = 'DDPG_'
model = RecurrentPPO.load(os.path.join(TRAINED_MODEL_DIR,  model_name + model_version + ".pth"))

TypeError: TD3Policy.__init__() got an unexpected keyword argument 'use_sde'

In [31]:
account_memory = []
actions_memory = []
lstm_states = None
num_envs = 1
episode_starts = np.ones((num_envs,), dtype=bool)
#         state_memory=[] #add memory pool to store states
env_trade.reset()
for i in range(len(e_trade_gym.df.index.unique())):
    action, lstm_states = model.predict(obs_trade, state=lstm_states, episode_start=episode_starts, deterministic=True)
            # account_memory = test_env.env_method(method_name="save_asset_memory")
            # actions_memory = test_env.env_method(method_name="save_action_memory")
    obs_trade, rewards, dones, info = env_trade.step(action)
    episode_starts = dones
    env_trade.render()
    if i == (len(e_trade_gym.df.index.unique()) - 2):
        account_memory = env_trade.env_method(method_name="save_asset_memory")
        actions_memory = env_trade.env_method(method_name="save_action_memory")
#                 state_memory=test_env.env_method(method_name="save_state_memory") # add current state to state memory
    if dones[0]:
        print(i)
        print("hit end!")


105
hit end!


In [52]:
df_account_value = pd.DataFrame(account_memory[0])
df_actions_memory = pd.DataFrame(actions_memory[0])

NameError: name 'account_memory' is not defined

In [33]:
df_account_value.to_csv(os.path.join(config.RESULTS_DIR,'account_value_for$_untill_$'.format(config.TRADE_START_DATE,config.TRADE_END_DATE)))
df_actions_memory.to_csv(os.path.join(config.RESULTS_DIR,'actions_memory_for$_untill_$'.format(config.TRADE_START_DATE,config.TRADE_END_DATE)))



Unnamed: 0,date,account_value
0,2021-04-26,1.000000e+06
1,2021-04-27,1.017711e+06
2,2021-04-28,1.014192e+06
3,2021-04-29,9.902239e+05
4,2021-04-30,1.067545e+06
...,...,...
101,2021-09-17,8.741583e+05
102,2021-09-20,7.944693e+05
103,2021-09-21,7.521426e+05
104,2021-09-22,8.052165e+05
