# RL in Finance(Test) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sunnyswag/RL_in_Finance/blob/main/RL_in_Finance_Test.ipynb)

## 1、拉取 github 仓库，下载并导入相关包
&emsp;&emsp;运行流程：python setup.py -> pip install -r requirements.txt

In [1]:
!pip install git+https://github.com/sunnyswag/RL_in_Finance.git
!pip install git+https://github.com/quantopian/pyfolio.git

Collecting git+https://github.com/sunnyswag/RL_in_Finance.git
  Cloning https://github.com/sunnyswag/RL_in_Finance.git to /tmp/pip-req-build-kj86soz5
  Running command git clone -q https://github.com/sunnyswag/RL_in_Finance.git /tmp/pip-req-build-kj86soz5
Collecting stockstats
  Downloading https://files.pythonhosted.org/packages/32/41/d3828c5bc0a262cb3112a4024108a3b019c183fa3b3078bff34bf25abf91/stockstats-0.3.2-py2.py3-none-any.whl
Collecting tushare
[?25l  Downloading https://files.pythonhosted.org/packages/17/76/dc6784a1c07ec040e748c8e552a92e8f4bdc9f3e0e42c65699efcfee032b/tushare-1.2.62-py3-none-any.whl (214kB)
[K     |████████████████████████████████| 215kB 4.9MB/s 
[?25hCollecting pyfolio
[?25l  Downloading https://files.pythonhosted.org/packages/28/b4/99799b743c4619752f88b70354924132a2e9b82f4656fe7c55eaa9101392/pyfolio-0.9.2.tar.gz (91kB)
[K     |████████████████████████████████| 92kB 4.3MB/s 
Collecting stable-baselines3[extra]
[?25l  Downloading https://files.pythonhosted

In [2]:
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import datetime
import time

%matplotlib inline
from utils import config
from utils.pull_data import Pull_data
from utils.preprocessors import FeatureEngineer, split_data
from utils.env import Stock_Trading_Env
from utils.models import DRL_Agent
from utils.backtest import backtest_stats, backtest_plot, get_baseline
import itertools
import sys
sys.path.append("../RL_in_Finance")

  'Module "zipline.assets" not found; multipliers will not be applied'


## 2、下载数据

数据来源：Tushare API<br>
当前用到的数据：SSE_50 和 CSI_300<br>
数据量的大小：shape[2892 * n, 8]

In [3]:
stock_list = config.SSE_50[:2]
df = Pull_data(stock_list, save_data=False).pull_data()

--- 开始下载 ----
--- 下载完成 ----
DataFrame 的大小:  (5784, 8)


In [4]:
df.sort_values(['date', 'tic'], ignore_index=True).head()

Unnamed: 0,date,tic,open,high,low,close,volume,day
0,2009-01-05,600000.SH,2.7584,2.8115,2.7258,2.8013,503142.56,0
1,2009-01-05,600009.SH,9.4665,9.6505,9.4414,9.5836,52100.33,0
2,2009-01-06,600000.SH,2.8422,2.981,2.8422,2.9565,958496.0,1
3,2009-01-06,600009.SH,9.525,10.1021,9.4999,10.077,104182.13,1
4,2009-01-07,600000.SH,2.9565,2.9769,2.9034,2.9177,618938.77,2


In [5]:
print("数据下载的时间区间为：{} 至 {}".format(config.Start_Date, config.End_Date))

数据下载的时间区间为：20090101 至 20210101


In [6]:
print("下载的股票列表为: ")
print(stock_list)

下载的股票列表为: 
['600000.SH', '600009.SH']


## 3、数据预处理

In [7]:
processed_df = FeatureEngineer(use_technical_indicator=True).preprocess_data(df)

成功添加技术指标
对当前时间段未上市的公司的所有行置零


In [8]:
print("技术指标列表: ")
print(config.TECHNICAL_INDICATORS_LIST)
print("技术指标数: {}个".format(len(config.TECHNICAL_INDICATORS_LIST)))

技术指标列表: 
['macd', 'boll_ub', 'boll_lb', 'rsi_30', 'cci_30', 'dx_30', 'close_20_sma', 'close_60_sma', 'close_120_sma']
技术指标数: 9个


In [9]:
processed_df.head()

Unnamed: 0,date,tic,open,high,low,close,volume,day,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_20_sma,close_60_sma,close_120_sma
0,2009-01-05,600000.SH,2.7584,2.8115,2.7258,2.8013,503142.56,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.8013,2.8013,2.8013
1,2009-01-05,600009.SH,9.4665,9.6505,9.4414,9.5836,52100.33,0.0,0.0,0.0,0.0,0.0,0.0,0.0,9.5836,9.5836,9.5836
2,2009-01-06,600000.SH,2.8422,2.981,2.8422,2.9565,958496.0,1.0,0.003482,3.098386,2.659414,100.0,66.666667,100.0,2.8789,2.8789,2.8789
3,2009-01-06,600009.SH,9.525,10.1021,9.4999,10.077,104182.13,1.0,0.01107,10.528073,9.132527,100.0,66.666667,100.0,9.8303,9.8303,9.8303
4,2009-01-07,600000.SH,2.9565,2.9769,2.9034,2.9177,618938.77,2.0,0.003234,3.053371,2.730296,79.452055,53.048306,100.0,2.891833,2.891833,2.891833


In [10]:
train_data = split_data(processed_df, config.Start_Trade_Date, config.End_Trade_Date)
test_data = split_data(processed_df, config.End_Trade_Date, config.End_Test_Date)

In [11]:
print("训练数据的范围：{} 至 {}".format(config.Start_Trade_Date, config.End_Trade_Date))
print("测试数据的范围：{} 至 {}".format(config.End_Trade_Date, config.End_Test_Date))
print("训练数据的长度: {},测试数据的长度:{}".format(len(train_data), len(test_data)))
print("训练集数据 : 测试集数据: {} : {}".format(round(len(train_data)/len(test_data),1), 1))

训练数据的范围：2009-01-01 至 2019-01-01
测试数据的范围：2019-01-01 至 2021-01-01
训练数据的长度: 4862,测试数据的长度:974
训练集数据 : 测试集数据: 5.0 : 1


In [12]:
train_data.head()

Unnamed: 0,date,tic,open,high,low,close,volume,day,macd,boll_ub,boll_lb,rsi_30,cci_30,dx_30,close_20_sma,close_60_sma,close_120_sma
0,2009-01-05,600000.SH,2.7584,2.8115,2.7258,2.8013,503142.56,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.8013,2.8013,2.8013
0,2009-01-05,600009.SH,9.4665,9.6505,9.4414,9.5836,52100.33,0.0,0.0,0.0,0.0,0.0,0.0,0.0,9.5836,9.5836,9.5836
1,2009-01-06,600000.SH,2.8422,2.981,2.8422,2.9565,958496.0,1.0,0.003482,3.098386,2.659414,100.0,66.666667,100.0,2.8789,2.8789,2.8789
1,2009-01-06,600009.SH,9.525,10.1021,9.4999,10.077,104182.13,1.0,0.01107,10.528073,9.132527,100.0,66.666667,100.0,9.8303,9.8303,9.8303
2,2009-01-07,600000.SH,2.9565,2.9769,2.9034,2.9177,618938.77,2.0,0.003234,3.053371,2.730296,79.452055,53.048306,100.0,2.891833,2.891833,2.891833


## 4、初始化环境

**state_space 由四部分组成 :** <br>
1. 当前的资金量
2. 每只股票当前的收盘价
3. 每只股票当前的持仓量
4. 股票数 * 技术指标数<br>

&emsp;&emsp;TODO: 增加成交量这个状态

**reward 的计算方式：**<br>
* reward 交易前的总资产-当天交易后的总资产 = 当天交易的手续费
* TODO：待改进

**action_space 的空间：**<br>
  * actions ∈[-100, 100]
  * 正数表示买入，负数表示卖出，0表示不进行买入卖出操作
  * 绝对值表示买入卖出的数量

In [13]:
stock_dimension = len(df.tic.unique()) # 2
state_space = 1 + 2*stock_dimension + \
    len(config.TECHNICAL_INDICATORS_LIST)*stock_dimension # 23
print("stock_dimension: {}, state_space: {}".format(stock_dimension, state_space))

stock_dimension: 2, state_space: 23


In [14]:
# 初始化环境的参数
env_kwargs = {
    "stock_dim": stock_dimension, 
    "hmax": 100, 
    "initial_amount": 1e6, 
    "buy_cost_pct": 0.001,
    "sell_cost_pct": 0.001,
    "reward_scaling": 1e-4,
    "state_space": state_space, 
    "action_space": stock_dimension, 
    "tech_indicator_list": config.TECHNICAL_INDICATORS_LIST
}
e_train_gym = Stock_Trading_Env(df = train_data, **env_kwargs)

In [15]:
# 对环境进行测试
observation = e_train_gym.reset() # 初始化环境，observation为环境状态
count = 0
total_reward = 0
for t in range(10):
  action = e_train_gym.action_space.sample() # 随机采样动作
  observation, reward, done, info = e_train_gym.step(action) # 与环境交互，获得下一个state的值
  total_reward += reward
  if done:             
      break
  count+=1
  time.sleep(0.2)      #每次等待 0.2s
print("observation: ", observation)
print("e_train_gym.cost: ", e_train_gym.cost)
print("reward: {}, done: {}".format(total_reward, done))

observation:  [999373.4166775, 3.2587, 10.5955, 218, 0, 0.03743618918699809, 0.06953890696683374, 3.29030605610415, 11.158052250161413, 2.6616575802594857, 9.567620477111317, 76.03339630703236, 68.40717977033304, 174.4215725060299, 78.36201325770573, 68.45426624783556, 67.51593864520234, 2.975981818181818, 10.362836363636365, 2.975981818181818, 10.362836363636365, 2.975981818181818, 10.362836363636365]
e_train_gym.cost:  3.1954224999999994
reward: -0.0003195422499789856, done: False


In [16]:
# 初始化训练时所需要用到的环境
env_train, _ = e_train_gym.get_sb_env()
print(type(env_train))

<class 'stable_baselines3.common.vec_env.dummy_vec_env.DummyVecEnv'>


## 5、开始训练

所用到的框架：stable_baseline3

In [17]:
agent = DRL_Agent(env = env_train)
model_sac = agent.get_model("sac", model_kwargs = config.SAC_PARAMS)

{'batch_size': 64, 'buffer_size': 100000, 'learning_rate': 0.0001, 'learning_starts': 100, 'ent_coef': 'auto_0.1'}
Using cpu device


In [18]:
train_sac = agent.train_model(
    model = model_sac,
    tb_log_name='sac',
    total_timesteps=20000
)

Logging to tensorboard_log/sac/sac_1
----------------------------------
| environment/        |          |
|    portfolio_value  | 1e+06    |
|    total_cost       | 0        |
|    total_reward     | 0        |
|    total_reward_pct | 0        |
|    total_trades     | 0        |
| time/               |          |
|    episodes         | 4        |
|    fps              | 55       |
|    time_elapsed     | 176      |
|    total timesteps  | 9724     |
| train/              |          |
|    actor_loss       | 5.03e+03 |
|    critic_loss      | 7.99e+04 |
|    ent_coef         | 0.253    |
|    ent_coef_loss    | 56.7     |
|    learning_rate    | 0.0001   |
|    n_updates        | 9623     |
----------------------------------
天数: 2430天, episode: 10
开始时的总资产: 1000000.0
结束时的总资产: 1000000.0
总奖励值: 0.0
总的手续费: 0
总的交易次数: 0
----------------------------------
| environment/        |          |
|    portfolio_value  | 1e+06    |
|    total_cost       | 0        |
|    total_reward     | 0        

## 6、测试

In [19]:
e_test_gym = Stock_Trading_Env(df = test_data, **env_kwargs)
df_account_value, df_actions = DRL_Agent.DRL_prediction(
    model=model_sac, 
    environment = e_test_gym)

回测完成!


In [20]:
print("回测的时间窗口：{} 至 {}".format(config.End_Trade_Date, config.End_Test_Date))

回测的时间窗口：2019-01-01 至 2021-01-01


In [21]:
print("查看日账户净值")
df_account_value.tail()
df_account_value.to_csv("df_account_value.csv", index=False)

查看日账户净值


In [22]:
print("查看每日所作的交易")
df_actions.tail()

查看每日所作的交易


Unnamed: 0_level_0,600000.SH,600009.SH
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-12-23,0,0
2020-12-24,0,0
2020-12-25,0,0
2020-12-28,0,0
2020-12-29,0,0


## 7、回测

In [23]:
print("---------------------获取回测结果---------------------")
pref_stats_all = backtest_stats(account_value=df_account_value)

# perf_stats_all = pd.DataFrame(perf_stats_all)
# now = datetime.datetime.now().strftime('%Y%m%d-%Hh%M')
# perf_stats_all.to_csv("./"+config.RESULTS_DIR+"/perf_stats_all_"+now+'.csv')

---------------------获取回测结果---------------------
Annual return          0.0
Cumulative returns     0.0
Annual volatility      0.0
Sharpe ratio           NaN
Calmar ratio           NaN
Stability              0.0
Max drawdown           0.0
Omega ratio            NaN
Sortino ratio          NaN
Skew                   NaN
Kurtosis               NaN
Tail ratio             NaN
Daily value at risk    0.0
dtype: float64


  out=out,
  np.divide(average_annual_return, annualized_downside_risk, out=out)
  np.abs(np.percentile(returns, 5))


In [24]:
# 获取 baseline 的结果
print("---------------------获取baseline结果---------------------")
baseline_df = get_baseline(config.SSE_50_INDEX, 
              start="20190101",
              end="20210101")
baseline_stats = backtest_stats(baseline_df, value_col_name='close')

---------------------获取baseline结果---------------------
--- 开始下载 ----
--- 下载完成 ----
DataFrame 的大小:  (487, 8)
Annual return          0.271107
Cumulative returns     0.589776
Annual volatility      0.189096
Sharpe ratio           1.366667
Calmar ratio           1.487275
Stability              0.618053
Max drawdown          -0.182284
Omega ratio            1.286287
Sortino ratio          1.953108
Skew                        NaN
Kurtosis                    NaN
Tail ratio             1.078570
Daily value at risk   -0.022798
dtype: float64


In [25]:
pref_stats_all.head()

Annual return         0.0
Cumulative returns    0.0
Annual volatility     0.0
Sharpe ratio          NaN
Calmar ratio          NaN
dtype: float64

In [26]:
# 删除 df_account_value 中重复的行
df_account_value.drop(df_account_value.index[1], inplace=True)

In [27]:
baseline_df.head(10)

Unnamed: 0,date,tic,open,high,low,close,volume,day
0,2019-01-02,000016.SH,2262.7908,2298.1805,2301.0552,2252.7479,20880697.0,2
1,2019-01-03,000016.SH,2269.243,2259.4825,2287.7778,2253.9433,18895240.0,3
2,2019-01-04,000016.SH,2314.6466,2252.7449,2316.3528,2249.3658,25900596.0,4
3,2019-01-07,000016.SH,2314.3193,2329.0316,2331.6031,2306.8979,25278948.0,0
4,2019-01-08,000016.SH,2305.1708,2312.1705,2312.1705,2298.9548,18131160.0,1
5,2019-01-09,000016.SH,2332.7192,2320.9119,2360.3601,2318.4352,28747596.0,2
6,2019-01-10,000016.SH,2331.8507,2333.2162,2345.3313,2321.3049,22280507.0,3
7,2019-01-11,000016.SH,2354.4987,2342.0236,2360.0609,2334.9145,18417693.0,4
8,2019-01-14,000016.SH,2331.1358,2350.256,2354.3082,2330.0042,16462252.0,0
9,2019-01-15,000016.SH,2378.3696,2337.7021,2380.5995,2332.4095,22466336.0,1


In [28]:
type(df_account_value)

pandas.core.frame.DataFrame

In [29]:
print("---------------------Plot---------------------")
print("和 {} 指数进行比较".format(config.SSE_50_INDEX[0]))
backtest_plot(df_account_value,
        baseline_start="20190101",
        baseline_end="20210101",
        baseline_ticker=config.SSE_50_INDEX,
      )

---------------------Plot---------------------
和 000016.SH 指数进行比较
--- 开始下载 ----
--- 下载完成 ----
DataFrame 的大小:  (487, 8)


  out=out,
  np.divide(average_annual_return, annualized_downside_risk, out=out)
  np.abs(np.percentile(returns, 5))


Start date,2019-01-02,2019-01-02
End date,2020-12-30,2020-12-30
Total months,23,23
Unnamed: 0_level_3,Backtest,Unnamed: 2_level_3
Annual return,0.0%,
Cumulative returns,0.0%,
Annual volatility,0.0%,
Sharpe ratio,,
Calmar ratio,,
Stability,0.00,
Max drawdown,0.0%,
Omega ratio,,
Sortino ratio,,
Skew,,


Worst drawdown periods,Net drawdown in %,Peak date,Valley date,Recovery date,Duration
0,0.0,2019-01-02,2019-01-02,2019-01-02,1.0
1,,NaT,NaT,NaT,
2,,NaT,NaT,NaT,
3,,NaT,NaT,NaT,
4,,NaT,NaT,NaT,


