# 實驗說明

## 歷史幣價資料取得

- [幣安API](https://binance-docs.github.io/apidocs/futures/cn/#185368440e)

## 幣種

- BTC
- ETH
- BNB
- SOL
- BUSD

## 資料切分

- 訓練時間段 : 2020/10/17 ~ 2021/10/17
- 測試時間段 : 2021/10/18 ~ 2023/10/25

## 避險機制

#### 邏輯
- 比特幣做為加密貨幣的元老，其走勢往往能牽動其他的加密貨幣，例如比特幣大漲時會帶動其他小幣可能一起漲，甚至漲得更多，但比特幣大跌時同樣也會讓其他小幣一起大跌，如同加密貨幣的大盤

#### 原理
- 利用比特幣的大盤特性計算一種動能指標來判斷當前市場是樂觀情緒(利於做多)還是悲觀情緒(不利於做多)

- 在比特幣動能<0時判定為不利於做多的市場，將所有幣種賣出並轉換成穩定幣以度過進一步下跌風險
- 在比特幣動能>0後判定為利於做多的市場，再將穩定幣換成各個幣種以吃到瞬間漲幅

## 特徵
- [眾多技術指標](https://github.com/bukosabino/ta)
- [SuperTrend](https://tw.tradingview.com/scripts/supertrend/) 指標
- [Catch22](https://github.com/DynamicsAndNeuralSystems/pycatch22) 時間序列特徵
- 過去平均跌幅
- 微軟[QLib](https://github.com/microsoft/qlib) 開高低收特徵

## 模型
- 強化學習演算法 [A2C](https://github.com/openai/baselines/tree/master/baselines/a2c)

## 回測結果分析

- 比較基準為100%持有比特幣

#### 重要的衡量指標
- Alpha : 相對於大盤(比特幣)的超額報酬，愈高愈好，表示無考慮風險的賺錢能力
- Sharp Ratio : 承受一單位的風險可以獲得多少單位的報酬，愈高愈好，表示有考慮風險的賺錢能力
- CAGR : 年複合增長率，平均一年成長了多少%，愈高愈好
- Calmar Ratio : CAGR/Max Drawdown，愈高愈好，表示考慮最大風險的賺錢能力
- Mean Drawdown : 平均跌幅，愈低愈好，表示一期間內平均跌了多少%，表示抗跌能力
- Max Drawdown : 最大跌幅，愈低愈好，表示曾經跌了多少%
- Prob of Losing Money : 一期間內虧錢期間佔了多少%，愈低愈好，表示虧錢期間的長短

#### 從報酬的角度來看
- 2021年末的短暫牛市報酬大於單純持有比特幣
- 2023年比特幣從低點反彈並有幾段牛市，因此模型表現不如單純持有比特幣
- 2021/10/18~2023/10/25 獲利表現 44% 勝單純持有比特幣的-43%

#### 從風險的角度來看
- 2022年熊市可以透過避險機制避開數次大跌，跌幅遠低於單純持有比特幣
- 最大回撤 -41% 勝單純持有比特幣的76%
- 平均回撤 -17% 勝單純持有比特幣的55%

## 結論

- 長期持有加密貨幣是一件高風險高報酬的事情，例如單純持有比特幣，雖然上漲時漲幅高，但下跌時跌幅也很高，並且很依靠進場的位置，是否屬於低點。本研究使用A2C強化學習模型並配合避險機制風控一個長期持有加密貨幣的投資組合，實驗結果表明可以在報酬與風險中間達到一個不錯的平衡，為投資人提供一個除了單純持有比特幣以外的新選擇。

# 連結雲端硬碟

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# 安裝套件

In [None]:
!pip install stable-baselines3==1.2.0 > log.txt
!pip install optuna > log.txt
!pip install ta==0.9.0 > log.txt
!pip install git+https://github.com/DynamicsAndNeuralSystems/pycatch22.git
# Install ta-lib
url = 'https://launchpad.net/~mario-mariomedina/+archive/ubuntu/talib/+files'
!wget $url/libta-lib0_0.4.0-oneiric1_amd64.deb -qO libta.deb
!wget $url/ta-lib0-dev_0.4.0-oneiric1_amd64.deb -qO ta.deb
!dpkg -i libta.deb ta.deb
!pip install ta-lib > log.txt

Collecting git+https://github.com/DynamicsAndNeuralSystems/pycatch22.git
  Cloning https://github.com/DynamicsAndNeuralSystems/pycatch22.git to /tmp/pip-req-build-jfgsg7ko
  Running command git clone --filter=blob:none --quiet https://github.com/DynamicsAndNeuralSystems/pycatch22.git /tmp/pip-req-build-jfgsg7ko
  Resolved https://github.com/DynamicsAndNeuralSystems/pycatch22.git to commit cfbc4a3f1dc62e93b2fe7c9d06d358fa25b5aaad
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Building wheels for collected packages: pycatch22
  Building wheel for pycatch22 (pyproject.toml) ... [?25l[?25hdone
  Created wheel for pycatch22: filename=pycatch22-0.4.4-cp310-cp310-linux_x86_64.whl size=113376 sha256=18c015924ba9fe4ad2af74845bff571557fe0f53ead92b2276008360f787d528
  Stored in directory: /tmp/pip-ephem-wheel-cache-mx5

# 設定參數

In [None]:
# 雲端硬碟檔案存放路徑
FOLDER = '/content/drive/MyDrive/CryptoPortfolioRL2023/'

# 設定回測比較基準，如比特幣
BENCHMARK = 'BTC/USDT'

# 一開始的投資金額
INITIAL_AMOUNT = 10000

# 訓練回合數
N_TIMESTEPS = 50000

# 是否使用避險機制 : [True] [False]
HEDGING = [True]

# 模型演算法選擇 : 'a2c' 'ppo'
AGENT_NAME = 'a2c'

# 模型超參數
A2C_PARAMS = {"n_steps": 5, "ent_coef": 0.005, "learning_rate": 0.002}
PPO_PARAMS = {"n_steps": 2048,"ent_coef": 0.005,"learning_rate": 0.001,"batch_size": 128}

MODEL_KWARGS = {"a2c": A2C_PARAMS, "ppo": PPO_PARAMS}

# 訓練與測試時間段
TRAIN_START_DATE = '2020-10-17'
TRAIN_END_DATE = '2021-10-17'
TRADE_START_DATE = '2021-10-18'
TRADE_END_DATE = '2023-10-25'

# 投資組合中的加密貨幣
CRYPTO_LIST = ['BTC', 'ETH', 'BNB', 'SOL']

# 加入所有特徵 (預設)
FEATURE_LIST = ['86ta', 'dd', 'st', 'catch22', 'qlib']

if HEDGING[0]:
    CRYPTO_LIST.append('BUSD')
    FEATURE_LIST.append('hedging')

    index_busd = sorted(CRYPTO_LIST).index('BUSD')
    HEDGING.append(index_busd)

CRYPTO_LIST = [ticker + '/USDT' for ticker in CRYPTO_LIST]

# 資料、環境建置、模型訓練

In [None]:
from data import *

data_kwargs = {
    "folder": FOLDER,
    "crypto_list": CRYPTO_LIST,
    "feature_list": FEATURE_LIST,
    "train_start_date": TRAIN_START_DATE,
    "train_end_date": TRAIN_END_DATE,
    "trade_start_date": TRADE_START_DATE,
    "trade_end_date": TRADE_END_DATE
}
# 讀取、計算特徵、並切分成訓練、測試資料
train, trade, tech_indicator_list = prepare_data(**data_kwargs)

from training import *

env_kwargs = {
    "transaction_cost_pct": 0.001,
    "reward_scaling": 1e-4,
    "initial_amount": INITIAL_AMOUNT,
    "tech_indicator_list": tech_indicator_list,
    "hedging": HEDGING,
    "folder": FOLDER
}
# 建立強化學習的環境
e_train_gym, e_trade_gym, env_train, crypto_dimension, state_space = building_environment(env_kwargs, train, trade)

train_kwargs = {
    "train": train,
    "trade": trade,
    "e_train_gym": e_train_gym,
    "e_trade_gym": e_trade_gym,
    "env_train": env_train,
    "agent_name": AGENT_NAME,
    "total_timesteps":N_TIMESTEPS,
    "MODEL_KWARGS": MODEL_KWARGS,
    "folder": FOLDER
}
# 訓練模型，並讓模型預測測試時間段的結果
df_daily_return, df_actions, trained_agent = train_test_agent(**train_kwargs)

  from tensorflow.tsl.python.lib.core import pywrap_ml_dtypes


Crypto Dimension: 5, State Space: 5
reset at 2020-10-17T08:00:00.000000000
reset at 2021-10-18T08:00:00.000000000
{'n_steps': 5, 'ent_coef': 0.005, 'learning_rate': 0.002}
Using cuda device
reset at 2020-10-17T08:00:00.000000000
Logging to ./a2c/a2c_1
begin_total_asset:10000
end_total_asset:136428.35949255736
Sharpe:  0.21841300335405872
reset at 2020-10-17T08:00:00.000000000
------------------------------------
| time/                 |          |
|    fps                | 33       |
|    iterations         | 100      |
|    time_elapsed       | 14       |
|    total_timesteps    | 500      |
| train/                |          |
|    entropy_loss       | -6.92    |
|    explained_variance | 0.000186 |
|    learning_rate      | 0.002    |
|    n_updates          | 99       |
|    policy_loss        | 9.14e+05 |
|    std                | 0.967    |
|    value_loss         | 2.08e+10 |
------------------------------------
begin_total_asset:10000
end_total_asset:131343.3402685994
Sharpe: 

# 回測結果

In [None]:
from backtesting import *
# 讀取模型歷史每日投資組合權重和報酬率的檔案
df_daily_return = pd.read_csv(f'{FOLDER}df_daily_return_{AGENT_NAME}_crypto.csv', index_col=0, parse_dates=True)
df_actions = pd.read_csv(f'{FOLDER}df_actions_{AGENT_NAME}_crypto.csv', index_col=0, parse_dates=True)
# 讀取回測比較基準的資料
bm, close_data = prepare_backtesting_data(df_daily_return, df_actions, BENCHMARK, FOLDER)

# 限定資料頻率為8小時 (原本是頻率4小時)
bm = bm[bm.index.hour==8]
close_data = close_data[close_data.index.hour==8]

CASH = 10000
NAME = 'DRL Agent'
# 回測分析
backtest_analytics(df_daily_return, bm, df_actions, close_data, cash=CASH, NAME=NAME, BM_NAME=BENCHMARK)

  and should_run_async(code)



===== Daily Shares =====



Unnamed: 0_level_0,BNB/USDT,BTC/USDT,BUSD/USDT,ETH/USDT,SOL/USDT
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2021-10-18 08:00:00,4.3,0.0328,1999.6,0.5340,12.7
2021-10-19 08:00:00,2.6,0.0309,2347.8,0.4144,18.3
2021-10-20 08:00:00,2.7,0.0250,2374.8,0.5035,18.1
2021-10-21 08:00:00,2.7,0.0248,2403.1,0.4671,15.7
2021-10-22 08:00:00,2.9,0.0272,2569.3,0.5099,15.4
...,...,...,...,...,...
2023-10-20 08:00:00,7.4,0.0650,2888.0,1.4684,131.5
2023-10-21 08:00:00,7.8,0.0682,3031.9,1.5429,125.4
2023-10-22 08:00:00,8.0,0.0698,3114.5,1.5653,134.2
2023-10-23 08:00:00,7.7,0.0675,3088.0,1.5078,126.9



===== Daily Change of Shares =====



Unnamed: 0_level_0,BNB/USDT,BTC/USDT,BUSD/USDT,ETH/USDT,SOL/USDT
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2021-10-18 08:00:00,4.3,0.0328,1999.6,0.5340,12.7
2021-10-19 08:00:00,-1.7,-0.0019,348.2,-0.1196,5.6
2021-10-20 08:00:00,0.1,-0.0059,27.0,0.0891,-0.2
2021-10-21 08:00:00,0.0,-0.0002,28.3,-0.0364,-2.4
2021-10-22 08:00:00,0.2,0.0024,166.2,0.0428,-0.3
...,...,...,...,...,...
2023-10-20 08:00:00,-0.1,-0.0030,0.0,-0.0531,-13.8
2023-10-21 08:00:00,0.4,0.0032,143.9,0.0745,-6.1
2023-10-22 08:00:00,0.2,0.0016,82.6,0.0224,8.8
2023-10-23 08:00:00,-0.3,-0.0023,-26.5,-0.0575,-7.3



===== Transaction Fees (0.2%) =====

21.852526632000014

===== Monthly return =====



month,01,02,03,04,05,06,07,08,09,10,11,12,total
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2021,nan%,nan%,nan%,nan%,nan%,nan%,nan%,nan%,nan%,12.93%,2.84%,0.08%,16.23%
2022,-0.02%,-6.06%,19.86%,-7.17%,0.06%,0.08%,10.93%,-6.05%,-6.90%,-0.01%,-27.11%,0.01%,-25.97%
2023,33.85%,-0.22%,-5.21%,5.53%,-1.55%,0.69%,10.47%,-2.56%,0.01%,17.49%,nan%,nan%,67.49%



===== Benchmark Monthly return =====



month,01,02,03,04,05,06,07,08,09,10,11,12,total
year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
2021,nan%,nan%,nan%,nan%,nan%,nan%,nan%,nan%,nan%,1.09%,-7.65%,-16.47%,-22.02%
2022,-19.79%,3.59%,19.86%,-18.11%,-20.86%,-34.44%,19.16%,-14.83%,-4.61%,6.71%,-20.57%,-0.01%,-65.28%
2023,39.90%,1.52%,22.26%,2.40%,-4.47%,9.46%,-4.42%,-6.74%,-1.35%,28.08%,nan%,nan%,109.40%



===== Performance Report =====



Unnamed: 0_level_0,DRL Agent,BTC/USDT
Performance Metrics,Unnamed: 1_level_1,Unnamed: 2_level_1
Alpha,28.41%,*
Beta,0.33,*
Information Ratio,0.72,*
CAGR,19.88%,-24.55%
Sharp Ratio,0.68,-0.19
Calmar Ratio,0.48,-0.32
Omega Ratio,1.17,0.97
Mean Drawdown,-17.19%,-55.01%
Max Drawdown,-41.86%,-76.7%
Prob of Losing Money,8.96%,96.61%



* means losses are 0.83 as bad as profits
