趨勢科技 : 台灣ETF價格預測競賽
---
Kenny Hsieh, 2018/4/30

- [官方競賽網站](https://tbrain.trendmicro.com.tw/Competitions/Details/2)
- `ETF_Modeling.ipynb` : 資料讀取、資料處理、模型建立、輸出預測結果(目前僅能預測一天)
- `ETF_Price_Performance.ipynb` : 衡量預測結果，計算分數

## Brief Introduction
- 已完成資料處理、建立模型到最終預測結果，此階段為計算模型表現
- 衡量表現依據官方網站評分規則計算，如下圖

![](https://i.imgur.com/ebHOIzA.png =500x)

In [40]:
from google.colab import files

uploaded = files.upload()

Saving actual_result.csv to actual_result.csv


## Load the Predict Result
- 讀取 `ETF_Modeling` 預測結果

In [41]:
import pandas as pd

predict_result = pd.read_csv("predict_result.csv")
predict_result["Code"] = predict_result["Code"].astype(str).map(lambda x: "00" + x)
predict_result.head()

Unnamed: 0,Code,Date,Trend,Predict
0,50,2018-04-30,1,79.486275
1,51,2018-04-30,-1,31.813707
2,52,2018-04-30,-1,52.205582
3,53,2018-04-30,-1,34.138184
4,54,2018-04-30,-1,23.049755


## Load the Actual Result & Combine with Prediction
- 讀取 4/30 18檔 ETF 實際漲跌、股價資料

In [42]:
actual_result = pd.read_csv("actual_result.csv")
actual_result.head()

Unnamed: 0,Code,Price_430,Trend_430
0,50,80.0,1
1,51,32.02,-1
2,52,53.6,1
3,53,34.52,1
4,54,23.23,1


In [43]:
evaluate_predict = pd.concat([predict_result, actual_result], axis = 1)
evaluate_predict = evaluate_predict.iloc[:, [0, 1, 2, 3, 5, 6]]
evaluate_predict.head()

Unnamed: 0,Code,Date,Trend,Predict,Price_430,Trend_430
0,50,2018-04-30,1,79.486275,80.0,1
1,51,2018-04-30,-1,31.813707,32.02,-1
2,52,2018-04-30,-1,52.205582,53.6,1
3,53,2018-04-30,-1,34.138184,34.52,1
4,54,2018-04-30,-1,23.049755,23.23,1


## Measure the Model Performance
若是完全預測漲跌、股價滿分為18分
- `Trend Score` : 漲跌預測正確得 0.5 
- `Price Score` : 依據競賽規則計算分數 (此部分滿分為 0.5)

In [50]:
# 依據公式計算股價分數
evaluate_predict["Price_Score"] = ((evaluate_predict["Price_430"] - abs(evaluate_predict["Predict"] - evaluate_predict["Price_430"])) / evaluate_predict["Price_430"]) * 0.5

# 計算漲跌正確分數
evaluate_predict["Trend_Score"] = evaluate_predict["Trend"] - evaluate_predict["Trend_430"] 
evaluate_predict["Trend_Score"] = evaluate_predict["Trend_Score"].map(lambda x : 0.5 if x == 0 else 0)

# 上述兩項分數加總
evaluate_predict["Final_Score"] = evaluate_predict["Price_Score"] + evaluate_predict["Trend_Score"]
evaluate_predict

Unnamed: 0,Code,Date,Trend,Predict,Price_430,Trend_430,Price_Score,Trend_Score,Final_Score
0,50,2018-04-30,1,79.486275,80.0,1,0.496789,0.5,0.996789
1,51,2018-04-30,-1,31.813707,32.02,-1,0.496779,0.5,0.996779
2,52,2018-04-30,-1,52.205582,53.6,1,0.486992,0.0,0.486992
3,53,2018-04-30,-1,34.138184,34.52,1,0.49447,0.0,0.49447
4,54,2018-04-30,-1,23.049755,23.23,1,0.49612,0.0,0.49612
5,55,2018-04-30,-1,16.896467,17.2,1,0.491176,0.0,0.491176
6,56,2018-04-30,-1,25.056149,25.4,1,0.493231,0.0,0.493231
7,57,2018-04-30,-1,48.434689,49.31,1,0.491124,0.0,0.491124
8,58,2018-04-30,1,45.758862,45.45,1,0.496602,0.5,0.996602
9,59,2018-04-30,-1,41.104889,42.0,1,0.489344,0.0,0.489344


## Conclusion
- 最終預測模型表現得分為 `12.39` (滿分為 `18`)
- 觀察模型表現
  - 股票漲跌：在此部分失去相當多分數，觀察有可能為 LSTM 有 lag 的情況所導致，且 4/30 當日大盤齊揚，相較於 4/27 有反彈的走勢，因此模型錯評股票漲跌。
  - 股價差距：在此部分表現則相當良好，得分皆有達 0.49 左右，與實際股價相差不遠。不過也有可能是 ETF 與大盤波動的特性，因此 ETF 股價鮮少有巨量的震盪。
- 未來努力方向
  - 調整 LSTM 網路架構，加入更多 LSTM 層、神經節點個數等，目標減少 LSTM lag 的狀況。
  - 掌握財金知識，納入財金指標 (MA, KD 等指標)，讓模型擁有更多學習資訊。

In [52]:
# 計算最終模型表現

Total_score = evaluate_predict["Final_Score"].sum()
Total_score

12.394004897485868