## AlgoTrader - End Of The Day Prediction Trading by using reinforcement learning AI and based on FinRL lib

<a href="https://colab.research.google.com/github/AI4Finance-LLC/FinRL-Library/blob/master/FinRL_single_stock_trading.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

 I am using local source code of the finrl library. This will allow us to add additional loaders, technical indicators, sentiment and custom configuration.

In [None]:
# if you are running this in Google Colab, please enable installation of FinRL lib
# !pip install git+https://github.com/AI4Finance-LLC/FinRL-Library.git

 Install following packages
 * Yahoo Finance API
 * pandas
 * numpy
 * matplotlib
 * stockstats
 * OpenAI gym
 * stable-baselines
 * tensorflow
 * pyfolio

In [None]:
# Lets check if we have all packages installed, and if not lets install them
import pkg_resources
installedPackages = {pkg.key for pkg in pkg_resources.working_set}
required = {'yfinance', 'pandas', 'matplotlib', 'stockstats','stable-baselines','gym','tensorflow'}
missing = required - installedPackages
if missing:
    !pip install yfinance
    !pip install pandas
    !pip install matplotlib
    !pip install stockstats
    !pip install gym
    !pip install stable-baselines

In [None]:
# import all libraries
import pandas as pd
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
matplotlib.use('Agg')

import datetime
import os
import sys
import warnings

# add FinRL-Library path
sys.path.append("FinRL-Library")

from finrl.config import config
from finrl.marketdata.yahoodownloader import YahooDownloader
from finrl.preprocessing.preprocessors import FeatureEngineer
from finrl.preprocessing.data import data_split
from finrl.env.environment import EnvSetup
from finrl.env.EnvMultipleStock_train import StockEnvTrain
from finrl.env.EnvMultipleStock_trade import StockEnvTrade
from finrl.model.models import DRLAgent
from finrl.trade.backtest import BackTestStats, BaselineStats, BackTestPlot

In [None]:
# Basic setup
#Disable warnings
warnings.filterwarnings('ignore')

# add following folders
if not os.path.exists("./" + config.DATA_SAVE_DIR):
    os.makedirs("./" + config.DATA_SAVE_DIR)
if not os.path.exists("./" + config.TRAINED_MODEL_DIR):
    os.makedirs("./" + config.TRAINED_MODEL_DIR)
if not os.path.exists("./" + config.TENSORBOARD_LOG_DIR):
    os.makedirs("./" + config.TENSORBOARD_LOG_DIR)
if not os.path.exists("./" + config.RESULTS_DIR):
    os.makedirs("./" + config.RESULTS_DIR)

Download end of the day trading data. For this I am using YahooDawnloader function from the marketdata package, that is
implemented in finrl library. This function is actually using fyahoo library.

Todo:
 - [ ] add loader for IEX Cloud

 -----
class YahooDownloader:
    Provides methods for retrieving daily stock data from
    Yahoo Finance API

    Attributes
    ----------
        start_date : str
            start date of the data (modified from config.py)
        end_date : str
            end date of the data (modified from config.py)
        ticker_list : list
            a list of stock tickers (modified from config.py)

    Methods
    -------
    fetch_data()
        Fetches data from yahoo API

In [None]:
# From config.py file get following:

# start_date
START_DATE = config.START_DATE

# end_date
END_DATE = config.END_DATE

# list of stocks
STOCK_LIST = config.MULTIPLE_STOCK_TICKER

# Download and save the data in a pandas DataFrame:
data_frame = YahooDownloader(start_date = START_DATE,
                          end_date = END_DATE,
                          ticker_list = STOCK_LIST).fetch_data()

In [None]:
data_frame.shape

In [None]:
data_frame.head()

Data preprocessing is a crucial step for training a high quality machine learning model. We need to check for missing
data and to do feature engineering in order to convert the data into a model-ready state.
* FinRL uses a class FeatureEngineer to preprocess the data
* Add technical indicators to the dataset. In practical trading, various information needs to be taken into account,
  for example the historical stock prices, current holding shares, technical indicators, etc.

---

class FeatureEngineer:
Provides methods for preprocessing the stock price data

    Attributes
    ----------
        df: DataFrame
            data downloaded from Yahoo API
        feature_number : int
            number of features we used
        use_technical_indicator : boolean
            we technical indicator or not
        use_turbulence : boolean
            use turbulence index or not

    Methods
    -------
    preprocess_data()
        main method to do the feature engineering

Technical Indicators
* FinRL uses stockstats to calculate technical indicators such as:
 * Moving Average Convergence Divergence (MACD) - this indicator is used to identify buying and selling cycle. If the
   rice goes up during the selling cycle that shows strong demand. MACD belongs to Trend Indicators.
 * Relative Strength Index (RSI) - when RSI is above 70 it is considered to be overbought and when it is below 30 it
   is considered to be oversold. RSI belongs to Momentum Indicators.
 * Average Directional Index (ADX) - The average directional index (ADX) is a technical analysis indicator used by some
   traders to determine the strength of a trend. ADX belongs to Trend Indicators. For more go here:
   https://www.investopedia.com/terms/a/adx.asp
 * Commodity Channel Index (CCI) - is a momentum based oscillator used to help determine when an investment vehicle is
   reaching a condition of being overbought or oversold. It is also used to assess price trend direction and strength.
   This information allows traders to determine if they want to enter or exit a trade, refrain from taking a trade, or
   add to an existing position. In this way, the indicator can be used to provide trade signals when it acts in a
   certain way. For more go here: https://www.investopedia.com/terms/c/commoditychannelindex.asp
* stockstats library: supplies a wrapper StockDataFrame based on the pandas.DataFrame with inline stock
  statistics/indicators support. Supported statistics/indicators are:
  * change (in percent)
  * delta
  * permutation (zero based)
  * log return
  * max in range
  * min in range
  * middle = (close + high + low) / 3
  * compare: le, ge, lt, gt, eq, ne
  * count: both backward(c) and forward(fc)
  * SMA: simple moving average
  * EMA: exponential moving average
  * MSTD: moving standard deviation
  * MVAR: moving variance
  * RSV: raw stochastic value
  * RSI: relative strength index
  * KDJ: Stochastic oscillator
  * Bolling: including upper band and lower band.
  * MACD: moving average convergence divergence. Including signal and histogram. (see note)
  * CR:
  * WR: Williams Overbought/Oversold index
  * CCI: Commodity Channel Index
  * TR: true range
  * ATR: average true range
  * line cross check, cross up or cross down.
  * DMA: Different of Moving Average (10, 50)
  * DMI: Directional Moving Index, including
  * +DI: Positive Directional Indicator
  * -DI: Negative Directional Indicator
  * ADX: Average Directional Movement Index
  * ADXR: Smoothed Moving Average of ADX
  * TRIX: Triple Exponential Moving Average
  * TEMA: Another Triple Exponential Moving Average
  * VR: Volatility Volume Ratio

Todo:
 - [ ] add some additional Technical Indicators to the dataset.
       https://www.visualcapitalist.com/12-types-technical-indicators-stocks


In [None]:
## we store the stockstats technical indicator column names in config.py
tech_indicator_list=config.TECHNICAL_INDICATORS_LIST
print(tech_indicator_list)

feature_engineering = FeatureEngineer(
                    use_technical_indicator=True,
                    tech_indicator_list = tech_indicator_list,
                    use_turbulence=True,
                    user_defined_feature = False)

processed = feature_engineering.preprocess_data(data_frame)

In [None]:
processed.sort_values(['date','tic'],ignore_index=True).head(10)

## Design your gym
Considering the stochastic and interactive nature of the automated stock trading tasks, a financial task is modeled as
a Markov Decision Process (MDP) problem. The training process involves observing stock price change, taking an action
and reward's calculation to have the agent adjusting its strategy accordingly. By interacting with the environment, the
trading agent will derive a trading strategy with the maximized rewards as time proceeds.

Our trading environments, based on OpenAI Gym framework, simulate live stock markets with real market data according to
the principle of time-driven simulation.

The action space describes the allowed actions that the agent interacts with the environment. Normally, action a includes
three actions: {-1, 0, 1}, where -1, 0, 1 represent selling, holding, and buying one share. Also, an action can be
carried upon multiple shares. We use an action space {-k,…,-1, 0, 1, …, k}, where k denotes the number of shares to buy
and -k denotes the number of shares to sell. For example, "Buy 10 shares of AAPL" or "Sell 10 shares of AAPL" are 10 or
-10, respectively. The continuous action space needs to be normalized to [-1, 1], since the policy is defined on a
Gaussian distribution, which needs to be normalized and symmetric.

### Dataset Preparation. Split dataset on training data and trading data
#### Training data: 2009-03-01 to 2018-12-31
#### Trade data: 2019-01-01 to 2020-09-30

In [None]:
training_set = data_split(processed, '2009-03-01','2019-01-01')
tradeing_set = data_split(processed, '2019-01-01','2020-12-01')
print(len(training_set))
print(len(tradeing_set))

In [None]:
training_set.head()

In [None]:
tradeing_set.head()

In [None]:
stock_dimension = len(training_set.tic.unique())
state_space = 1 + 2*stock_dimension + len(tech_indicator_list)*stock_dimension
print(f"Stock Dimension: {stock_dimension}, State Space: {state_space}")

In [None]:
env_setup = EnvSetup(stock_dim = stock_dimension,
                        state_space = state_space,
                        hmax = 100,
                        initial_amount = 1000000,
                        transaction_cost_pct = 0.001)

env_training = env_setup.create_env_training(data = training_set,
                                          env_class = StockEnvTrain)


env_tradeing, obs_trade = env_setup.create_env_trading(data = tradeing_set,
                                         env_class = StockEnvTrade)


In [None]:
print(type(env_training))

## Environment for Training

In [None]:
env_training = env_setup.create_env_training(data = training_set,
                                          env_class = StockEnvTrain)

## Environment for Trading

In [None]:
env_tradeing, obs_trade = env_setup.create_env_trading(data = tradeing_set,
                                         env_class = StockEnvTrade)

# Implement DRL Algorithms
* The implementation of the DRL algorithms are based on OpenAI Baselines and Stable Baselines. Stable Baselines is a
  fork of OpenAI Baselines, with a major structural refactoring, and code cleanups.
* FinRL library includes fine-tuned standard DRL algorithms, such as DQN, DDPG, Multi-Agent DDPG, PPO, SAC, A2C and TD3.
  We also allow users to design their own DRL algorithms by adapting these DRL algorithms. Instead of installing FinRL
  lib I have included the source code and created my own version.

In [None]:
agent = DRLAgent(env = env_training)

## Model Training: 5 models, A2C DDPG, PPO, TD3, SAC

#### Model 1: A2C

In [None]:
agent = DRLAgent(env = env_training)
model_a2c = agent.get_model("a2c")

In [None]:
trained_a2c = agent.train_model(model=model_a2c,
                             tb_log_name='a2c',
                             total_timesteps=150000)

#### Model 2: DDPG

In [None]:
agent = DRLAgent(env = env_training)
model_ddpg = agent.get_model("ddpg")

In [None]:
trained_ddpg = agent.train_model(model=model_ddpg,
                             tb_log_name='ddpg',
                             total_timesteps=50000)

### Model 3: PPO

In [None]:
agent = DRLAgent(env = env_training)
PPO_PARAMS = {
    "n_steps": 2048,
    "ent_coef": 0.01,
    "learning_rate": 0.00025,
    "batch_size": 128,
}
model_ppo = agent.get_model("ppo",model_kwargs = PPO_PARAMS)

In [None]:
trained_ppo = agent.train_model(model=model_ppo,
                             tb_log_name='ppo',
                             total_timesteps=100000)

### Model 4: TD3

In [None]:
agent = DRLAgent(env = env_training)
TD3_PARAMS = {"batch_size": 100,
              "buffer_size": 1000000,
              "learning_rate": 0.001}

model_td3 = agent.get_model("td3",model_kwargs = TD3_PARAMS)

In [None]:
trained_td3 = agent.train_model(model=model_td3,
                             tb_log_name='td3',
                             total_timesteps=50000)

### Model 5: SAC

In [None]:
agent = DRLAgent(env = env_training)
SAC_PARAMS = {
    "batch_size": 128,
    "buffer_size": 100000,
    "learning_rate": 0.0001,
    "learning_starts": 100,
    "ent_coef": "auto_0.1",
}

model_sac = agent.get_model("sac",model_kwargs = SAC_PARAMS)

In [None]:
trained_sac = agent.train_model(model=model_sac,
                             tb_log_name='sac',
                             total_timesteps=80000)

## Trading
Assume that we have $1,000,000 initial capital at 2019-01-01. We use the DDPG model to trade Dow jones 30 stocks.

### Set turbulence threshold
Set the turbulence threshold to be the 99% quantile of insample turbulence data, if current turbulence index is greater than the threshold, then we assume that the current market is volatile

In [None]:
data_turbulence = processed[(processed.date<'2019-01-01') & (processed.date>='2009-01-01')]
insample_turbulence = data_turbulence.drop_duplicates(subset=['date'])

In [None]:
insample_turbulence.turbulence.describe()

In [None]:
turbulence_threshold = np.quantile(insample_turbulence.turbulence.values,1)

In [None]:
turbulence_threshold

### Trade

DRL model needs to update periodically in order to take full advantage of the data, ideally we need to retrain our model yearly, quarterly, or monthly. We also need to tune the parameters along the way, in this notebook I only use the in-sample data from 2009-01 to 2018-12 to tune the parameters once, so there is some alpha decay here as the length of trade date extends.

Numerous hyperparameters – e.g. the learning rate, the total number of samples to train on – influence the learning process and are usually determined by testing some variations.

In [None]:
env_trade, obs_trade = env_setup.create_env_trading(data = tradeing_set,
                                         env_class = StockEnvTrade,
                                         turbulence_threshold=250)

df_account_value, df_actions = DRLAgent.DRL_prediction(model=trained_sac,
                        test_data = tradeing_set,
                        test_env = env_trade,
                        test_obs = obs_trade)

In [None]:
df_account_value.shape

In [None]:
df_account_value.head()

<a id='6'></a>
# Part 7: Backtest Our Strategy
Backtesting plays a key role in evaluating the performance of a trading strategy. Automated backtesting tool is preferred because it reduces the human error. We usually use the Quantopian pyfolio package to backtest our trading strategies. It is easy to use and consists of various individual plots that provide a comprehensive image of the performance of a trading strategy.

<a id='6.1'></a>
## 7.1 BackTestStats
pass in df_account_value, this information is stored in env class


In [None]:
print("==============Get Backtest Results===========")
now = datetime.datetime.now().strftime('%Y%m%d-%Hh%M')

perf_stats_all = BackTestStats(account_value=df_account_value)
perf_stats_all = pd.DataFrame(perf_stats_all)
perf_stats_all.to_csv("./"+config.RESULTS_DIR+"/perf_stats_all_"+now+'.csv')

<a id='6.2'></a>
## 7.2 BackTestPlot

In [None]:
print("==============Compare to DJIA===========")
%matplotlib inline
# S&P 500: ^GSPC
# Dow Jones Index: ^DJI
# NASDAQ 100: ^NDX
BackTestPlot(df_account_value,
             baseline_ticker = '^DJI',
             baseline_start = '2019-01-01',
             baseline_end = '2020-12-01')

<a id='6.3'></a>
## 7.3 Baseline Stats

In [None]:
print("==============Get Baseline Stats===========")
baesline_perf_stats=BaselineStats('^DJI',
                                  baseline_start = '2019-01-01',
                                  baseline_end = '2020-12-01')