Using OpenAI stable baselines to train a stock trading agent which is only compatible with up to tensorflow version 1.15.4.

In [None]:
!pip install stable-baselines
!pip install tensorflow==1.15.4

This jupyter notebook was originally run on Google Colab. If not being run on Google Colab, skip the following drive mounting step

In [10]:
from google.colab import drive
drive.mount('/content/gdrive')

%cd gdrive/MyDrive/Deep-RL-Stock-Trading

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).
[Errno 2] No such file or directory: 'gdrive/MyDrive/Deep-RL-Stock-Trading'
/content/gdrive/MyDrive/Colab Notebooks


In [5]:
from indicators import get_indicators, sma, macd, rsi, cci
from environment import StockEnvironment

import csv
import matplotlib.pyplot as plt

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
import torch.nn.utils
import torch.optim as optim
import numpy as np

import gym
from gym import spaces
from stable_baselines import DQN

The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.



Reading the stock data from a csv file that was originally downloaded from kaggle. The data being read stores one set of values for each day the market is open. It contains information about the date, high, low, open, close, and adjusted close for the day.

In [3]:
data_dict = {'Date': [], 'Open': [], 'Close': [], 'High': [], 'Low': [], 'Adj Close': []}

with open('./data/stocks/AAPL.csv', newline = '') as csvfile:
  reader = csv.DictReader(csvfile)
  for row in reader:
    data_dict['Date'].append(row['Date'])
    data_dict['Open'].append(float(row['Open']))
    data_dict['Close'].append(float(row['Close']))
    data_dict['High'].append(float(row['High']))
    data_dict['Low'].append(float(row['Low']))
    data_dict['Adj Close'].append(float(row['Adj Close']))

indicators = get_indicators(data_dict)

Training the model and saving it

In [None]:
env = StockEnvironment(data_dict, indicators, 1200)
model = DQN('MlpPolicy', env, learning_rate=5e-4, prioritized_replay=True, verbose=1)
trained_model = model.learn(300000, log_interval=5)

In [None]:
model.save("./trained models/DQN AAPL Model")

In [None]:
model = DQN.load("./trained models/DQN AAPL Model")

Computing some relevant statistics for the trading agent and printing that information. Create a plot that compares stock price to portfolio value where the starting stock price has be adjusted to be equal to the portfolio value in order to allow for a direct comparison.

In [None]:
env = StockEnvironment(data_dict, indicators, 1200)
obs = env.reset()
action_list = [0] * 9
portfolio_value = []
day = 201

for i in range(5000):
  action, _states = model.predict(obs)
  action_list[action] += 1
  obs, rewards, done, info = env.step(action)
  portfolio_value.append(obs[0] + obs[1]*data_dict['Close'][day])
  day += 1

env.render()
print(action_list)

other_prices = [num * 367 for num in data_dict['Close']]
plt.plot(portfolio_value, label='portfolio value')
plt.plot(other_prices[200:5200], label='stock price')
plt.legend(["portfolio", "price of stock"])
plt.ylabel('value')
plt.xlabel('day')
plt.title('DQN Trading Agent')
plt.show()