### This notebook is inspired by the work of Selim Amrouni, Aymeric Moulin and Philippe Mizrahi 
### who proposed a notebook based implementation of the Algorithm of Zhengyao Jiang, Dixing Xu, Jinjun Liang 

## Implementation of DDPG algorithm


In this notebook we will present our own implementation of the Deep reinforcement learning Algorithm proposed in the paper.We will then use it on stock market data, which contrary to the cryptocurrency market is not open all the time . 

For precision purpose in the cryptocurrency market we considered 13 crytocurrencies for which we looked at 3 main prices (high,low,open prices). Since the market is continuously open, for each time step t we have open(t)=close(t-1). 

Here is the outline of what i will be doing trought the notebook:

- Preprocessing of the Poloniex cryptocurrency data:
   The data X is composed of:
     - open(t)/open(t-1)
     - high(t)/open(t-1)
     - low(t)/open(t-1)
 
- Preprocessing of the S&P500 data :
    The final output data X will be a tensor composed of:
     - open(t)/open(t-1)
     - close(t)/open(t-1)
     - high(t)/open(t-1)
     - low(t)/open(t-1)
     
- Implementation of the deep deterministic policy gradient algorithm 
- possible improvement of the model 


<u>Remark</u>: 
The idea of using a ratio of two prices is to normalize the data.


# Poloniex data preprocessing


### Libraries and dependencies 

In [2]:
# for navigation in the folders
import os 
import pathlib

# manipulate time
from time import strptime 
from datetime import datetime

from tqdm import tqdm

# plot 
import matplotlib.pyplot as plt
import matplotlib.image as mpimg

import numpy as np
import pandas as pd
import seaborn as sns
import PIL
import pickle

sns.despine()

<Figure size 432x288 with 0 Axes>

In [18]:
data_dir = '/poloniex_data/'
directory = os.getcwd()+ data_dir # path to the files of the directory
file_tages= os.listdir(directory) # listed all files of the directory

#Remove wrong file  
for file in file_tages:
    if file[0]=='.':
        file_tages.remove(file)
        
stock_names =[file.split('.')[0] for file in file_tages]
stocks = [file for file in file_tages]


In [30]:
# We are interested in one year data history (30min period of trading => 48 periods x 365 days =17000), so the data 
# that do not exceed that size are removed 
for s in stocks:
    try:
        df = pd.read_csv('.'+data_dir+s)
        print(s,len(df))
    except :
        continue

ETCBTC.csv 17032
GNTETH.csv 7001
REPETH.csv 13547
ETHBTC.csv 33878
DOGEBTC.csv 60915
ETHUSDT.csv 33876
BTCUSDT.csv 42010
XRPBTC.csv 51113
DASHBTC.csv 60103
REPBTC.csv 13547
XMRBTC.csv 55285
LTCBTC.csv 61096
GNTBTC.csv 7001
ETCETH.csv 17031


In [155]:
kept_stocks = ['ETCBTC','ETHBTC','DOGEBTC','ETHUSDT','BTCUSDT','XRPBTC','DASHBTC',\
              'XMRBTC','LTCBTC','ETCETH']
len_list = []
for stk in kept_stocks:
    data = pd.read_csv('.'+data_dir+stk+'.csv')
    len_list.append(len(data))

In [156]:
min_list=min(len_list)

In [157]:
list_open =list() 
list_close= list()
list_high = list()
list_low = list()

for s in tqdm(kept_stocks):
    data = pd.read_csv('.'+data_dir+s+'.csv').fillna('bfill').copy()
    data =data[['open','close','high','low']].tail(min_list)
    list_open.append(data.open.values)
    list_close.append(data.close.values) 
    list_high.append(data.high.values)
    list_low.append(data.low.values)

100%|██████████| 10/10 [00:00<00:00, 20.06it/s]


In [158]:
list_open_normalizer =np.transpose(np.array(list_open))[:-1]
list_open = np.transpose(np.array(list_open))[1:]
list_close = np.transpose(np.array(list_close))[1:]
list_high = np.transpose(np.array(list_high))[1:]
list_low = np.transpose(np.array(list_low))[1:]

In [159]:
X =np.transpose(np.array([list_open/list_open_normalizer,list_high/list_open_normalizer,list_low/list_open_normalizer]))

##### The shape of our tensor is: $m \times features \times data points$
    - m = 10 number of stocks
    - features = 3 the differents prices 
    - data points = 17030  number of data points
    

In [163]:
np.save('./np_data/Cypto_data.npy',X)

## Stock Market Data Preprocessing

In [168]:
data_dir = '/individual_stocks_5yr/'
directory = os.getcwd()+ data_dir # path to the files of the directory
file_tages= os.listdir(directory) # listed all files of the directory
for file in file_tages:
    if file[0]=='.':
        file_tages.remove(file)
        
stock_names =[file.split('.')[0] for file in file_tages]
stocks = [file for file in file_tages]


In [197]:
# We will considere the stock that have 1256 data points . 

kept_stocks = list()
for stock in stock_names :
    try:
        df = pd.read_csv('.'+data_dir+stock+'.csv')
        if len(df)==1259:
            kept_stocks.append(stock+'.csv')
    except:
        continue

### We considere only some  specific stocks:

- 'ALK_data.csv',
- 'APC_data.csv',
- 'CAG_data.csv',
- 'CDNS_data.csv',
- 'AMT_data.csv'


In [225]:
kept_stocks_sple=[kept_stocks[3], kept_stocks[7], kept_stocks[12], kept_stocks[37], kept_stocks[42]]
kept_stocks_sple

['ALK_data.csv',
 'APC_data.csv',
 'CAG_data.csv',
 'CDNS_data.csv',
 'AMT_data.csv']

In [244]:
open_price=list()
close_price = list()
high_price = list()
low_price = list()

for s in kept_stocks_sple:
    data = pd.read_csv('.'+data_dir+s).fillna('bfill').copy()
    data = data[['open','close','high','low']]
    open_price.append(data.open.values)
    close_price.append(data.close.values)
    low_price.append(data.low.values)
    high_price.append(data.high.values)

In [245]:
array_close_price = np.array(close_price).T[:-1]
array_open_price = np.array(open_price).T[:-1]
array_high_price = np.array(high_price).T[:-1]
array_low_price= np.array(low_price).T[:-1]
array_normalizer_open_price = np.array(open_price).T[1:]

In [257]:
X = np.transpose(np.array([array_open_price/array_normalizer_open_price,\
              array_close_price/array_normalizer_open_price,\
              array_high_price/array_normalizer_open_price,\
              array_low_price/array_normalizer_open_price]),axes=(0,2,1))
X.shape

(4, 5, 1258)

In [258]:
np.save('./np_data/S&P500_data.npy',X)