In [1]:
import os
import time
import math
import helper
import datetime
import requests
import bs4 as bs
import numpy as np
import pandas as pd
import pickle as pkl
import datetime as dt
import tensorflow as tf
from sklearn import metrics
from matplotlib import style
import yahoo_finance as yahoo
from datetime import timedelta
import matplotlib.pyplot as plt
import matplotlib.cbook as cbook
import matplotlib.dates as mdates
from matplotlib.finance import candlestick_ohlc
from IPython.display import display, Math, Latex

style.use('ggplot')



# 1. Stock features description 

#### Basics daily information 
* *Open*: The first trading price when the market open on the day.

* *Day High and Low*: This indicates the price range at which the stock has traded at throughout the day. In other words, these are the maximum and the minimum prices that people have paid for the stock.

* *Close*: the last trading price when the market closed on the day. This is also the predicted target of this research.

* *Volumne*: This figure shows the total number of shared traded for the day.

* *Adj Close*: stock's closing price on any given day of trading that has been amended to include any distributions and corporate actions that occurred at any time prior to the next day's open. 

When a both of stock's price and volume increase, there is an increase in the interest in the stock. The stock price trend goes upward. 

When the stock's price increases and volume decrease, it shows that investors are indicisive to finalize their buying decision. The reasons is due to the lack of fundamental or psychological factors which can influence the larger market participants. In this situation, the stock price trend will likely to change

When the stock price decreases and volume increases, it shows that some force is increase the selling tendency of the stock which is possibly a downtrend.

The decrease of stock price and volume indicates that investors are indicisive to finalize their selling decision. 



#### Technical indicator (Not finished)
Any class of metrics whose value is derived from generic price activity in a stock or asset. Technical indictors look to predict the future price levels, or simply the general price direction, of a security by looking at past patterns. 

- Relative Strength index (RSI): is a measure of a stock's overbought and oversold position. 
- On-Balance Volume (OBV): measures the positive and negative flow of volume in a security, relative to its price over time. It is a simple measure that keeps a cumulative total volume by adding or subtracting each period's volume, depending on the price movement. This measure expands on the basic volume measure by combining volume and price movement. The idea behind this indicator is that volume precedes price movement, so if a security is seeing an increasing OBV, it is a signal that volume is increasing on upward price moves. Decreases mean that the security is seeing increasing volume on down days.
- Accumulation/Distribution Line: 

| Indicators | Name | Description | Formula         
| :- |:- | :- | :------------------:
|WILLR| Williams%R  | Determines where today's closing price fell within the range on past 10-day's transaction | (highest-closed)/(highest-lowest)*100
| ROCR | Rate of Change | Measures the percentage change in price between the current price and the price n periods in the past| (Price(t)/Price(t-n))*100
| MOM | Momentum | compares the current price with the previous price from a selected number of periods ago | Price(t) - Price(t-n)
| RS | Relative Strength Index | Suggests the overbought and oversold market signal | 
| CCI | Commodity Channel Index | Identifies cyclical turn in stock price | 
|ADX | Average Directional Index | Discover if trend i developing | 
| TRIX | Triple Exponential Moving Average | Smooth the insignificant movements | 
| MACD | Moving average convergence divergence | Use different EMA to signal buy and sell | 
| OBV | On balance Volume | Relates trading Volume to price change | 
| TSF | Time series Forcasting | Calculates the linear regression of 20-day price | 
| ATR | Average True Range | Shows volatility of market | 
| MFI | Money flow Index | Relates typical price with volume |

* Type of features:

    1) Price change: ROCR, MOM

    2) Stock trend discovery: ADX, MFI

    3) Buy and Sell signals: WILLR, RSI, CCI, MACD

    4) Volatility signal: ATR

    5) Volumn weights: OBV

    6) Noise elimination and data smoothing: TRIX

#### Dow theory

1. The markets have three basic movements:
    - Primary trend: major market movement, last from many months to many years
    - Secondary trend: medium movement, last from ten days to three months
    - minor trend: smallest movement, last a few hours to one month
2. The market trends have three phases
    - Accumulation phase: happens during the beginning of an uptrend or downtrend when investors buy or sell stocks
    - Public participation phase:  happens when the stock is driven by some factors. In this phase, more investors participate
    - Distribution or panic phase: all of the prices have reached their peak of bottom. In this phase, investors sell in an uptrend and buy in a downtrend. here’s consensus in the market that the stock price is trading above or below its fundamental valuation. 
    <img src="./market_phase.png">
    
                three phases of the market trend for Goldman Sachs’
                (GS) stock
3. Stock prices reflect all news
4. Financial market indexes should agree with each other
5. Market trends should be confirmed by volume.
6. Market trends reverse after giving strong signals

#### Elliot Wave Theory

The theory proposed that market trends moves in repeated cycles. They cycles are classifided based on how long they last. 

The cycles include:
- Grand Super cycle – This is the longest wave. It takes place over many centuries.
- Super cycle – This wave lasts for many decades. It lasts ~40–70 years.
- Cycle – This wave lasts from one to several years.
- Primary – This wave last for a few months to a couple of years.
- Intermediate – This wave lasts for a few weeks to a few months.
- Minor – This wave lasts for a few weeks.
- Minute – This wave lasts for a few days.
- Minuette – This wave lasts for a few hours.
- Subminuette – This wave lasts for a few minutes.

A cycle has two phases:
- Impulsive phase
- Corrective phase
<img src="./elliot.png">
The above chart shows two phases of the Elliott Wave for Home Depot’s (HD) stock.

The impulsive wave is shown in the above chart. It consists of five waves. It forms higher highs in the uptrend and lower highs in downtrend.
The corrective wave is shown in the above chart. It consists of three waves. It forms lower highs in an uptrend and higher highs in a downtrend.

#### Overbought and oversought



# 1. Data collection and understanding

## 1.1 Amazon 

# 1.2 Woolworth limited

# 1.3 Myer Limited

In [6]:
df = pd.read_csv('./data/ibm/full_ibm.csv')
df['Date.1'] = pd.to_datetime(df['Date.1'])
df.set_index('Date.1', inplace=True)
df.drop(['Date','High', 'Adj Close', 'NDX', 'SP500', 'Volume', 'Low'], axis=1, inplace=True)
df.head()

Unnamed: 0_level_0,Open,Close,DJIA
Date.1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2000-01-03,112.4375,116.0,11357.509766
2000-01-04,114.0,112.0625,10997.929688
2000-01-05,112.9375,116.0,11122.650391
2000-01-06,118.0,114.0,11253.259766
2000-01-07,117.25,113.5,11522.55957


In [7]:
df.describe()

Unnamed: 0,Open,Close,DJIA
count,4277.0,4277.0,4277.0
mean,126.495747,126.581206,12214.937212
std,40.516453,40.535799,2942.788258
min,54.650002,55.07,6547.049805
25%,91.169998,91.339996,10259.740234
50%,116.5,116.540001,11151.339844
75%,161.669998,161.820007,13553.730469
max,215.380005,215.800003,19974.619141


In [9]:
df['ma7'] = df['Close'].rolling(window=7).mean()
df['ma30'] = df['Close'].rolling(window=30).mean()
df['std30'] = df['Close'].rolling(window=30).std()
df.tail()
df_train = df.drop(['ma7', 'Close', 'ma30'],axis=1)
df_train['ma7_1'] = df['ma7'].shift(periods = 11, axis=0)
df_train['ma7_2'] = df['ma7'].shift(periods = 18, axis=0)
df_train['ma7_3'] = df['ma7'].shift(periods = 25, axis=0)
df_train['ma7_4'] = df['ma7'].shift(periods = 32, axis=0)
df_train['ma30_1'] = df['ma7'].shift(periods = 62, axis=0)
df_train['ma30_2'] = df['ma7'].shift(periods = 92, axis=0)
df_train['ma30_3'] = df['ma7'].shift(periods = 112, axis=0)
df_train['ma30_4'] = df['ma7'].shift(periods = 142, axis=0)
df_train['Close_1'] = df['Close'].shift(periods = 1, axis=0)
df_train['Close_2'] = df['Close'].shift(periods = 2, axis=0)
df_train['Close_3'] = df['Close'].shift(periods = 3, axis=0)
print (df_train.shape)
df_train.head()

(4277, 14)


Unnamed: 0_level_0,Open,DJIA,std30,ma7_1,ma7_2,ma7_3,ma7_4,ma30_1,ma30_2,ma30_3,ma30_4,Close_1,Close_2,Close_3
Date.1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
2000-01-03,112.4375,11357.509766,,,,,,,,,,,,
2000-01-04,114.0,10997.929688,,,,,,,,,,116.0,,
2000-01-05,112.9375,11122.650391,,,,,,,,,,112.0625,116.0,
2000-01-06,118.0,11253.259766,,,,,,,,,,116.0,112.0625,116.0
2000-01-07,117.25,11522.55957,,,,,,,,,,114.0,116.0,112.0625


In [12]:
df_train.dropna(how='any',inplace=True)

In [14]:
df_train.describe()

Unnamed: 0,Open,DJIA,std30,ma7_1,ma7_2,ma7_3,ma7_4,ma30_1,ma30_2,ma30_3,ma30_4,Close_1,Close_2,Close_3
count,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0,4129.0
mean,127.00557,12268.737885,3.521462,126.904779,126.805701,126.719079,126.645318,126.32323,126.014579,125.751258,125.484766,127.083778,127.070191,127.057128
std,41.129646,2980.199877,2.051704,41.04625,41.02974,41.008755,40.989074,40.946874,40.870083,40.815341,40.754449,41.144277,41.140487,41.136626
min,54.650002,6547.049805,0.592571,57.541429,57.541429,57.541429,57.541429,57.541429,57.541429,57.541429,57.541429,55.07,55.07,55.07
25%,90.5,10241.019531,2.052683,90.584286,90.584286,90.584286,90.584286,90.584286,90.584286,90.584286,90.584286,90.550003,90.550003,90.550003
50%,117.139999,11239.769531,3.048959,116.838572,116.714286,116.581428,116.514288,115.952857,115.585713,115.085714,115.011428,117.400002,117.379997,117.360001
75%,162.339996,13712.209961,4.533557,162.192856,162.124285,161.909999,161.909999,161.909999,161.909999,161.81,161.81,162.300003,162.279999,162.279999
max,215.380005,19974.619141,13.560802,213.824286,213.824286,213.824286,213.824286,213.824286,213.824286,213.824286,213.824286,215.800003,215.800003,215.800003


In [15]:
df_train.head()

Unnamed: 0_level_0,Open,DJIA,std30,ma7_1,ma7_2,ma7_3,ma7_4,ma30_1,ma30_2,ma30_3,ma30_4,Close_1,Close_2,Close_3
Date.1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
2000-08-03,113.0,10706.580078,4.542157,104.589286,106.8125,113.214271,117.754457,109.875,111.678571,113.196429,115.508929,114.25,110.5,112.25
2000-08-04,116.0,10767.75,4.674236,106.767857,105.098214,112.874985,117.281242,109.75,113.625,111.892857,116.008929,116.0,114.25,110.5
2000-08-07,116.625,10867.009766,4.81781,108.214286,104.392857,112.169628,116.660714,109.392857,116.178571,110.285714,116.892857,115.875,116.0,114.25
2000-08-08,115.6875,10976.889648,5.032185,109.428571,103.607143,111.839271,115.589286,108.107143,117.964286,108.285714,117.410714,116.3125,115.875,116.0
2000-08-09,119.0,10905.830078,5.287005,110.580357,103.455357,110.857128,115.0,107.116071,118.857143,106.535714,117.660714,118.875,116.3125,115.875


In [17]:
close = df[["Close"]].loc[df_train.index]

In [21]:
years = mdates.YearLocator()   # every year
months = mdates.MonthLocator()  # every month
yearsFmt = mdates.DateFormatter('%Y')
plt.close()
plt.clf()
fig, ax = plt.subplots(figsize=(10,5))
ax.plot(close.index, close, color = 'b', label='IMB stock')
# ax.text(index[1], abs_error.max() - 0.01, "MAE={:7.5f}".format(mae), style='italic',
#         bbox={'facecolor':'blue', 'alpha':0.5, 'pad':10})
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
ax.xaxis.set_minor_locator(months)
datemin = datetime.date(close.index.date.min().year, 1, 1)
datemax = datetime.date(close.index.date.max().year + 1, 1, 1)
ax.set_xlim(datemin, datemax)
ax.format_xdata = mdates.DateFormatter('%d-%m-%Y')
ax.set_xlabel('Time')
ax.set_ylabel('Price')
ax.legend(loc='upper right', shadow=False)
ax.grid(True)
fig.autofmt_xdate()
plt.savefig("IBM")