### Importing Libraries

In [1]:
!pip install yfinance
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')
%matplotlib inline



### Loading Dataset from Yahoo Finance

In [2]:
import yfinance as yf
msft = yf.Ticker("MSFT")
df1 = msft.history(period="5y")

df1 = df1.reset_index()
df1.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits
0,2015-07-13,40.71,41.29,40.68,41.21,28178300,0.0,0
1,2015-07-14,41.13,41.59,41.01,41.29,22880300,0.0,0
2,2015-07-15,41.34,41.53,41.12,41.41,26629600,0.0,0
3,2015-07-16,41.64,42.26,41.6,42.23,26271700,0.0,0
4,2015-07-17,42.13,42.34,41.87,42.19,29467100,0.0,0


### Preprocessing and Feature Extraction

In [3]:
df = df1['Open'].values
df = df.reshape(-1, 1)

**Splitting Dataset**

In [4]:
dataset_train = np.array(df[:int(df.shape[0]*0.8)])
dataset_test = np.array(df[int(df.shape[0]*0.8):]) #1007 & 302
print(dataset_train.shape)
print(dataset_test.shape)

(1007, 1)
(252, 1)


**Feature Scaling**

In [5]:
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler()
dataset_train = sc.fit_transform(dataset_train)
dataset_test = sc.fit_transform(dataset_test)

**Creating timesteps for LSTM**<br>
Assume x contain every next 50 values from dataset and 51st be its prediction for next day in y

In [6]:
def create_dataset(df):
    x = []
    y = []
    for i in range(50, df.shape[0]):
        x.append(df[i-50:i, 0])
        y.append(df[i, 0])
    x = np.array(x)
    y = np.array(y)
    return x,y 

In [7]:
x_train, y_train = create_dataset(dataset_train)
x_test, y_test = create_dataset(dataset_test)

**Reshaping Features**

In [8]:
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))
x_test = np.reshape(x_test, (x_test.shape[0], x_test.shape[1], 1))

### Training Model using LSTM

In [9]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM  
from tensorflow.keras.layers import Dropout

In [10]:
model = Sequential()
model.add(LSTM(units=96, return_sequences=True, input_shape=(x_train.shape[1], 1)))
model.add(Dropout(0.2))
model.add(LSTM(units=96, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=96, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(units=96))
model.add(Dropout(0.2))
model.add(Dense(units=1))
model.compile(loss='mean_squared_error', optimizer='adam')

In [11]:
model.fit(x_train, y_train, epochs=20, batch_size=32)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x7fd595b3d6d8>

In [12]:
predictions = model.predict(x_test)
predictions = sc.inverse_transform(predictions)
preds_y_test = sc.inverse_transform(y_test[:,].reshape(-1,1))

In [13]:
pred_y_test = pd.DataFrame(data=predictions)
pred_y_test.columns = ['pred']

scaled_y_test = pd.DataFrame(data=preds_y_test)
scaled_y_test.columns = ['pred']

### Visualizing Results

In [14]:
from plotly.offline import iplot
from plotly import graph_objs as go
import plotly.express as px


fig = go.Figure()
fig.add_trace(go.Scatter(y=scaled_y_test.pred,
                    mode='lines',
                    name='lines'))
fig.add_trace(go.Scatter(y=pred_y_test.pred,
                    mode='lines',
                    name='lines'))

fig.update_traces(mode='lines')

fig.show()

### Moving Average
Moving Average is used for technical analysis by Statisticians, Time Series Analyst, Traders or Investors, <br>
its significance is to analyze the primary behaviour of the data whether it is in uptrend, downtrend or side ways.<br>
Two types of moving averages are widely used ***Simple Moving Average*** & ***Exponential Moving Average***. <br>
<br>
***Simple Moving Average*** can be calculated by taking set of observations and getting it divide by the sum of total <br>
number of that many observation data. SMA is a basic arithmetic mean formula i.e Total sum of observation divided by <br>
number of observation so there are no any major actions or indications as observations are evaluated with the particular set<br>
and after evaluation of new observations old are forgotten.<br> 

Let's visualize Moving Average plots for 1 year data

In [15]:
def filter_date(s_date,e_date):
  start_date = df1["Date"] >= s_date
  end_date = df1["Date"] <= e_date
  betw_dates = start_date & end_date

  df = df1.loc[betw_dates]
  return df

df = filter_date("2019-1-1","2020-1-1")

In [16]:
# Finding simple moving average of 20,50 & 100 period
df['SM20'] = df.iloc[:,1].rolling(window=20).mean()
df['SM50'] = df.iloc[:,1].rolling(window=50).mean()
df['SM100'] = df.iloc[:,1].rolling(window=100).mean()

In [17]:
df.head()

Unnamed: 0,Date,Open,High,Low,Close,Volume,Dividends,Stock Splits,SM20,SM50,SM100
875,2019-01-02,97.56,99.72,96.96,99.1,35329300,0.0,0,,,
876,2019-01-03,98.1,98.19,95.26,95.45,42579100,0.0,0,,,
877,2019-01-04,97.73,100.46,96.95,99.89,44060600,0.0,0,,,
878,2019-01-07,99.61,101.21,98.96,100.02,35656100,0.0,0,,,
879,2019-01-08,100.98,101.89,99.68,100.75,31514400,0.0,0,,,


In [18]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=df.Date,
                    y=df.Open,
                    mode='lines',
                    name='Open'))
fig.add_trace(go.Scatter(x=df.Date,
                         y=df.SM20,
                    mode='lines',
                    name='SM20'))
fig.add_trace(go.Scatter(x=df.Date,
                         y=df.SM50,
                    mode='lines',
                    name='SM50'))
fig.add_trace(go.Scatter(x=df.Date,
                         y=df.SM100,
                    mode='lines',
                    name='SM100'))

fig.update_traces(mode='lines')

fig.show()

In ***Exponential Moving Average*** weightage to the recent price is given more so it may look more responsible to the 
latest price actions than SMA. Unlike definition its working is bit deep. Smoothing factor has to be applied for 
previous day as well as current day so that any of the weights does not imbalance value of the other.

In [19]:
# Finding exponential moving average of 20,50 & 100 period
df['EMA20'] = df.iloc[:,2].ewm(span=20,adjust=False).mean()
df['EMA50'] = df.iloc[:,2].ewm(span=50,adjust=False).mean()
df['EMA100'] = df.iloc[:,2].ewm(span=100,adjust=False).mean()

In [20]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=df.Date,
                         y=df.Open,
                    mode='lines',
                    name='Open'))
fig.add_trace(go.Scatter(x=df.Date,
                         y=df.EMA20,
                    mode='lines',
                    name='EMA20'))
fig.add_trace(go.Scatter(x=df.Date,
                         y=df.EMA50,
                    mode='lines',
                    name='EMA50'))
fig.add_trace(go.Scatter(x=df.Date,
                         y=df.EMA100,
                    mode='lines',
                    name='EMA100'))

fig.update_traces(mode='lines')

fig.show()

### Conclusion
As above results are pretty much same this does not mean that both moving averages has almost same performance but <br>
***Exponential Moving Average*** is used with the small amount of data to obtain latest price actions while<br>
***Simple Moving Average*** is used with the large amount of data to know the long term behaviour of data and here<br>
as we have not distinguished data with daily, weekly or monthly time frame significance of different observations<br>
cannot be seen properly.