# Theory:

<hr>

<p> 
    <font color="tomato">
        Volume Weighted Average Price [VWAP]:
    </font>
    <br>
    
    It a technique used by stock traders to reduce the noise and find out the actual trends, 
    considering factors such as Volume and Rate.
</p>


<p> 
    <font color="tomato">
        Auto Regressive Integrated Moving Average [ARIMA]:
    </font>
    <br>
    
    ARIMA is a very popular statistical method for time series forecasting. 
    ARIMA models take into account the past values to predict the future values.
    
    There are three important parameters in ARIMA:
        - p (past values used for forecasting the next value)
        - q (past forecast errors used to predict the future values)
        - d (order of differencing)
</p>

## Dataset:
<hr>

<p>
    Dataset collected from National Stock Exchange [NSE].<br>
    Link: https://www.nseindia.com/products/content/equities/equities/eq_security.htm
    
    Companies:
        - Biocon
        - Britannia
        - Coal India
        - Eicher Motors
        - Heidelberg
        - ICICI Bank
        - ITC
        - Maruti
        - Priya Village Roadshow (PVR)
        - SBI
</p>


# Code:
<hr>

### Global Variables:

In [None]:
rmseStock = {}

### Imports and Global Settings:

In [None]:
# import packages### Helper Functions:
import numpy as np
import pandas as pd

from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

# to plot charts and figures
import matplotlib.pyplot as plt
%matplotlib inline


In [None]:
# Matplotlib setting to adjust the plot size
from matplotlib.pylab import rcParams
rcParams['figure.figsize'] = 20,10


In [None]:
# Importing the ARIMA Library
from pmdarima.arima import auto_arima

In [None]:
# Data Normalisation
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))


### Helper Functions:

In [None]:
# Read the Dataset
def readCSV(filename):
    filename = "Dataset\\" + filename + ".csv"
    df = pd.read_csv(filename)
    df = cleanDate(df)
    return df    


# Changing the date to a standard format [dd-mm-yy]
def cleanDate(df):
    df['Date'] = pd.to_datetime(df.Date,format='%d-%b-%Y')
    df.index = df['Date']
    return df
    

#     Helper function to plot VWAP for stocks
def plotChart(df):
    plt.figure(figsize=(24, 8))
    plt.plot(df['Average Price'], label='VWAP')
    plt.xlabel("Date")
    plt.ylabel("Volume Weighted Average Price")
    plt.legend()
    
    
# Print the entire dataset
def printEntireData(df):
    print(df)
    plotChart(df)
    

# Print a sample of the dataset
def printSampleData(df):
    length = len(df)
    list = [0, 5, 6, 9, 10]
    df1 = df[0:5]
    print(df1[df1.columns[list]])
    print(".\n.\n.")
    df2 = df[987:]
    print(df2[df2.columns[list]])
    print("\nDisplaying 10 out of {} rows.".format(length))


## Implementation of ARIMA model:

<hr>

In [None]:
def arimaModel(df, stock):
    data = df.sort_index(ascending=True, axis=0)

    # Total data: 992
    endTrainingValue = 962
    train = data[:endTrainingValue]
    valid = data[endTrainingValue:]

    predictionPeriod = len(data) - len(train)

    training = train['Average Price']
    validation = valid['Average Price']

    model = auto_arima(training, start_p=1, start_q=1, max_p=3, max_q=3, m=12, start_P=0, seasonal=True, d=1, D=1, trace=True, error_action='ignore', suppress_warnings=True)
    model.fit(training)

    forecast = model.predict(n_periods = predictionPeriod)
    forecast = pd.DataFrame(forecast, index = valid.index, columns=['Prediction'])

    arimaModelError(valid, forecast, stock)
    plotResult(train, valid, forecast, stock)


In [None]:
# Generating Root Mean Square Error (RMSE):
def arimaModelError(valid, forecast, stock):
    rms = np.sqrt(np.mean(np.power((np.array(valid['Average Price']) - np.array(forecast['Prediction'])),2)))
    print("Root Mean Square Error: {}".format(rms))
    rmseStock[stock] = rms

In [None]:
# Helper Function to Plot the result of the ARIMA model
def plotResult(train, valid, forecast, stock):
    plt.figure(figsize=(24, 6))
    plt.plot(train['Average Price'], label = "Training Data VWAP")
    plt.plot(valid['Average Price'], label = "Validation Data VWAP")
    plt.plot(forecast['Prediction'], label = "Predicted VWAP")
    plt.title(stock, fontsize = 18)
    plt.legend()


### Reading the Dataset & Executing Models:

In [None]:
stockList = ['BIOCON', 'BRITANNIA', 'COALINDIA', 'EICHERMOT', 'HEIDELBERG', 'ICICIBANK', 'ITC','MARUTI', 'PVR', 'SBIN']

for stock in stockList:
    df = readCSV(stock)
    printSampleData(df)

    # Applying Arima Model
    print("Applying ARIMA Model on: {}".format(stock))
    arimaModel(df, stock)
    

In [None]:
print(rmseStock)

## Inference:

For sectors which are seasonal such as Maruti (Automobile) and Priya Village Roadshow (Entertainment) the model couldn't fit properly and provided large RMS Error values.

Whereas other sectors such as BIOCON (Pharmaceutical), CoalIndia (Mining), SBI & ICICI (Banking) provided acceptable RMS Error values.