# STOCK PREDICTION - STA, EMA

<p> In this section, the user will be presented with predictions of the evolution on stock´s Open, Close price and Volume. Reader (assuming he is as a stock buyer) can use the model predictions to help him decide when to buy or sell the stocks. However, it should be noted that stock market prices are highly volatile with no consistent patterns and that makes stock prices modelling difficult. It is impossible to model stock prices over time near-perfectly but scientists are not discouraged by this. There are many approaches on stock´s prices prediction computation. We decided to use two short-term measures - standard averaging (STA) and exponential moving average (EMA) -  and one long-term - LSTM (Long Short-Term Memory). This file will focus on short-term predicting via STA and EMA. </p>

<p>The short-term predictions via averaging can be reliably used as one-step ahead prediction (in our case one trading day ahead prediction), using them for more than one time step is not recommended (can lead to bad results). Put simply, these methods are, quite intuitively, based on averaging historically observed values and using the averages as predictions. In case of standard averaging (STA), the price at time t+1 equals the average of historical values within specific time window (e.g. last 100 days). In other words, normal averaging is based on the following equation:  $x_{t+1}=1/N\sum^{t}_{i=t-N}x_{i}$. In the exponential moving average method (EMA), the stock price at time t+1 is calculated as follows: $x_{t+1}=EMA_{t}=\gamma*EMA_{t-1} + (1-\gamma)*x_{t}$ , where $EMA_{0}=0$ and $EMA$ is the exponential moving average value maintained over time. </p>

Short-term predictions are one point in time predictions. Therefore, they will be presented just in the numerical form via table, while LSTM predictions, as sequence of predicted values, will be showed graphically rather than numerically. 

,  -> no graph, just value predicted presented; LSTM will be presented the other way - graphically rather than numerically; LSTM prediction done only for Close??

<p>While averaging methods are used for predicting the next value, LSTM method predicts future sequence of values (usually 30 days).  </p>

**Note**: The models are not "golden rule" you should follow no matter what. It is important to understand that those models provide some sort of guidance on how the stock market might develop and also help you understand the market. Nevertheless, it is not wise to base your investment strategy solely on the predictions given by the models and we as authors of this project definitely do not encourage you to do so.

**Disclaimer**: The mathematical methods and approaches used in the prediction analyses employed by authors of this project are based on the measures standardly used in the research area. Thus, the authors do not claim the procedures used are their own inventions and do not take any credit for them.

Issues with tensorflow package...It may happen that when installing the tensorflow package (if you already do not have one installed), you may encounter an error of the following fashion *Could not install packages due to an EnvironmentError: [WinError 5] Access is denied:* . There are two possible solutions to this problem. Firstly, you run the following command: "pip install tensorflow --user". Or secondly, you should change the access permission, where the particular package is going to install. In either case, after you run the command/modify access permission, it is better to close the command line (or e.g. Jupyter Notebook), open new one and try to install the package again.

So, let´s get started with the prediction analysis. Firstly, there are some packages we need to import (if you do not have them installed, please do so).

In [102]:
import numpy as np
import yfinance as yf
import pandas as pd
#!pip install prettytable
from prettytable import PrettyTable #for table
import SP500_data_downloader as SP
from SP500_data_downloader import *

Next, we download stock market data we will be working with.

**NOTE**: For time saving reasons, in this particular ipynb file, we download data only for 9 tickers (AAPL, MSFT, GE, IBM, AA, DAL, UAL, PEP, KO) with the help of function *get_data_try()* (for more details on how the function is defined, see SP500_data_downloader.py or Data_S&P500_Yahoo.ipynb in the main branch of our repository). The corresponding dataset is denoted as *aa*.

In [103]:
aa=get_data_try()

In [104]:
aa

Attributes,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Adj Close,Close,...,Open,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume,Volume
Symbols,AAPL,MSFT,GE,IBM,AA,DAL,UAL,PEP,KO,AAPL,...,KO,AAPL,MSFT,GE,IBM,AA,DAL,UAL,PEP,KO
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2
2015-01-02,24.714506,41.193836,167.903976,114.906494,37.160912,44.267323,66.339996,76.954910,33.559845,27.332500,...,42.259998,212818400.0,27913900.0,5319704.0,5779673.0,4340408.0,8637300.0,6215000.0,3545700.0,9921100.0
2015-01-05,24.018265,40.815037,164.821960,113.098465,35.008007,43.529228,66.150002,76.376366,33.559845,26.562500,...,42.689999,257142000.0,39673900.0,5464316.0,5104898.0,9026467.0,10556500.0,5033400.0,6441000.0,26292600.0
2015-01-06,24.020521,40.215977,161.270920,110.659393,35.265423,42.503109,64.580002,75.797798,33.814697,26.565001,...,42.410000,263188400.0,36447900.0,8288800.0,6429448.0,8063670.0,12880400.0,6051700.0,6195000.0,16897500.0
2015-01-07,24.357346,40.726929,161.337906,109.936157,36.178066,42.278088,65.529999,78.014191,34.236782,26.937500,...,42.799999,160423600.0,29114100.0,5673525.0,4918083.0,6637744.0,10516200.0,5135000.0,6526300.0,13412300.0
2015-01-08,25.293209,41.925045,163.280914,112.325630,37.207718,43.376221,66.639999,79.432068,34.650898,27.972500,...,43.180000,237458000.0,29645200.0,5619172.0,4431693.0,8185851.0,10499300.0,6889500.0,7131600.0,21743600.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2022-01-31,174.557602,310.980011,94.480003,133.570007,56.709999,39.689999,42.880001,173.520004,61.009998,174.779999,...,60.299999,115541600.0,46444500.0,7001600.0,5859000.0,7206100.0,10657300.0,11871000.0,5908000.0,22045300.0
2022-02-01,174.387817,308.760010,97.949997,135.529999,58.169998,40.490002,43.959999,172.339996,60.560001,174.610001,...,60.910000,86213900.0,40950400.0,8149200.0,6206400.0,5939300.0,9668800.0,9614300.0,5952700.0,20841700.0
2022-02-02,175.616257,313.459991,98.040001,137.250000,59.209999,40.520000,44.119999,175.470001,61.180000,175.839996,...,60.619999,84914300.0,36636000.0,5561400.0,5357200.0,5613800.0,10249800.0,12135300.0,5767000.0,20225600.0
2022-02-03,172.679993,301.250000,98.320000,137.779999,62.740002,39.730000,43.080002,175.369995,61.610001,172.899994,...,60.939999,89418100.0,43730000.0,6213300.0,6100800.0,8076600.0,9922000.0,8016500.0,4632700.0,19440500.0


In [105]:
#checking that it is really a dataframe
isinstance(aa, pd.DataFrame)

True

In [106]:
#defining a function for rounding to specific number of decimal places
def truncate(n, decimals=0):
    multiplier = 10 ** decimals
    return int(n * multiplier) / multiplier


Here, you can change the structure of tickers in ticker_list. However, do not forget that you can choose only from those 9 tickers specified above. If you select a different one, the function will not work???

???? co se mi stane s funkcí, když bude ticker list prázdný??? -> vyzkoušet

In [107]:
# define the list of tickers
ticker_list=["MSFT", "GE", "AA"]

## STANDARD AVERAGING (STA)

Now, we can move on to the predictions themselves. Firstly, we will deal with the standard averaging method (STA). As already mentioned at the beginning, its key feature is denoting average of historical data as the predicted value for time *t+1*. 

So, what we do here is basically computing the average of Open price, Close price and Volume for each ticker from the *ticker_list* within particular time window (we decided to compute the average from 100 observations back). The resulting predictions are displayed in the table below.

In [108]:
##--------------------- STANDARD AVERAGING ----------------- ##


#funguje i s table a smyčkou + napasované na data downloader
# chybí jen nadefinovat přímo jako funkci


# idea - basically calculates the average from historical values within specified time window and uses that as one step ahead prediction


#ticker_list=["MSFT", "GE", "AA","PEP"]

# specify the column names while initializing the table 
myTable = PrettyTable(["Ticker", "Prediction method", "Open", "Close", "Volume"]) 

for ticker in ticker_list:
    data_volume=aa.Volume[ticker_list][ticker]
    data_close=aa.Close[ticker_list][ticker]
    data_open=aa.Open[ticker_list][ticker]
    N = 100 #from how many prices back is the average computed

    std_prediction_list=[1,2,3] #create a list to which we will write the prediction values (it cannot be empty because with lists, we can 
                             # use indexing only to access or modify an item that already exists ) 
    pred_idex=-1

    Open=data_open
    Close=data_close
    Volume=data_volume

    featurelist = [Open, Close, Volume]
    for feature in featurelist:
        pred_idex=pred_idex+1
        feature[feature.size - N:, ]
        std_prediction = truncate(np.mean(feature[feature.size - N:, ]),2)
        std_prediction_list[pred_idex]=std_prediction #assign predicted value to a specific position in std_predictions_list
                
    print("Short-term predictions on ticker", ticker, "are:")
    print(std_prediction_list)
    print("")
    myTable.add_row([ticker, "Standard Averaging", std_prediction_list[0], std_prediction_list[1], std_prediction_list[2]]) 
print("")
print("")
print("Or summarized in the table for better comparison:")
print(myTable)
std_table=myTable    
    

Short-term predictions on ticker MSFT are:
[317.16, 316.93, 30914212.0]

Short-term predictions on ticker GE are:
[100.32, 100.3, 7179619.0]

Short-term predictions on ticker AA are:
[52.3, 52.35, 8240158.0]



Or summarized in the table for better comparison:
+--------+--------------------+--------+--------+------------+
| Ticker | Prediction method  |  Open  | Close  |   Volume   |
+--------+--------------------+--------+--------+------------+
|  MSFT  | Standard Averaging | 317.16 | 316.93 | 30914212.0 |
|   GE   | Standard Averaging | 100.32 | 100.3  | 7179619.0  |
|   AA   | Standard Averaging |  52.3  | 52.35  | 8240158.0  |
+--------+--------------------+--------+--------+------------+


In [109]:
myTable

Ticker,Prediction method,Open,Close,Volume
MSFT,Standard Averaging,317.16,316.93,30914212.0
GE,Standard Averaging,100.32,100.3,7179619.0
AA,Standard Averaging,52.3,52.35,8240158.0


Now, take all in the cell above, just define it as function *pred_sta()*.

In [110]:
#vloženo přímo to funkce
# funguje

def pred_sta(ticker_list):
    # specify the column names while initializing the table 
    myTable = PrettyTable(["Ticker", "Prediction method", "Open", "Close", "Volume"]) 

    for ticker in ticker_list:
        data_volume=aa.Volume[ticker_list][ticker]
        data_close=aa.Close[ticker_list][ticker]
        data_open=aa.Open[ticker_list][ticker]
        N = 100 #from how many prices back is the average computed

        std_prediction_list=[1,2,3] #create a list to which we will write the prediction values (it cannot be empty because with lists, we can 
                             # use indexing only to access or modify an item that already exists ) 
        pred_idex=-1

        Open=data_open
        Close=data_close
        Volume=data_volume

        featurelist = [Open, Close, Volume]
        for feature in featurelist:
            pred_idex=pred_idex+1
            feature[feature.size - N:, ]
            std_prediction = truncate(np.mean(feature[feature.size - N:, ]),2)
            std_prediction_list[pred_idex]=std_prediction #assign predicted value to a specific position in std_predictions_list

        
        print("Short-term predictions on ticker", ticker, "are:")
        print(std_prediction_list)
        print("")
        myTable.add_row([ticker, "Standard Averaging", std_prediction_list[0], std_prediction_list[1], std_prediction_list[2]]) 
    print("")
    print("")
    print("Or summarized in the table for better comparison:")
    print(myTable)
    std_table=myTable 

In [111]:
# check, that the function works and does what is supossed to

pred_sta(ticker_list)

Short-term predictions on ticker MSFT are:
[317.16, 316.93, 30914212.0]

Short-term predictions on ticker GE are:
[100.32, 100.3, 7179619.0]

Short-term predictions on ticker AA are:
[52.3, 52.35, 8240158.0]



Or summarized in the table for better comparison:
+--------+--------------------+--------+--------+------------+
| Ticker | Prediction method  |  Open  | Close  |   Volume   |
+--------+--------------------+--------+--------+------------+
|  MSFT  | Standard Averaging | 317.16 | 316.93 | 30914212.0 |
|   GE   | Standard Averaging | 100.32 | 100.3  | 7179619.0  |
|   AA   | Standard Averaging |  52.3  | 52.35  | 8240158.0  |
+--------+--------------------+--------+--------+------------+


## EXPONENTIAL MOVING AVERAGE (EMA)

Next, we will move on to the Exponential moving average (EMA) short-term predictions. As already mentioned above, these are a bit more sophisticated since they follow the exponential moving average methodoly instead of just averaging historical values. 

The procedure here is quite similar to the one we applied for STA predictions - we compute the exponential moving average of Open price, Close price and Volume for each ticker from the *ticker_list* . The results are again displayed in the table below.

In [112]:
##--------------------- EXPONENTIAL AVERAGING ----------------- ##

#funguje i s table a smyčkou + napasované na data downloader
# chybí jen nadefinovat přímo jako funkci

# idea - basically calculates the exponential moving average from t+1 time step and uses that as the one step ahead prediction



# specify the column names while initializing the table 
myTable = PrettyTable(["Ticker", "Prediction method", "Open", "Close", "Volume"]) 


for ticker in ticker_list:
    siz=len(aa)
    idex=0
    running_mean = 0.0
    gamma=0.1

    exp_prediction_list=[1,2,3] #create a list to which we will write the prediction values (it cannot be empty because with lists, we can 
                             # use indexing only to access or modify an item that already exists ) 
    pred_idex=-1
    
    data_volume=aa.Volume[ticker_list][ticker]
    data_close=aa.Close[ticker_list][ticker]
    data_open=aa.Open[ticker_list][ticker]

    Open=data_open
    Close=data_close
    Volume=data_volume

    featurelist = [Open, Close, Volume]
    for feature in featurelist:
        running_mean = 0.0
        idex=0
        pred_idex=pred_idex+1
        while idex < siz:
        #calculation
            running_mean = running_mean*gamma + (1.0-gamma)*feature[idex]
            idex=idex+1
            exp_prediction=truncate(running_mean,2)
        
        exp_prediction_list[pred_idex]=exp_prediction #assign predicted value to a specific position in exp_predictions_list
    
        #print(exp_prediction)
    print("Short-term EMA predictions on ticker", ticker, "are:")
    print(exp_prediction_list)
    print("")
    myTable.add_row([ticker, "Exponential Moving Average", exp_prediction_list[0], exp_prediction_list[1], exp_prediction_list[2]]) 
print("")
print("")
print("Or summarized in the table for better comparison:")
print(myTable)
exp_table=myTable    
       

Short-term EMA predictions on ticker MSFT are:
[301.13, 305.58, 35870234.06]

Short-term EMA predictions on ticker GE are:
[97.64, 98.93, 6380160.51]

Short-term EMA predictions on ticker AA are:
[61.91, 64.03, 7227031.78]



Or summarized in the table for better comparison:
+--------+----------------------------+--------+--------+-------------+
| Ticker |     Prediction method      |  Open  | Close  |    Volume   |
+--------+----------------------------+--------+--------+-------------+
|  MSFT  | Exponential Moving Average | 301.13 | 305.58 | 35870234.06 |
|   GE   | Exponential Moving Average | 97.64  | 98.93  |  6380160.51 |
|   AA   | Exponential Moving Average | 61.91  | 64.03  |  7227031.78 |
+--------+----------------------------+--------+--------+-------------+


Now, take all in the cell above, just define it as function *pred_ema()*.

In [113]:
#vloženo přímo to funkce
# funguje

def pred_ema(ticker_list):
    # specify the column names while initializing the table 
    myTable = PrettyTable(["Ticker", "Prediction method", "Open", "Close", "Volume"]) 


    for ticker in ticker_list:
        siz=len(aa)
        idex=0
        running_mean = 0.0
        gamma=0.1

        exp_prediction_list=[1,2,3] #create a list to which we will write the prediction values (it cannot be empty because with lists, we can 
                             # use indexing only to access or modify an item that already exists ) 
        pred_idex=-1
    
        data_volume=aa.Volume[ticker_list][ticker]
        data_close=aa.Close[ticker_list][ticker]
        data_open=aa.Open[ticker_list][ticker]

        Open=data_open
        Close=data_close
        Volume=data_volume

        featurelist = [Open, Close, Volume]
        for feature in featurelist:
            running_mean = 0.0
            idex=0
            pred_idex=pred_idex+1
            while idex < siz:
            #calculation
                running_mean = running_mean*gamma + (1.0-gamma)*feature[idex]
                idex=idex+1
                exp_prediction=truncate(running_mean,2)
        
            exp_prediction_list[pred_idex]=exp_prediction #assign predicted value to a specific position in exp_predictions_list
    
            #print(exp_prediction)
        print("Short-term EMA predictions on ticker", ticker, "are:")
        print(exp_prediction_list)
        print("")
        myTable.add_row([ticker, "Exponential Moving Average", exp_prediction_list[0], exp_prediction_list[1], exp_prediction_list[2]]) 
    print("")
    print("")
    print("Or summarized in the table for better comparison:")
    print(myTable)
    exp_table=myTable    
    


In [114]:
# check, that the function works and does what is supossed to

pred_ema(ticker_list)

Short-term EMA predictions on ticker MSFT are:
[301.13, 305.58, 35870234.06]

Short-term EMA predictions on ticker GE are:
[97.64, 98.93, 6380160.51]

Short-term EMA predictions on ticker AA are:
[61.91, 64.03, 7227031.78]



Or summarized in the table for better comparison:
+--------+----------------------------+--------+--------+-------------+
| Ticker |     Prediction method      |  Open  | Close  |    Volume   |
+--------+----------------------------+--------+--------+-------------+
|  MSFT  | Exponential Moving Average | 301.13 | 305.58 | 35870234.06 |
|   GE   | Exponential Moving Average | 97.64  | 98.93  |  6380160.51 |
|   AA   | Exponential Moving Average | 61.91  | 64.03  |  7227031.78 |
+--------+----------------------------+--------+--------+-------------+


If we want to directly compare the predictions from these two methods, we can print both of the prediction tables.

In [115]:
print(std_table)
print(exp_table)

+--------+--------------------+--------+--------+------------+
| Ticker | Prediction method  |  Open  | Close  |   Volume   |
+--------+--------------------+--------+--------+------------+
|  MSFT  | Standard Averaging | 317.16 | 316.93 | 30914212.0 |
|   GE   | Standard Averaging | 100.32 | 100.3  | 7179619.0  |
|   AA   | Standard Averaging |  52.3  | 52.35  | 8240158.0  |
+--------+--------------------+--------+--------+------------+
+--------+----------------------------+--------+--------+-------------+
| Ticker |     Prediction method      |  Open  | Close  |    Volume   |
+--------+----------------------------+--------+--------+-------------+
|  MSFT  | Exponential Moving Average | 301.13 | 305.58 | 35870234.06 |
|   GE   | Exponential Moving Average | 97.64  | 98.93  |  6380160.51 |
|   AA   | Exponential Moving Average | 61.91  | 64.03  |  7227031.78 |
+--------+----------------------------+--------+--------+-------------+
