In [1]:
import requests
import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
from matplotlib import style
import plotly.graph_objects as go

# EMA Analysis and Experiment
In this section, we explored the trading strategies - Moving Average Crossing, implementing a key indicator: EMA.

The exponential moving average (EMA) is a technical chart indicator that tracks the price of an investment (like a stock or commodity) over time. The EMA is a type of weighted moving average (WMA) that gives more weighting or importance to recent price data. Like the simple moving average (SMA), the EMA is used to see price trends over time, and watching several EMAs at the same time is easy to do with moving average ribbons. 

### Calculating SMA and EMA

The EMA is designed to improve on the idea of an SMA by giving more weight to the most recent price data, which is considered to be more relevant than older data. Since new data carries greater weight, the EMA responds more quickly to price changes than the SMA does. 

The formula for calculating the EMA is a matter of using a multiplier and starting with the SMA. There are three steps in the calculation:

    (1) Compute the SMA
    (2) Calculate the multiplier for weighting the EMA
    (3) Calculate the current EMA
    
The calculation for the SMA is the same as computing an average or mean. That is, the SMA for any given number of time periods is simply the sum of closing prices for that number of time periods, divided by that same number. So, for example, a 10-day SMA is just the sum of the closing prices for the past 10 days, divided by 10.

The mathematical formula looks like this:

\begin{aligned} &\text{Simple moving average} = \frac{(N - \text{period sum})}{N}\\ &\textbf{where:}\\ &N=\text{number of days in a given period}\\&\text{period sum}=\text{sum of stock closing prices in that period}\\ \end{aligned}

The formula for calculating the weighting multiplier looks like this:

\begin{aligned} \text{Weighted multiplier} &= 2 \div (\text{selected time period} + 1) \\ &= 2 \div (10 + 1) \\ &= 0.1818 \\ &= 18.18\% \\ \end{aligned}

(In both cases, we’re assuming a 10-day SMA.)

So, when it comes to calculating the EMA of a stock:

\begin{aligned} &EMA = \text{Price}(t) \times k + EMA(y) \times (1-k) \\ &\textbf{where:}\\ &t=\text{today}\\ &y=\text{yesterday}\\ &N=\text{number of days in EMA}\\ &k=2 \div (N + 1)\\ \end{aligned}

The weighting given to the most recent price is greater for a shorter-period EMA than for a longer-period EMA. For example, an 18.18% multiplier is applied to the most recent price data for a 10-day EMA, as we did above, whereas for a 20-day EMA, only a 9.52% multiplier weighting is used. There are also slight variations of the EMA arrived at by using the open, high, low, or median price instead of using the closing price. 


### Fetch Data
From Alpha Vantage, we can obtain a free time series stock API to fetch time series stock data: daily, weekly, monthly, and intraday, as well various technical indicators, including SMA, EMA, etc. 

So, first, we designed a function to fetch data from the Alpha Vantage, using the free API. In this function, we can choose the stock, manipulate the desired indicators by changing the length for short-term and long-term moving averages, switching different types of moving average, and altering different time interval. In addition, if you have full API with unlimited access to the website, you can use your own API keys to download all the data you need from the website at one time.

Here, we concentrate on daily data, with a default 30-day short term EMA and 150-day long term EMA.

In [2]:
def EMA_data(stock, short = 30, long = 150, indicator = 'EMA', interval = "daily", APIkeys = "6UOESXFNVPGFRYH2"):
    """
    The function would fetch data from Alpha Vantage and store the csv file locally
    
    Paramters:
    ____________
    stock:      stock abbreviation
    indicator:  moving average indicator
    short:      short term indicator interval period
    long:       long term Eindicator interval period
    interval:   moving average indicator interval
    APIkeys:    API key for Alpha Vantage
    
    """
    assert short < long, "Length for short-term MA should be less than length for long-term MA!"
    
    # fetch data 
    func = [stock+'_adjusted_close', indicator + str(short), indicator + str(long)]
    functions = ['TIME_SERIES_DAILY_ADJUSTED', indicator+'&time_period='+str(short), indicator+'&time_period='+str(long)]
    urls = {}
    data = {}
    
    ## create a dictionary of urls
    for i in range(len(func)):
        a = 'https://www.alphavantage.co/query?function='+functions[i]+'&symbol='+stock+'&interval='+interval+'&series_type=close&outputsize=full&datatype=csv&apikey='+APIkeys
        urls[func[i]] = a
        
    ## fetch data and transform
    for i in range(len(func)):
        if i == 0:
            data = pd.read_csv(urls[func[i]], index_col = "timestamp").iloc[:,[4]]
        if i == 1:
            df = pd.read_csv(urls[func[i]], index_col = "time")
            data = data.merge(df, left_index = True, right_index = True)
        if i == 2:
            df = pd.read_csv(urls[func[i]], index_col = "time")
            data = data.merge(df, left_index = True, right_index = True)
    data.sort_index(ascending = True, inplace = True)
    data.columns = func
    data.to_csv('data/'+stock+'.csv')
    return data

Due to the limited API, we fetched the data, stored the csv file locally and read the file each time to avoid IP ban. 

In [3]:
df = EMA_data("TSLA")

In [4]:
df.head()

Unnamed: 0,TSLA_adjusted_close,EMA30,EMA150
2011-01-31,4.82,5.2508,4.7297
2011-02-01,4.782,5.2206,4.7304
2011-02-02,4.788,5.1927,4.7312
2011-02-03,4.726,5.1625,4.7311
2011-02-04,4.692,5.1322,4.7306


### Design Buy/Sell Algorithm

Now, we need to implement the moving average crossing strategy to indicate when to buy and when to sell. The basic idea is buying when the short-term MA crosses the long-term MA from below, with a upward slope of long-term MA, and selling when the short-term MA crosses the long-term MA from above, with a downward slope of long-term MA. 

Note: Since the time interval here is daily data, we used the log difference and 5-day smoothing (at default) to calculate the smoothed slope of long-term MA. 

We designed a function to calculate the buy/sell timing and the percentage change of the stock prices with different holding period, with the data we fetched from Alpha Vantage. In this function, we can change different parameters to adjust to the optimal trading strategies. 

For example, you can change the smoothing period whatever is needed, buying and selling threshold for the slope of the long-term MA to adjust to your risk-averse. In addition, you can evaluate the strategy's profiting performance by choosing different position holding periods. 

In [5]:
def EMA_calculate(data, smoothing = 5, buy_threshold = 0, sell_threshold = 0, positions = [30]):
    """
    The function would use the data you fetched and indicate the buy/sell signal, 
    the percentage change of the stock price
    
    Paramters:
    ____________
    data:            data fetched from Alpha Vantage
    smoothing:       period of time to smooth the long-term MA
    buy_threshold:   threshold for long-term MA' slope when considering buy
    sell_threshold:  threshold for long-term MA' slope when considering sell
    positions:       holding periods for calculating the percentage change
    
    """
    
    df = data
    profit = []
    
    df['diff'] = data.iloc[:,1] - data.iloc[:,2] # column 3
    # use the log difference and 5-day smoothing (at default) to calculate the smoothed slope of long-term MA
    df['long_slope'] = np.log(data.iloc[:,2]).diff(1) # column 4
    df['long_slope_sm'] = df.iloc[:,4].rolling(smoothing).mean() # column 5
    
    # define crossing from above and below
    df['up_cross'] = False # column 6
    df['down_cross'] = False # column 7
    for i in range(len(df[1:-1])):
        df.iloc[i,6] = (df.iloc[i-1,3] < 0) & (df.iloc[i,3] > 0)
    for i in range(len(df[1:-1])):
        df.iloc[i,7] = (df.iloc[i-1,3] > 0) & (df.iloc[i,3] < 0)
    
    # define buy/sell signal
    df['buy_signal'] = df.up_cross & (df['long_slope_sm'] > buy_threshold)
    df['sell_signal'] = df.down_cross & (df['long_slope_sm'] < sell_threshold)
    
    # calculate the percentage change of stock price with different holding periods
    for i in range(len(positions)):
        profit.append(str(positions[i])+"day_change")
        df[profit[i]] = -df.iloc[:,0].diff(-positions[i])/df.iloc[:,0]
    
    return df

In [6]:
tesla = pd.read_csv("data/TSLA.csv", index_col = 0)
tesla

Unnamed: 0,TSLA_adjusted_close,EMA30,EMA150
2011-01-31,4.820,5.2508,4.7297
2011-02-01,4.782,5.2206,4.7304
2011-02-02,4.788,5.1927,4.7312
2011-02-03,4.726,5.1625,4.7311
2011-02-04,4.692,5.1322,4.7306
...,...,...,...
2021-12-07,1051.750,1053.8184,837.9269
2021-12-08,1068.960,1054.7953,840.9869
2021-12-09,1003.800,1051.5053,843.1434
2021-12-10,1017.030,1049.2811,845.4465


In [7]:
tesla_data = EMA_calculate(tesla, positions = [30,60,90,120])

### Report the performance of the strategy

Since we have calcuate the signal and the change in stock prices, we can directly report the performance by subsetting the dataset. 

In [8]:
tesla_data[tesla_data.buy_signal == True]

Unnamed: 0,TSLA_adjusted_close,EMA30,EMA150,diff,long_slope,long_slope_sm,up_cross,down_cross,buy_signal,sell_signal,30day_change,60day_change,90day_change,120day_change
2011-03-31,5.55,4.7544,4.721,0.0334,0.002354,0.000407,True,False,True,False,-0.007207,-0.01045,-0.096937,-0.068468
2011-10-13,5.588,5.1476,5.1464,0.0012,0.001147,0.001029,True,False,True,False,0.133142,-0.011453,0.235863,0.234073
2012-06-25,6.622,6.2095,6.209,0.0005,0.000902,0.000868,True,False,True,False,-0.086379,-0.062217,-0.126602,0.038961
2012-11-19,6.584,6.0109,5.997,0.0139,0.001318,0.000802,True,False,True,False,0.056197,0.125152,0.346902,1.528554
2015-05-05,46.59,42.9211,42.8688,0.0523,0.001167,0.000985,True,False,True,False,0.117879,0.145267,0.074222,-0.102425
2016-04-04,49.398,43.2033,43.0032,0.2001,0.002,0.001247,True,False,True,False,-0.156687,-0.183003,-0.0864,-0.164217
2016-07-26,45.902,44.0677,44.046,0.0217,0.000565,0.000372,True,False,True,False,-0.121128,-0.113067,-0.207529,0.026448
2017-01-06,45.802,41.481,41.3621,0.1189,0.001442,0.001069,True,False,True,False,0.211257,0.326143,0.336667,0.575259
2018-01-12,67.244,65.0433,64.9363,0.107,0.000478,0.000474,True,False,True,False,0.04393,-0.104961,-0.169978,-0.081256
2018-06-18,74.166,62.7957,62.5949,0.2008,0.002486,0.001862,True,False,True,False,-0.19602,-0.216514,-0.222015,-0.034679


In [9]:
tesla_data[tesla_data.sell_signal == True]

Unnamed: 0,TSLA_adjusted_close,EMA30,EMA150,diff,long_slope,long_slope_sm,up_cross,down_cross,buy_signal,sell_signal,30day_change,60day_change,90day_change,120day_change
2011-03-24,4.466,4.7032,4.7114,-0.0082,-0.0007,-0.000627,False,True,False,True,0.21451,0.164801,0.224362,0.090013
2011-08-22,4.39,5.1998,5.2282,-0.0284,-0.002159,-0.001099,False,True,False,True,0.077904,0.545786,0.308884,0.434624
2012-05-31,5.9,6.2357,6.2358,-0.0001,-0.000721,-0.000368,False,True,False,True,0.161017,-0.0,-0.008475,0.100678
2012-07-30,5.47,6.1842,6.2184,-0.0342,-0.001623,-0.001039,False,True,False,True,0.016453,0.038026,0.24936,0.316271
2014-12-15,40.808,46.1199,46.3826,-0.2627,-0.001611,-0.001343,False,True,False,True,0.005685,-0.075279,0.134827,0.254656
2015-10-14,43.376,48.1568,48.3711,-0.2143,-0.001386,-0.001255,False,True,False,True,0.058834,-0.041636,-0.174659,0.185909
2016-06-21,43.922,44.2856,44.3011,-0.0155,-0.000115,-0.000212,False,True,False,True,0.028141,-0.087382,-0.071035,-0.124903
2016-09-01,40.154,44.1032,44.2201,-0.1169,-0.001234,-0.000582,False,True,False,True,-0.021218,-0.023161,0.144245,0.280072
2017-11-14,61.74,65.165,65.2966,-0.1316,-0.000732,-0.000823,False,True,False,True,0.021574,0.022773,-0.095627,-0.005993
2018-03-22,61.82,65.7897,65.8221,-0.0324,-0.000817,-0.000626,False,True,False,True,-0.04856,0.199709,-0.035458,-0.060045


However, we cannot observe 100% profitablity for each buy/sell signal at any holding period. It is because the trend trading strategy would differ each time due to different situation. Thus, we need to visualize the time series data and evaluate each signal case by case. 

### Data Visualization

In order to evaluate each signal, we designed a function to plot the time series data (adjusted close price, short-term EMA, long-term EMA and buy/sell signals) using plotly with range slider and time frame selection. In particular, the red dashed vertical lines indicate the selling signals, while the green dashed vertical lines indicate the buying signals.

In [130]:
def EMA_visual(data):
    buy = []
    sell = []
    fig = go.Figure()
    
    fig.add_trace(go.Scatter(x=list(data.index), y=list(data.iloc[:,0]), name = data.columns[0]))
    fig.add_trace(go.Scatter(x=list(data.index), y=list(data.iloc[:,1]), name = data.columns[1]))
    fig.add_trace(go.Scatter(x=list(data.index), y=list(data.iloc[:,2]), name = data.columns[2]))

    buy = data[data.buy_signal == True].index.tolist()    
    for i in buy:
        fig.add_vline(x = i, line_width=3, line_dash="dash", line_color="green")
    
    sell = data[data.sell_signal == True].index.tolist()    
    for i in sell:
        fig.add_vline(x = i, line_width=3, line_dash="dash", line_color="red")
        
    # Set title
    fig.update_layout(title_text=data.columns[0][:-15]+" Price and Signal")
    
    # Add range slider
    fig.update_layout(
        xaxis=dict(
                rangeselector=dict(buttons=list([
                        dict(count=3, label="3m", step="month", stepmode="backward"),
                        dict(count=6, label="6m", step="month", stepmode="backward"),
                        dict(count=1, label="1y", step="year", stepmode="backward"),
                        dict(count=3, label="3y", step="year", stepmode="backward"),
                        dict(step="all")])),
                rangeslider=dict(visible=True),
                type="date"),
        yaxis = dict(fixedrange = False),    
    )

    fig.show()

In [131]:
EMA_visual(tesla_data)

In [132]:
print(len(tesla_data[tesla_data.buy_signal == True]))

14


In [133]:
print(len(tesla_data[tesla_data.sell_signal == True]))

14


The graph above visualizes the signaling with moving average crossings. We can observe from the graph that:

(1) In a upward run of a longer period, the buy signal can perform well to predict the bottom, and the strategy is profitable if buy at the buy signal and sell at the sell signal.

(2) During turbulance, or when the stock price stalls, the strategy performs badly.

(3) Sell signals cannot capture the "top"s and there are lags.

 

#### Solution to (2):
We can change parameters to ignore the turbulance period, for example, increase the threshold for long-term MA. 

In [158]:
tesla = pd.read_csv("data/TSLA.csv", index_col = 0)
tesla_data_2 = EMA_calculate(tesla, buy_threshold = 0.0008, sell_threshold = -0.0008)
EMA_visual(tesla_data_2)

In [159]:
print(len(tesla_data_2[tesla_data_2.buy_signal == True]))

9


In [160]:
print(len(tesla_data_2[tesla_data_2.sell_signal == True]))

7


#### Solution to (3):
In order to capture the top and bottom better, we need to redesign the sell signal algorithm. 

As we can observe from both graphs, the tops are reached when the stock price crosses the short-term EMA from above, while the gap between short-term and long-term EMA is shrinking and long-term EMA is still increasing (due to the lag effect). In addition, the bottom are recognized when stock price crosses the long-term EMA from below, while the gap between two EMAs is expanding. Then, we can redesign the signal algorithms accordingly. 

In [286]:
def EMA_calculate_2(data, smoothing = 5, buy_threshold = 0, sell_threshold = 0):
    """
    The function would use the data you fetched and indicate the buy/sell signal, 
    the percentage change of the stock price
    
    Paramters:
    ____________
    data:            data fetched from Alpha Vantage
    smoothing:       period of time to smooth the long-term MA
    buy_threshold:   threshold for long-term MA' slope when considering buy
    sell_threshold:  threshold for long-term MA' slope when considering sell
    
    """
    
    df = data
    profit = []
    
    df['diff'] = data.iloc[:,1]/data.iloc[:,2] # column 3
    df['diff_slope'] = data.iloc[:,3].diff(1) # column 4
    df['diff_slope_sm'] = df.iloc[:,4].rolling(smoothing).mean() # column 5
    df['short_slope'] = np.log(np.log(data.iloc[:,1])).diff(1) # column 6
    df['short_slope_sm'] = df.iloc[:,6].rolling(smoothing).mean() # column 7
    df['long_slope'] = np.log(np.log(data.iloc[:,2])).diff(1) # column 8
    df['long_slope_sm'] = df.iloc[:,8].rolling(smoothing).mean() # column 9
    
    # define buy/sell signal
    df['buy_signal'] = False # column 10
    for i in range(len(df[1:-1])):
        df.iloc[i,10] = (df.iloc[i-1,0] < df.iloc[i-1,2]) & (df.iloc[i,0] > df.iloc[i,2]) & (df.iloc[i,3] > df.iloc[i-1,3]) & (df.iloc[i,9] > buy_threshold)
    
    df['sell_signal'] = False
    for i in range(len(df[1:-1])): # column 11
        df.iloc[i,11] = (df.iloc[i-1,0] > df.iloc[i-1,2]) & (df.iloc[i,0] < df.iloc[i,2]) & (df.iloc[i,3] < df.iloc[i-1,3]) & (df.iloc[i,9] < sell_threshold) 
        
    return df

In [287]:
tesla = pd.read_csv("data/TSLA.csv", index_col = 0)
tesla_data_2 = EMA_calculate_2(tesla, buy_threshold = -0.001, sell_threshold = 0.001)
EMA_visual(tesla_data_2)

In [288]:
len(tesla_data_2[tesla_data_2.buy_signal == True])

38

In [289]:
len(tesla_data_2[tesla_data_2.sell_signal == True])

58

In conclusion, the simple moving average crossing may not capture the local maximum point (selling point) efficiently, due to the lags of moving average. In addition, the strategy may not perform well during the turbulance (prices fluctuate or the slope the long-term average is around zero.)

Hence, we redesign the buy(sell) algorithm to be stock price crossing the long-term from below(above), with an increasing(decreasing) gap between EMAs, while maintaining a trend of long-term EMA. 

### Algorithm application on other stocks

In [281]:
aapl = pd.read_csv("data/AAPL.csv", index_col = 0)
amzn = pd.read_csv("data/AMZN.csv", index_col = 0)
googl = pd.read_csv("data/GOOGL.csv", index_col = 0)

In [283]:
aapl_data = EMA_calculate_2(aapl, buy_threshold = -0.001, sell_threshold = 0.001)
EMA_visual(aapl_data)

In [284]:
amzn_data = EMA_calculate_2(amzn, buy_threshold = -0.001, sell_threshold = 0.001)
EMA_visual(amzn_data)

In [285]:
googl_data = EMA_calculate_2(googl, buy_threshold = -0.001, sell_threshold = 0.001)
EMA_visual(googl_data)

From the applications of other stocks, we can see that the updated algorithm can also capture runs of rise in the stock prices. At the same time, the timing for buying/selling is closer to the bottoms/tops.