# Formula Sheet for Derived Technical Indicator Features

---

This notebook will break down more complex technical indicator functions to be implemented as features in a model. Some of these feaatures require intensive computation effort for larger datasets; 
however, it is unlikely these will need to be calculated before the global notebook. This is more for practice, reference, and technical breakdown of what will be calculated at runtime.

[Example Scholarly Article](https://arxiv.org/pdf/2205.06673.pdf) 

[TA-Lib Documentaion](https://github.com/TA-Lib/ta-lib-python/blob/master/docs/func_groups/momentum_indicators.md)

---

## List of Indicators

1. [10 day simple moving average (SMA) closing price](#Simple-Moving-Average-(SMA))
2. [50 day simple moving average (SMA) closing price](#Simple-Moving-Average-(SMA))
> *Possibly consolidate above 2 items as difference*
3. Current volume
4. [200 day simple moving average (SMA) volume](#Simple-Moving-Average-(SMA))
> *Possibly consolidate above 2 items as difference*
5. [Weighted moving average (WMA) closing price](#Weighted-Moving-Average-(WMA))
6. [Exponential moving average closing price](#Exponential-Moving-Average-(EMA))
7. [Relative Strength Index (RSI)](#Relative-Strength-Index-(RSI))
8. [Commodity Channel Index (CCI)](#Commodity-Channel-Index-(CCI))
9. [Accumulation Distribution (AD)](#Accumulation-Distribution-(AD))
10.  [Stochastic K%](#Stochastic-K-Percent)
11.  [Stochastic D%](#Stochastic-D-Percent)
12.  [Moving Average Convergence \ Divergence (MACD)](#Moving-Average-Convergence-\-Divergence-(MACD))
    
---

In [2]:
import talib
talib.set_compatibility(1)

import numpy as np



### Simple Moving Average (SMA)
Sum of all items divided by number of items. In this context, applied to closing prices and volume over time. Sum of each daily closing price or volume over a given time range divided by the number of days in the time range.
#### SMA = $\Large\frac{\sum_{i=1}^n c_1 + c_2 + ... + c_n}{n}$
---

In [3]:
# SMA

def simple_moving_average(arr):
    output = np.average(arr)
    return output

example_inputs = np.array([1.,2.,3.,4.,5.,6.,7.,8.,9.,10.])

simple_moving_average(example_inputs)

5.5

### Weighted Moving Average (WMA)
 Sum of all items with applied scaling weights divided by number of items. In this context, applied to closing prices over time. Sum of each daily closing price scaled by a decreasing multiplier over a given time range, then divided by the number of days in the time range. Provides emphasis on most recent closing prices.
 #### WMA = $\Large\frac{\sum_{i=1}^n w_1c_1 + w_2c_2 + ... + w_nc_n}{\sum_{i=1}^n w_i}$
---

In [4]:
# WMA accepting array in order of oldest to most recent price. Items at end of input array are weighted more.


def weighted_moving_average(arr):
    wma = talib.WMA(arr, len(arr))
    output=wma[len(wma)-1]
    return output

weighted_moving_average_with_np(example_inputs)



7.0

### Exponential Moving Average (EMA)

> Built in function in Pandas and SciKit!
 
Weighted moving average where weights exponentially approach 0 for older items.
 #### EMA = $c_{today}\frac{2}{1 + N_{today}} + EMA_{yesterday}(1 - \frac{2}{1 + N_{yesterday}})$
---

In [5]:
# EMA

def exponential_moving_average(arr):
    ema = talib.EMA(np.array(arr), timeperiod=len(arr))
    output = ema[len(ema)-1]
    return output
    
exponential_moving_average(example_inputs)

6.2393684801212155

### Relative Strength Index (RSI)
 
Oscillates on a scale of 0 to 100. Involves comparing average gain during up periods vs. average loss during down periods.

 #### RSI = $100 - [\large\frac{100}{1+\frac{gain_{avg}}{loss_{avg}}}]$
---

In [6]:
# RSI

def relative_strength_index(arr):
    rsi = talib.RSI(arr, len(arr))
    output = rsi[len(rsi)-1]
    return output

print(relative_strength_index(example_inputs))
    

100.0


### Commodity Channel Index (CCI)
 
Involves comparing the current price to average price over a given time range. The numerator is the SMA subtracted from the typical price (the moving average of the high, low, and close over a given range). The denominator is a constant time the mean deviation (the absolute value for the overall average difference between the typical price and the SMA over the given time range).

 #### CCI = $\Large\frac{ \sum_{i=1}^n {\frac{high_i + low_i + close_i} {3}} - {SMA}}{{0.015} \frac{|{\sum_{i=1}^n {\frac{high_i + low_i + close_i} {3}} - {SMA}}|}{n}}$
---

In [7]:
# CCI

def commodity_channel_index(highs, lows, closes):
    assert(len(highs) == len(lows) == len(closes))
    cci = talib.CCI(high=highs, low=lows, close=closes, timeperiod=len(highs))
    output = cci[len(cci)-1]
    return output

example_closes = example_inputs
example_highs = np.array([20.,30.,40.,50.,60.,7.,8.,9.,10.,11.])
example_lows = np.array([0.,1.,2.,3.,4.,5.,6.,7.,8.,9.])

print(commodity_channel_index(highs=example_highs, lows=example_lows, closes=example_closes))

-22.22222222222222


### Accumulation Distribution (AD)
 
The ratio of the accumulation divided by the distribution to identify divergences between price and volume. For example, if price is rising but the indicator is falling, it may signal volume will not maintain the price and that price could drop.

 #### AD = $\sum_{i=1}^n ( \frac{(close_i - low_i) - (high_i - close_i)}{high_i - low_i} volume_i)$
---

In [28]:
# AD

def accumulation_distribution(highs, lows, closes, volumes):
    ad = talib.AD(high=highs, low=lows, close=closes, volume=volumes)
    output = ad[len(ad)-1]
    return output

example_volumes=np.array([100.,101.,105.,102.,104.,108.,103.,107.,109.,106.])

accumulation_distribution(highs=example_highs, lows=example_lows, closes=example_closes, volumes=example_volumes)

-481.4534557229464

### Stochastic K Percent
 
The numerator is today's close minus the absolute minimum over the given time range. The denominator is the absolute high minus the absolute low for the given time range. the ratio is then multiplied by 100.

 #### K = $(\Large\frac{close_{today} - low_{lowest}}{high_{highest} - low_{lowest}}) 100$
---

In [9]:
# Return tuple with K%, D%

def fast_stochastic(closes, lows, highs):
    fast_k, fast_d = talib.STOCHF(high=highs, low=lows, close=closes)
    output = {"fast_k": fast_k[len(fast_k)-1], "fast_d": fast_d[len(fast_d)-1]}
    return output


fast_stochastic(highs=example_highs,
                lows=example_lows, closes=example_closes)

{'fast_k': 83.33333333333334, 'fast_d': 33.67794486215539}

### Stochastic D Percent
 
SMA of the stochastic K% over a given time range. 

 #### D = $SMA(k_0, k_1, ... , k_n)$
---

In [10]:
# Return tuple with K%, D%

def slow_stochastic(closes, lows, highs):
    slow_k, slow_d = talib.STOCH(high=highs, low=lows, close=closes)
    output = {"slow_k": slow_k[len(slow_k)-1], "slow_d": slow_d[len(slow_d)-1]}
    return output


slow_stochastic(highs=example_highs,
                lows=example_lows, closes=example_closes)

{'slow_k': 33.67794486215539, 'slow_d': 17.024691249521297}

### Moving Average Convergence \ Divergence (MACD)
 
The difference between the 26 day EMA and the 12 day EMA. 

 #### MACD = $EMA_{12days} - EMA_{26days}$
---

In [11]:
# MACD --> Returns triple with MACD, MACD Signal, MACD History

import random

def moving_average_convergence_divergence(closes):
    macd, macdsignal, macdhist = talib.MACDFIX(closes)
    output = macd[len(macd)-1]
    print(output)
    return output


example_random_closes = np.array(list(map(lambda x: float(x), [random.randrange(0,300) for i in range(100)])))
out = moving_average_convergence_divergence(example_random_closes)

-0.9076387379770949
