# Technicals extraction

------

**Approach:** Sift through technical functions in `ta.py` file, research sensible parameters for each function. Sometimes it seems like multiple runs should be extracted with each function with different parameters, which is denoted by a list of values. 

### Good to go

- 'MA'
    - n: [5, 20, 90, 260]
- 'STDDEV'
    - n: [5, 20, 90, 260]
- 'RSI'
    - n: [6, 12]
- MACD'
    - n_fast: 12
    - n_slow: 26

- 'BBANDS'
    - n: [5, 20, 90, 260]

- 'MFI' money flow index ratio
    - n: 14
- 'Chaikin'
    - None
- 'EMA'
    - n: [5, 20, 90, 260]
- 'KST'
    - r: (10, 10, 10, 15)
    - n: (10, 15, 20, 30)
    
- 'TSI'
    - r: 25
    - s: 13

- 'TRIX'
    - n: [5, 20, 90, 260]

- 'STOK'
    - None

- 'STO'
    - n: [5, 20, 90, 260]

- 'ROC'
    - n: [5, 20, 90, 260]

- 'PPSR' 

- 'OBV'
    - n: [5, 20, 90, 260]

- 'MassI'
    - None
    
- 'MOM'
    - n: 1 

- 'COPP' 
    - n: 10

- ACCDIST'
    - n: 1

- 'ADX'
    - n: 14
    - n_ADX: 50
- 'ATR'
    - n: 14

### Potential implementation

- differences on any or all of these columns

### Missing end data

'ULTOSC'

'Vortex'

'EOM' ease of movement

'KELCH'

'DONCH'

'CCI' Commodity channel index

In [1]:
cd .. 

/home/jovyan/work/dsi-plus-2/critical_feature_extraction


In [2]:
#load technicals .py file
from lib import ta

import inspect
import string
import os

import pickle

import warnings
warnings.filterwarnings('ignore')

In [8]:
%run __init__.py

In [3]:
#hacky way to get a dictionary of all the imported technical functions
tech_funcs = dict(filter(lambda x: x[0][0] in string.ascii_uppercase, inspect.getmembers(ta)))

In [4]:
#Each tuple contains the *args for a single run
#Best guesses from the internet, "A critical extraction .." paper, and the ta.py code

grid = {"MA": [(5,), (20,)],
        "STDDEV": [(5,), (20,)],
        "RSI": [(6,), (12,)],
        "MACD": [(12, 26)],
        "BBANDS": [(5,), (20,)],
        "MFI": [(14,)],
        "Chaikin": [()],
        "EMA": [(5,), (20,)],
        "KST": [(10, 10, 10, 15, 10, 15, 20, 30)],
        "TSI": [(25, 13)],
        "TRIX": [(5,), (20,)],
        "STOK": [()],
        "STO": [(5,), (20,)],
        "ROC": [(5,), (20,)],
        "PPSR": [()],
        "OBV": [(5,), (20,)],
        "MassI": [()],
        "MOM": [(1,)],
        "COPP": [(10,)],
        "ADX": [(14, 50)],
        "ATR": [(14,)],
        "FORCE": [(2,)],
        "ACCDIST": [(1,)]}


In [5]:
#serially apply every technical function in dictionary to an initial dataframe
def extract_technicals(df, tech_funcs, grid):
    
    output = df
    for name, func in tech_funcs.items():
        arg_list = grid[name]
        for arg_tuple in arg_list:
            output = func(output, *arg_tuple)
    
    return output

In [6]:
#serialize technical functions extraction objects
tech_func_tools = [tech_funcs, grid]

with open("lib/tech_func_tools.pkl", "wb") as dump_file:
    pickle.dump(tech_func_tools, dump_file)

### Extract technicals from every individual stock CSV

In [10]:
#grab list of csv names in the directory
individuals = os.listdir(path = "data/sandp500/individual_stocks_5yr/")

for csv in individuals:
    csv_path = "data/sandp500/individual_stocks_5yr/" + csv
    df = pd.read_csv(csv_path)
    try:
        df_technicals = extract_technicals(df, tech_funcs, grid)
        df_technicals.to_csv("data/sandp500/individual_stocks_5yr_TECHNICALS/" + csv)
    except IndexError:
        print(f"Technical extraction failed on {csv}")


### Testing `extract_technicals` on single stock csv

In [11]:
sp = pd.read_csv("data/sandp500/individual_stocks_5yr/A_data.csv")
sp_technicals = extract_technicals(sp, tech_funcs, grid)
sp_technicals.shape

(1258, 49)

### Notes

- `ACCDIST` appears to be spitting out all zeros.
- Barely any variance in `TRIX` and `TRIX_260` is about half missing. Feel like the args for this function are probably wrong.