## **This notebook contains the code needed to generate a technical_indicators df for a given stock**

The stock must have its corresponding raw prices .csv file in the raw/single_name folder

In [3]:
import pandas as pd
import os
from ta import add_all_ta_features
from ta.utils import dropna
import yfinance as yf

In [4]:
project_dir = "/home/jupyter-tfg2425paula/prediction_project_v3"
os.chdir(project_dir)

raw_data_dir = os.path.join(project_dir, "00_data")
output_data_dir = os.path.join(raw_data_dir, "raw/technical")

securities = "raw/single_name"
stocks_folder = os.path.join(raw_data_dir, securities)

stock = 'SPX'
filename = f'{stock}_Close.csv'

In [5]:
df = pd.read_csv(os.path.join(stocks_folder, filename), sep=";", decimal=",")
df["Date"] = pd.to_datetime(df["Date"], format="%d/%m/%y")
df

Unnamed: 0,Date,Close
0,2009-06-18,918.370
1,2009-06-19,921.227
2,2009-06-22,893.042
3,2009-06-23,895.098
4,2009-06-24,900.940
...,...,...
4023,2024-11-19,5916.980
4024,2024-11-20,5917.109
4025,2024-11-21,5948.711
4026,2024-11-22,5969.340


There are many possible indicators.

**Momentum indicators**

- RSI (Relative Strength Index): Measures the speed and change of price movements.
- Stochastic Oscillator: Compares the closing price to a price range over a period.
- Williams %R: Indicates overbought/oversold levels.
- Awesome Oscillator: Measures momentum using two SMAs (simple moving averages).
- KAMA (Kaufman’s Adaptive Moving Average): Adaptive moving average based on volatility.
- PPO (Percentage Price Oscillator): Measures the difference between two EMAs as a percentage of the larger EMA.
- PVO (Percentage Volume Oscillator): Similar to PPO but based on volume.
- ROC (Rate of Change): Measures the percentage change in price.

**Trend indicators**

- MACD (Moving Average Convergence Divergence): Identifies trend direction and strength.
- SMA (Simple Moving Average): Calculates the average price over a period.
- EMA (Exponential Moving Average): Weighted moving average that gives more weight to recent prices.
- WMA (Weighted Moving Average): Similar to SMA but with a weighting factor.
- DEMA (Double Exponential Moving Average): Reduces lag by applying EMA twice.
- TEMA (Triple Exponential Moving Average): Further reduces lag compared to DEMA.
- TRIX: A triple exponential moving average to identify trends.
- ADX (Average Directional Movement Index): Measures trend strength.
- Aroon Indicator: Measures the time since the highest/lowest point over a period.
- PSAR (Parabolic Stop and Reverse): Provides potential reversal points in a trend.
- Ichimoku Cloud: Identifies support, resistance, and trend strength.

**Volatility indicators**

- Bollinger Bands: Measures price volatility and potential breakouts.
- Average True Range (ATR): Measures market volatility.
- Donchian Channels: Identifies breakout levels over a period.
- Keltner Channels: Combines ATR and EMA to define price range.

**Volume indicators**

- OBV (On-Balance Volume): Combines volume and price movements to identify trends.
- CMF (Chaikin Money Flow): Measures money flow volume over a period.
- VWAP (Volume Weighted Average Price): Average price weighted by volume.
- ADI (Accumulation/Distribution Index): Tracks supply and demand using volume and price.
- Ease of Movement (EOM): Relates price movement to volume.
- MFI (Money Flow Index): Combines price and volume to identify overbought/oversold levels.

In [9]:
stock = 'SPX'
ticker = '^GSPC'
yf_data = yf.download(ticker, start=df["Date"].min().strftime('%Y-%m-%d'), 
                      end=df["Date"].max().strftime('%Y-%m-%d'))

[*********************100%***********************]  1 of 1 completed


In [10]:
stock = 'SPX'
ticker = '^GSPC'
yf_data = yf.download(ticker, start=df["Date"].min().strftime('%Y-%m-%d'), 
                      end=df["Date"].max().strftime('%Y-%m-%d'))

yf_data.columns = yf_data.columns.droplevel(1)
yf_data = yf_data.reset_index()

df_with_indicators = add_all_ta_features(
    yf_data,
    open="Open",    # Use Close for 'open'
    high="High",    # Use Close for 'high'
    low="Low",     # Use Close for 'low'
    close="Close",   # Use Close for 'close'
    volume="Volume",     # No volume data available
    fillna=False      # Fill missing values to avoid issues
)

df_with_indicators = df_with_indicators.interpolate(method="linear")

# They are only either at the beginning or the end
df_with_indicators = df_with_indicators.dropna().reset_index().drop(columns="index")
df_with_indicators

[*********************100%***********************]  1 of 1 completed


Price,Date,Adj Close,Close,High,Low,Open,Volume,volume_adi,volume_obv,volume_cmf,...,momentum_ppo,momentum_ppo_signal,momentum_ppo_hist,momentum_pvo,momentum_pvo_signal,momentum_pvo_hist,momentum_kama,others_dr,others_dlr,others_cr
0,2009-09-29,1060.609985,1060.609985,1069.619995,1057.829956,1063.689941,4949900000,8.822506e+10,63841220000,0.178556,...,1.370567,1.542708,-0.172140,-3.111790,-0.621416,-2.490374,1060.230193,-0.222958,-0.223207,15.488310
1,2009-09-30,1057.079956,1057.079956,1063.400024,1046.469971,1061.020020,5998860000,8.974511e+10,57842360000,0.252401,...,1.286471,1.491460,-0.204989,-1.458611,-0.788855,-0.669757,1060.142136,-0.332830,-0.333385,15.103930
2,2009-10-01,1029.849976,1029.849976,1054.910034,1029.449951,1054.910034,5791450000,8.413565e+10,52050910000,0.217908,...,0.999488,1.393066,-0.393577,-0.481878,-0.727459,0.245581,1057.476868,-2.575962,-2.609721,12.138896
3,2009-10-02,1025.209961,1025.209961,1030.599976,1019.950012,1029.709961,5583240000,8.406746e+10,46467670000,0.173294,...,0.727650,1.259983,-0.532333,-0.030476,-0.588063,0.557586,1053.753481,-0.450552,-0.451571,11.633652
4,2009-10-05,1040.459961,1040.459961,1042.579956,1025.920044,1026.869995,4313310000,8.728303e+10,50780980000,0.165082,...,0.622495,1.132485,-0.509990,-1.594268,-0.789304,-0.804964,1053.210416,1.487500,1.476545,13.294202
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3810,2024-11-18,5893.620117,5893.620117,5908.120117,5865.950195,5874.169922,3983860000,1.852248e+12,1171234940000,-0.078266,...,0.806238,0.827522,-0.021283,4.040025,4.640095,-0.600069,5869.864420,0.391781,0.391016,541.747896
3811,2024-11-19,5916.979980,5916.979980,5923.509766,5855.290039,5870.049805,4036940000,1.855512e+12,1175271880000,-0.055806,...,0.762475,0.814512,-0.052037,3.284419,4.368960,-1.084540,5873.232034,0.396358,0.395575,544.291518
3812,2024-11-20,5917.109863,5917.109863,5920.669922,5860.560059,5914.339844,3772620000,1.858838e+12,1179044500000,-0.014326,...,0.719648,0.795539,-0.075891,2.149823,3.925132,-1.775309,5873.611437,0.002195,0.002195,544.305661
3813,2024-11-21,5948.709961,5948.709961,5963.319824,5887.259766,5940.580078,4230120000,1.861443e+12,1183274620000,-0.005008,...,0.720529,0.780537,-0.060008,2.109997,3.562105,-1.452108,5874.798947,0.534046,0.532625,547.746550


In [11]:
df_with_indicators.to_csv(os.path.join(output_data_dir, f"{stock}_technical.csv"), index=False)