# Stock Price Prediction Using Technical Indicators and Machine Learning

    This project aims to predict future stock prices (or stock price movement direction) by leveraging technical indicators derived from historical market data and applying machine learning algorithms for pattern recognition and forecasting.

    Steps 
1. [Retrieving the data](#retrieving-the-data)
   The step includes data explanation.
2. [Finding the indicators](#finding-indicators)
3. [Predicting by using technical indicators](#Predicting-by-using-technical-indicators)
2. [Results](#results)


### Retrieving the data

In [7]:
import yfinance as yf 
import ta
from datetime import datetime

In [10]:
end_date = datetime.now().strftime("%Y-%m-%d") # strftime = format object to string
df = yf.download('AMD', start='2015-01-01', end=end_date)
df

  df = yf.download('AMD', start='2015-01-01', end=end_date)
[*********************100%***********************]  1 of 1 completed


Price,Close,High,Low,Open,Volume
Ticker,AMD,AMD,AMD,AMD,AMD
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2015-01-02,2.670000,2.670000,2.670000,2.670000,0
2015-01-05,2.660000,2.700000,2.640000,2.670000,8878200
2015-01-06,2.630000,2.660000,2.550000,2.650000,13912500
2015-01-07,2.580000,2.650000,2.540000,2.630000,12377600
2015-01-08,2.610000,2.650000,2.560000,2.590000,11136600
...,...,...,...,...,...
2025-11-06,237.699997,253.509995,235.740005,253.470001,66049700
2025-11-07,233.539993,235.869995,224.639999,230.940002,52162600
2025-11-10,243.979996,248.899994,240.500000,242.139999,43361600
2025-11-11,237.520004,248.460007,234.639999,241.660004,61336800


In [12]:
df.columns

MultiIndex([( 'Close', 'AMD'),
            (  'High', 'AMD'),
            (   'Low', 'AMD'),
            (  'Open', 'AMD'),
            ('Volume', 'AMD')],
           names=['Price', 'Ticker'])

In [16]:
df.columns = df.columns.droplevel('Ticker')

KeyError: 'Requested level (Ticker) does not match index name (Price)'

In [17]:
df.columns

Index(['Close', 'High', 'Low', 'Open', 'Volume'], dtype='object', name='Price')

In [19]:
df = ta.add_all_ta_features(df, open="Open", high="High", 
                             low="Low", close="Close", volume="Volume")
df

  self._psar[i] = high2


Price,Close,High,Low,Open,Volume,volume_adi,volume_obv,volume_cmf,volume_fi,volume_em,...,momentum_ppo,momentum_ppo_signal,momentum_ppo_hist,momentum_pvo,momentum_pvo_signal,momentum_pvo_hist,momentum_kama,others_dr,others_dlr,others_cr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-01-02,2.670000,2.670000,2.670000,2.670000,0,0.000000e+00,0,,,,...,,,,,,,,,,0.000000
2015-01-05,2.660000,2.700000,2.640000,2.670000,8878200,-2.959400e+06,-8878200,,,0.000000,...,,,,,,,,-0.374531,-0.375235,-0.374531
2015-01-06,2.630000,2.660000,2.550000,2.650000,13912500,3.364480e+06,-22790700,,,-0.051393,...,,,,,,,,-1.127818,-1.134227,-1.498126
2015-01-07,2.580000,2.650000,2.540000,2.630000,12377600,-1.124852e+04,-35168300,,,-0.008887,...,,,,,,,,-1.901148,-1.919452,-3.370792
2015-01-08,2.610000,2.650000,2.560000,2.590000,11136600,1.226119e+06,-24031700,,,0.008081,...,,,,,,,,1.162790,1.156081,-2.247198
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-06,237.699997,253.509995,235.740005,253.470001,66049700,3.523881e+09,8141772500,0.001628,-4.522006e+07,-177.700582,...,6.700559,8.231625,-1.531066,-6.321933,-6.525175,0.203242,244.919966,-7.267971,-7.545626,8802.621354
2025-11-07,233.539993,235.869995,224.639999,230.940002,52162600,3.554398e+09,8089609900,0.113529,-6.975956e+07,-309.369298,...,5.717830,7.728866,-2.011036,-6.558789,-6.531898,-0.026892,244.281359,-1.750107,-1.765602,8646.815978
2025-11-10,243.979996,248.899994,240.500000,242.139999,43361600,3.546965e+09,8132971500,0.146135,4.876832e+06,279.828026,...,5.224229,7.227938,-2.003710,-7.947323,-6.814983,-1.132341,244.269681,4.470327,4.373290,9037.827294
2025-11-11,237.520004,248.460007,234.639999,241.660004,61336800,3.511192e+09,8071634700,0.148372,-5.242489e+07,-70.973606,...,4.567162,6.695783,-2.128621,-6.385125,-6.729011,0.343886,243.923546,-2.647755,-2.683439,8795.880056


In [20]:
df.columns

Index(['Close', 'High', 'Low', 'Open', 'Volume', 'volume_adi', 'volume_obv',
       'volume_cmf', 'volume_fi', 'volume_em', 'volume_sma_em', 'volume_vpt',
       'volume_vwap', 'volume_mfi', 'volume_nvi', 'volatility_bbm',
       'volatility_bbh', 'volatility_bbl', 'volatility_bbw', 'volatility_bbp',
       'volatility_bbhi', 'volatility_bbli', 'volatility_kcc',
       'volatility_kch', 'volatility_kcl', 'volatility_kcw', 'volatility_kcp',
       'volatility_kchi', 'volatility_kcli', 'volatility_dcl',
       'volatility_dch', 'volatility_dcm', 'volatility_dcw', 'volatility_dcp',
       'volatility_atr', 'volatility_ui', 'trend_macd', 'trend_macd_signal',
       'trend_macd_diff', 'trend_sma_fast', 'trend_sma_slow', 'trend_ema_fast',
       'trend_ema_slow', 'trend_vortex_ind_pos', 'trend_vortex_ind_neg',
       'trend_vortex_ind_diff', 'trend_trix', 'trend_mass_index', 'trend_dpo',
       'trend_kst', 'trend_kst_sig', 'trend_kst_diff', 'trend_ichimoku_conv',
       'trend_ichimoku_base

### finding indicators

#### Technical Analysis Indicators

##### 1. Momentum Indicators  
*Answer: "Is it overbought/oversold?"*

| Indicator | Description |
|-----------|-------------|
| `momentum_rsi` | Shows if stock is overbought (>70) or oversold (<30) |
| `momentum_stoch` | Similar to RSI, compares current price to recent range |
| `momentum_roc` | Measures how fast the price is changing (Rate of Change) |
| `momentum_wr` | Williams %R, like RSI but inverted |

---

##### 2. Trend Indicators  
*Answer: "Which direction is it going?"*

| Indicator | Description |
|-----------|-------------|
| `trend_macd` | Shows trend strength and direction |
| `trend_macd_signal` | Crossover signal for MACD |
| `trend_ema_fast` | Short-term trend (reacts quickly) |
| `trend_ema_slow` | Long-term trend (smooth, less sensitive) |

---

##### 3. Volatility Indicators  
*Answer: "How much is it moving?"*

| Indicator | Description |
|-----------|-------------|
| `volatility_bbh` | Upper Bollinger Band (resistance level) |
| `volatility_bbl` | Lower Bollinger Band (support level) |
| `volatility_atr` | Average True Range (measures how much price swings) |


In [21]:
selected_features = [
    'Close', 'Open', 'High', 'Low', 'Volume',
    
    # Momentum indicators (4)
    'momentum_rsi',           # RSI - most famous momentum indicator
    'momentum_stoch',         # Stochastic Oscillator
    'momentum_roc',           # Rate of Change
    'momentum_wr',            # Williams %R
    
    # Trend indicators (4)
    'trend_macd',             # MACD - most famous trend indicator
    'trend_macd_signal',      # MACD signal line
    'trend_ema_fast',         # Fast moving average
    'trend_ema_slow',         # Slow moving average
    
    # Volatility indicators (3)
    'volatility_bbh',         # Bollinger Band High
    'volatility_bbl',         # Bollinger Band Low
    'volatility_atr',         # Average True Range
]

In [22]:
df_model = df[selected_features].copy()
df_model

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-01-02,2.670000,2.670000,2.670000,2.670000,0,,,,,,,,,,,0.000000
2015-01-05,2.660000,2.670000,2.700000,2.640000,8878200,,,,,,,,,,,0.000000
2015-01-06,2.630000,2.650000,2.660000,2.550000,13912500,,,,,,,,,,,0.000000
2015-01-07,2.580000,2.630000,2.650000,2.540000,12377600,,,,,,,,,,,0.000000
2015-01-08,2.610000,2.590000,2.650000,2.560000,11136600,,,,,,,,,,,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-06,237.699997,253.470001,253.509995,235.740005,66049700,53.473086,30.379141,-0.138639,-69.620859,15.549415,18.386516,247.610880,232.061465,272.317447,212.590550,12.500424
2025-11-07,233.539993,230.940002,235.869995,224.639999,52162600,51.272053,20.970774,1.437692,-79.029226,13.275143,17.364242,245.446128,232.170986,270.815879,215.956118,12.556381
2025-11-10,243.979996,242.139999,248.899994,240.500000,43361600,56.150155,45.570222,3.825690,-54.429778,12.174842,16.326362,245.220570,233.045727,269.247469,220.280528,12.836743
2025-11-11,237.520004,241.660004,248.460007,234.639999,61336800,52.638648,30.348748,-6.088879,-69.651252,10.658712,15.192832,244.035867,233.377155,267.272901,224.198096,12.935070


In [34]:
df[selected_features].isna().sum()
# the NaN values seem to be only at the beginning

Price
Close                 0
Open                  0
High                  0
Low                   0
Volume                0
momentum_rsi         13
momentum_stoch       13
momentum_roc         12
momentum_wr          13
trend_macd           25
trend_macd_signal    33
trend_ema_fast       11
trend_ema_slow       25
volatility_bbh       19
volatility_bbl       19
volatility_atr        0
dtype: int64

In [36]:
df[selected_features].head(35)

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-01-02,2.67,2.67,2.67,2.67,0,,,,,,,,,,,0.0
2015-01-05,2.66,2.67,2.7,2.64,8878200,,,,,,,,,,,0.0
2015-01-06,2.63,2.65,2.66,2.55,13912500,,,,,,,,,,,0.0
2015-01-07,2.58,2.63,2.65,2.54,12377600,,,,,,,,,,,0.0
2015-01-08,2.61,2.59,2.65,2.56,11136600,,,,,,,,,,,0.0
2015-01-09,2.63,2.63,2.64,2.58,8907600,,,,,,,,,,,0.0
2015-01-12,2.63,2.62,2.64,2.55,9979600,,,,,,,,,,,0.0
2015-01-13,2.66,2.64,2.68,2.6,17907400,,,,,,,,,,,0.0
2015-01-14,2.63,2.6,2.66,2.58,9989900,,,,,,,,,,,0.0
2015-01-15,2.52,2.62,2.65,2.49,17744000,,,,,,,,,,,0.084


In [26]:
df_model = df_model.dropna() # safe to drop since the it will not break the timeline
df_model

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-02-20,3.060000,3.030000,3.130000,3.020000,10666700,60.597387,56.944440,10.869564,-43.055560,0.127181,0.128352,3.017328,2.890147,3.377431,2.414569,0.132561
2015-02-23,3.060000,3.050000,3.100000,3.030000,6323500,60.597387,55.072460,7.368423,-44.927540,0.121164,0.126915,3.023893,2.902729,3.366558,2.486442,0.126305
2015-02-24,3.110000,3.060000,3.120000,3.020000,10916300,62.993571,57.377043,-6.042298,-42.622957,0.119058,0.125343,3.037140,2.918082,3.373224,2.529776,0.123674
2015-02-25,3.100000,3.080000,3.140000,3.060000,6151300,62.179140,46.000004,2.310229,-53.999996,0.115253,0.123325,3.046811,2.931558,3.381391,2.561609,0.119307
2015-02-26,3.080000,3.100000,3.130000,3.060000,8680900,60.494568,53.571383,1.315788,-46.428617,0.109363,0.120533,3.051917,2.942554,3.374810,2.613190,0.114376
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-06,237.699997,253.470001,253.509995,235.740005,66049700,53.473086,30.379141,-0.138639,-69.620859,15.549415,18.386516,247.610880,232.061465,272.317447,212.590550,12.500424
2025-11-07,233.539993,230.940002,235.869995,224.639999,52162600,51.272053,20.970774,1.437692,-79.029226,13.275143,17.364242,245.446128,232.170986,270.815879,215.956118,12.556381
2025-11-10,243.979996,242.139999,248.899994,240.500000,43361600,56.150155,45.570222,3.825690,-54.429778,12.174842,16.326362,245.220570,233.045727,269.247469,220.280528,12.836743
2025-11-11,237.520004,241.660004,248.460007,234.639999,61336800,52.638648,30.348748,-6.088879,-69.651252,10.658712,15.192832,244.035867,233.377155,267.272901,224.198096,12.935070


In [39]:
import os
file_path = os.path.join("../data", "amd_data.csv")

if not os.path.exists(file_path):
    df.to_csv(file_path, index=True)
    print(f"Data saved to {file_path}")
else:
    print(f"File already exists: {file_path}")

Data saved to ../data\amd_data.csv


##### Q: should the cleaning and predicting notebooks be separated? 

### Predicting by using technical indicators

### Results

### Further considerations 

Linear regression on price/time

Monte Carlo simulation of future prices

Gradient descent fitting

Correlation or volatility analysis