# Stock Price Prediction Using Technical Indicators and Machine Learning

## Data cleaning

    This project aims to predict future stock prices (or stock price movement direction) by leveraging technical indicators derived from historical market data and applying machine learning algorithms for pattern recognition and forecasting.

    Steps 
1. [Retrieving the data](#retrieving-the-data)
   The step includes data explanation.
2. [Finding the indicators](#finding-indicators)
3. [New stock data](#New-stock-data)



### Retrieving the data

In [38]:
import yfinance as yf 
import ta
from datetime import datetime
import numpy as np

In [39]:
end_date = datetime.now().strftime("%Y-%m-%d") # strftime = format object to string
df = yf.download('AMD', start='2015-01-01', end=end_date)
df

  df = yf.download('AMD', start='2015-01-01', end=end_date)
[*********************100%***********************]  1 of 1 completed


Price,Close,High,Low,Open,Volume
Ticker,AMD,AMD,AMD,AMD,AMD
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2015-01-02,2.670000,2.670000,2.670000,2.670000,0
2015-01-05,2.660000,2.700000,2.640000,2.670000,8878200
2015-01-06,2.630000,2.660000,2.550000,2.650000,13912500
2015-01-07,2.580000,2.650000,2.540000,2.630000,12377600
2015-01-08,2.610000,2.650000,2.560000,2.590000,11136600
...,...,...,...,...,...
2025-11-10,243.979996,248.899994,240.500000,242.139999,43361600
2025-11-11,237.520004,248.460007,234.639999,241.660004,61336800
2025-11-12,258.890015,263.510010,250.000000,253.130005,108942000
2025-11-13,247.960007,259.630005,246.059998,251.899994,63025700


In [40]:
df.columns

MultiIndex([( 'Close', 'AMD'),
            (  'High', 'AMD'),
            (   'Low', 'AMD'),
            (  'Open', 'AMD'),
            ('Volume', 'AMD')],
           names=['Price', 'Ticker'])

In [41]:
df.columns = df.columns.droplevel('Ticker')

In [42]:
df.columns

Index(['Close', 'High', 'Low', 'Open', 'Volume'], dtype='object', name='Price')

In [43]:
df = ta.add_all_ta_features(df, open="Open", high="High", 
                             low="Low", close="Close", volume="Volume")
df

  self._psar[i] = high2


Price,Close,High,Low,Open,Volume,volume_adi,volume_obv,volume_cmf,volume_fi,volume_em,...,momentum_ppo,momentum_ppo_signal,momentum_ppo_hist,momentum_pvo,momentum_pvo_signal,momentum_pvo_hist,momentum_kama,others_dr,others_dlr,others_cr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-01-02,2.670000,2.670000,2.670000,2.670000,0,0.000000e+00,0,,,,...,,,,,,,,,,0.000000
2015-01-05,2.660000,2.700000,2.640000,2.670000,8878200,-2.959400e+06,-8878200,,,0.000000,...,,,,,,,,-0.374531,-0.375235,-0.374531
2015-01-06,2.630000,2.660000,2.550000,2.650000,13912500,3.364480e+06,-22790700,,,-0.051393,...,,,,,,,,-1.127818,-1.134227,-1.498126
2015-01-07,2.580000,2.650000,2.540000,2.630000,12377600,-1.124852e+04,-35168300,,,-0.008887,...,,,,,,,,-1.901148,-1.919452,-3.370792
2015-01-08,2.610000,2.650000,2.560000,2.590000,11136600,1.226119e+06,-24031700,,,0.008081,...,,,,,,,,1.162790,1.156081,-2.247198
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-10,243.979996,248.899994,240.500000,242.139999,43361600,3.546965e+09,8132971500,0.146135,4.876832e+06,279.828026,...,5.224229,7.227938,-2.003710,-7.947323,-6.814983,-1.132341,244.269681,4.470327,4.373290,9037.827294
2025-11-11,237.520004,248.460007,234.639999,241.660004,61336800,3.511192e+09,8071634700,0.148372,-5.242489e+07,-70.973606,...,4.567162,6.695783,-2.128621,-6.385125,-6.729011,0.343886,243.923546,-2.647755,-2.683439,8795.880056
2025-11-12,258.890015,263.510010,250.000000,253.130005,108942000,3.545625e+09,8180576700,0.089607,2.876489e+08,188.558796,...,4.698544,6.296335,-1.597791,1.236951,-5.135819,6.372770,244.074465,8.997141,8.615147,9596.254953
2025-11-13,247.960007,259.630005,246.059998,251.899994,63025700,3.500248e+09,8117551000,0.079703,1.481460e+08,-84.185941,...,4.388535,5.914775,-1.526240,1.071027,-3.894450,4.965476,244.120812,-4.221873,-4.313585,9186.891372


In [44]:
df.columns

Index(['Close', 'High', 'Low', 'Open', 'Volume', 'volume_adi', 'volume_obv',
       'volume_cmf', 'volume_fi', 'volume_em', 'volume_sma_em', 'volume_vpt',
       'volume_vwap', 'volume_mfi', 'volume_nvi', 'volatility_bbm',
       'volatility_bbh', 'volatility_bbl', 'volatility_bbw', 'volatility_bbp',
       'volatility_bbhi', 'volatility_bbli', 'volatility_kcc',
       'volatility_kch', 'volatility_kcl', 'volatility_kcw', 'volatility_kcp',
       'volatility_kchi', 'volatility_kcli', 'volatility_dcl',
       'volatility_dch', 'volatility_dcm', 'volatility_dcw', 'volatility_dcp',
       'volatility_atr', 'volatility_ui', 'trend_macd', 'trend_macd_signal',
       'trend_macd_diff', 'trend_sma_fast', 'trend_sma_slow', 'trend_ema_fast',
       'trend_ema_slow', 'trend_vortex_ind_pos', 'trend_vortex_ind_neg',
       'trend_vortex_ind_diff', 'trend_trix', 'trend_mass_index', 'trend_dpo',
       'trend_kst', 'trend_kst_sig', 'trend_kst_diff', 'trend_ichimoku_conv',
       'trend_ichimoku_base

### finding indicators

#### Technical Analysis Indicators

##### 1. Momentum Indicators  
*Answer: "Is it overbought/oversold?"*

| Indicator | Description |
|-----------|-------------|
| `momentum_rsi` | Shows if stock is overbought (>70) or oversold (<30) |
| `momentum_stoch` | Similar to RSI, compares current price to recent range |
| `momentum_roc` | Measures how fast the price is changing (Rate of Change) |
| `momentum_wr` | Williams %R, like RSI but inverted |

---

##### 2. Trend Indicators  
*Answer: "Which direction is it going?"*

| Indicator | Description |
|-----------|-------------|
| `trend_macd` | Shows trend strength and direction |
| `trend_macd_signal` | Crossover signal for MACD |
| `trend_ema_fast` | Short-term trend (reacts quickly) |
| `trend_ema_slow` | Long-term trend (smooth, less sensitive) |

---

##### 3. Volatility Indicators  
*Answer: "How much is it moving?"*

| Indicator | Description |
|-----------|-------------|
| `volatility_bbh` | Upper Bollinger Band (resistance level) |
| `volatility_bbl` | Lower Bollinger Band (support level) |
| `volatility_atr` | Average True Range (measures how much price swings) |


In [45]:
selected_features = [
    'Close', 'Open', 'High', 'Low', 'Volume',
    
    # Momentum indicators (4)
    'momentum_rsi',           # RSI - most famous momentum indicator
    'momentum_stoch',         # Stochastic Oscillator
    'momentum_roc',           # Rate of Change
    'momentum_wr',            # Williams %R
    
    # Trend indicators (4)
    'trend_macd',             # MACD - most famous trend indicator
    'trend_macd_signal',      # MACD signal line
    'trend_ema_fast',         # Fast moving average
    'trend_ema_slow',         # Slow moving average
    
    # Volatility indicators (3)
    'volatility_bbh',         # Bollinger Band High
    'volatility_bbl',         # Bollinger Band Low
    'volatility_atr',         # Average True Range
]

In [46]:
df_model = df[selected_features].copy()
df_model

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-01-02,2.670000,2.670000,2.670000,2.670000,0,,,,,,,,,,,0.000000
2015-01-05,2.660000,2.670000,2.700000,2.640000,8878200,,,,,,,,,,,0.000000
2015-01-06,2.630000,2.650000,2.660000,2.550000,13912500,,,,,,,,,,,0.000000
2015-01-07,2.580000,2.630000,2.650000,2.540000,12377600,,,,,,,,,,,0.000000
2015-01-08,2.610000,2.590000,2.650000,2.560000,11136600,,,,,,,,,,,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-10,243.979996,242.139999,248.899994,240.500000,43361600,56.150155,45.570222,3.825690,-54.429778,12.174842,16.326362,245.220570,233.045727,269.247469,220.280528,12.836743
2025-11-11,237.520004,241.660004,248.460007,234.639999,61336800,52.638648,30.348748,-6.088879,-69.651252,10.658712,15.192832,244.035867,233.377155,267.272901,224.198096,12.935070
2025-11-12,258.890015,253.130005,263.510010,250.000000,108942000,61.267867,80.702228,-0.300381,-19.297772,11.054124,14.365090,246.321121,235.266997,268.753819,224.746180,14.240563
2025-11-13,247.960007,251.899994,259.630005,246.059998,63025700,55.679988,54.948196,-3.895199,-45.051804,10.366037,13.565280,246.573257,236.207220,268.702529,226.137470,14.173508


In [47]:
df[selected_features].isna().sum()
# the NaN values seem to be only at the beginning

Price
Close                 0
Open                  0
High                  0
Low                   0
Volume                0
momentum_rsi         13
momentum_stoch       13
momentum_roc         12
momentum_wr          13
trend_macd           25
trend_macd_signal    33
trend_ema_fast       11
trend_ema_slow       25
volatility_bbh       19
volatility_bbl       19
volatility_atr        0
dtype: int64

In [48]:
df[selected_features].head(35)

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-01-02,2.67,2.67,2.67,2.67,0,,,,,,,,,,,0.0
2015-01-05,2.66,2.67,2.7,2.64,8878200,,,,,,,,,,,0.0
2015-01-06,2.63,2.65,2.66,2.55,13912500,,,,,,,,,,,0.0
2015-01-07,2.58,2.63,2.65,2.54,12377600,,,,,,,,,,,0.0
2015-01-08,2.61,2.59,2.65,2.56,11136600,,,,,,,,,,,0.0
2015-01-09,2.63,2.63,2.64,2.58,8907600,,,,,,,,,,,0.0
2015-01-12,2.63,2.62,2.64,2.55,9979600,,,,,,,,,,,0.0
2015-01-13,2.66,2.64,2.68,2.6,17907400,,,,,,,,,,,0.0
2015-01-14,2.63,2.6,2.66,2.58,9989900,,,,,,,,,,,0.0
2015-01-15,2.52,2.62,2.65,2.49,17744000,,,,,,,,,,,0.084


In [49]:
df_model = df_model.dropna() # safe to drop since the it will not break the timeline
df_model

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-02-20,3.060000,3.030000,3.130000,3.020000,10666700,60.597387,56.944440,10.869564,-43.055560,0.127181,0.128352,3.017328,2.890147,3.377431,2.414569,0.132561
2015-02-23,3.060000,3.050000,3.100000,3.030000,6323500,60.597387,55.072460,7.368423,-44.927540,0.121164,0.126915,3.023893,2.902729,3.366558,2.486442,0.126305
2015-02-24,3.110000,3.060000,3.120000,3.020000,10916300,62.993571,57.377043,-6.042298,-42.622957,0.119058,0.125343,3.037140,2.918082,3.373224,2.529776,0.123674
2015-02-25,3.100000,3.080000,3.140000,3.060000,6151300,62.179140,46.000004,2.310229,-53.999996,0.115253,0.123325,3.046811,2.931558,3.381391,2.561609,0.119307
2015-02-26,3.080000,3.100000,3.130000,3.060000,8680900,60.494568,53.571383,1.315788,-46.428617,0.109363,0.120533,3.051917,2.942554,3.374810,2.613190,0.114376
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-10,243.979996,242.139999,248.899994,240.500000,43361600,56.150155,45.570222,3.825690,-54.429778,12.174842,16.326362,245.220570,233.045727,269.247469,220.280528,12.836743
2025-11-11,237.520004,241.660004,248.460007,234.639999,61336800,52.638648,30.348748,-6.088879,-69.651252,10.658712,15.192832,244.035867,233.377155,267.272901,224.198096,12.935070
2025-11-12,258.890015,253.130005,263.510010,250.000000,108942000,61.267867,80.702228,-0.300381,-19.297772,11.054124,14.365090,246.321121,235.266997,268.753819,224.746180,14.240563
2025-11-13,247.960007,251.899994,259.630005,246.059998,63025700,55.679988,54.948196,-3.895199,-45.051804,10.366037,13.565280,246.573257,236.207220,268.702529,226.137470,14.173508


In [50]:
import os
file_path = os.path.join("../data", "amd_data.csv")

if not os.path.exists(file_path):
    df_model.to_csv(file_path, index=True)
    print(f"Data saved to {file_path}")
else:
    print(f"File already exists: {file_path}")

File already exists: ../data\amd_data.csv


## New stock data 

In [51]:
end_date = datetime.now().strftime("%Y-%m-%d") # strftime = format object to string
df_cvx = yf.download('CVX', start='2015-01-01', end=end_date)
df_cvx

  df_cvx = yf.download('CVX', start='2015-01-01', end=end_date)
[*********************100%***********************]  1 of 1 completed


Price,Close,High,Low,Open,Volume
Ticker,CVX,CVX,CVX,CVX,CVX
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2015-01-02,70.993706,71.258559,69.902754,70.394626,5898800
2015-01-05,68.155968,70.123457,67.752380,69.972113,11758100
2015-01-06,68.124428,68.748727,67.146990,68.023533,11591600
2015-01-07,68.067673,69.196459,67.796512,68.893766,10353800
2015-01-08,69.625275,69.644192,68.483873,68.855934,8650800
...,...,...,...,...,...
2025-11-10,155.649994,155.979996,152.369995,155.419998,8436800
2025-11-11,156.240005,157.990005,155.899994,156.750000,6361400
2025-11-12,153.320007,155.789993,152.080002,155.580002,11670900
2025-11-13,155.580002,156.199997,153.929993,154.000000,7995800


In [52]:
df_cvx.info

<bound method DataFrame.info of Price            Close        High         Low        Open    Volume
Ticker             CVX         CVX         CVX         CVX       CVX
Date                                                                
2015-01-02   70.993706   71.258559   69.902754   70.394626   5898800
2015-01-05   68.155968   70.123457   67.752380   69.972113  11758100
2015-01-06   68.124428   68.748727   67.146990   68.023533  11591600
2015-01-07   68.067673   69.196459   67.796512   68.893766  10353800
2015-01-08   69.625275   69.644192   68.483873   68.855934   8650800
...                ...         ...         ...         ...       ...
2025-11-10  155.649994  155.979996  152.369995  155.419998   8436800
2025-11-11  156.240005  157.990005  155.899994  156.750000   6361400
2025-11-12  153.320007  155.789993  152.080002  155.580002  11670900
2025-11-13  155.580002  156.199997  153.929993  154.000000   7995800
2025-11-14  156.639999  157.214996  154.809998  156.229996   2748790

[

In [59]:
df_cvx.columns

MultiIndex([( 'Close', 'CVX'),
            (  'High', 'CVX'),
            (   'Low', 'CVX'),
            (  'Open', 'CVX'),
            ('Volume', 'CVX')],
           names=['Price', 'Ticker'])

In [60]:
df_cvx.columns = df_cvx.columns.droplevel('Ticker')

In [61]:
df_cvx = ta.add_all_ta_features(df_cvx, open="Open", high="High", 
                             low="Low", close="Close", volume="Volume")
df_cvx

  self._psar[i] = high2


Price,Close,High,Low,Open,Volume,volume_adi,volume_obv,volume_cmf,volume_fi,volume_em,...,momentum_ppo,momentum_ppo_signal,momentum_ppo_hist,momentum_pvo,momentum_pvo_signal,momentum_pvo_hist,momentum_kama,others_dr,others_dlr,others_cr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2015-01-02,70.993706,71.258559,69.902754,70.394626,5898800,3.594164e+06,5898800,,,,...,,,,,,,,,,0.000000
2015-01-05,68.155968,70.123457,67.752380,69.972113,11758100,-4.161176e+06,-5859300,,,-33.126606,...,,,,,,,,-3.997168,-4.079250,-3.997168
2015-01-06,68.124428,68.748727,67.146990,68.023533,11591600,-1.605552e+06,-17450900,,,-13.680726,...,,,,,,,,-0.046276,-0.046287,-4.041595
2015-01-07,68.067673,69.196459,67.796512,68.893766,10353800,-7.948420e+06,-27804700,,,7.418034,...,,,,,,,,-0.083311,-0.083346,-4.121539
2015-01-08,69.625275,69.644192,68.483873,68.855934,8650800,4.203006e+05,-19153900,,,7.612427,...,,,,,,,,2.288314,2.262525,-1.927539
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-10,155.649994,155.979996,152.369995,155.419998,8436800,3.318800e+08,764031200,-0.133514,2.955124e+06,-26.742964,...,-0.045445,-0.113800,0.068355,4.544900,1.850018,2.694882,154.076470,0.406392,0.405569,119.244780
2025-11-11,156.240005,157.990005,155.899994,156.750000,6361400,3.275884e+08,770392600,-0.183704,3.069149e+06,91.007342,...,0.051607,-0.080719,0.132325,2.810536,2.042121,0.768414,154.123268,0.379063,0.378346,120.075856
2025-11-12,153.320007,155.789993,152.080002,155.580002,11670900,3.237192e+08,758721700,-0.189829,-2.237730e+06,-95.683128,...,-0.023634,-0.069302,0.045668,6.902813,3.014260,3.888553,154.110449,-1.868918,-1.886603,115.962818
2025-11-13,155.580002,156.199997,153.929993,154.000000,7995800,3.273473e+08,766717500,-0.168688,6.634402e+05,32.080575,...,0.034312,-0.048579,0.082891,6.239421,3.659292,2.580129,154.136582,1.474038,1.463279,119.146191


In [63]:
df_cvx_fin = df_cvx[selected_features].copy()
df_cvx_fin

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-01-02,70.993706,70.394626,71.258559,69.902754,5898800,,,,,,,,,,,0.000000
2015-01-05,68.155968,69.972113,70.123457,67.752380,11758100,,,,,,,,,,,0.000000
2015-01-06,68.124428,68.023533,68.748727,67.146990,11591600,,,,,,,,,,,0.000000
2015-01-07,68.067673,68.893766,69.196459,67.796512,10353800,,,,,,,,,,,0.000000
2015-01-08,69.625275,68.855934,69.644192,68.483873,8650800,,,,,,,,,,,0.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-10,155.649994,155.419998,155.979996,152.369995,8436800,53.239339,49.786637,-0.581249,-50.213363,-0.070210,-0.175718,154.424765,154.494975,157.296590,151.173411,2.777239
2025-11-11,156.240005,156.750000,157.990005,155.899994,6361400,54.748215,58.179407,0.437135,-41.820593,0.079796,-0.124615,154.704033,154.624237,157.485001,151.370001,2.733517
2025-11-12,153.320007,155.580002,155.789993,152.080002,11670900,46.714264,17.464889,-1.262230,-82.535111,-0.036521,-0.106997,154.491106,154.527627,157.404303,151.571700,2.876165
2025-11-13,155.580002,154.000000,156.199997,153.929993,7995800,52.521455,49.295838,0.940762,-50.704162,0.053048,-0.074988,154.658628,154.605581,157.336701,152.026301,2.876548


In [65]:
df_cvx_fin[selected_features].isna().sum()

Price
Close                 0
Open                  0
High                  0
Low                   0
Volume                0
momentum_rsi         13
momentum_stoch       13
momentum_roc         12
momentum_wr          13
trend_macd           25
trend_macd_signal    33
trend_ema_fast       11
trend_ema_slow       25
volatility_bbh       19
volatility_bbl       19
volatility_atr        0
dtype: int64

In [66]:
df_cvx_fin = df_cvx_fin.dropna()
df_cvx_fin

Price,Close,Open,High,Low,Volume,momentum_rsi,momentum_stoch,momentum_roc,momentum_wr,trend_macd,trend_macd_signal,trend_ema_fast,trend_ema_slow,volatility_bbh,volatility_bbl,volatility_atr
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
2015-02-20,69.156708,69.048453,69.328647,68.456226,7610600,50.513635,60.499211,0.125072,-39.500789,0.479129,0.391771,69.538001,69.058872,72.460901,64.798691,1.467479
2015-02-23,68.685463,68.640887,69.093016,68.360693,6724000,48.319604,23.004200,0.525757,-76.995800,0.375629,0.388542,69.406842,69.031212,72.483022,64.907084,1.400332
2015-02-24,68.749138,68.704562,68.927442,68.354319,5708400,48.644191,24.522729,-0.264704,-75.477271,0.295339,0.369902,69.305657,69.010318,72.487492,64.911483,1.317611
2015-02-25,69.143959,68.851031,69.271317,68.621782,4806500,50.711323,26.166406,0.033533,-73.833594,0.260563,0.348034,69.280780,69.020217,72.530541,64.955883,1.250804
2015-02-26,68.176025,68.704572,68.723676,67.883097,5903100,45.839937,7.266975,-2.099253,-92.733025,0.153133,0.309054,69.110818,68.957685,72.359935,65.404071,1.251810
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2025-11-10,155.649994,155.419998,155.979996,152.369995,8436800,53.239339,49.786637,-0.581249,-50.213363,-0.070210,-0.175718,154.424765,154.494975,157.296590,151.173411,2.777239
2025-11-11,156.240005,156.750000,157.990005,155.899994,6361400,54.748215,58.179407,0.437135,-41.820593,0.079796,-0.124615,154.704033,154.624237,157.485001,151.370001,2.733517
2025-11-12,153.320007,155.580002,155.789993,152.080002,11670900,46.714264,17.464889,-1.262230,-82.535111,-0.036521,-0.106997,154.491106,154.527627,157.404303,151.571700,2.876165
2025-11-13,155.580002,154.000000,156.199997,153.929993,7995800,52.521455,49.295838,0.940762,-50.704162,0.053048,-0.074988,154.658628,154.605581,157.336701,152.026301,2.876548


In [67]:
import os
file_path = os.path.join("../data", "cvx_data.csv")

if not os.path.exists(file_path):
    df_model.to_csv(file_path, index=True)
    print(f"Data saved to {file_path}")
else:
    print(f"File already exists: {file_path}")

Data saved to ../data\cvx_data.csv
