# **Machine Learning Models Utilizing Walk-Forward Validation**

#### **Author: Zachary Wright, CFA, FRM | Last Updated: 03/03/25**

**Overview:** This is an exercise project to find a solution the data leakage problem when making predictions on test sample time-series data by ordering the samples by time.

**Planned Updates:**
- Feature engineering with volatility clustering, rolling correlations, and autoregression
- Model comparison and selection
- Model validation metrics

**Latest Changes:**
* Started project

**Libraries used:**
- Yahoo Finance
- Pandas
- Numpy
- Sci-kit Learn

---

In [None]:
import yfinance as yf
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

#Download historical stock data for NVidia
ticker = "NVDA"
data = yf.download(ticker, start="2020-01-01", end="2024-01-01")

#Feature Engineering with closing price data
data["MA5"] = data["Close"].rolling(window=5).mean()
data["MA10"] = data["Close"].rolling(window=10).mean()
data["Momentum"] = data["Close"] - data["Close"].shift(4)
data["Daily Return"] = data["Close"].pct_change()

#Target Binary Variable (1 if next day's close price is higher, 0 otherwise)
data["Target"] = (data["Close"].shift(-1) > data["Close"]).astype(int)

#Drop NaN or null values
data.dropna(inplace=True)

#Define Features
features = ["MA5", "MA10", "Momentum", "Daily Return"]
X = data[features]
y = data["Target"]

#Walk-Forward Validation
split_point = int(len(data) * 0.7)  #Use 70% for training, 30% for testing (in time order)
X_train, X_test = X.iloc[:split_point], X.iloc[split_point:]
y_train, y_test = y.iloc[:split_point], y.iloc[split_point:]

#Standardize Features based on Training Data Only
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

#Train Logistic Regression Model
model = LogisticRegression()
model.fit(X_train, y_train)

#Predictions on Future Data
y_pred = model.predict(X_test)

#Evaluation
print(f"\nAccuracy on future data: {accuracy_score(y_test, y_pred):.2f}")
print(classification_report(y_test, y_pred))

[*********************100%***********************]  1 of 1 completed



Accuracy on future data: 0.47
              precision    recall  f1-score   support

           0       0.45      0.79      0.57       134
           1       0.56      0.22      0.31       166

    accuracy                           0.47       300
   macro avg       0.51      0.50      0.44       300
weighted avg       0.51      0.47      0.43       300



Looking at the scores for the binary classifier 0, we can see that with a Recall of 0.79 and  0.57 F1 score, the model in its current state performs best predicting down days. I will use K-Fold Cross Validation to improve its test sample performance for predicting up days.