## Applying KNN with Time Series Split, Hyperparameter Tuning, and Dynamic Time Warping (DTW) on Stock Data


This guide outlines the process of building an KNN model to classify stock behavior. The model's predictive performance can be enhanced by incorporating exploratory data analysis techniques, such as leveraging multiple tickers that are correlated with the target ticker and applying advanced statistical preprocessing methods.

# KNN with Time Series Split, Hyperparameter Tuning, and DTW

## 1. Introduction
This study explores the application of the K-Nearest Neighbors (KNN) algorithm for predicting stock price movements. It focuses on:

- Preprocessing the time series data with `TimeSeriesSplit`.
- Hyperparameter tuning for KNN using metrics like accuracy.
- Applying **Dynamic Time Warping (DTW)** to analyze and compare the last 60 days of stock movements.

---

## 2. Data Description
The dataset contains:

- Daily stock prices and technical indicators.
- A `F_prediction` column, indicating whether the F stock price increases (1) or decreases (0) the following day.

---

## 3. Methodology

### 3.1 Data Preprocessing
1. **Features and Target**: Extract relevant features such as technical indicators (`SMA`, `EMA`, `MACD`, etc.) and define the target as price direction (0 or 1).
2. **Scaling**: Normalize the features using `StandardScaler`.
3. **Time Series Split**: Implement `TimeSeriesSplit` to ensure the training set always precedes the test set, preserving temporal order.

---

### 3.2 KNN Model
1. **Baseline KNN**: Use KNN for binary classification to predict stock price movement.
2. **Hyperparameter Tuning**: Perform grid search over:
   - **Number of neighbors (\( k \))**: 3, 5, 7, etc.
   - **Distance metrics**: `euclidean`, `manhattan`.
   - **Weighting schemes**: `uniform`, `distance`.

---

### 3.3 Dynamic Time Warping (DTW)
1. **Application on Last 60 Days**: Use DTW to measure the similarity between the last 60 days of stock prices and historical trends.
2. **Insights**: Identify patterns and correlations for potential decision-making.


# Dynamic Time Warping vs Euclidean Distance

Dynamic Time Warping (DTW) and Euclidean Distance are two methods used to measure similarity between time series data. The following image illustrates the difference between these methods:

![Dynamic Time Warping vs Euclidean Matching](image.png)

**Key Points**:
1. **Euclidean Matching**: Compares points in a one-to-one straight-line fashion, which is sensitive to distortions in time series alignment.
2. **Dynamic Time Warping (DTW)**: Allows flexible matching of points, accommodating time distortions and non-linear alignments, making it better for complex time series comparisons.



---

## 4. Results
1. **Performance Metrics**: Accuracy and performance for each split.
2. **Hyperparameter Tuning Results**: Optimal \( k \), distance metric, and weighting scheme.
3. **DTW Analysis**: Patterns identified in the last 60 days of data.

---

## 5. Conclusion
This study demonstrates the utility of KNN for short-term stock movement prediction and DTW for similarity analysis. The combination of these methods provides a robust framework for financial time series analysis.
