# K-Nearest Neighbors (KNN) - PyTorch Implementation

Multi-class classification on the **Covertype (Forest Cover Type)** dataset using PyTorch GPU acceleration.

**Dataset**: 581,012 samples, 54 features, 7 forest cover types  
**Task**: Predict forest cover type from cartographic variables  
**Key Concept**: KNN is a "lazy learner" - no training phase, expensive at prediction time

## PyTorch Advantages for KNN
- **`torch.cdist`**: Highly optimized pairwise distance computation
- **GPU acceleration**: RTX 4090 CUDA cores for parallel distance calculations
- **Tensor operations**: Efficient top-k selection with `torch.topk`
- **Memory management**: Batched processing to fit in 24GB VRAM


In [1]:
# Standard libraries
import numpy as np
import sys

# Pytorch for GPU-accelerated distance computation
import torch

# Add utils to path
sys.path.append('../..')
from utils.data_loader import load_processed_data
from utils.metrics import accuracy, macro_f1_score
from utils.visualization import (
    plot_confusion_matrix_multiclass,
    plot_validation_curve,
    plot_per_class_f1
)
from utils.performance import track_performance

# Check GPU availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")
if device.type == 'cuda':
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM: {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

print("Imports complete!")

Using device: cuda
GPU: NVIDIA GeForce RTX 4090
VRAM: 25.8 GB
Imports complete!
