# KNN – Clasificare și Regresie  
## Testarea mai multor valori pentru K

**Cerință:**  
Testați modelele de clasificare și regresie folosind diferite valori pentru K.  
Comparați rezultatele și explicați ce diferențe apar.


In [None]:
# Importuri
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import plotly.express as px
import plotly.graph_objects as go
from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, confusion_matrix, mean_squared_error

In [None]:
# Descărcare dataset
!pip install -q gdown
!gdown "https://drive.google.com/uc?id=1WO3yoK_Fd-v3JBLBCVNvZVZHfS5Un_Em"

df = pd.read_csv('strength_training_data.csv')
df.head()

## Clasificare KNN – Strength Level

In [None]:
X = df[['Bench_Press_kg', 'Squat_kg', 'Deadlift_kg', 'Body_Weight_kg']]
y = df['Strength_Level']

le = LabelEncoder()
y_encoded = le.fit_transform(y)

X_train, X_test, y_train, y_test = train_test_split(
    X, y_encoded, test_size=0.25, random_state=42
)

In [None]:
k_values = [1, 3, 5, 7, 9, 11]
accuracies = []

for k in k_values:
    knn = KNeighborsClassifier(n_neighbors=k, metric='euclidean')
    knn.fit(X_train, y_train)
    y_pred = knn.predict(X_test)
    acc = accuracy_score(y_test, y_pred)
    accuracies.append(acc)
    print(f"K={k} -> Accuracy={acc:.4f}")

In [None]:
fig = px.line(
    x=k_values,
    y=accuracies,
    markers=True,
    labels={'x': 'K', 'y': 'Accuracy'},
    title='Clasificare KNN – Accuracy vs K'
)
fig.show()

## Regresie KNN – Body Weight

In [None]:
X = df[['Bench_Press_kg', 'Squat_kg', 'Deadlift_kg']]
y = df['Body_Weight_kg']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=42
)

In [None]:
k_values = [1, 3, 5, 7, 9, 11]
mse_values = []

for k in k_values:
    knn = KNeighborsRegressor(n_neighbors=k, metric='euclidean')
    knn.fit(X_train, y_train)
    y_pred = knn.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_values.append(mse)
    print(f"K={k} -> MSE={mse:.4f}")

In [None]:
fig = px.line(
    x=k_values,
    y=mse_values,
    markers=True,
    labels={'x': 'K', 'y': 'MSE'},
    title='Regresie KNN – MSE vs K'
)
fig.show()

## Concluzii

- **K mic** → overfitting (sensibil la zgomot)
- **K mare** → underfitting (model prea general)
- **K intermediar** oferă cele mai bune rezultate

Aceleași observații se aplică atât pentru clasificare, cât și pentru regresie.
