사이킷런 링크: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_wine.html

- 클래스 수: 3개
- 샘플 수: 총 178개 (class_0: 59, class_1: 71, class_2: 48)
- 피처 수: 13개 (연속형, 모두 양수)
- 출처: UCI 머신러닝 저장소의 Wine 데이터셋 (값 일부 포맷화됨)

| 피처 이름                           | 설명                 |
| ------------------------------- | ------------------ |
| alcohol                         | 알코올 함량             |
| malic\_acid                     | 말산 함량              |
| ash                             | 회분 함량              |
| alcalinity\_of\_ash             | 회분의 알칼리도           |
| magnesium                       | 마그네슘 함량            |
| total\_phenols                  | 총 페놀               |
| flavanoids                      | 플라바노이드             |
| nonflavanoid\_phenols           | 비플라바노이드 페놀         |
| proanthocyanins                 | 프로안토시아니딘           |
| color\_intensity                | 색 농도               |
| hue                             | 색조                 |
| od280/od315\_of\_diluted\_wines | 희석 와인의 OD280/OD315 |
| proline                         | 프롤린                |


In [1]:
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report

In [3]:
wine = load_wine()
X, y = wine.data, wine.target
target_names = wine.target_names

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [5]:
# 표준화

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.fit_transform(X_test)

In [13]:
knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train_scaled, y_train)
y_pred = knn.predict(X_test_scaled)

In [14]:
accuracy_score(y_test, y_pred)

0.9629629629629629

In [15]:
print(classification_report(y_test, y_pred, target_names=target_names))

              precision    recall  f1-score   support

     class_0       0.95      1.00      0.97        19
     class_1       1.00      0.90      0.95        21
     class_2       0.93      1.00      0.97        14

    accuracy                           0.96        54
   macro avg       0.96      0.97      0.96        54
weighted avg       0.97      0.96      0.96        54



In [16]:
from sklearn.model_selection import GridSearchCV

In [18]:
param_grid = {'n_neighbors': range(1, 31)}
grid = GridSearchCV(KNeighborsClassifier(), param_grid, cv=5)
grid.fit(X_train_scaled, y_train)

In [19]:
best_k = grid.best_params_['n_neighbors']
best_k

7

In [20]:
# 최적의 K값으로 학습
best_model = grid.best_estimator_
y_pred_best = best_model.predict(X_test_scaled)

In [21]:
accuracy_score(y_test, y_pred_best)

0.9629629629629629

In [22]:
print(classification_report(y_test, y_pred_best, target_names=target_names))

              precision    recall  f1-score   support

     class_0       0.95      1.00      0.97        19
     class_1       1.00      0.90      0.95        21
     class_2       0.93      1.00      0.97        14

    accuracy                           0.96        54
   macro avg       0.96      0.97      0.96        54
weighted avg       0.97      0.96      0.96        54

