# SVM

- 선형모델을 개선한 모델
- 학습 샘플수가 적을 때 유용

<img src="https://github.com/StillWork/image/blob/main/%E1%84%89%E1%85%B3%E1%84%8F%E1%85%B3%E1%84%85%E1%85%B5%E1%86%AB%E1%84%89%E1%85%A3%E1%86%BA%202021-05-30%20%E1%84%8B%E1%85%A9%E1%84%8C%E1%85%A5%E1%86%AB%209.28.37.png?raw=1" align='center'  width=500>

- SVM 모델은 하이퍼파라미터 최적화가 필요하다
 - C: 마진 조정, gamma: 과대적합 조정


![C](https://s3.stackabuse.com/media/articles/understanding-svm-hyperparameters-1.png)

![gamma](https://s3.stackabuse.com/media/articles/understanding-svm-hyperparameters-4.png)

# import

In [1]:
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import matplotlib
from sklearn.datasets import load_breast_cancer
from sklearn.metrics import classification_report
from sklearn.model_selection import GridSearchCV 
%config InlineBackend.figure_format="retina"

In [2]:
## 유방암 발생 데이터 다운로드

cancer = load_breast_cancer()
X = pd.DataFrame(cancer.data, columns=cancer.feature_names)
y = cancer.target
print(X.shape)
X[:3]

(569, 30)


Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,25.38,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189
1,20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,24.99,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902
2,19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,23.57,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758


In [11]:
cancer.target[:30]

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,
       0, 0, 0, 0, 0, 0, 0, 0])

In [4]:
type(cancer)

sklearn.utils.Bunch

In [5]:
cancer.keys()

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename'])

# 데이터

In [6]:
from sklearn.model_selection import train_test_split 
from sklearn.svm import SVC  
X_train, X_test, y_train, y_test = train_test_split(X, y) 

In [7]:
# 디폴트 모델 사용
model = SVC() 
model.fit(X_train, y_train) 
  
y_pred = model.predict(X_test) 
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.95      0.73      0.83        49
           1       0.88      0.98      0.92        94

    accuracy                           0.90       143
   macro avg       0.91      0.86      0.88       143
weighted avg       0.90      0.90      0.89       143



# 그리드 탐색

- 여러 하이퍼파라미터의 조합을 모두 실행해 보는 방법
- 참고: 랜덤 탐색

In [8]:
# 그리드 탐색할 하이퍼파라미터 범위 지정

param_grid = {'C': [0.1, 1, 10, 100, 1000],  
              'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 
              'kernel': ['rbf']}  
  
grid = GridSearchCV(SVC(), param_grid, refit = True) 
grid.fit(X_train, y_train)

GridSearchCV(estimator=SVC(),
             param_grid={'C': [0.1, 1, 10, 100, 1000],
                         'gamma': [1, 0.1, 0.01, 0.001, 0.0001],
                         'kernel': ['rbf']})

In [9]:
# 최적의 하이퍼파라미터 보기
print(grid.best_params_) 

{'C': 100, 'gamma': 0.0001, 'kernel': 'rbf'}


In [10]:
# 예측 다시하기 

grid_predictions = grid.predict(X_test) 
  
# print classification report 
print(classification_report(y_test, grid_predictions)) 

              precision    recall  f1-score   support

           0       0.94      0.94      0.94        49
           1       0.97      0.97      0.97        94

    accuracy                           0.96       143
   macro avg       0.95      0.95      0.95       143
weighted avg       0.96      0.96      0.96       143



- 그리드 탐색

![grid](https://upload.wikimedia.org/wikipedia/commons/thumb/b/b6/Hyperparameter_Optimization_using_Grid_Search.svg/440px-Hyperparameter_Optimization_using_Grid_Search.svg.png)

- 랜덤 탐색

![grid](https://upload.wikimedia.org/wikipedia/commons/thumb/7/74/Hyperparameter_Optimization_using_Random_Search.svg/440px-Hyperparameter_Optimization_using_Random_Search.svg.png)