# 5.1 线性SVM分类

* 可以将SVM分类器视为在类之间拟合可能的最宽的街道
* SVM对特征的缩放非常敏感

鸢尾花数据集检测维吉尼亚鸢尾花

In [1]:
import numpy as np
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

In [2]:
iris = datasets.load_iris()
X = iris["data"][:, (2, 3)]     # petal length, petal width
y = (iris["target"] == 2).astype(np.float64)    # iris virginica

In [3]:
# Pipeline可以将许多算法模型串联起来形成一个典型的机器学习问题工作流
svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("linear_svc", LinearSVC(C=1, loss="hinge")),
    ])
svm_clf.fit(X, y)

Pipeline(steps=[('scaler', StandardScaler()),
                ('linear_svc', LinearSVC(C=1, loss='hinge'))])

In [4]:
# 预测
svm_clf.predict([[5.5, 1.7]])

array([1.])

# 5.2 非线性SVM分类

* 处理非线性数据集的方法之一是添加更多特征
* 某些情况下，这可能导致数据集变得线性可分离

In [11]:
# 创建一个包含PolynomialFeatures转换器的Pipeline
from sklearn.datasets import make_moons
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import train_test_split

X, y = make_moons(n_samples=1000, noise=0.15)
polynomial_svm_clf = Pipeline([
        ("poly_features", PolynomialFeatures(degree=3)),
        ("scaler", StandardScaler()),
        ("svm_clf", LinearSVC(C=10, loss="hinge", max_iter=5000))
    ])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
polynomial_svm_clf.fit(X_train, y_train)

Pipeline(steps=[('poly_features', PolynomialFeatures(degree=3)),
                ('scaler', StandardScaler()),
                ('svm_clf', LinearSVC(C=10, loss='hinge', max_iter=5000))])

## 5.2.1 多项式内核

In [12]:
# 核技巧, 产生的结果跟添加了许多多项式特征一样，但实际上并不需要真的添加
from sklearn.svm import SVC
poly_kernel_svm_clf = Pipeline([
    ("scaler", StandardScaler()),
    ("svm_clf", SVC(kernel="poly", degree=3, coef0=1, C=5))     # coef0控制的是受高阶多项式还是低阶多项式影响的程度
])
poly_kernel_svm_clf.fit(X, y)

Pipeline(steps=[('scaler', StandardScaler()),
                ('svm_clf', SVC(C=5, coef0=1, kernel='poly'))])

## 5.2.2 相似特征

相似函数可以测量每个实例与一个特点地标之间的相似度

高斯RBF：  
![RBF](https://img-blog.csdnimg.cn/20200612161653951.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk,shadow_10,text_aHR0cHM6Ly9ibG9nLmNzZG4ubmV0L21vb25saWdodHBlbmc=,size_16,color_FFFFFF,t_70)

## 5.2.3 高斯RBF内核

In [13]:
rbf_kernel_svm_clf = Pipeline([
        ("scaler", StandardScaler()),
        ("svm_clf", SVC(kernel="rbf", gamma=5, C=0.001))
    ])
rbf_kernel_svm_clf.fit(X, y)

Pipeline(steps=[('scaler', StandardScaler()),
                ('svm_clf', SVC(C=0.001, gamma=5))])

* 增加gamma值会使钟形曲线变窄,因此每个实例的影响范围随之变小,开始围着单个实例绕弯
* 减小gamma值会使钟形曲线变宽,因此每个实例的影响范围随之变大,决策边界变得更平坦     
* 过拟合,减低gamma值; 欠拟合,提升gamma值

## 5.2.4 计算复杂度

| 类 | 时间复杂度 | 核外技巧 | 需要缩放 | 核技巧 |
|  ----  | ----  | ----  | ----  | ----  | 
| LinearSVC  | $O(m×n)$ | 否 | 是 | 否 | 
| SGDClassifier  | $O(m×n)$ | 是 | 是 | 否 | 
| SVC  | $O(m^2×n)到O(m^3×n)$ | 否 | 是 | 是 | 
