# SVM

有人认为SVM是现成的最好的分类器，现成的意思是指不加修改的用于某个数据值时，就能有较低的错误率；

In [1]:
from numpy import *

dataPath = '../../../git_mlaction/machinelearninginaction/Ch06/'

## 基于最大间隔分隔数据

优点：
* 泛化错误率低；
* 计算开销小；
* 结果易于理解；

缺点：
* 对参数以及核函数敏感；
* 原始分类器不加修改只能用于二分类；

使用数据类型：数值型和标称型；

对于SVM算法，目的就是寻找到一个超平面（决策边界，二维数据对应的就是一条直线）将数据划分，使得支持向量的margin尽可能大（感觉又是一个求最优解的问题）；

## 寻找最大间隔 - margin

### 分类器求解的优化问题 - 需要深度学习其公式推导思路

在某些约束条件下，求解最优值问题；

### SVM应用的一般框架

* 收集数据；
* 准备数据：需要数值型；
* 分析数据：有助于可视化分隔超平面；
* 训练算法：两个参数的调优；
* 使用算法：几乎任何分类问题都可以使用SVM，对于多分类问题，则需要修改代码去适应；

## SMO高效优化算法

### Platt的SMO算法

最小序列优化（Sequential Minimal Optimization），用于训练SVM，基本思路是：将大优化问题分解为多个小优化问题，而这些小问题通常很容易求解，且最终效果是一致的，但是速度快很多；

SMO算法原理：每次迭代中选择两个alpha进行优化处理，一旦找到一对**合适**的alpha，那么就增大其中一个同时减少另一个（因为约束条件和为0，因此一个增加，就得对应有一个减少）；

### 应用简化版SMO算法处理小规模数据集

In [2]:
# 加载数据
def loadDataSet(fileName=dataPath+'testSet.txt'):
    dataMat, labelMat = [], []
    with open(fileName) as testFile:
        for line in testFile.readlines():
            words = line.strip().split()
            dataMat.append([float(words[0]), float(words[1])])
            labelMat.append(float(words[2]))
    return dataMat, labelMat

In [3]:
# 从一个范围内选择一个不等于某个数的随机数
def selectJrand(i, m):
    '''
    i:不能等于的数
    m:范围
    '''
    j = i
    while j==i:
        j = random.randint(m)
    return j

selectJrand(3,10)

8

In [4]:
# 限制某个数的上下限
def clipAlpha(aj, H, L):
    '''
    aj:待限制的数
    H:上限
    L:下限
    '''
    return H if aj > H else (L if L > aj else aj)

print clipAlpha(5,10,1)
print clipAlpha(0,10,1)
print clipAlpha(20,10,1)

5
1
10


In [6]:
# 简化版的SMO，先实现，再一步一步理解
def smoSimple(dataMatIn, classLabels, C, toler, maxIter):
    '''
    dataMatIn:输入特征多维数据
    classLabels:类别标签
    C:常数C，可调参数
    toler:容错率，可调参数
    maxIter:最大迭代次数，可调参数
    
    注意SVM是参数、核函数敏感的算法，因此参数以及核函数的设置对结果的影响很大，同样也就具有更大的可操作空间；
    '''
    
    # 转换数据为矩阵
    dataMatrix = mat(dataMatIn)
    labelMatrix = mat(classLabels).transpose() # 照例转置类别矩阵
    m,n = shape(dataMatrix)
    b = 0 # 
    alphaMatrix = mat(zeros((m, 1))) # 初始化一个全为0的高度为m的列向量（宽为1的矩阵）
    curIter = 0
    while curIter < maxIter: # 开启外循环，由最大迭代次数控制
        alphaPairsChanged = 0 # 修改的alpha对个数
        for i in range(m): # 开启内循环，遍历每个数据向量
            fXi = float(multiply(alphaMatrix, labelMatrix).T * (dataMatrix * dataMatrix[i,:].T)) + b # 当前预测值
            Ei = fXi - float(labelMatrix[i]) # 预测值与实际值之间的误差（实际值为-1或1）
            # 接下来就是如何利用Ei来更新alphaMatrix了
            # 首先是判断是否需要优化，主要由常数C和容错率toler控制，既误差比较大，则判断该数据向量需要被优化
            if ((labelMatrix[i]*Ei < -toler) and (alphaMatrix[i] < C)) or ((labelMatrix[i]*Ei > toler) and (alphaMatrix[i] > 0)):
                j = selectJrand(i, m) # 从范围m中选一个不是i的，也就是所谓的成对优化，保证和为0（约束条件）
                fXj = float(multiply(alphaMatrix, labelMatrix).T * (dataMatrix * dataMatrix[j,:].T)) + b
                Ej = fXj - float(labelMatrix[j]) # 计算随机选择的j的误差
                # 保存旧的alpha
                alphaIold = alphaMatrix[i].copy()
                alphaJold = alphaMatrix[j].copy()
                # 明确alpha的范围，0跟C之间的某个范围
                if labelMatrix[i] != labelMatrix[j]: # 如果选择的两个数据向量有不同的类别标签
                    L = max(0, alphaMatrix[j] - alphaMatrix[i])
                    H = min(C, C + alphaMatrix[j] - alphaMatrix[i])
                else:
                    L = max(0, alphaMatrix[j] + alphaMatrix[i] - C)
                    H = min(C, alphaMatrix[j] + alphaMatrix[i])
                if L == H:
                    print 'L==H,Continue'
                    continue
                eta = 2.0 * dataMatrix[i,:] * dataMatrix[j,:].T - dataMatrix[i,:] * dataMatrix[i,:].T - dataMatrix[j,:] * dataMatrix[j,:].T
                if eta >= 0:
                    print 'eta>=0,Continue'
                    continue
                # 其次是优化算法
                alphaMatrix[j] -= labelMatrix[j] * (Ei - Ej) / eta
                alphaMatrix[j] = clipAlpha(alphaMatrix[j], H, L)
                if abs(alphaMatrix[j] - alphaJold) < 0.00001:
                    print 'J have not move enough,Continue'
                    continue
                alphaMatrix[i] += labelMatrix[i] * labelMatrix[j] * (alphaJold - alphaMatrix[j]) # 注意i与j的变化是反向的，保证变化和为0
                b1 = b-Ei-(labelMatrix[i]*(alphaMatrix[i]-alphaIold)*dataMatrix[i,:]*dataMatrix[i,:].T)-(labelMatrix[j]*(alphaMatrix[j]-alphaJold)*dataMatrix[i,:]*dataMatrix[j,:].T)
                b2 = b-Ej-(labelMatrix[i]*(alphaMatrix[i]-alphaIold)*dataMatrix[i,:]*dataMatrix[j,:].T)-(labelMatrix[j]*(alphaMatrix[j]-alphaJold)*dataMatrix[j,:]*dataMatrix[j,:].T)
                b = b1 if (0<alphaMatrix[i]<C) else (b2 if (0<alphaMatrix[j]<C) else (b1+b2)/2.0)
                alphaPairsChanged += 1
                print 'CurIter: %d, i: %d, Pairs Changed: %d' % (curIter, i, alphaPairsChanged)
        if alphaPairsChanged == 0:
            curIter += 1
        else:
            curIter = 0
        print 'CurIter: %d' % curIter
    return b, alphaMatrix

In [7]:
dataMatIn, classLabels = loadDataSet()
b, alphas = smoSimple(dataMatIn, classLabels, C=0.6, toler=0.001, maxIter=30)
print b
print alphas[alphas > 0]

CurIter: 0, i: 0, Pairs Changed: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
L==H,Continue
J have not move enough,Continue
L==H,Continue
J have not move enough,Continue
J have not move enough,Continue
L==H,Continue
L==H,Continue
L==H,Continue
CurIter: 0, i: 29, Pairs Changed: 2
L==H,Continue
L==H,Continue
CurIter: 0, i: 51, Pairs Changed: 3
L==H,Continue
CurIter: 0, i: 54, Pairs Changed: 4
CurIter: 0, i: 55, Pairs Changed: 5
L==H,Continue
L==H,Continue
J have not move enough,Continue
J have not move enough,Continue
L==H,Continue
CurIter: 0, i: 95, Pairs Changed: 6
L==H,Continue
L==H,Continue
CurIter: 0
J have not move enough,Continue
J have not move enough,Continue
L==H,Continue
J have not move enough,Continue
L==H,Continue
L==H,Continue
J have not move enough,Continue
J have not move enough,Continue
L==H,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 0, i: 54, Pairs Changed: 1
J have not move enou

J have not move enough,Continue
CurIter: 7
J have not move enough,Continue
J have not move enough,Continue
L==H,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 8
J have not move enough,Continue
J have not move enough,Continue
CurIter: 8, i: 29, Pairs Changed: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 0
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 1, i: 54, Pairs Changed: 1
J have not move enough,Continue
J have not move enough,Continue
CurIter: 0
CurIter: 0, i: 8, Pairs Changed: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,C

CurIter: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 2
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 3
CurIter: 3, i: 10, Pairs Changed: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 0
L==H,Continue
L==H,Continue
CurIter: 0, i: 17, Pairs Changed: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 0
J have not move enough,Continue
J have not move enough,Continue
J have not mo

J have not move enough,Continue
CurIter: 0
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 1
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 2
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 3
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 4
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 5
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 6
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 7
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIter: 8
J have not move enough,Continue
J have not move enough,Continue
J have not move enough,Continue
CurIt

J have not move enough,Continue
J have not move enough,Continue
CurIter: 10
J have not move enough,Continue
J have not move enough,Continue
CurIter: 11
J have not move enough,Continue
J have not move enough,Continue
CurIter: 12
J have not move enough,Continue
J have not move enough,Continue
CurIter: 13
J have not move enough,Continue
J have not move enough,Continue
CurIter: 14
J have not move enough,Continue
J have not move enough,Continue
CurIter: 15
J have not move enough,Continue
J have not move enough,Continue
CurIter: 16
J have not move enough,Continue
CurIter: 16, i: 54, Pairs Changed: 1
J have not move enough,Continue
CurIter: 0
J have not move enough,Continue
J have not move enough,Continue
CurIter: 1
J have not move enough,Continue
J have not move enough,Continue
CurIter: 2
J have not move enough,Continue
J have not move enough,Continue
CurIter: 3
J have not move enough,Continue
J have not move enough,Continue
CurIter: 4
J have not move enough,Continue
J have not move enough,C

In [9]:
for i in range(len(alphas)):
    if alphas[i] > 0:
        print 'Support Vector: %s, Class: %f' % (dataMatIn[i], classLabels[i])

Support Vector: [4.658191, 3.507396], Class: -1.000000
Support Vector: [3.457096, -0.082216], Class: -1.000000
Support Vector: [5.286862, -2.358286], Class: 1.000000
Support Vector: [6.080573, 0.418886], Class: 1.000000


## 使用完整Platt SMO算法加速优化