# Julia 機器學習：LIBSVM 支撐向量機

本範例需要使用到的套件有 LIBSVM、StatsBase，請在執行以下範例前先安裝。

```
] add LIBSVM
] add StatsBase
```

In [3]:
using Pkg
Pkg.add("LIBSVM")

[32m[1m  Resolving[22m[39m package versions...
[32m[1m  Installed[22m[39m OpenBLAS_jll ──── v0.3.9+3
[32m[1m  Installed[22m[39m ScikitLearnBase ─ v0.5.0
[32m[1m  Installed[22m[39m LIBSVM ────────── v0.4.0
[32m[1m  Installed[22m[39m LIBLINEAR ─────── v0.5.1
[32m[1m   Updating[22m[39m `C:\Users\kai\.julia\environments\v1.4\Project.toml`
 [90m [b1bec4e5][39m[92m + LIBSVM v0.4.0[39m
[32m[1m   Updating[22m[39m `C:\Users\kai\.julia\environments\v1.4\Manifest.toml`
 [90m [2d691ee1][39m[92m + LIBLINEAR v0.5.1[39m
 [90m [b1bec4e5][39m[92m + LIBSVM v0.4.0[39m
 [90m [4536629a][39m[93m ↑ OpenBLAS_jll v0.3.9+2 ⇒ v0.3.9+3[39m
 [90m [6e75b9c4][39m[92m + ScikitLearnBase v0.5.0[39m
[32m[1m   Building[22m[39m LIBLINEAR → `C:\Users\kai\.julia\packages\LIBLINEAR\yTdp5\deps\build.log`
[32m[1m   Building[22m[39m LIBSVM ───→ `C:\Users\kai\.julia\packages\LIBSVM\5Z99T\deps\build.log`
base64 binary data: 4pSMIEVycm9yOiBFcnJvciBidWlsZGluZyBgTElCU1ZNYDogCuK

In [4]:
using LIBSVM, RDatasets, StatsBase

┌ Info: Precompiling LIBSVM [b1bec4e5-fd48-53fe-b0cb-9723c09d164b]
└ @ Base loading.jl:1260


## 載入資料

In [5]:
iris = dataset("datasets", "iris")
first(iris, 6)

Unnamed: 0_level_0,SepalLength,SepalWidth,PetalLength,PetalWidth,Species
Unnamed: 0_level_1,Float64,Float64,Float64,Float64,Categorical…
1,5.1,3.5,1.4,0.2,setosa
2,4.9,3.0,1.4,0.2,setosa
3,4.7,3.2,1.3,0.2,setosa
4,4.6,3.1,1.5,0.2,setosa
5,5.0,3.6,1.4,0.2,setosa
6,5.4,3.9,1.7,0.4,setosa


## 前處理

In [6]:
X = Matrix(iris[!, 1:4])
y = Vector{String}(iris[!, :Species]);

## 支撐向量機模型

In [9]:
model = LIBSVM.fit!(SVC(), X, y)

SVC(LIBSVM.Kernel.RadialBasis, 0.25, nothing, 1.0, 3, 0.0, 0.001, true, false, false, LIBSVM.SVM{String}(SVC, LIBSVM.Kernel.RadialBasis, nothing, 4, 3, ["setosa", "versicolor", "virginica"], Int32[1, 2, 3], Float64[], Int32[], LIBSVM.SupportVectors{String,Float64}(45, Int32[7, 19, 19], ["setosa", "setosa", "setosa", "setosa", "setosa", "setosa", "setosa", "versicolor", "versicolor", "versicolor"  …  "virginica", "virginica", "virginica", "virginica", "virginica", "virginica", "virginica", "virginica", "virginica", "virginica"], [4.3 5.7 … 6.5 5.9; 3.0 4.4 … 3.0 3.0; 1.1 1.5 … 5.2 5.1; 0.1 0.4 … 2.0 1.8], Int32[14, 16, 19, 24, 25, 42, 45, 51, 53, 55  …  130, 132, 134, 135, 139, 142, 143, 147, 148, 150], LIBSVM.SVMNode[LIBSVM.SVMNode(1187983392, 8.4879831653e-314), LIBSVM.SVMNode(398020776, 0.0), LIBSVM.SVMNode(0, 0.0), LIBSVM.SVMNode(1, 5.1), LIBSVM.SVMNode(1, 4.8), LIBSVM.SVMNode(1, 4.5), LIBSVM.SVMNode(1, 5.1), LIBSVM.SVMNode(1, 7.0), LIBSVM.SVMNode(0, 0.0), LIBSVM.SVMNode(1187986262,

## 預測

In [10]:
predicted_labels, decision_values = LIBSVM.predict(model, X)

150-element Array{String,1}:
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 "setosa"
 ⋮
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"
 "virginica"

## 評估模型

In [11]:
accuracy() = mean((predicted_labels .== y))*100

accuracy (generic function with 1 method)

In [12]:
accuracy()

33.33333333333333

## LIBSVM API
This provides a lower level API similar to LIBSVM C-interface. See `?svmtrain` for options.

In [1]:
using RDatasets, LIBSVM, Printf, StatsBase

# Load Fisher's classic iris data
iris = dataset("datasets", "iris")

# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = convert(Vector{String}, iris[!, :Species])

# First dimension of input data is features; second is instances
instances = convert(Array, iris[!, 1:4])'

# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);

# Test model on the other half of the data.
(predicted_labels, decision_values) = svmpredict(model, instances[:, 2:2:end]);

# Compute accuracy
@printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[2:2:end]))*100

Accuracy: 93.33%


In [2]:
using LIBSVM, Printf, StatsBase, CSV, DataFrames

# Load Fisher's classic iris data
iris = DataFrame(CSV.File(joinpath(dirname(pathof(DataFrames)), "../docs/src/assets/iris.csv")));

# LIBSVM handles multi-class data automatically using a one-against-one strategy
labels = convert(Vector, iris[!, :Species])

# First dimension of input data is features; second is instances
instances = convert(Array, iris[!, 1:4])'

# Train SVM on half of the data using default parameters. See documentation
# of svmtrain for options
model = svmtrain(instances[:, 1:2:end], labels[1:2:end]);

# Test model on the other half of the data.
(predicted_labels, decision_values) = svmpredict(model, instances[:, 2:2:end]);

# Compute accuracy
@printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[2:2:end]))*100

Accuracy: 93.33%


In [3]:
iris = dataset("datasets", "iris")
eltype.(eachcol(iris))

5-element Array{DataType,1}:
 Float64
 Float64
 Float64
 Float64
 CategoricalString{UInt8}

In [4]:
iris = DataFrame(CSV.File(joinpath(dirname(pathof(DataFrames)), "../docs/src/assets/iris.csv")));
eltype.(eachcol(iris))

5-element Array{DataType,1}:
 Float64
 Float64
 Float64
 Float64
 String

## ScikitLearn API
You can alternatively use ScikitLearn.jl API with same options as svmtrain:

In [36]:
using LIBSVM
using RDatasets: dataset

#Classification C-SVM
iris = dataset("datasets", "iris")
labels = convert(Vector{String}, iris[!, :Species])
instances = convert(Array, iris[:, 1:4])
model = LIBSVM.fit!(SVC(), instances[1:2:end, :], labels[1:2:end])
yp = LIBSVM.predict(model, instances[2:2:end, :])

#epsilon-regression
whiteside = RDatasets.dataset("MASS", "whiteside")
X = Array(whiteside[!, :Gas])
if typeof(X) <: AbstractVector
    X = reshape(X, (length(X),1))
end
y = Array(whiteside[!, :Temp])
svrmod = LIBSVM.fit!(EpsilonSVR(cost = 10., gamma = 1.), X, y)
yp = LIBSVM.predict(svrmod, X)

56-element Array{Float64,1}:
 -0.7000113490808335
 -0.7996246408701291
  0.6186400419362288
  2.3999767521176216
  3.14483985114843
  3.14483985114843
  3.700283339414352
  5.0855501735619
  3.14483985114843
  4.399762328002755
  4.826573424201148
  4.826573424201148
  5.264403656617971
  ⋮
  4.7246204145302455
  5.000278543993352
  4.724479924584878
  4.7246204145302455
  7.399094305453358
  6.399828505025188
  7.399094305453358
  8.376502151680235
  7.899681575131432
  7.399094305453358
  8.900109144603398
  9.599863791319233

# References:
- Marathon example notebook
- [Github: LIBSVM.jl](https://github.com/mpastell/LIBSVM.jl)