# 乳癌資料庫預測SVM分類
>使用scikit-learn 機器學習套件裡的SVR演算法

* (一)引入函式庫及內建乳癌資料集<br>
引入之函式庫如下<br>
sklearn.datasets: 用來匯入內建之乳癌資料集`datasets.load_breast_cancer()`<br>
sklearn.SVR: 支持向量機回歸分析之演算法<br>
matplotlib.pyplot: 用來繪製影像

In [1]:
from sklearn import svm
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

## Step1. 下載資料

In [35]:
#我換了一個資料集 - 乳癌資料集無法成功

digits=datasets.load_digits()

In [36]:
digits

{'data': array([[ 0.,  0.,  5., ...,  0.,  0.,  0.],
        [ 0.,  0.,  0., ..., 10.,  0.,  0.],
        [ 0.,  0.,  0., ..., 16.,  9.,  0.],
        ...,
        [ 0.,  0.,  1., ...,  6.,  0.,  0.],
        [ 0.,  0.,  2., ..., 12.,  0.,  0.],
        [ 0.,  0., 10., ..., 12.,  1.,  0.]]),
 'target': array([0, 1, 2, ..., 8, 9, 8]),
 'frame': None,
 'feature_names': ['pixel_0_0',
  'pixel_0_1',
  'pixel_0_2',
  'pixel_0_3',
  'pixel_0_4',
  'pixel_0_5',
  'pixel_0_6',
  'pixel_0_7',
  'pixel_1_0',
  'pixel_1_1',
  'pixel_1_2',
  'pixel_1_3',
  'pixel_1_4',
  'pixel_1_5',
  'pixel_1_6',
  'pixel_1_7',
  'pixel_2_0',
  'pixel_2_1',
  'pixel_2_2',
  'pixel_2_3',
  'pixel_2_4',
  'pixel_2_5',
  'pixel_2_6',
  'pixel_2_7',
  'pixel_3_0',
  'pixel_3_1',
  'pixel_3_2',
  'pixel_3_3',
  'pixel_3_4',
  'pixel_3_5',
  'pixel_3_6',
  'pixel_3_7',
  'pixel_4_0',
  'pixel_4_1',
  'pixel_4_2',
  'pixel_4_3',
  'pixel_4_4',
  'pixel_4_5',
  'pixel_4_6',
  'pixel_4_7',
  'pixel_5_0',
  'pixel_5_1',
 

In [37]:
X=digits.data
y=digits.target

In [38]:
X

array([[ 0.,  0.,  5., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ..., 10.,  0.,  0.],
       [ 0.,  0.,  0., ..., 16.,  9.,  0.],
       ...,
       [ 0.,  0.,  1., ...,  6.,  0.,  0.],
       [ 0.,  0.,  2., ..., 12.,  0.,  0.],
       [ 0.,  0., 10., ..., 12.,  1.,  0.]])

## Step2. 區分訓練集與測試集

In [39]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3)


In [40]:
X_train
X_train.shape

(1257, 64)

In [41]:
X_test
X_test.shape

(540, 64)

# Step3. 建模
```
C = 懲罰函數
gamma = 決定支援向量的多寡，影響訓練速度、預測速度
kernel = 決定不同的核函數
```

In [42]:
clf = svm.SVC(kernel="poly", gamma="auto", C=100)
clf.fit(X_train, y_train)

SVC(C=100, gamma='auto', kernel='poly')

## Step4. 預測

```

```


In [43]:
clf.predict(X_test)

array([7, 5, 0, 1, 0, 7, 9, 0, 5, 3, 2, 4, 0, 5, 5, 1, 6, 4, 1, 7, 0, 7,
       0, 0, 7, 8, 4, 2, 5, 6, 5, 1, 6, 0, 3, 0, 5, 0, 2, 0, 6, 0, 9, 9,
       6, 8, 7, 0, 0, 2, 7, 8, 3, 3, 9, 4, 8, 0, 8, 2, 0, 2, 2, 9, 2, 7,
       2, 3, 6, 4, 6, 9, 9, 0, 1, 1, 5, 8, 3, 6, 9, 9, 4, 0, 2, 5, 6, 8,
       6, 4, 2, 5, 1, 6, 1, 5, 0, 8, 3, 9, 8, 9, 7, 1, 0, 1, 8, 2, 9, 8,
       4, 5, 6, 7, 6, 1, 8, 6, 9, 0, 7, 4, 3, 0, 9, 6, 2, 1, 0, 8, 9, 5,
       4, 0, 1, 6, 9, 3, 1, 9, 3, 0, 0, 5, 1, 0, 2, 8, 4, 8, 2, 7, 0, 8,
       4, 5, 1, 9, 2, 9, 5, 8, 0, 5, 7, 6, 5, 7, 2, 4, 3, 0, 8, 2, 7, 3,
       8, 5, 7, 8, 8, 9, 3, 1, 5, 7, 3, 7, 5, 1, 9, 4, 3, 7, 9, 3, 3, 7,
       2, 4, 7, 4, 1, 8, 7, 3, 3, 3, 6, 6, 5, 0, 2, 0, 5, 3, 5, 8, 4, 3,
       3, 7, 6, 3, 4, 5, 3, 1, 9, 5, 0, 7, 5, 4, 8, 2, 7, 1, 7, 4, 4, 7,
       6, 1, 9, 1, 2, 6, 7, 5, 1, 8, 4, 5, 4, 6, 1, 5, 5, 7, 4, 1, 0, 0,
       7, 4, 6, 0, 3, 1, 8, 9, 7, 2, 2, 7, 8, 7, 7, 5, 4, 9, 1, 9, 3, 6,
       0, 3, 8, 9, 9, 7, 8, 8, 7, 6, 7, 2, 1, 7, 8,

## Step5. 準確度分析

In [45]:
print(clf.score(X_train, y_train))
print(clf.score(X_test, y_test))

1.0
0.9907407407407407
