# Support Vector Machine (Classification) Hello World

An example application of Support Vector Machine Classification (SVM/SVC)

![image.png](attachment:image.png)

**Reference:**
- https://www.datacamp.com/community/tutorials/svm-classification-scikit-learn-python
- https://scikit-learn.org/stable/modules/svm.html#classification


In [1]:
# Dataset

# Load dataset from scikit-learn dataset library
from sklearn import datasets
cancer = datasets.load_breast_cancer()


In [2]:
# print the names of the 13 features
print("Features: ", cancer.feature_names)

# print the label type of cancer('malignant' 'benign')
print("Labels: ", cancer.target_names)


# print the first 5 cancer data features data
print(cancer.data[0:5])

# print the first 5 cancer labels / result (0:malignant, 1:benign)
print(cancer.target[0:5])


Features:  ['mean radius' 'mean texture' 'mean perimeter' 'mean area'
 'mean smoothness' 'mean compactness' 'mean concavity'
 'mean concave points' 'mean symmetry' 'mean fractal dimension'
 'radius error' 'texture error' 'perimeter error' 'area error'
 'smoothness error' 'compactness error' 'concavity error'
 'concave points error' 'symmetry error' 'fractal dimension error'
 'worst radius' 'worst texture' 'worst perimeter' 'worst area'
 'worst smoothness' 'worst compactness' 'worst concavity'
 'worst concave points' 'worst symmetry' 'worst fractal dimension']
Labels:  ['malignant' 'benign']
[[1.799e+01 1.038e+01 1.228e+02 1.001e+03 1.184e-01 2.776e-01 3.001e-01
  1.471e-01 2.419e-01 7.871e-02 1.095e+00 9.053e-01 8.589e+00 1.534e+02
  6.399e-03 4.904e-02 5.373e-02 1.587e-02 3.003e-02 6.193e-03 2.538e+01
  1.733e+01 1.846e+02 2.019e+03 1.622e-01 6.656e-01 7.119e-01 2.654e-01
  4.601e-01 1.189e-01]
 [2.057e+01 1.777e+01 1.329e+02 1.326e+03 8.474e-02 7.864e-02 8.690e-02
  7.017e-02 1.812e-

In [3]:
# Split dataset into two sets: training set & testing set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = \
    train_test_split(cancer.data, cancer.target, test_size=0.3) # 70% training and 30% test


In [4]:
# Import SVM model
from sklearn import svm

# Create a SVM Classifier
clf = svm.SVC(kernel='linear') # can be linear or non-linear (polynomial, rbf, sigmoid)

# Train the model using the training sets
clf.fit(X_train, y_train)

# Predict the response for test dataset
y_pred = clf.predict(X_test)


In [5]:
# Evalaute accuracy with scikit-learn metrics modules
from sklearn import metrics
print("Accuracy: ", metrics.accuracy_score(y_test, y_pred))
print("Precision: ",metrics.precision_score(y_test, y_pred))
print("Recall:", metrics.recall_score(y_test, y_pred))
print("F1 score: ", metrics.f1_score(y_test, y_pred))


Accuracy:  0.9590643274853801
Precision:  0.9734513274336283
Recall: 0.9649122807017544
F1 score:  0.9691629955947135
