In [1]:
library(e1071)
library(ggplot2)
library(class)
library(caret)

"package 'e1071' was built under R version 3.6.3"Registered S3 methods overwritten by 'ggplot2':
  method         from 
  [.quosures     rlang
  c.quosures     rlang
  print.quosures rlang
"package 'caret' was built under R version 3.6.3"Loading required package: lattice


In [2]:
data(iris)

In [3]:
head(iris)

Sepal.Length,Sepal.Width,Petal.Length,Petal.Width,Species
5.1,3.5,1.4,0.2,setosa
4.9,3.0,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5.0,3.6,1.4,0.2,setosa
5.4,3.9,1.7,0.4,setosa


In [4]:
cat("Number of rows and coloumns:", nrow(iris), ncol(iris))

Number of rows and coloumns: 150 5

In [5]:
base::table(iris$Species) 
apply(iris, 2, function(x) sum(is.na(x)))


    setosa versicolor  virginica 
        50         50         50 

In [6]:
SEED <- 2021
set.seed(SEED)
in_train <- createDataPartition(iris$Species, p=0.8, list=FALSE)
ndf_train <- iris[in_train, ]
ndf_test <- iris[-in_train, ]
x_train <- subset(ndf_train, select=-Species)
y_train <- ndf_train$Species
x_test <- subset(ndf_test, select=-Species)
y_test <- ndf_test$Species

In [7]:
cat("Train size is:", dim(x_train),length(y_train),"\n")
cat("Test size is:", dim(x_test),length(y_test))

Train size is: 120 4 120 
Test size is: 30 4 30

In [8]:
svm_model <- svm(x_train,y_train)
summary(svm_model)


Call:
svm.default(x = x_train, y = y_train)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  1 

Number of Support Vectors:  46

 ( 8 20 18 )


Number of Classes:  3 

Levels: 
 setosa versicolor virginica




In [9]:
pred <- predict(svm_model,x_test)
base::table(pred,y_test)

            y_test
pred         setosa versicolor virginica
  setosa         10          0         0
  versicolor      0          8         0
  virginica       0          2        10

In [10]:
svm_tune <- tune(svm, train.x=x_train, train.y=y_train, 
              kernel="radial", ranges=list(cost=10^(-1:2), gamma=c(.5,1,2)))

In [11]:
summary(svm_tune)


Parameter tuning of 'svm':

- sampling method: 10-fold cross validation 

- best parameters:
 cost gamma
    1   0.5

- best performance: 0.04166667 

- Detailed performance results:
    cost gamma      error dispersion
1    0.1   0.5 0.07500000 0.08286908
2    1.0   0.5 0.04166667 0.05892557
3   10.0   0.5 0.05000000 0.07027284
4  100.0   0.5 0.05000000 0.07027284
5    0.1   1.0 0.07500000 0.08286908
6    1.0   1.0 0.05833333 0.07905694
7   10.0   1.0 0.05833333 0.07905694
8  100.0   1.0 0.05833333 0.07905694
9    0.1   2.0 0.27500000 0.18023476
10   1.0   2.0 0.05833333 0.07905694
11  10.0   2.0 0.05833333 0.07905694
12 100.0   2.0 0.05833333 0.07905694


In [12]:
svm_after_tune <- svm(x_train, y_train, kernel="radial", cost=1, gamma=0.5)
summary(svm_after_tune)


Call:
svm.default(x = x_train, y = y_train, kernel = "radial", gamma = 0.5, 
    cost = 1)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  1 

Number of Support Vectors:  51

 ( 12 18 21 )


Number of Classes:  3 

Levels: 
 setosa versicolor virginica




In [13]:
pred <- predict(svm_after_tune,x_test)
confm = base::table(pred,y_test)

In [14]:
cat("Test accuracy: ", sum(diag(confm))/sum(confm))

Test accuracy:  0.9333333