# Chapter 9 - Problem 8

This problem involves the OJ data set which is part of the ISLR
package.

**A.** Create a training set containing a random sample of 800 observations, 
and a test set containing the remaining observations.

In [3]:
library(ISLR)
train <- sample(1:nrow(OJ),800)
train.set <- OJ[train,]
test.set <- OJ[-train,]

**B.** Fit a support vector classifier to the training data using
cost=0.01, with Purchase as the response and the other variables
as predictors. Use the summary() function to produce summary
statistics, and describe the results obtained.

In [5]:
library(e1071)
svmfit <- svm(Purchase~., data = train.set,kernel = "linear", cost = 0.01)
summary(svmfit)


Call:
svm(formula = Purchase ~ ., data = train.set, kernel = "linear", 
    cost = 0.01)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  linear 
       cost:  0.01 
      gamma:  0.05555556 

Number of Support Vectors:  447

 ( 224 223 )


Number of Classes:  2 

Levels: 
 CH MM




**C.** What are the training and test error rates?

In [8]:
train.pred <- predict(svmfit,train.set)
table(train.pred,train.set$Purchase)

          
train.pred  CH  MM
        CH 418  80
        MM  60 242

In [None]:
Training misclassification error is 17.5%

In [10]:
test.pred <- predict(svmfit,newdata = test.set)
table(test.pred,test.set$Purchase)

         
test.pred  CH  MM
       CH 160  27
       MM  15  68

In [None]:
Test misclassification error is 18.26%

**D.** Use the tune() function to select an optimal cost. Consider values
in the range 0.01 to 10.

In [15]:
cost = seq(0.01,10,length = 20)
linear.tune = tune(svm, Purchase ~ ., data = train.set, kernel = "linear", ranges = list(cost = cost))
summary(linear.tune)


Parameter tuning of 'svm':

- sampling method: 10-fold cross validation 

- best parameters:
     cost
 1.587368

- best performance: 0.17 

- Detailed performance results:
         cost   error dispersion
1   0.0100000 0.18250 0.04684490
2   0.5357895 0.17500 0.04409586
3   1.0615789 0.17250 0.04281744
4   1.5873684 0.17000 0.04048319
5   2.1131579 0.17625 0.03972562
6   2.6389474 0.17500 0.04082483
7   3.1647368 0.17750 0.04241004
8   3.6905263 0.17750 0.04241004
9   4.2163158 0.17625 0.04143687
10  4.7421053 0.17750 0.04281744
11  5.2678947 0.17750 0.04281744
12  5.7936842 0.18000 0.04338138
13  6.3194737 0.17625 0.03928617
14  6.8452632 0.17500 0.03818813
15  7.3710526 0.17625 0.03701070
16  7.8968421 0.17625 0.03701070
17  8.4226316 0.17625 0.03701070
18  8.9484211 0.17625 0.03701070
19  9.4742105 0.17625 0.03701070
20 10.0000000 0.17625 0.03701070


The cross validation error is minimized at a cost of 1.5 

**E.** Compute the training and test error rates using this new value
for cost.

In [19]:
svm.linear <- svm(Purchase~., data = train.set,kernel = "linear", cost = 1.5)
train.pred <- predict(svm.linear,train.set)
table(train.pred,train.set$Purchase)

          
train.pred  CH  MM
        CH 417  72
        MM  61 250

Training misclassification error is 16.6%.

In [21]:
test.pred <- predict(svm.linear,newdata = test.set)
table(test.pred,test.set$Purchase)

         
test.pred  CH  MM
       CH 159  26
       MM  16  69

Training misclassification error is 15.5%. The change in cost slighlty improved the classifier.

**F.** Repeat parts (b) through (e) using a support vector machine
with a radial kernel. Use the default value for gamma.

In [24]:
svmfit.radial <- svm(Purchase~., data = train.set,kernel = "radial", cost = 0.01)
summary(svmfit.radial)


Call:
svm(formula = Purchase ~ ., data = train.set, kernel = "radial", 
    cost = 0.01)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  0.01 
      gamma:  0.05555556 

Number of Support Vectors:  646

 ( 324 322 )


Number of Classes:  2 

Levels: 
 CH MM




In [25]:
train.pred <- predict(svmfit.radial,train.set)
table(train.pred,train.set$Purchase)

          
train.pred  CH  MM
        CH 478 322
        MM   0   0

In [26]:
The training misclassification error is 67.3%.

In [27]:
test.pred <- predict(svmfit.radial,newdata = test.set)
table(test.pred,test.set$Purchase)

         
test.pred  CH  MM
       CH 175  95
       MM   0   0

The test misclassification error is 41.3%.

In [29]:
radial.tune = tune(svm, Purchase ~ ., data = train.set, kernel = "radial", ranges = list(cost = cost))
summary(radial.tune)


Parameter tuning of 'svm':

- sampling method: 10-fold cross validation 

- best parameters:
     cost
 1.061579

- best performance: 0.1775 

- Detailed performance results:
         cost   error dispersion
1   0.0100000 0.40250 0.04241004
2   0.5357895 0.17875 0.03175973
3   1.0615789 0.17750 0.03763863
4   1.5873684 0.18000 0.03545341
5   2.1131579 0.18500 0.03374743
6   2.6389474 0.18875 0.03458584
7   3.1647368 0.18875 0.03304563
8   3.6905263 0.19000 0.03050501
9   4.2163158 0.18750 0.03227486
10  4.7421053 0.18625 0.03143004
11  5.2678947 0.18625 0.03304563
12  5.7936842 0.18625 0.03408018
13  6.3194737 0.18625 0.03408018
14  6.8452632 0.18750 0.03435921
15  7.3710526 0.18750 0.03435921
16  7.8968421 0.19125 0.03586723
17  8.4226316 0.19250 0.03496029
18  8.9484211 0.19250 0.03496029
19  9.4742105 0.19250 0.03496029
20 10.0000000 0.19375 0.03644345


The minimum cross validation error is attained at cost = 1

In [30]:
svm.radial <- svm(Purchase~., data = train.set,kernel = "radial", cost = 1)
train.pred <- predict(svm.radial,train.set)
table(train.pred,train.set$Purchase)

          
train.pred  CH  MM
        CH 434  74
        MM  44 248

The training misclassification error is 17.3%.

In [32]:
test.pred <- predict(svm.radial,newdata = test.set)
table(test.pred,test.set$Purchase)

         
test.pred  CH  MM
       CH 163  27
       MM  12  68

The test misclassification error is 16.9%. A dramatic improvement of the SVM with radial kernel when tuning the cost parameter.

**G.** Repeat parts (b) through (e) using a support vector machine
with a polynomial kernel. Set degree=2.

In [35]:
svmfit.poly <- svm(Purchase~., data = train.set,kernel = "polynomial", cost = 0.01)
summary(svmfit.poly)


Call:
svm(formula = Purchase ~ ., data = train.set, kernel = "polynomial", 
    cost = 0.01)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  polynomial 
       cost:  0.01 
     degree:  3 
      gamma:  0.05555556 
     coef.0:  0 

Number of Support Vectors:  635

 ( 319 316 )


Number of Classes:  2 

Levels: 
 CH MM




In [36]:
train.pred <- predict(svmfit.poly,train.set)
table(train.pred,train.set$Purchase)

          
train.pred  CH  MM
        CH 472 297
        MM   6  25

The training misclassification error is 37.8%.

In [38]:
test.pred <- predict(svmfit.poly,newdata = test.set)
table(test.pred,test.set$Purchase)

         
test.pred  CH  MM
       CH 174  91
       MM   1   4

The test misclassification error is 40%.

In [41]:
poly.tune = tune(svm, Purchase ~ ., data = train.set, kernel = "polynomial", ranges = list(cost = cost))
summary(poly.tune)


Parameter tuning of 'svm':

- sampling method: 10-fold cross validation 

- best parameters:
     cost
 2.638947

- best performance: 0.1875 

- Detailed performance results:
         cost   error dispersion
1   0.0100000 0.37875 0.05434266
2   0.5357895 0.20875 0.04931827
3   1.0615789 0.20625 0.03644345
4   1.5873684 0.19375 0.04299952
5   2.1131579 0.19250 0.04133199
6   2.6389474 0.18750 0.03726780
7   3.1647368 0.18750 0.03173239
8   3.6905263 0.18750 0.03435921
9   4.2163158 0.19125 0.03175973
10  4.7421053 0.19625 0.03634805
11  5.2678947 0.19750 0.03622844
12  5.7936842 0.20000 0.03385016
13  6.3194737 0.20125 0.03143004
14  6.8452632 0.20000 0.03061862
15  7.3710526 0.19875 0.03251602
16  7.8968421 0.19875 0.03143004
17  8.4226316 0.19625 0.03537988
18  8.9484211 0.19625 0.03537988
19  9.4742105 0.19750 0.03574602
20 10.0000000 0.19875 0.03653860


The minimum cross validation error is attained at cost = 3.

In [42]:
svm.poly <- svm(Purchase~., data = train.set,kernel = "radial", cost = 3)
train.pred <- predict(svm.poly, train.set)
table(train.pred,train.set$Purchase)

          
train.pred  CH  MM
        CH 434  75
        MM  44 247

The training misclassification error is 14.8%.

In [44]:
test.pred <- predict(svm.poly,newdata = test.set)
table(test.pred,test.set$Purchase)

         
test.pred  CH  MM
       CH 160  29
       MM  15  66

The test misclassification error is 19.1%. A dramatic improvement due to tuning the cost parameter.

**H.** Overall, which approach seems to give the best results on this
data?

The best results appeared to be given by the support vector classifier. It has the lowest test misclassification error and it is the simplest model of them all.