### Get the Data

In [3]:
df <- read.csv('cancer.csv')

In [4]:
head(df)

mean.radius,mean.texture,mean.perimeter,mean.area,mean.smoothness,mean.compactness,mean.concavity,mean.concave.points,mean.symmetry,mean.fractal.dimension,...,worst.texture,worst.perimeter,worst.area,worst.smoothness,worst.compactness,worst.concavity,worst.concave.points,worst.symmetry,worst.fractal.dimension,target
17.99,10.38,122.8,1001.0,0.1184,0.2776,0.3001,0.1471,0.2419,0.07871,...,17.33,184.6,2019.0,0.1622,0.6656,0.7119,0.2654,0.4601,0.1189,0
20.57,17.77,132.9,1326.0,0.08474,0.07864,0.0869,0.07017,0.1812,0.05667,...,23.41,158.8,1956.0,0.1238,0.1866,0.2416,0.186,0.275,0.08902,0
19.69,21.25,130.0,1203.0,0.1096,0.1599,0.1974,0.1279,0.2069,0.05999,...,25.53,152.5,1709.0,0.1444,0.4245,0.4504,0.243,0.3613,0.08758,0
11.42,20.38,77.58,386.1,0.1425,0.2839,0.2414,0.1052,0.2597,0.09744,...,26.5,98.87,567.7,0.2098,0.8663,0.6869,0.2575,0.6638,0.173,0
20.29,14.34,135.1,1297.0,0.1003,0.1328,0.198,0.1043,0.1809,0.05883,...,16.67,152.2,1575.0,0.1374,0.205,0.4,0.1625,0.2364,0.07678,0
12.45,15.7,82.57,477.1,0.1278,0.17,0.1578,0.08089,0.2087,0.07613,...,23.75,103.4,741.6,0.1791,0.5249,0.5355,0.1741,0.3985,0.1244,0


In [6]:
colnames(df)

### Building the Model
We'll need the e1071 library.

In [7]:
#install.packages('e1071',repos = 'http://cran.us.r-project.org')

In [8]:
library(e1071)

In [9]:
help(svm)

#### Train Test split 

In [36]:
df$target <- factor(df$target)

In [37]:
library(caTools)
set.seed(101)

split = sample.split(df$target, SplitRatio = 0.70)

final.train = subset(df, split == TRUE)
final.test = subset(df, split == FALSE)

### Model on Train data

In [38]:
model <- svm(target ~ ., data=final.train)

In [39]:
summary(model)


Call:
svm(formula = target ~ ., data = final.train)


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  1 
      gamma:  0.03333333 

Number of Support Vectors:  98

 ( 51 47 )


Number of Classes:  2 

Levels: 
 0 1




### Example Predictions

In [42]:
predicted.values.test <- predict(model,final.test)

In [43]:
table(predicted.values.test,final.test$target)

                     
predicted.values.test   0   1
                    0  63   1
                    1   1 106

### Advanced - Tuning

In [50]:
x.train <- subset(final.train, select = -c(target))
y.train <- subset(final.train, select = c(target))

In [53]:
  obj <- tune(svm, target~., data = final.train, 
              ranges = list(gamma = 2^(-1:1), cost = 2^(2:4)),
              tunecontrol = tune.control(sampling = "fix")
             )

In [55]:
summary(obj)
  


Parameter tuning of 'svm':

- sampling method: fixed training/validation set 

- best parameters:
 gamma cost
   0.5    4

- best performance: 0.2556391 

- Detailed performance results:
  gamma cost     error dispersion
1   0.5    4 0.2556391         NA
2   1.0    4 0.3533835         NA
3   2.0    4 0.3609023         NA
4   0.5    8 0.2556391         NA
5   1.0    8 0.3533835         NA
6   2.0    8 0.3609023         NA
7   0.5   16 0.2556391         NA
8   1.0   16 0.3533835         NA
9   2.0   16 0.3609023         NA


In [56]:
help(tune)

In [58]:
obj$best.model


Call:
best.tune(method = svm, train.x = target ~ ., data = final.train, 
    ranges = list(gamma = 2^(-1:1), cost = 2^(2:4)), tunecontrol = tune.control(sampling = "fix"))


Parameters:
   SVM-Type:  C-classification 
 SVM-Kernel:  radial 
       cost:  4 
      gamma:  0.5 

Number of Support Vectors:  374


In [59]:
predicted.values.test <- predict(obj$best.model,final.test)

In [60]:
table(predicted.values.test,final.test$target)

                     
predicted.values.test   0   1
                    0  44   2
                    1  20 105