In [1]:
// boring imports
var Plot = require('plotly-notebook-js');
var {loadLabelledWine} = require('./utils');

'use strict'

# Supervised Learning


![Supervised learning](images/slide_supervised.png)

### Techniques availble in mljs

 - KNN - K Nearest Neighbors
 - **SVM - Support Vector Machines**
 - Naive Bayes
 - Partial Least Squares [regression]
 - Random Forests


In [2]:
var {features, dataset} = loadLabelledWine({verbose: true});

our dataset has 178 rows and  14 columns
Alcohol | Malic Acid | Ash | Alcalinity of ash | Magnesium | Total phenols | Flavanoids | Nonflavanoid phenols | Proanthocyanins | Color intensity | Hue | OD280/OD315 of diluted wines | Proline | Class
---------------------------
14.23 | 1.71 | 2.43 | 15.6 | 127 | 2.8  | 3.06 | 0.28 | 2.29 | 5.64 | 1.04 | 3.92 | 1065 | 1
 13.2 | 1.78 | 2.14 | 11.2 | 100 | 2.65 | 2.76 | 0.26 | 1.28 | 4.38 | 1.05 | 3.4  | 1050 | 1
13.16 | 2.36 | 2.67 | 18.6 | 101 | 2.8  | 3.24 | 0.3  | 2.81 | 5.68 | 1.03 | 3.17 | 1185 | 1
14.37 | 1.95 | 2.5  | 16.8 | 113 | 3.85 | 3.49 | 0.24 | 2.18 | 7.8  | 0.86 | 3.45 | 1480 | 1
13.24 | 2.59 | 2.87 | 21   | 118 | 2.8  | 2.69 | 0.39 | 1.82 | 4.32 | 1.04 | 2.93 | 735  | 1


## Support Vector Machines


##### SVM in mljs

 - docs for the SVM module are [here](https://mljs.github.io/svm/)
 - docs for the various kernel options are [here](https://github.com/mljs/kernel)


In [3]:
var SVM = require('ml-svm');

var options = {
  C: 0.01,  // Regularisation Parameter
  tol: 10e-4,  // numerical tolerance
  maxPasses: 10,  // max number of times to iterate over alphas when no change
  maxIterations: 10000,  // max number of iterations
  kernel: 'rbf', // kernel to use see full like in kernel docs
  kernelOptions: {
    sigma: 0.5  // sets the gaussian width
  }
};

var svm = new SVM(options);

'use strict'

Next train the classifier, let's start with the original 2d feature vectors that we used in our first pass through kmeans.

To do this, just call the `svm.train(inputs, labels)` with the input feature vectors and the true class labels.

In [4]:
var preprocess = require('ml-preprocess');



'use strict'

In [5]:
// prep our input data structures
var inputs = dataset.map(d => [d[0], d[11]]);
var labels = dataset.map(d => d[13] - 2); // needs label domain of {-1,1} rather then our labels of [0,1,2]

// train the model
svm.train(inputs, labels)

console.log(labels.slice(50,60))
console.log(inputs.slice(50,60).map(i => svm.predict(i)))

[ -1, -1, -1, -1, -1, -1, -1, -1, -1, 0 ]
[ -1, -1, -1, -1, -1, -1, -1, -1, -1, -1 ]


#### Margins

In [6]:
console.log(svm.margin(inputs.slice(0,5)))

[ -1.0099802987647664,
  -0.7583221166024475,
  -0.7583221166024475,
  -0.7583221166024475,
  -0.7583221166024475 ]


#### Support Vectors

In [10]:
console.log(svm.supportVectors().slice(0,5))

[ 1, 2, 4, 5, 6 ]


#### Measure Accuracy

Use the same confusion matrix approach as earlier to compute accuracy and f1-scores

In [13]:
var ConfusionMatrix = require('ml-confusion-matrix');
var actuals = labels;
var predicted = inputs.map(i => svm.predict(i));

var C = ConfusionMatrix.fromLabels(actuals, predicted)

'use strict'

#### Discussion

 - How does you accuracy compare with our unsupervised approaches?
 - Is this too good to be tru and should you be suspicous?
 - Can you think of how to get a better accuracy measurement?