In [1]:
// boring imports
var Plot = require('plotly-notebook-js');
var {loadLabelledWine} = require('./utils');

// Added some utility functions for arrays
// prep our input data structures
Array.prototype.max = function() {
  return Math.max.apply(null, this);
};

Array.prototype.min = function() {
  return Math.min.apply(null, this);
};

[Function]

# Supervised Learning


![Supervised learning](images/slide_supervised.png)

### Techniques availble

 - KNN - K Nearest Neighbors
 - **SVM - Support Vector Machines**
 - Naive Bayes
 - Partial Least Squares [regression]



In [3]:
var {features, dataset} = loadLabelledWine({verbose: true});

our dataset has 178 rows and  14 columns
Alcohol | Malic Acid | Ash | Alcalinity of ash | Magnesium | Total phenols | Flavanoids | Nonflavanoid phenols | Proanthocyanins | Color intensity | Hue | OD280/OD315 of diluted wines | Proline | Class
---------------------------
14.23 | 1.71 | 2.43 | 15.6 | 127 | 2.8  | 3.06 | 0.28 | 2.29 | 5.64 | 1.04 | 3.92 | 1065 | 1
 13.2 | 1.78 | 2.14 | 11.2 | 100 | 2.65 | 2.76 | 0.26 | 1.28 | 4.38 | 1.05 | 3.4  | 1050 | 1
13.16 | 2.36 | 2.67 | 18.6 | 101 | 2.8  | 3.24 | 0.3  | 2.81 | 5.68 | 1.03 | 3.17 | 1185 | 1
14.37 | 1.95 | 2.5  | 16.8 | 113 | 3.85 | 3.49 | 0.24 | 2.18 | 7.8  | 0.86 | 3.45 | 1480 | 1
13.24 | 2.59 | 2.87 | 21   | 118 | 2.8  | 2.69 | 0.39 | 1.82 | 4.32 | 1.04 | 2.93 | 735  | 1


## Support Vector Machines


##### SVM in libsvm-js

 - docs for the SVM module are [here](https://github.com/mljs/libsvm)
 - docs for the various kernel options are [here](https://github.com/mljs/libsvm#SVM.KERNEL_TYPES)


In [6]:
var SVM = require('libsvm-js/asm');

var options = {
    kernel: SVM.KERNEL_TYPES.RBF, // The type of kernel I want to use
    type: SVM.SVM_TYPES.C_SVC,    // The type of SVM I want to run
    gamma: 1,                     // RBF kernel gamma parameter
    cost: 1                       // C_SVC cost parameter
};

var svm = new SVM(options);

'use strict'

Next train the classifier, let's start with the original 2d feature vectors that we used in our first pass through kmeans.

To do this, just call the `svm.train(inputs, labels)` with the input feature vectors and the true class labels.

In [7]:
var inputs = dataset.map(d => [d[0], d[10]]);
var labels = dataset.map(d => d[13]);

svm.train(inputs, labels)

console.log(labels.slice(50,70))
console.log(svm.predict(inputs.slice(50,70)))

*
optimization finished, #iter = 60
nu = 0.239281
obj = -26.718387, rho = 0.289015
nSV = 34, nBSV = 28
*
optimization finished, #iter = 50
nu = 0.381209
obj = -30.281812, rho = 0.114424
nSV = 44, nBSV = 38
*
optimization finished, #iter = 52
nu = 0.355454
obj = -34.123841, rho = -0.271579
nSV = 45, nBSV = 39
Total nSV = 89
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2 ]
[ 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2 ]


#### Support Vectors

In [8]:
console.log(svm.getSVIndices().slice(0,5))

[ 1, 2, 3, 4, 8 ]


#### Measure Accuracy

Use the same confusion matrix approach as earlier to compute accuracy and f1-scores

In [9]:
var ConfusionMatrix = require('ml-confusion-matrix');

var actuals = labels;
var predicted = svm.predict(inputs)

var C = ConfusionMatrix.fromLabels(actuals, predicted)

var M = C.getMatrix();
var trace = { 
    x: [0,1,2],
    y: [0,1,2],
    z: M,
    type: 'heatmap',
    showscale: false,
    colorscale:[[0, '#3D9970'], [100, '#001f3f']]
};

console.info(C.getAccuracy())
var annotations = [];

M.map((a,y) => {
    a.map((b,x) => {
        annotations.push(
            {
                x: x,
                y: y,
                text: M[y][x],
                font: {
                    family: 'Arial',
                    size: 12,
                    color: 'white'
                  },
                showarrow: false
            }
        )
    })
})

var layout = { 
    xaxis: { title: "predicted", side: 'top' },
    yaxis: { title: "actuals", nticks: 6, autosize: false, autorange: 'reversed' },
    annotations,
    width: 500, height: 500};

var Plot = require('plotly-notebook-js');
var table = require('text-table');

$$html$$ = Plot.createPlot([trace], layout).render();

0.8707865168539326


#### Discussion

 - How does you accuracy compare with our unsupervised approaches?
 - Is this too good to be tru and should you be suspicous?
 - Can you think of how to get a better accuracy measurement?