Linear Support Vector Machines

César Souza edited this page Apr 6, 2015 · 1 revision
  1. Accord.NET Framework
  2. Getting started
  3. Published books
  4. How to use
  5. Sample applications

Help improve this wiki! Those pages can be edited by anyone that would like to contribute examples and documentation to the framework.

Have you found this software useful? Consider donating only U$10 so it can get even better! This software is completely free and will always stay free. Enjoy!

Donate

Clone this wiki locally

Linear support vector machines

Besides the name, linear machines can be learned with or without the use of kernels. If the problem is known to be non-linearly separable, you can still use any kernel that implements the ITransform interface to explictly transform input points into a kernel space and perform linear classification in this extended space.

  • Download the sample project containing the all source code below and the framework's NuGet packages.

Learning without kernels

In this example, we will create a linear SVM to model a binary AND function. The binary AND is an operation that takes two bit values and computes their logical conjunction. This is the same as returning zero whenever at least one of the inputs is zero, and one when both inputs are one. The This is a linearly separable problem, and as such, should be perfectly learned by a linear vector machine.

// Create a simple binary AND
// classification problem:

double[][] problem =
{
    //             a    b    a + b
    new double[] { 0,   0,     0    },
    new double[] { 0,   1,     0    },
    new double[] { 1,   0,     0    },
    new double[] { 1,   1,     1    },
};

// Get the two first columns as the problem
// inputs and the last column as the output

// input columns
double[][] inputs = problem.GetColumns(0, 1);

// output column
int[] outputs = problem.GetColumn(2).ToInt32();

// Plot the problem on screen
ScatterplotBox.Show("AND", inputs, outputs).Hold();

Binary AND classification problem

The AND problem is considered a linearly separable problem because we can easily draw a simple line that separates the red dots from the blue dots.

// However, SVMs expect the output value to be
// either -1 or +1. As such, we have to convert
// it so the vector contains { -1, -1, -1, +1 }:
//
outputs = outputs.Apply(x => x == 0 ? -1 : 1);


// Create a new linear-SVM for two inputs (a and b)
SupportVectorMachine svm = new SupportVectorMachine(inputs: 2);

// Create a L2-regularized L2-loss support vector classification
var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs)
{
    Loss = Loss.L2,
    Complexity = 1000,
    Tolerance = 1e-5
};

// Learn the machine
double error = teacher.Run(computeError: true);

// Compute the machine's answers for the learned inputs
int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x)));

// Plot the results
ScatterplotBox.Show("SVM's answer", inputs, answers).Hold();

SVM's outputs for the AND classification problem

As we can see, as expected, the SVM's answer perfectly match the expected results for an AND function. As such, we can see that a linear SVM was perfectly able to learn the AND function without effort.

Learning with kernels

In this example, we will create a linear SVM to model a binary XOR function. The binary XOR is an operation that takes two bit values and computes their exclusive OR. It is a function that outputs 1 when the two values are different, and zero if they are the same. This is not a linearly separable problem, and as such, we can't expect it to be perfectly learned by a linear vector machine. However, we will see that we can still use kernels in this task in order to transform this problem from non-linearly separable to actually linearly separable, with a few extra steps.

// Create a simple binary XOR
// classification problem:

double[][] problem =
{
    //             a    b    a XOR b
    new double[] { 0,   0,      0    },
    new double[] { 0,   1,      1    },
    new double[] { 1,   0,      1    },
    new double[] { 1,   1,      0    },
};

// Get the two first columns as the problem
// inputs and the last column as the output

// input columns
double[][] inputs = problem.GetColumns(0, 1);

// output column
int[] outputs = problem.GetColumn(2).ToInt32();

// Plot the problem on screen
ScatterplotBox.Show("XOR", inputs, outputs).Hold();

XOR binary classification problem

The XOR problem is considered a non-linearly separable problem because we cannot simply draw a line that separates the red dots from the blue dots.

// However, SVMs expect the output value to be
// either -1 or +1. As such, we have to convert
// it so the vector contains { -1, -1, -1, +1 }:
//
outputs = outputs.Apply(x => x == 0 ? -1 : 1);


// Create a new linear-SVM for two inputs (a and b)
SupportVectorMachine svm = new SupportVectorMachine(inputs: 2);

// Create a L2-regularized L2-loss support vector classification
var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs)
{
    Loss = Loss.L2,
    Complexity = 1000,
    Tolerance = 1e-5
};

// Learn the machine
double error = teacher.Run(computeError: true);

// Compute the machine's answers for the learned inputs
int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x)));

// Plot the results
ScatterplotBox.Show("SVM's answer", inputs, answers).Hold();

SVM's outputs for the XOR classification problem

As we can see, a linear SVM cannot learn the binary XOR function. The output the machine produces differs from what we have been expecting. This happens because the XOR is a non-linearly separable problem, and a linear support vector machine is only able to learn linearly separable problems. However, with the aid of kernels, we can attempt to transform this non-linearly separable problem into a linearly separable problem. We now we will see that we don't have to give up all of the advantages of a linear classification model in order to do that by using explicit kernel expansions instead of the kernel trick to do that.

// Use an explicit kernel expansion to transform the 
// non-linear classification problem into a linear one
//
// Create a quadratic kernel
Quadratic quadratic = new Quadratic(constant: 1);

// Project the inptus into a higher dimensionality space
double[][] expansion = inputs.Apply(quadratic.Transform);

At this point, we transformed our initial 2-dimensional problem into a higher dimensionality classification problem. According to Cover's theorem, the probability of a problem being linearly separable increases as the dimensionality of the space is increased when we project the original problem it into a higher-dimensional space via some non-linear transformation.

Now, all we have to do is to learn a linear SVM that can operate in this expanded higher-dimensionality space.

// Create a new linear-SVM for the transformed input space
svm = new SupportVectorMachine(inputs: expansion[0].Length);

// Create the same learning algorithm in the expanded input space
teacher = new LinearDualCoordinateDescent(svm, expansion, outputs)
{
    Loss = Loss.L2,
    Complexity = 1000,
    Tolerance = 1e-5
};

// Learn the machine
error = teacher.Run(computeError: true); 

// Compute the machine's answers for the learned inputs
answers = expansion.Apply(x => Math.Sign(svm.Compute(x)));

// Plot the results
ScatterplotBox.Show("SVM's answer", inputs, answers).Hold();

SVM's outputs for the XOR classification problem after the non-linear transformation

As we can see, after we apply a higher-dimensional transformation for our original input domain, the problem becomes linearly separable. And as such, we can learn a linaer machine that is able to draw an hyperplane separating the blue dots from the red dots even if they were not linearly separable in the original input space.

Learning a LIBLINEAR/LIBSVM problem

In this example, we will use the framework's Accord.IO module to create a sparse data reader that is able to learn data written in LIBSVM's sparse sample data format. The dataset chosen for this example is available at LibSVM's website.

// Create a new LibSVM sparse format data reader
// to read the Wisconsin's Breast Cancer dataset
//
var reader = new SparseReader("breast-cancer_scale.txt");

int[] outputs; // Read the classification problem into dense memory
double[][] inputs = reader.ReadToEnd(sparse: false, labels: out outputs);

// The dataset has output labels as 4 and 2. We have to convert them
// into negative and positive labels so they can be properly processed.
//
outputs = outputs.Apply(x => x == 2 ? -1 : +1);

// Create a new linear-SVM for the problem dimensions
var svm = new SupportVectorMachine(inputs: reader.Dimensions);

// Create a learning algorithm for the problem's dimensions
var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs)
{
    Loss = Loss.L2,
    Complexity = 1000,
    Tolerance = 1e-5
};

// Learn the classification
double error = teacher.Run();

// Compute the machine's answers for the learned inputs
int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x)));

// Create a confusion matrix to show the machine's performance
var m = new ConfusionMatrix(predicted: answers, expected: outputs);

// Show it onscreen
DataGridBox.Show(new ConfusionMatrixView(m));

SVM's confusion matrix for Wisconsin's Breast Cancer dataset