# Linear support vector machines

Besides the name, linear machines can be learned with or without the use of kernels. If the problem is known to be non-linearly separable, you can still use any kernel that implements the ITransform interface to explictly transform input points into a kernel space and perform linear classification in this extended space.

## Learning without kernels

In this example, we will create a linear SVM to model a binary AND function. The binary AND is an operation that takes two bit values and computes their logical conjunction. This is the same as returning zero whenever at least one of the inputs is zero, and one when both inputs are one. The This is a linearly separable problem, and as such, should be perfectly learned by a linear vector machine.

```// Create a simple binary AND
// classification problem:

double[][] problem =
{
//             a    b    a + b
new double[] { 0,   0,     0    },
new double[] { 0,   1,     0    },
new double[] { 1,   0,     0    },
new double[] { 1,   1,     1    },
};

// Get the two first columns as the problem
// inputs and the last column as the output

// input columns
double[][] inputs = problem.GetColumns(0, 1);

// output column
int[] outputs = problem.GetColumn(2).ToInt32();

// Plot the problem on screen
ScatterplotBox.Show("AND", inputs, outputs).Hold();```

The AND problem is considered a linearly separable problem because we can easily draw a simple line that separates the red dots from the blue dots.

```// Create a L2-regularized L2-loss support vector classification algorithm
var teacher = new LinearDualCoordinateDescent()
{
Loss = Loss.L2,
Complexity = 1000,
Tolerance = 1e-5
};

// Use the algorithm to learn the machine
var svm = teacher.Learn(inputs, outputs);

// Compute the machine's answers for the learned inputs

// Convert to Int32 so we can plot:

// Plot the results
.Hold();```

As we can see, as expected, the SVM's answer perfectly match the expected results for an AND function. As such, we can see that a linear SVM was perfectly able to learn the AND function without effort.

## Learning with kernels

In this example, we will create a linear SVM to model a binary XOR function. The binary XOR is an operation that takes two bit values and computes their exclusive OR. It is a function that outputs 1 when the two values are different, and zero if they are the same. This is not a linearly separable problem, and as such, we can't expect it to be perfectly learned by a linear vector machine. However, we will see that we can still use kernels in this task in order to transform this problem from non-linearly separable to actually linearly separable, with a few extra steps.

```// Create a simple binary XOR
// classification problem:

double[][] problem =
{
//             a    b    a XOR b
new double[] { 0,   0,      0    },
new double[] { 0,   1,      1    },
new double[] { 1,   0,      1    },
new double[] { 1,   1,      0    },
};

// Get the two first columns as the problem
// inputs and the last column as the output

// input columns
double[][] inputs = problem.GetColumns(0, 1);

// output column
int[] outputs = problem.GetColumn(2).ToInt32();

// Plot the problem on screen
ScatterplotBox.Show("XOR", inputs, outputs).Hold();```

The XOR problem is considered a non-linearly separable problem because we cannot simply draw a line that separates the red dots from the blue dots.

```// Create a L2-regularized L2-loss support vector classification
var teacher = new LinearDualCoordinateDescent()
{
Loss = Loss.L2,
Complexity = 1000,
Tolerance = 1e-5
};

// Use the learning algorithm to Learn
var svm = teacher.Learn(inputs, outputs);

// Convert to Int32 so we can plot:

// Plot the results

As we can see, a linear SVM cannot learn the binary XOR function. The output the machine produces differs from what we have been expecting. This happens because the XOR is a non-linearly separable problem, and a linear support vector machine is only able to learn linearly separable problems. However, with the aid of kernels, we can attempt to transform this non-linearly separable problem into a linearly separable problem. We now we will see that we don't have to give up all of the advantages of a linear classification model in order to do that by using explicit kernel expansions instead of the kernel trick to do that.

```// Use an explicit kernel expansion to transform the
// non-linear classification problem into a linear one
//

// Project the inputs into a higher dimensionality space

At this point, we transformed our initial 2-dimensional problem into a higher dimensionality classification problem. According to Cover's theorem, the probability of a problem being linearly separable increases as the dimensionality of the space is increased when we project the original problem it into a higher-dimensional space via some non-linear transformation.

Now, all we have to do is to learn a linear SVM that can operate in this expanded higher-dimensionality space.

```// Create the same learning algorithm in the expanded input space
teacher = new LinearDualCoordinateDescent()
{
Loss = Loss.L2,
Complexity = 1000,
Tolerance = 1e-5
};

// Use the learning algorithm to Learn
svm = teacher.Learn(expansion, outputs);

// Compute the machine's answers for the learned inputs

// Convert to Int32 so we can plot:

// Plot the results

As we can see, after we apply a higher-dimensional transformation for our original input domain, the problem becomes linearly separable. And as such, we can learn a linaer machine that is able to draw an hyperplane separating the blue dots from the red dots even if they were not linearly separable in the original input space.

## Learning a LIBLINEAR/LIBSVM problem

In this example, we will use the framework's Accord.IO module to create a sparse data reader that is able to learn data written in LIBSVM's sparse sample data format. The dataset chosen for this example is available at LibSVM's website.

```// Create a new LibSVM sparse format data reader
// to read the Wisconsin's Breast Cancer dataset
//

// Read the sparse inputs and outputs from the file
Sparse<double>[] inputs = results.Item1;
double[] outputs = results.Item2;

// The dataset has output labels as 4 and 2. We have to convert them
// into negative and positive labels so they can be properly processed.
//
bool[] outputs = doubleOutputs.Apply(x => x == 2.0 ? false : true);

// Create a learning algorithm for Sparse data. The first generic argument
// of the learning algorithm below is the chosen kernel function, and the
// second is the type of inputs the machine should accept. Note that, using
// those interfaces, it is possible to define custom kernel functions that
// operate directly on double[], string[], graphs, trees or any object:
var teacher = new LinearDualCoordinateDescent<Linear, Sparse<double>>()
{
Loss = Loss.L2,
Complexity = 1000, // Create a hard-margin SVM
Tolerance = 1e-5
};

// Use the learning algorithm to Learn
var svm = teacher.Learn(inputs, outputs);

// Create a confusion matrix to show the machine's performance
var m = new ConfusionMatrix(predicted: answers, expected: outputs);

// Show it onscreen
DataGridBox.Show(new ConfusionMatrixView(m)).Hold();```

Help improve this wiki! Those pages can be edited by anyone that would like to contribute examples and documentation to the framework.

Have you found this software useful? Consider donating only U\$10 so it can get even better! This software is completely free and will always stay free. Enjoy!

##### Clone this wiki locally
You can’t perform that action at this time.
Press h to open a hovercard with more details.