How to use

César Souza edited this page Oct 18, 2017 · 12 revisions

How to use

Loading data

If you plan to load data into your application (which is most likely the case), then you should consider adding a reference to the Accord.IO library into your project. This library provides data readers for formats like Excel, comma-separated values, matlab matrix files, LibSVM and LibLinear's data formats, and others. If you are interested in loading sounds into your application, consider adding a reference to the Accord.Audio.Formats library.

Alternatively, if you would like to test your algorithms against some well known datasets or data collections, you can also include a reference to the Accord.DataSets library into your project. This library provides classes that can automatically download popular data collections from the web and convert them into formats used across the framework. For example, using dataset classes, you can effortlessly obtain the Free Spoken Digits Dataset, the Iris dataset, the Pendigits dataset, MNIST, and many others.

Tables

Excel and Excel compatible files (such as .csv)

To import a table from Excel or .CSV files, you can use the ExcelReader class class. To load the contents of "Sheet1" located in "worksheet.xlsx", use:

DataTable table = new ExcelReader("worksheet.xlsx").GetWorksheet("Sheet1");

In order to convert a DataTable to a framework matrix, we can use

double[,] matrix = table.ToMatrix();

If you would like to convert a DataTable to a jagged array, we can use

double[][] jagged = table.ToJagged();

Note: If you plan to read files from the latest versions of Microsoft Excel in .xls or .xlsx formats, please be sure to have the Microsoft Access Database Engine 2010 Redistributable, available here. In order to use ACE in both 32-bit and 64-bit applications, you need to install both redistributables from the Microsoft website. To install them both, you need to use the /passive command line switch to prevent the installation from failing once it detects another component version already installed. After downloading the executables, run the following two commands:

C:\Users\You\Downloads\AccessDatabaseEngine.exe /passive
C:\Users\You\Downloads\AccessDatabaseEngine_x64.exe /passive

Matrices

The framework can load any MATLAB-compatible .mat file using the MatReader class. It can also parse matrices written in MATLAB, Octave, Mathematica and C#/NET formats directly from text. Examples are shown below.

// From free-text format
double[,] a = Matrix.Parse(@"1 2
                             3 4");

// From MATLAB/Octave format
matrix[,] b = Matrix.Parse("[1 2; 3 4]", OctaveMatrixFormatProvider.InvariantCulture);

// From C# multi-dimensional array format
string str = @"double[,] matrix = 
               {
                  { 1, 2 },
                  { 3, 4 },
                  { 5, 6 },
               }";

double[,] c = Matrix.Parse(str, CSharpMatrixFormatProvider.InvariantCulture);

Images

Images can be loaded in the standard .NET Framework way. However, one might be interested into converting images from matrices and vice-versa; in this case, the classes in the Accord.Imaging.Converters namespace.

Otherwise, if you would like to apply image processing filters to standard images such as Lena Söderberg's picture, you can use the TestImages dataset.

Sounds

Sounds can be loaded from files or recorded on-the-fly using a capture device. For examples on how to record audio on-the-fly, please refer to the audio recording, beat detection and FFT sample applications.

If you are looking into a quick way to load audio samples into your application, please refer to the Signal.FromFile method.

If you would like to use an audio database in your machine learning applications, please see the FreeSpokenDigitsDataset documentation page.

Video

Video capturing is done using AForge.NET.

Manipulating matrices

The framework provides matrix manipulation routines through extension methods. Just import the Accord.Math namespace into your source file and all common .NET datatypes will be extended with several extension methods related to mathematics. Please see the Mathematics page for more examples and details.

Matrix operations in the Accord.NET Framework through extension methods.

One common task in matrix manipulation is to decompose a matrix into various forms. Some examples of the decompositions supported by the framework are listed below. Those decompositions can be used to solve linear systems, compute matrix inverses and pseudo-inverses and extract other useful information about data.

Decompositions Multidimensional Jagged
Cholesky (double)(float)(decimal) (double)(float)(decimal)
Eigenvalue (EVD) (double)(float)
Generalized Eigenvalue [1] (double)
Nonnegative Factorization (double)
LU (double)(float)(decimal) (double)(float)(decimal)
QR (double)(float)(decimal)
Singular value (SVD) (double)(float)

Data preprocessing

Before attempting to learn a machine learning model, a good practice is to preprocess, normalize and clean your data. One of the simplest ways to normalize data is by transforming them to Z-scores. In order to transform your data to Z-Scores, you can use the following method:

double[][] scores = Accord.Statistics.Tools.ZScores(inputs);

In case you would like to subtract the mean from your data, you can use the Center method

double[][] centered = Accord.Statistics.Tools.Center(inputs);

And to divide by the standard deviation you can use the Standardize method

double[][] standard = Accord.Statistics.Tools.Standardize(inputs);

Learning from input and output pairs

The framework adopts an interface similar to Python's Scikit-learn package. If you would like to learn a new classifier or regression model that is able to map a set of given inputs to a set of corresponding outputs, you can first identify the algorithm that you would like to use (SVMs are a good initial choice), create it:

// As an example, we will try to learn a decision machine 
// that can replicate the "exclusive-or" logical function:

double[][] inputs =
{
    new double[] { 0, 0 }, // the XOR function takes two booleans
    new double[] { 0, 1 }, // and computes their exclusive or: the
    new double[] { 1, 0 }, // output is true only if the two booleans
    new double[] { 1, 1 }  // are different
};

int[] xor = // this is the output of the xor function
{
    0, // 0 xor 0 = 0 (inputs are equal)
    1, // 0 xor 1 = 1 (inputs are different)
    1, // 1 xor 0 = 1 (inputs are different)
    0, // 1 xor 1 = 0 (inputs are equal)
};

// Now, we can create the sequential minimal optimization teacher
var learn = new SequentialMinimalOptimization<Gaussian>()
{
    UseComplexityHeuristic = true,
    UseKernelEstimation = true
};

And then you will be able to call the universal .Learn() method which is common for all learning algorithms:

// And then we can obtain a trained SVM by calling its Learn method
SupportVectorMachine<Gaussian> svm = learn.Learn(inputs, xor);

The .Learn() method will use the learning algorithm you have chosen to create a new machine learning model. In the case of SequentialMinimalOptimization, this will end up creating a new SupportVectorMachine object. The nice thing about the framework is that you do not even need to know about how a SVM works to be able to use it (although this would be highly advisable in case you would like to use it for something other than a toy example). All classification models in the framework implement the .Decide() method, which can be used to obtain class predictions for any new data you would like to present the model to:

// Finally, we can obtain the decisions predicted by the machine:
bool[] prediction = svm.Decide(inputs);

Different models can predict different kinds of data. If you need to predict bool[] vectors, then Support Vector Machines should be your first choice. If you need to predict int[] class labels, then you are invited to take a look at multi-class and multi-label Support Vector Machines.

If you would like easier examples that you could just download to your computer, press F5 and see some classification problems being solved in action, please refer to the Classsification (SVMs), Classification (Naive Bayes), Classification (Decision Trees)[https://github.com/accord-net/framework/wiki/Sample-applications#classification-decision-trees], or Handwriting Recognition with SVMs sample applications to get up and running in no time.

More examples for classification problems are also given in the Classification page here in the wiki.

Finding similarity groups in data

Let's say that you would like to separate your data into different groups. It looks like you are trying to build a classifier for your data, except that you do not really know in advance the real class labels for each of the data groups you are trying to identify. In those cases, we need to use a machine learning algorithm that is able to learn the different classes for the data in an unsupervised manner, i.e. without being told which classes or features it should look for.

Unsupervised classification algorithms are also known as clustering algorithms. Clustering algorithms receive input data from your classification problem, but try to make as few as possible assumptions about its expected output. If you know in advance how many data groups should be present in your data, you might want to take a look at clustering algorithms such as K-Means, K-Medoids or Binary-Split clustering algorithms:

In situations where you do not know in advance how many classes should be expected from your data, but you have some idea about how spread your data clusters should be, then you might want to take a look at the MeanShift class:

For more details and examples for clustering algorithms, please see the Clustering page at this wiki.

Measuring performance and visualizing your results

Once you have learned your model using the .Learn() method and obtained its predictions for your input data using the .Decide() method, you might want to be able to check how far those predictions were from the values you were expecting. For supervised classification algorithms such as SVMs, NaiveBayes or DecisionTrees, you can use

Chart controls

Data-binding

Some framework objects can be data-bound to WPF or Windows Forms controls. Examples are all the statistical analysis classes (PCA, LDA, PLS, KPCA, ...), statistical distributions and hypothesis tests.

Specifying statistical distributions

Testing your hypothesis

Persisting models to disk

  1. Accord.NET Framework
  2. Getting started
  3. Published books
  4. How to use
  5. Sample applications

Help improve this wiki! Those pages can be edited by anyone that would like to contribute examples and documentation to the framework.

Have you found this software useful? Consider donating only U$10 so it can get even better! This software is completely free and will always stay free. Enjoy!

Clone this wiki locally
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.