Marc Claesen edited this page Oct 11, 2013 · 19 revisions

Welcome to the EnsembleSVM wiki! This wiki is intended as a programmer's guide: we will try to introduce you to the main elements of the framework to get you started with prototyping your own ensemble algorithms. If you are looking for example use cases or a manual of the tools provided with EnsembleSVM, please visit the EnsembleSVM website.

The key class in this library is SVMEnsemble, a binary ensemble classifier using SVM base models. An ensemble outputs a vector of decision values: one per base model. The base model type is SVMModel, which works like LIBSVM's svm_model with some C++ sprinkled on top.

The framework provides a great deal of flexibility to aggregate the set of decision values yielded by ensembles. Several predefined schemes are included, which are documented on the MultistagePipe page. Additionally, the page contains a straightforward guide to implement new aggregation pipelines.

Facilities provided by EnsembleSVM

ThreadPool: built using C++11's standard threading facilities. It allows simple parallel execution of some predefined function with variable arguments. EnsembleSVM uses this thread pool in the esvm-train and esvm-predict tools, which perform embarassingly parallel tasks.

SelectiveFactory: a meta-factory which delegates construction of appropriate derived objects to the correct factory based on some criterion. We use it mainly for deserialization, particularly in to provide automated, transparant deserialization of the internal Pipeline classes.

Processing pipelines which model highly generic data analysis workflows. In brief, a Pipeline is used to process a generic input a to a generic output b. The dimensions of a and b nor their types need to match.