Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deleting and testing #4092

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
7 changes: 2 additions & 5 deletions applications/classification/random_fourier_classification.cpp
Original file line number Diff line number Diff line change
@@ -1,10 +1,7 @@
/*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* (at your option) any later version.
* This software is distributed under BSD 3-clause license (see LICENSE file).
*
* Written (W) 2013 Evangelos Anagnostopoulos
* Authors: Björn Esser, Evangelos Anagnostopoulos
*/
#include <shogun/base/init.h>
#include <shogun/features/RandomFourierDotFeatures.h>
Expand Down
8 changes: 2 additions & 6 deletions benchmarks/hasheddoc_benchmarks.cpp
Original file line number Diff line number Diff line change
@@ -1,11 +1,7 @@
/*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* (at your option) any later version.
* This software is distributed under BSD 3-clause license (see LICENSE file).
*
* Written (W) 2013 Evangelos Anagnostopoulos
* Copyright (C) 2013 Evangelos Anagnostopoulos
* Authors: Evangelos Anagnostopoulos
*/

#include <shogun/base/init.h>
Expand Down
7 changes: 2 additions & 5 deletions benchmarks/sparse_test.cpp
Original file line number Diff line number Diff line change
@@ -1,10 +1,7 @@
/*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* (at your option) any later version.
* This software is distributed under BSD 3-clause license (see LICENSE file).
*
* Written (W) 2013 Soumyajit De
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to not see these changes in the PR. They are unrelated.

* Authors: Soeren Sonnenburg, Pan Deng, Soumyajit De, Björn Esser
*/

#include <shogun/lib/common.h>
Expand Down
1 change: 0 additions & 1 deletion cmake/FindMetaExamples.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@ function(get_excluded_meta_examples)
IF(NOT HAVE_LAPACK)
LIST(APPEND EXCLUDED_META_EXAMPLES
regression/linear_ridge_regression.sg
clustering/gmm.sg
distance/mahalanobis.sg
)
ENDIF()
Expand Down
2 changes: 1 addition & 1 deletion data
Submodule data updated 31 files
+0 −0 testsuite/meta/binary/averaged_perceptron.dat
+0 −0 testsuite/meta/binary/kernel_support_vector_machine.dat
+0 −0 testsuite/meta/binary/linear_discriminant_analysis.dat
+0 −0 testsuite/meta/binary/linear_support_vector_machine.dat
+0 −0 testsuite/meta/binary/sgd_svm.dat
+0 −0 testsuite/meta/binary/svm_oneclass.dat
+0 −0 testsuite/meta/clustering/gaussian_mixture_models.dat
+0 −0 testsuite/meta/converter/independent_component_analysis_fast.dat
+0 −0 testsuite/meta/converter/independent_component_analysis_ff_sep.dat
+0 −0 testsuite/meta/converter/independent_component_analysis_jade.dat
+0 −0 testsuite/meta/converter/independent_component_analysis_jedi_sep.dat
+0 −0 testsuite/meta/converter/independent_component_analysis_sobi.dat
+0 −0 testsuite/meta/gaussian_process/classifier.dat
+0 −0 testsuite/meta/gaussian_process/regression.dat
+0 −0 testsuite/meta/gaussian_process/sparse_regression.dat
+0 −0 testsuite/meta/multiclass/cartree.dat
+0 −0 testsuite/meta/multiclass/chaid_tree.dat
+0 −0 testsuite/meta/multiclass/ecoc_random.dat
+0 −0 testsuite/meta/multiclass/gaussian_naive_bayes.dat
+0 −0 testsuite/meta/multiclass/gmnp_svm.dat
+0 −0 testsuite/meta/multiclass/k_nearest_neighbours.dat
+0 −0 testsuite/meta/multiclass/large_margin_nearest_neighbours.dat
+0 −0 testsuite/meta/multiclass/linear.dat
+0 −0 testsuite/meta/multiclass/linear_discriminant_analysis.dat
+0 −0 testsuite/meta/multiclass/logistic_regression.dat
+0 −0 testsuite/meta/multiclass/quadratic_discriminant_analysis.dat
+0 −0 testsuite/meta/multiclass/random_forest.dat
+0 −0 testsuite/meta/multiclass/relaxed_tree.dat
+0 −0 testsuite/meta/multiclass/shareboost.dat
+0 −0 testsuite/meta/multiclass/support_vector_machine.dat
+0 −0 testsuite/meta/regression/kernel_ridge_regression_nystrom.dat
Original file line number Diff line number Diff line change
Expand Up @@ -26,27 +26,27 @@ Example

Imagine we have files with training and test data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CBinaryLabels` as

.. sgexample:: kernel_svm.sg:create_features
.. sgexample:: kernel_support_vector_machine.sg:create_features

In order to run :sgclass:`CLibSVM`, we need to initialize a kernel like :sgclass:`CGaussianKernel` with training features and some parameters like :math:`C` and epsilon i.e. residual convergence parameter which is optional.

.. sgexample:: kernel_svm.sg:set_parameters
.. sgexample:: kernel_support_vector_machine.sg:set_parameters

We create an instance of the :sgclass:`CLibSVM` classifier by passing it regularization coefficient, kernel and labels.

.. sgexample:: kernel_svm.sg:create_instance
.. sgexample:: kernel_support_vector_machine.sg:create_instance

Then we train and apply it to test data, which here gives :sgclass:`CBinaryLabels`.

.. sgexample:: kernel_svm.sg:train_and_apply
.. sgexample:: kernel_support_vector_machine.sg:train_and_apply

We can extract :math:`\alpha` and :math:`b`.

.. sgexample:: kernel_svm.sg:extract_weights_bias
.. sgexample:: kernel_support_vector_machine.sg:extract_weights_bias

Finally, we can evaluate test performance via e.g. :sgclass:`CAccuracyMeasure`.

.. sgexample:: kernel_svm.sg:evaluate_accuracy
.. sgexample:: kernel_support_vector_machine.sg:evaluate_accuracy

----------
References
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,23 +24,23 @@ Example

We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CBinaryLabels` from files with training and test data.

.. sgexample:: lda.sg:create_features
.. sgexample:: linear_discriminant_analysis.sg:create_features

We create an instance of the :sgclass:`CLDA` classifier and set features and labels. By default, Shogun automatically chooses the decomposition method based on :math:`{N<=D}` or :math:`{N>D}`.

.. sgexample:: lda.sg:create_instance
.. sgexample:: linear_discriminant_analysis.sg:create_instance

Then we train and apply it to test data, which here gives :sgclass:`CBinaryLabels`.

.. sgexample:: lda.sg:train_and_apply
.. sgexample:: linear_discriminant_analysis.sg:train_and_apply

We can extract weights :math:`{\bf w}`.

.. sgexample:: lda.sg:extract_weights
.. sgexample:: linear_discriminant_analysis.sg:extract_weights

We can evaluate test performance via e.g. :sgclass:`CAccuracyMeasure`.

.. sgexample:: lda.sg:evaluate_accuracy
.. sgexample:: linear_discriminant_analysis.sg:evaluate_accuracy

----------
References
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,27 +26,27 @@ Example

Imagine we have files with training and test data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CBinaryLabels` as

.. sgexample:: linear_svm.sg:create_features
.. sgexample:: linear_support_vector_machine.sg:create_features

In order to run :sgclass:`CLibLinear`, we need to initialize some parameters like :math:`C` and epsilon which is the residual convergence parameter of the solver.

.. sgexample:: linear_svm.sg:set_parameters
.. sgexample:: linear_support_vector_machine.sg:set_parameters

We create an instance of the :sgclass:`CLibLinear` classifier by passing it regularization coefficient, features and labels. We here set the solver type to L2 regularized classification. There are many other solver types in :sgclass:`CLibLinear` to choose from.

.. sgexample:: linear_svm.sg:create_instance
.. sgexample:: linear_support_vector_machine.sg:create_instance

Then we train and apply it to test data, which here gives :sgclass:`CBinaryLabels`.

.. sgexample:: linear_svm.sg:train_and_apply
.. sgexample:: linear_support_vector_machine.sg:train_and_apply

We can extract :math:`{\bf w}` and :math:`b`.

.. sgexample:: linear_svm.sg:extract_weights_bias
.. sgexample:: linear_support_vector_machine.sg:extract_weights_bias

We can evaluate test performance via e.g. :sgclass:`CAccuracyMeasure`.

.. sgexample:: linear_svm.sg:evaluate_accuracy
.. sgexample:: linear_support_vector_machine.sg:evaluate_accuracy

----------
References
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Multiple kernel learning (MKL) is based on convex combinations of arbitrary kern

where :math:`\beta_k > 0`, :math:`\sum_{k=1}^{K} \beta_k = 1`, :math:`K` is the number of sub-kernels, :math:`\bf{k}` is a combined kernel, :math:`{\bf k}_i` is an individual kernel and :math:`{x_i}_i` are the training data.

Classification is done by using Support Vector Machines (SVM). See :doc:`linear_svm` for more details. Optimal :math:`\alpha` and :math:`b` for SVM and :math:`\beta` are determined via training.
Classification is done by using Support Vector Machines (SVM). See :doc:`linear_support_vector_machine` for more details. Optimal :math:`\alpha` and :math:`b` for SVM and :math:`\beta` are determined via training.

See :cite:`sonnenburg2006large` for more details.

Expand All @@ -20,35 +20,35 @@ Example

Imagine we have files with training and test data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CBinaryLabels` as

.. sgexample:: mkl.sg:create_features
.. sgexample:: multiple_kernel_learning.sg:create_features

Then we create indvidual kernels like :sgclass:`CPolyKernel` and :sgclass:`CGaussianKernel` which will be later combined in one :sgclass:`CCombinedKernel`.

.. sgexample:: mkl.sg:create_kernel
.. sgexample:: multiple_kernel_learning.sg:create_kernel

We create an instance of :sgclass:`CCombinedKernel` and append the :sgclass:`CKernel` objects.

.. sgexample:: mkl.sg:create_combined_train
.. sgexample:: multiple_kernel_learning.sg:create_combined_train

We create an object of :sgclass:`CMKLClassification`, provide the combined kernel and labels before training it.

.. sgexample:: mkl.sg:train_mkl
.. sgexample:: multiple_kernel_learning.sg:train_mkl

After training, we can extract :math:`\beta`, SVM coefficients :math:`\alpha` and :math:`b`.

.. sgexample:: mkl.sg:extract_weights
.. sgexample:: multiple_kernel_learning.sg:extract_weights

We update the :sgclass:`CCombinedKernel` object for testing data.

.. sgexample:: mkl.sg:create_combined_test
.. sgexample:: multiple_kernel_learning.sg:create_combined_test

We set the updated kernel and predict :sgclass:`CBinaryLabels` for test data.

.. sgexample:: mkl.sg:mkl_apply
.. sgexample:: multiple_kernel_learning.sg:mkl_apply

Finally, we can evaluate test performance via e.g. :sgclass:`CAccuracyMeasure`.

.. sgexample:: mkl.sg:evaluate_accuracy
.. sgexample:: multiple_kernel_learning.sg:evaluate_accuracy

----------
References
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,27 +20,27 @@ Example

We start by creating CDenseFeatures (here 64 bit floats aka RealFeatures) as

.. sgexample:: gmm.sg:create_features
.. sgexample:: gaussian_mixture_models.sg:create_features

We initialize :sgclass:`CGMM`, passing the desired number of mixture components.

.. sgexample:: gmm.sg:create_gmm_instance
.. sgexample:: gaussian_mixture_models.sg:create_gmm_instance

We provide training features to the :sgclass:`CGMM` object, train it by using EM algorithm and sample data-points from the trained model.

.. sgexample:: gmm.sg:train_sample
.. sgexample:: gaussian_mixture_models.sg:train_sample

We extract parameters like :math:`\pi`, :math:`\mu_i` and :math:`\Sigma_i` for any componenet from the trained model.

.. sgexample:: gmm.sg:extract_params
.. sgexample:: gaussian_mixture_models.sg:extract_params

We obtain log likelihood of belonging to clusters and being generated by this model.

.. sgexample:: gmm.sg:cluster_output
.. sgexample:: gaussian_mixture_models.sg:cluster_output

We can also use Split-Merge Expectation-Maximization algorithm :cite:`ueda2000smem` for training.

.. sgexample:: gmm.sg:training_smem
.. sgexample:: gaussian_mixture_models.sg:training_smem

----------
References
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,19 +15,19 @@ Example
Given a dataset which we assume consists of linearly mixed signals, we create CDenseFeatures
(RealFeatures, here 64 bit float values).

.. sgexample:: ica_fast.sg:create_features
.. sgexample:: independent_component_analysis_fast.sg:create_features

We create the :sgclass:`CFastICA` instance, and set its parameter for the iterative optimization.

.. sgexample:: ica_fast.sg:set_parameters
.. sgexample:: independent_component_analysis_fast.sg:set_parameters

Then we apply ICA, which gives the unmixed signals.

.. sgexample:: ica_fast.sg:apply_convert
.. sgexample:: independent_component_analysis_fast.sg:apply_convert

We can also extract the estimated mixing matrix.

.. sgexample:: ica_fast.sg:extract
.. sgexample:: independent_component_analysis_fast.sg:extract

----------
References
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,39 +15,39 @@ We'll use as example a classification problem solvable by using :sgclass:`CMKLCl
For the sake of brevity, we'll skip the initialization of features, kernels and so on
(see :doc:`../regression/multiple_kernel_learning` for a more complete example of MKL usage).

.. sgexample:: cross_validation_mkl_weight_storage.sg:create_classifier
.. sgexample:: cross_validation_multiple_kernel_learning_weights_storage.sg:create_classifier

Firstly, we initialize a splitting strategy :sgclass:`CStratifiedCrossValidationSplitting`, which is needed
to divide the dataset into folds, and an evaluation criterium :sgclass:`CAccuracyMeasure`, to evaluate the
performance of the trained models. Secondly, we create the :sgclass:`CCrossValidation` instance.
We set also the number of cross validation's runs.

.. sgexample:: cross_validation_mkl_weight_storage.sg:create_cross_validation
.. sgexample:: cross_validation_multiple_kernel_learning_weights_storage.sg:create_cross_validation

To observe also the partial folds' results, we create a cross validation's observer :sgclass:`CParameterObserverCV`
and then we register it into the :sgclass:`CCrossValidation` instance.

.. sgexample:: cross_validation_mkl_weight_storage.sg:create_observer
.. sgexample:: cross_validation_multiple_kernel_learning_weights_storage.sg:create_observer

Finally, we evaluate the model and get the results (aka a :sgclass:`CCrossValidationResult` instance).

.. sgexample:: cross_validation_mkl_weight_storage.sg:evaluate_and_get_result
.. sgexample:: cross_validation_multiple_kernel_learning_weights_storage.sg:evaluate_and_get_result

We get the :math:`mean` of all the evaluation results and its standard deviation :math:`stddev`.

.. sgexample:: cross_validation_mkl_weight_storage.sg:get_results
.. sgexample:: cross_validation_multiple_kernel_learning_weights_storage.sg:get_results

We can get more information about the single cross validation's runs and folds by using the observer we set before, like the kernels' weights.
We get the :sgclass:`CMKLClassification` machine used during the first run and trained on the first fold.

.. sgexample:: cross_validation_mkl_weight_storage.sg:get_fold_machine
.. sgexample:: cross_validation_multiple_kernel_learning_weights_storage.sg:get_fold_machine

Then, from the trained machine, we get the weights :math:`\mathbf{w}` of its kernels.

.. sgexample:: cross_validation_mkl_weight_storage.sg:get_weights
.. sgexample:: cross_validation_multiple_kernel_learning_weights_storage.sg:get_weights

----------
References
----------

:wiki:`Cross-validation_(statistics)`
:wiki:`Cross-validation_(statistics)`
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Gaussian Process Classifier

Application of Gaussian processes in binary and multi-class classification.
`See Gaussian process regression cookbook
<http://shogun.ml/cookbook/latest/examples/gaussian_processes/gaussian_process_regression.html>`_
<http://shogun.ml/cookbook/latest/examples/gaussian_process/regression.html>`_
and :cite:`Rasmussen2005GPM` for more information on Gaussian processes.

-------
Expand All @@ -13,12 +13,12 @@ Example

Imagine we have files with training and test data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CMulticlassLabels` as

.. sgexample:: gaussian_process_classifier.sg:create_features
.. sgexample:: classifier.sg:create_features

To fit the input (training) data :math:`\mathbf{X}`, we have to choose appropriate :sgclass:`CMeanFunction`
and :sgclass:`CKernel`. Here we use a basic :sgclass:`CConstMean` and a :sgclass:`CGaussianKernel` with chosen width parameter.

.. sgexample:: gaussian_process_classifier.sg:create_appropriate_kernel_and_mean_function
.. sgexample:: classifier.sg:create_appropriate_kernel_and_mean_function

We need to specify the inference method to find the posterior distribution of the function values :math:`\mathbf{f}`.
Here we choose to perform Laplace approximation inference method with an instance of :sgclass:`CMultiLaplaceInferenceMethod` (See Chapter 18.2 in :cite:`barber2012bayesian` for a detailed introduction)
Expand All @@ -27,20 +27,20 @@ the training features, the mean function, the labels and an instance of :sgclass
to specify the distribution of the targets/labels as above.
Finally we create an instance of the :sgclass:`CGaussianProcessClassification` classifier.

.. sgexample:: gaussian_process_classifier.sg:create_instance
.. sgexample:: classifier.sg:create_instance

Then we can train the model and evaluate the predictive distribution.
We get predicted :sgclass:`CMulticlassLabels`.

.. sgexample:: gaussian_process_classifier.sg:train_and_apply
.. sgexample:: classifier.sg:train_and_apply

We can extract the probabilities:

.. sgexample:: gaussian_process_classifier.sg:extract_the_probabilities
.. sgexample:: classifier.sg:extract_the_probabilities

We can evaluate test performance via e.g. :sgclass:`CMulticlassAccuracy`.

.. sgexample:: gaussian_process_classifier.sg:evaluate_accuracy
.. sgexample:: classifier.sg:evaluate_accuracy

----------
References
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,32 +24,32 @@ Example

Imagine we have files with training and test data. We create `CDenseFeatures` (here 64 bit floats aka RealFeatures) and :sgclass:`CRegressionLabels` as:

.. sgexample:: gaussian_process_regression.sg:create_features
.. sgexample:: regression.sg:create_features

To fit the input (training) data :math:`\mathbf{X}`, we have to choose an appropriate :sgclass:`CMeanFunction` and :sgclass:`CKernel` and instantiate them. Here we use a basic :sgclass:`CZeroMean` and a :sgclass:`CGaussianKernel` with chosen width parameter.

.. sgexample:: gaussian_process_regression.sg:create_appropriate_kernel_and_mean_function
.. sgexample:: regression.sg:create_appropriate_kernel_and_mean_function

We need to specify the inference method to find the posterior distribution of the function values :math:`\mathbf{f}`. Here we choose to perform exact inference with an instance of :sgclass:`CExactInferenceMethod` and pass it the chosen kernel, the training features, the mean function, the labels and an instance of :sgclass:`CGaussianLikelihood`, to specify the distribution of the targets/labels as above. Finally we generate a CGaussianProcessRegression class to be trained.

.. sgexample:: gaussian_process_regression.sg:create_instance
.. sgexample:: regression.sg:create_instance

Then we can train the model and evaluate the predictive distribution. We get predicted :sgclass:`CRegressionLabels`.

.. sgexample:: gaussian_process_regression.sg:train_and_apply
.. sgexample:: regression.sg:train_and_apply

We can compute the predictive variances as

.. sgexample:: gaussian_process_regression.sg:compute_variance
.. sgexample:: regression.sg:compute_variance

The prediction above is based on arbitrarily set hyperparameters :math:`\boldsymbol{\theta}`: kernel width :math:`\tau`, kernel scaling :math:`\gamma` and observation noise :math:`\sigma^2`. We can also learn these parameters by optimizing the marginal likelihood :math:`p(\mathbf{y}|\mathbf{X}, \boldsymbol{\theta})` w.r.t. :math:`\boldsymbol{\theta}`.
To do this, we define a :sgclass:`CGradientModelSelection`, passing to it a :sgclass:`CGradientEvaluation` with its own :sgclass:`CGradientCriterion`, specifying the gradient scheme and direction. Then we can follow the gradient and apply the chosen :math:`\boldsymbol{\theta}` back to the CGaussianProcessRegression instance.

.. sgexample:: gaussian_process_regression.sg:optimize_marginal_likelihood
.. sgexample:: regression.sg:optimize_marginal_likelihood

Finally, we evaluate the :sgclass:`CMeanSquaredError` and the (negative log) marginal likelihood for the optimized hyperparameters.

.. sgexample:: gaussian_process_regression.sg:evaluate_error_and_marginal_likelihood
.. sgexample:: regression.sg:evaluate_error_and_marginal_likelihood

----------
References
Expand Down