Skip to content

Commit

Permalink
Merge pull request #3219 from Saurabh7/ldacb
Browse files Browse the repository at this point in the history
lda cookbook
  • Loading branch information
karlnapf committed May 30, 2016
2 parents 5610543 + 0c22016 commit 8932637
Show file tree
Hide file tree
Showing 3 changed files with 86 additions and 1 deletion.
2 changes: 1 addition & 1 deletion data
51 changes: 51 additions & 0 deletions doc/cookbook/source/examples/classifier/lda.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
=============================
Linear Discriminant Analysis
=============================

LDA learns a linear classifier via finding a projection matrix that maximally discriminates the provided classes. The learned linear classification rule is optimal under the assumption that both classes a gaussian distributed with equal co-variance. To find a linear separation :math:`{\bf w}` in training, the in-between class variance is maximized and the within class variance is minimized.
The projection matrix is computed by maximizing the following objective:

.. math::
J({\bf w})=\frac{{\bf w^T} S_B {\bf w}}{{\bf w^T} S_W {\bf w}}
where :math:`{\bf S_B}` is between class scatter matrix and :math:`{\bf S_W}` is within class scatter matrix.
The above derivation of LDA requires the invertibility of the within class matrix. This condition however, is violated when there are fewer data-points than dimensions. In this case SVD is used to compute projection matrix using an orthonormal basis :math:`{\bf Q}`

.. math::
{\bf W} := {\bf Q} {\bf{W^\prime}}
See Chapter 16 in :cite:`barber2012bayesian` for a detailed introduction.

-------
Example
-------

We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CBinaryLabels` from files with training and test data.

.. sgexample:: lda.sg:create_features

We create an instance of the :sgclass:`CLDA` classifier and set features and labels. By default, Shogun automatically chooses the decomposition method based on :math:`{N<=D}` or :math:`{N>D}`.

.. sgexample:: lda.sg:create_instance

Then we train and apply it to test data, which here gives :sgclass:`CBinaryLabels`.

.. sgexample:: lda.sg:train_and_apply

We can extract weights :math:`{\bf w}`.

.. sgexample:: lda.sg:extract_weights

We can evaluate test performance via e.g. :sgclass:`CAccuracyMeasure`.

.. sgexample:: lda.sg:evaluate_accuracy

----------
References
----------
:wiki:`Linear_discriminant_analysis`

.. bibliography:: ../../references.bib
:filter: docname in docnames
34 changes: 34 additions & 0 deletions examples/meta/src/classifier/lda.sg
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
CSVFile f_feats_train("../../data/classifier_binary_2d_linear_features_train.dat")
CSVFile f_feats_test("../../data/classifier_binary_2d_linear_features_test.dat")
CSVFile f_labels_train("../../data/classifier_binary_2d_linear_labels_train.dat")
CSVFile f_labels_test("../../data/classifier_binary_2d_linear_labels_test.dat")

#![create_features]
RealFeatures features_train(f_feats_train)
RealFeatures features_test(f_feats_test)
BinaryLabels labels_train(f_labels_train)
BinaryLabels labels_test(f_labels_test)
#![create_features]

#![create_instance]
LDA lda()
lda.set_features(features_train)
lda.set_labels(labels_train)
#![create_instance]

#![train_and_apply]
lda.train()
BinaryLabels labels_predict = lda.apply_binary(features_test)
#![train_and_apply]

#![extract_weights]
RealVector w = lda.get_w()
#![extract_weights]

#![evaluate_accuracy]
AccuracyMeasure eval()
real accuracy = eval.evaluate(labels_predict, labels_test)
#![evaluate_accuracy]

#additional integration testing variables
RealVector output = labels_predict.get_labels()

0 comments on commit 8932637

Please sign in to comment.