Merge pull request #3219 from Saurabh7/ldacb

lda cookbook
shogun-toolbox · May 30, 2016 · 8932637 · 8932637
2 parents 5610543 + 0c22016
commit 8932637
Show file tree

Hide file tree

Showing 3 changed files with 86 additions and 1 deletion.
diff --git a/data b/data
diff --git a/doc/cookbook/source/examples/classifier/lda.rst b/doc/cookbook/source/examples/classifier/lda.rst
@@ -0,0 +1,51 @@
+=============================
+Linear Discriminant Analysis
+=============================
+
+LDA learns a linear classifier via finding a projection matrix that maximally discriminates the provided classes. The learned linear classification rule is optimal under the assumption that both classes a gaussian distributed with equal co-variance. To find a linear separation :math:`{\bf w}` in training, the in-between class variance is maximized and the within class variance is minimized.
+The projection matrix is computed by maximizing the following objective:
+
+.. math::
+
+	J({\bf w})=\frac{{\bf w^T} S_B {\bf w}}{{\bf w^T} S_W {\bf w}}
+
+where :math:`{\bf S_B}` is between class scatter matrix and :math:`{\bf S_W}` is within class scatter matrix.
+The above derivation of LDA requires the invertibility of the within class matrix. This condition however, is violated when there are fewer data-points than dimensions. In this case SVD is used to compute projection matrix using an orthonormal basis :math:`{\bf Q}`
+
+.. math:: 
+
+	{\bf W} := {\bf Q} {\bf{W^\prime}}
+
+See Chapter 16 in :cite:`barber2012bayesian` for a detailed introduction.
+
+-------
+Example
+-------
+
+We create CDenseFeatures (here 64 bit floats aka RealFeatures) and :sgclass:`CBinaryLabels` from files with training and test data.
+
+.. sgexample:: lda.sg:create_features
+
+We create an instance of the :sgclass:`CLDA` classifier and set features and labels. By default, Shogun automatically chooses the decomposition method based on :math:`{N<=D}` or :math:`{N>D}`.
+
+.. sgexample:: lda.sg:create_instance
+
+Then we train and apply it to test data, which here gives :sgclass:`CBinaryLabels`.
+
+.. sgexample:: lda.sg:train_and_apply
+
+We can extract weights :math:`{\bf w}`.
+
+.. sgexample:: lda.sg:extract_weights
+
+We can evaluate test performance via e.g. :sgclass:`CAccuracyMeasure`.
+
+.. sgexample:: lda.sg:evaluate_accuracy
+
+----------
+References
+----------
+:wiki:`Linear_discriminant_analysis`
+
+.. bibliography:: ../../references.bib
+    :filter: docname in docnames
diff --git a/examples/meta/src/classifier/lda.sg b/examples/meta/src/classifier/lda.sg
@@ -0,0 +1,34 @@
+CSVFile f_feats_train("../../data/classifier_binary_2d_linear_features_train.dat")
+CSVFile f_feats_test("../../data/classifier_binary_2d_linear_features_test.dat")
+CSVFile f_labels_train("../../data/classifier_binary_2d_linear_labels_train.dat")
+CSVFile f_labels_test("../../data/classifier_binary_2d_linear_labels_test.dat")
+
+#![create_features]
+RealFeatures features_train(f_feats_train)
+RealFeatures features_test(f_feats_test)
+BinaryLabels labels_train(f_labels_train)
+BinaryLabels labels_test(f_labels_test)
+#![create_features]
+
+#![create_instance]
+LDA lda()
+lda.set_features(features_train)
+lda.set_labels(labels_train)
+#![create_instance]
+
+#![train_and_apply]
+lda.train()
+BinaryLabels labels_predict = lda.apply_binary(features_test)
+#![train_and_apply]
+
+#![extract_weights]
+RealVector w = lda.get_w()
+#![extract_weights]
+
+#![evaluate_accuracy]
+AccuracyMeasure eval()
+real accuracy = eval.evaluate(labels_predict, labels_test)
+#![evaluate_accuracy]
+
+#additional integration testing variables
+RealVector output = labels_predict.get_labels()