add least squares regression cookbook

shogun-toolbox · Mar 12, 2016 · 7879a8a · 7879a8a
1 parent fd151b5
commit 7879a8a
Show file tree

Hide file tree

Showing 4 changed files with 66 additions and 31 deletions.
diff --git a/data b/data
diff --git a/doc/cookbook/source/examples/regression/least_squares_regression.rst b/doc/cookbook/source/examples/regression/least_squares_regression.rst
@@ -0,0 +1,38 @@
+========================
+Least Squares Regression
+========================
+
+A Linear regression model can be defined as :math:`y_i = \bf{w}.\bf{x_i}` where :math:`y_i` is the predicted value, :math:`\bf{x_i}` is a feature vector and :math:`\bf{w}` is the weight vector. We aim to find the linear function that best explains the data, i.e. minimizes the error or loss function :math:`E(\bf{w})` by finding appropriate :math:`\bf{w}`.
+
+.. math::
+    E({\bf{w}}) = \sum_{i=1}^N(y_i-{\bf w}\cdot {\bf x}_i)^2
+
+One can differentiate :math:`E(\bf{w})` with respect to :math:`\bf{w}` and equate to zero to determine the :math:`\bf{w}` that minimizes :math:`E(\bf{w})`. This leads to solution of the form:
+
+.. math::
+    {\bf w} = \left(\sum_{i=1}^N{\bf x}_i{\bf x}_i^T\right)^{-1}\left(\sum_{i=1}^N y_i{\bf x}_i\right)
+
+-------
+Example
+-------
+
+Imagine we have files with training and test data. We create `CDenseFeatures` (here 64 bit floats aka RealFeatures) and :sgclass:`CRegressionLabels` as
+
+.. sgexample:: least_squares_regression.sg:create_features
+
+We create an instance of the :sgclass:`CLeastSquaresRegression` classifier, passing it training data and labels.
+
+.. sgexample:: least_squares_regression.sg:create_instance
+
+Then we run the `train` function in `LeastSquaresRegression` and apply it to test data.
+
+.. sgexample:: least_squares_regression.sg:train_and_apply
+
+After prediction, we can output the predicted :math:`y_i` and calculate mean squared error by using :sgclass:`CMeanSquaredError`.
+
+.. sgexample:: least_squares_regression.sg:ouput_and_calculate_error
+
+----------
+References
+----------
+:wiki:`Ordinary_least_squares`
diff --git a/examples/meta/src/regression/least_squares_regression.sg b/examples/meta/src/regression/least_squares_regression.sg
@@ -0,0 +1,27 @@
+CSVFile f_feats_train("../data/regression_1d_linear_features_train.dat")
+CSVFile f_feats_test("../data/regression_1d_linear_features_test.dat")
+CSVFile f_labels_train("../data/regression_1d_linear_labels_train.dat")
+CSVFile f_labels_test("../data/regression_1d_linear_labels_test.dat")
+
+#![create_features]
+RealFeatures features_train(f_feats_train)
+RealFeatures features_test(f_feats_test)
+RegressionLabels labels_train(f_labels_train)
+RegressionLabels labels_test(f_labels_test)
+#![create_features]
+
+#![create_instance]
+LeastSquaresRegression lsr(features_train, labels_train)
+#![create_instance]
+
+#![train_and_apply]
+lsr.train()
+Labels labels_predict = lsr.apply(features_test)
+#![train_and_apply]
+
+
+#![ouput_and_calculate_error]
+MeanSquaredError mse()
+real accuracy = mse.evaluate(labels_predict, labels_test)
+RealVector output = labels_predict.get_values()
+#![ouput_and_calculate_error]
diff --git a/examples/undocumented/python_modular/regression_least_squares_modular.py b/examples/undocumented/python_modular/regression_least_squares_modular.py