Merge pull request #3248 from Saurabh7/larscb

lars cookbook
shogun-toolbox · Feb 27, 2017 · ec6860a · ec6860a
2 parents 15b8463 + b496b2c
commit ec6860a
Show file tree

Hide file tree

Showing 2 changed files with 98 additions and 0 deletions.
diff --git a/doc/cookbook/source/examples/regression/least_angle_regression.rst b/doc/cookbook/source/examples/regression/least_angle_regression.rst
@@ -0,0 +1,51 @@
+=======================
+Least Angle Regression
+=======================
+
+Least Angle Regression (LARS) is an algorithm used to fit a linear regression model. LARS is simliar to forward stagewise regression but less greedy. Instead of including variables at each step, the estimated parameters are increased in a direction equiangular to each one's correlations with the residual. LARS can be used to solve LASSO, which is L1-regularized least square regression.
+
+.. math::
+	\min \|X^T\beta - y\|^2 + \lambda\|\beta\|_{1}]
+
+	\|\beta\|_1 = \sum_i|\beta_i|
+
+where :math:`X` is the feature matrix with explanatory features and :math:`y` is the dependent variable to be predicted. 
+Pre-processing of :math:`X` and :math:`y` are needed to ensure the correctness of this algorithm:
+:math:`X` needs to be normalized: each feature should have zero-mean and unit-norm, 
+:math:`y` needs to be centered: its mean should be zero.
+
+
+-------
+Example
+-------
+
+Imagine we have files with training and test data. We create `CDenseFeatures` (here 64 bit floats aka RealFeatures) and :sgclass:`CRegressionLabels` as
+
+.. sgexample:: least_angle_regression.sg:create_features
+
+To normalize and center the features, we create an instance of preprocessors :sgclass:`CPruneVarSubMean` and :sgclass:`CNormOne` and apply it on the feature matrices.
+
+.. sgexample:: least_angle_regression:preprocess_features
+
+We create an instance of :sgclass:`CLeastAngleRegression` by selecting to disable the LASSO solution, setting the penalty :math:`\lambda` for l1 norm and setting training data and labels.
+
+.. sgexample:: least_angle_regression:create_instance
+
+Then we train the regression model and apply it to test data to get the predicted :sgclass:`CRegressionLabels` .
+
+.. sgexample:: linear_ridge_regression.sg:train_and_apply
+
+After training, we can extract :math:`{\bf w}`.
+
+.. sgexample:: linear_ridge_regression.sg:extract_w
+
+Finally, we can evaluate the :sgclass:`CMeanSquaredError`.
+
+.. sgexample:: linear_ridge_regression.sg:evaluate_error
+
+----------
+References
+----------
+:wiki:`Least-angle_regression`
+
+:wiki:`Stepwise_regression`
diff --git a/examples/meta/src/regression/least_angle_regression.sg b/examples/meta/src/regression/least_angle_regression.sg
@@ -0,0 +1,47 @@
+CSVFile f_feats_train("../../data/regression_1d_linear_features_train.dat")
+CSVFile f_feats_test("../../data/regression_1d_linear_features_test.dat")
+CSVFile f_labels_train("../../data/regression_1d_linear_labels_train.dat")
+CSVFile f_labels_test("../../data/regression_1d_linear_labels_test.dat")
+
+#![create_features]
+RealFeatures features_train(f_feats_train)
+RealFeatures features_test(f_feats_test)
+RegressionLabels labels_train(f_labels_train)
+RegressionLabels labels_test(f_labels_test)
+#![create_features]
+
+#![preprocess_features]
+PruneVarSubMean SubMean()
+NormOne Normalize()
+SubMean.init(features_train)
+SubMean.apply_to_feature_matrix(features_train)
+SubMean.apply_to_feature_matrix(features_test)
+Normalize.init(features_train)
+Normalize.apply_to_feature_matrix(features_train)
+Normalize.apply_to_feature_matrix(features_test)	
+#![preprocess_features]
+
+#![create_instance]
+real lamda1 = 0.01
+LeastAngleRegression lars(False)
+lars.set_features(features_train)
+lars.set_labels(labels_train)
+lars.set_max_l1_norm(lamda1)
+#![create_instance]
+
+#![train_and_apply]
+lars.train()
+RegressionLabels labels_predict = lars.apply_regression(features_test)
+
+#[!extract_w]
+RealVector weights = lars.get_w()
+#[!extract_w]
+
+#![evaluate_error]
+MeanSquaredError eval()
+real mse = eval.evaluate(labels_predict, labels_test)
+#![evaluate_error]
+
+# integration testing variables
+RealVector output = labels_test.get_labels()
+