Issue #2747 - Using linalg everywhere #4058

ckousik · 2017-12-29T13:10:50Z

This PR is a work in progress which will transition shogun to using linalg for linear algebra operations. Minimal unit tests will be written for required classes.
#2747

lisitsyn · 2018-01-01T17:45:11Z

Thanks for the patch!

Please take a look at the Travis CI status, the tests are apparently failing.

@karlnapf can you take a look?

ckousik · 2018-01-01T19:25:28Z

@lisitsyn I have been through the travis build. The failure is caused by the style checker. I'll push another commit shortly to fix it. Sorry :(

karlnapf · 2018-01-05T18:42:21Z

tests/unit/distribution/Gaussian_unittest.cc

+	point[0] = 0;
+	cov(0,0) = 1.0;
+
+	CGaussian *gauss = new CGaussian(mean, cov, FULL);


you could do auto gaussian = some<CGaussian>(mean, cov, FULL)

karlnapf · 2018-01-05T18:42:36Z

tests/unit/distribution/Gaussian_unittest.cc

+	SGMatrix<float64_t> cov(2,2);
+
+	mean[0] = mean[1] = 0;
+	// Random variables are independent


no need for this comment

karlnapf · 2018-01-05T18:42:46Z

tests/unit/distribution/Gaussian_unittest.cc

+	cov(1,1) = 1.0;
+
+	auto gauss = new CGaussian(mean, cov, FULL);
+	// Find the log of the distribution function at the mean


no need for comment

karlnapf · 2018-01-05T18:43:01Z

tests/unit/distribution/Gaussian_unittest.cc

+	// Find the log of the distribution function at the mean
+	// position.
+	float64_t log_pdf = gauss->compute_log_PDF(mean);
+	EXPECT_NEAR(log_pdf, -1.83787706641, 1e-8);


float64 is 16 digits, not just 8

karlnapf · 2018-01-05T18:43:29Z

tests/unit/distribution/Gaussian_unittest.cc

+
+	auto mean = gauss->get_mean();
+	auto cov = gauss->get_cov();
+	EXPECT_NEAR(mean[0], 0.0, 1e-1);


i would rather compute the mean by hand and check whether it is the same here.

karlnapf · 2018-01-05T18:43:49Z

tests/unit/distribution/Gaussian_unittest.cc

+	CGaussian *gauss = new CGaussian();
+	auto train_features = new CDenseFeatures<float64_t>(data);
+	gauss->train(train_features);
+


see comments above, also hold here

karlnapf

Great work! Thanks for the patch.
I made a few comments, let me know if you need help addressing them

karlnapf · 2018-01-05T18:44:34Z

tests/unit/distribution/Gaussian_unittest.cc

@@ -0,0 +1,149 @@
+/*
+ * Copyright (c) The Shogun Machine Learning Toolbox
+ * Written (w) 2014 Parijat Mazumdar


you can put your name here :)

ckousik · 2018-01-06T18:28:43Z

@karlnapf I am getting precision errors when testing with eps set to 1e-16. I checked the DotFeatures unit tests and they are only checked for 1e-8.

ckousik · 2018-01-06T18:32:14Z

@karlnapf I'm also concerned about the usefulness of these tests. The Gaussian mean and covariance matrices are generated from DotFeatures, and so this feels like a more elaborate test for DotFeatures than for Gaussian.

karlnapf · 2018-01-11T17:39:44Z

tests/unit/distribution/Gaussian_unittest.cc

+TEST(Gaussian, train_univariate)
+{
+	float64_t eps = 1e-8;
+	sg_rand->set_seed(1);


why do you need a fixed seed when you test for the estimation? Should be ok everytime no?

karlnapf · 2018-01-11T17:40:21Z

tests/unit/distribution/Gaussian_unittest.cc

+	auto mean = gauss->get_mean();
+	auto cov = gauss->get_cov();
+
+	for (int32_t i = 0; i < sample_size; i++)


for auto i : range(sample_size) ... minor

karlnapf · 2018-01-11T17:41:08Z

tests/unit/distribution/Gaussian_unittest.cc

+	SGMatrix<float64_t> data(1, 500);
+
+	int64_t sample_size = 500;
+	float64_t mn = 0.0, cv = 0.0;


I dont like this variable names. What about something meaningful? mean and covariance?

mu and Sigma?

or "mean_est" or so

karlnapf · 2018-01-11T17:42:45Z

tests/unit/distribution/Gaussian_unittest.cc

+
+	SGMatrix<float64_t> data(2, sample_size);
+
+	for(int32_t i = 0; i < 2; i++){


@vigsterkr don't we have ways to fill arrays with random gaussian numbers?

karlnapf · 2018-01-11T17:43:29Z

tests/unit/distribution/Gaussian_unittest.cc

+	linalg::zero(sample_cov);
+
+	for (int32_t i = 0; i < sample_size; i++)
+		train_features->add_to_dense_vec(1.0, i, sample_mean.vector, 2);


you are right, this should not be in here
I think there are linalg methods

karlnapf

You are right, the tests need some.

Actually, you KNOW the true mean and covariance, why don't you test against those if you are using EXPECT_NEAR with low epsilon

If you really want to check the exact thing, I would rather compute the mean and cov for small examples by hand (pen and paper!) and then compare against that.
your examples/tests are too big

One thing you could test for is buffer overflows/underflows for really large amounts of data (and make sure they work)

karlnapf · 2018-01-11T17:46:02Z

But this is already improving, let's iterate a few more times, and this will be good to merge.
THANKS! :)

minimal unit test for Gaussian

7096e37

karlnapf reviewed Jan 5, 2018

View reviewed changes

karlnapf requested changes Jan 5, 2018

View reviewed changes

karlnapf reviewed Jan 5, 2018

View reviewed changes

addressed PR 4058 issues

4f8dd0e

ckousik added 2 commits January 7, 2018 00:49

Removed LAPACK conditional compilation from Gaussian

bd81800

Removed Eigen include in LDA

c272fd0

karlnapf reviewed Jan 11, 2018

View reviewed changes

karlnapf requested changes Jan 11, 2018

View reviewed changes

ckousik closed this Mar 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #2747 - Using linalg everywhere #4058

Issue #2747 - Using linalg everywhere #4058

ckousik commented Dec 29, 2017

lisitsyn commented Jan 1, 2018

ckousik commented Jan 1, 2018

karlnapf Jan 5, 2018

karlnapf Jan 5, 2018

karlnapf Jan 5, 2018

karlnapf Jan 5, 2018

karlnapf Jan 5, 2018

karlnapf Jan 5, 2018

karlnapf left a comment

karlnapf Jan 5, 2018

ckousik commented Jan 6, 2018

ckousik commented Jan 6, 2018 •

edited

karlnapf Jan 11, 2018

karlnapf Jan 11, 2018

karlnapf Jan 11, 2018

karlnapf Jan 11, 2018

karlnapf Jan 11, 2018

karlnapf Jan 11, 2018

karlnapf Jan 11, 2018

karlnapf left a comment

karlnapf commented Jan 11, 2018


		SGMatrix<float64_t> data(2, sample_size);

		for(int32_t i = 0; i < 2; i++){

Issue #2747 - Using linalg everywhere #4058

Issue #2747 - Using linalg everywhere #4058

Conversation

ckousik commented Dec 29, 2017

lisitsyn commented Jan 1, 2018

ckousik commented Jan 1, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ckousik commented Jan 6, 2018

ckousik commented Jan 6, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf left a comment

Choose a reason for hiding this comment

karlnapf commented Jan 11, 2018

ckousik commented Jan 6, 2018 •

edited