Refactored the CLibLinear test codes and reduced them from 1300 lines to 470. #4208

FaroukY · 2018-03-17T05:12:15Z

Regarding issue #4207.

FaroukY · 2018-03-17T05:16:50Z

The Pull request seems to have duplicated the commits from the previous Pull request. The only files that should be changed here is the last one (LibLinear_unittest.cc). All the other commits are from other pull requests. I'll fix that in the next revision if any is needed.

vigsterkr · 2018-03-19T08:44:20Z

@FaroukY yeah i think you forked your branch feature_fix_lib_linear_doc not from develop but from the other fix' branch. just checkout develop pull the latest changes and then simply cherry pick 505f6ab into that and forcepush it to your remote feature_fix_lib_linear_doc, this way you dont even need to create a new PR ;)

vigsterkr · 2018-03-19T08:45:50Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	SGMatrix<float64_t> train_matrix = train_feats->get_feature_matrix();
-	SGMatrix<float64_t>::transpose_matrix(train_matrix.matrix,
-			train_matrix.num_rows, train_matrix.num_cols);
+		CLibLinear* ll = new CLibLinear();


you can use some in many of these cases, so you dont need to call SG_UNREF :)
for example:
auto ll = some<CLibLinear>();

vigsterkr · 2018-03-19T08:46:19Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	SG_UNREF(train_feats);
+		CBinaryLabels* pred = NULL;
+		float64_t liblin_accuracy;
+		CContingencyTableEvaluation* eval = new CContingencyTableEvaluation();


same here as above:
auto eval = some<CContingencyTableEvaluation>();

vigsterkr · 2018-03-19T08:46:52Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	ground_truth = new CBinaryLabels(labels);
+		ll->set_liblinear_solver_type(liblinear_solver_type);
+		ll->train();
+		pred = ll->apply_binary(test_feats);


this could be simply

auto pred = ll->apply_binary(test_feats);

and then you can remove the declaration of CBinaryLabels* pred = NULL;

vigsterkr · 2018-03-19T08:48:21Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	{
-		labels[i] = (i < num_samples/2) ? 1.0 : -1.0;
-	}
+		CLibLinear* ll = new CLibLinear();


same as above... this code shares quite a lot of lines with train_with_solver... we could combine them :)

I did think about this, but when I did it, it had a lot of if statements and it wasn't very readable (for a test case). Are we sure we wanna do this?

vigsterkr · 2018-03-19T08:48:41Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+		*/
+
+		index_t num_samples = 50;
+		CMath::init_random(5);


use sg_rand->set_seed plz

vigsterkr · 2018-03-19T08:49:26Z

tests/unit/classifier/svm/LibLinear_unittest.cc


 #include <shogun/classifier/svm/LibLinear.h>
 #include <shogun/features/DataGenerator.h>
 #include <shogun/features/DenseFeatures.h>
 #include <shogun/evaluation/ContingencyTableEvaluation.h>
 #include <shogun/mathematics/Math.h>
 #include <gtest/gtest.h>
+#include <map>
+#include <string>
+#include <vector>

 using namespace shogun;

 #ifdef HAVE_LAPACK


i think this can go as this was required for the gaussian blob generator, but that's not LAPACK only anymore...

I used the string and vector in the tests, so I will only delete , is that okay?

lapack can still go imo, no?

karlnapf · 2018-03-23T16:44:40Z

This needs a rebase as I fixed some missing SG_REF recently.
Should be pretty minimal.

FaroukY · 2018-03-25T07:52:30Z

@karlnapf Going through the diff, the only difference I see in this file is the SG_REF(pred). Is that safe to assume, or am I missing another SG_REF?

FaroukY · 2018-03-25T08:23:33Z

Okay, So I verified it, rebased, and addressed the problems

vigsterkr · 2018-03-26T08:17:04Z

@FaroukY ok in order to be able to merge this you need to remove the unrelated commits in this PR

FaroukY · 2018-03-29T03:52:03Z

@vigsterkr Removed all the unrelated commits. This should be okay to merge now.

karlnapf · 2018-04-02T10:48:46Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+		SG_UNREF(test_feats);
+		SG_UNREF(ground_truth);
+	}
+	//Helper that tests can call to drastically reduce code


I would remove the comment and let the code speak for itself :) minor

karlnapf · 2018-04-02T10:50:42Z

The liblinear test segfaults under clang.
Could you pls run it with valgrind and make sure it neither leaks nor does uninitialised memory reads?
https://travis-ci.org/shogun-toolbox/shogun/jobs/359675411#L4561

Pls check those things yourself next time, it is all there....

karlnapf · 2018-04-02T10:51:14Z

Windows build passed, but that is since it doesnt have LAPACK installed and you have the test still guarded so it is not executed

FaroukY · 2018-04-03T00:30:57Z

@karlnapf @vigsterkr Got rid of the memory leak. It seems like Travis timed out on two jobs, but the remaining ones passed. Any ideas?

karlnapf · 2018-04-03T11:35:45Z

clang passed so the tests executed in there were fine. good.
the timeout is not your fault.

I think this is OK to be merged soon.
A made a few more comments inline

karlnapf · 2018-04-03T11:37:53Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+	}
+	void generate_data_l1_simple()
+	{
+		generate_data_simple("L1_SIMPLE");


let's not do this string argument to determine the behaviour of the function. Just split it a bit further into helpers.

Removed it and simplified the functions with extra helpers

karlnapf · 2018-04-03T11:38:36Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+	{
+			generate_data_simple("L2_SIMPLE");
+	}
+	void generate_data(std::string type) //Type either "L1" or "L2"


as said this big function should be split into multiple helpers and pls not string arguments to determine behaviour

karlnapf · 2018-04-03T11:38:59Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	SG_UNREF(pred);
-}
+		SGMatrix<float64_t> data =
+			CDataGenerator::generate_gaussians(num_samples, 2, 2);


could you pls check whether this class is guarded with lapack, and if not, remove the lapack guard around the test?

It doesn't use lapack, so I removed the guard.

karlnapf · 2018-04-03T11:39:48Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-TEST(LibLinear,train_L2R_L2LOSS_SVC)
-{
-	LIBLINEAR_SOLVER_TYPE liblinear_solver_type = L2R_L2LOSS_SVC;
+		/*


the formatting of this comment is weird. Also the comment is superflous: if you name your variables nicely (they are) then the code is self explaining and we dont need this comment as it doesnt add any information

karlnapf · 2018-04-03T11:40:42Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+				We have to transpose the data if its l2. If it is l1, then leave it as it is (Since this is the data of l1 originally)
+			*/
+			index_t num_samples = 10;
+			std::vector<int32_t> x{0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9};


you can initialise SGVector in the same way btw

Done, I replaced vector with SGVector and removed both includes for string and vector

karlnapf · 2018-04-03T11:41:18Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+				if (type=="L1_SIMPLE")
+						train_data(x[i],y[i])=z[i];
+				else
+						train_data(y[i], x[i])=z[i]; //transpose


this is quite cryptic, why not do a transpose_matrix call?

Done, I replaced this with transpose_matrix after the training data is filled in.

karlnapf · 2018-04-03T15:17:35Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+		SGMatrix<float64_t>::transpose_matrix(train_matrix.matrix,
+				train_matrix.num_rows, train_matrix.num_cols);
+
+		SG_UNREF(train_feats);


why removing the old one?

also CDenseFeatures<ST>::get_transposed()

I think you dont need an own method for this, just use the above call

I removed the transpose_feats_matrix() method and replaced it with the calls with get_transposed().

karlnapf · 2018-04-03T15:18:48Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	ll->train();
-	pred = ll->apply_binary(test_feats);
-	SG_REF(pred);
+		/*


minor: could you make those 3 lines one line?

Not sure which 3 lines. Do you mean the if statement?

the comment ... but it is minor

karlnapf · 2018-04-03T15:19:26Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+	void initialize_train_and_test_and_ground_simple()
+	{
+		SGMatrix<float64_t> train_data(2, 10);
+		SGMatrix<float64_t> test_data(2, 10); //Always 2x10, doesn't matter l1 or l2


pls remove this comment, it doesnt help

karlnapf · 2018-04-03T15:19:45Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+	}
+	void generate_data_l1_simple()
+	{
+			initialize_train_and_test_and_ground_simple();


what about removing the _simple suffix?

there is already initialize_train_and_test_and_ground() for the normal data, so I used the _simple suffix to indicate this is the simple data initializer (which is entirely different from the non simple one)

Actually I agree, I got rid of both initialize_train_and_test_and_ground and initialize_train_and_test_and_ground_simple. Thanks!

karlnapf · 2018-04-03T15:20:43Z

tests/unit/classifier/svm/LibLinear_unittest.cc

@@ -6,7 +6,6 @@
 *
 * Written (W) 2014 pl8787


while we are at it, could you make the license BSD? just copy it from anothoer file and maintain the old authors (add yourself)

Done, changed it to BSD license.

karlnapf · 2018-04-03T15:21:17Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+		ll->set_bias_enabled(biasEnable);
+		ll->set_features(train_feats);
+		if (C_value)
+				ll->set_C(0.1,0.1); //Only in the case of L2R_L1LOSS_SVC_DUAL


the comment doesnt help.
Also formatting issues

karlnapf · 2018-04-03T15:22:06Z

tests/unit/classifier/svm/LibLinear_unittest.cc

 		}
+		else
+		{
+			for(int i=0;i<t_w.vlen;i++)


for (auto i : range(t_w.vlen)) ... but this is minor

Changed it.

karlnapf · 2018-04-03T15:22:44Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	liblin_accuracy = eval->evaluate(pred, ground_truth);
+		SG_REF(ground_truth);
+	}
+	void generate_data_l1()


why this wrapper, you could just call the thing inside :)

karlnapf · 2018-04-03T15:23:02Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+	}
+	void generate_data_l2()
+	{
+		initialize_train_and_test_and_ground();


same here, the wrapper doesnt help

Good point, did some renaming to get rid of initialize_train_and_test_and_ground() and initialize_train_and_test_and_ground_simple()

karlnapf · 2018-04-03T15:23:27Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+	}
+	void generate_data_l2_simple()
+	{
+			initialize_train_and_test_and_ground_simple();


formatting is all over the place here

karlnapf · 2018-04-03T15:23:39Z

tests/unit/classifier/svm/LibLinear_unittest.cc

 }

-TEST(LibLinear,simple_set_train_L2R_LR)
+/*
+* --------------------------------


remove this

karlnapf

Cool! One more round and this should be ready!
Thanks!

…t_transposed, and removed entirely two functions since they were unnecessairy wrappers

FaroukY · 2018-04-03T16:10:05Z

@karlnapf Done.

karlnapf · 2018-04-03T16:13:05Z

tests/unit/classifier/svm/LibLinear_unittest.cc

- * Written (W) 2014 pl8787
+ * Authors:
+ *	Written (W) 2014 pl8787
+ *	Refactored (R) 2018 Elfarouk Yasser


just add yourself as author in the standard way...makes later processing of headers easier

karlnapf · 2018-04-03T16:13:39Z

tests/unit/classifier/svm/LibLinear_unittest.cc

+		//Transpose the train_feats matrix
+		auto old_train_feats = train_feats;
+		train_feats = train_feats->get_transposed();
+		SG_UNREF(old_train_feats);


why the unref here?

@karlnapf Doesn't get_transposed() create a new transposed matrix? So I'm just deallocating the old matrix (that i transposed)? Or does get_transposed() act in place?

I guess I would just delete the old one from the same scope that you created it in (tearDown or whatever), but this is also minor.

Make sure to run the test with valgrind to see whether the memory is fine btw

Ohh I see, get_transposed() handles the memory deallocation inside of it. So no need for SG_UNREF :)

I dont think it does ...

Yea i got a bit confused. Since I store the transposed matrix always in train_feats, I need to deallocate the memory before I leave the function or else we leak. The TearDown will release the transposed matrix (which is not stored in train_feats)

karlnapf · 2018-04-03T16:13:55Z

tests/unit/classifier/svm/LibLinear_unittest.cc

-	{
-			initialize_train_and_test_and_ground_simple();
+		generate_data_l2_simple();
+		//transpose train_feats matrix


the comment is useless ... super minor :)

Removed it :)

…o need for SG_REF

karlnapf · 2018-04-03T16:54:05Z

Ok cool! This is much better. Lets wait for the CI and then merge it!

…ocation

FaroukY · 2018-04-03T17:02:13Z

@karlnapf I readded the SG_UNREFS before the get_transposed() since after rereading it, it doesn't handle the memory deallocation. The SG_UNREF for the transposed matrix will be called in TearDown so it should be okay now.

FaroukY · 2018-04-03T17:03:11Z

Last CI passed, let's wait for CI for now after the last commit and merge it :)

karlnapf · 2018-04-04T09:29:04Z

Did run valgrind to check the memory usage of this test? No leaks, no uninitialized reads?

FaroukY · 2018-04-04T10:14:52Z

@karlnapf
Valgrind says no leaks:

https://ibb.co/iLR5ux

karlnapf · 2018-04-04T11:40:40Z

Great!
Thanks so much for this clean-up. Very useful

… to 470. (shogun-toolbox#4208) * Reduced the test lines using a few design patterns * Fixed the required changes discussed in pr shogun-toolbox#4208 * Remove LAPACK requirement since it wasn't needed * added BSD license, removed transpose function and replaced it with get_transposed, and removed entirely two functions since they were unnecessary wrappers * Fixed BSD authors * Removed unnecessairy comments

vigsterkr reviewed Mar 19, 2018

View reviewed changes

FaroukY force-pushed the feature_fix_lib_linear_doc branch from 505f6ab to 9220e42 Compare March 25, 2018 08:01

FaroukY added a commit to FaroukY/shogun that referenced this pull request Mar 25, 2018

Fixed the required changes discussed in pr shogun-toolbox#4208

c4fc6c4

FaroukY added 2 commits March 28, 2018 23:50

Reduced the tests from 1300 lines to 470 using a few design patterns

76d1d24

Fixed the required changes discussed in pr shogun-toolbox#4208

c7a4a20

FaroukY force-pushed the feature_fix_lib_linear_doc branch from c4fc6c4 to c7a4a20 Compare March 29, 2018 03:51

karlnapf reviewed Apr 2, 2018

View reviewed changes

Fixed the memory leaks by adding extra SG_REFS

74d0b7d

karlnapf reviewed Apr 3, 2018

View reviewed changes

karlnapf requested changes Apr 3, 2018

View reviewed changes

added BSD license, removed transpose function and replaced it with ge…

1affdc6

…t_transposed, and removed entirely two functions since they were unnecessairy wrappers

karlnapf reviewed Apr 3, 2018

View reviewed changes

FaroukY added 2 commits April 3, 2018 12:48

Fixed BSD authors

4a65953

Removed unnecessairy comments

473df52

karlnapf approved these changes Apr 3, 2018

View reviewed changes

the function get_transposed already handles memory deallocation, so n…

e54ce43

…o need for SG_REF

readded the SG_UNREF since get_transposed doesn't handle memory deall…

5a2580c

…ocation

karlnapf merged commit 63594a4 into shogun-toolbox:develop Apr 4, 2018

Refactored the CLibLinear test codes and reduced them from 1300 lines to 470. #4208

Refactored the CLibLinear test codes and reduced them from 1300 lines to 470. #4208

Conversation

FaroukY commented Mar 17, 2018

FaroukY commented Mar 17, 2018 • edited

vigsterkr commented Mar 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf commented Mar 23, 2018

FaroukY commented Mar 25, 2018

FaroukY commented Mar 25, 2018

vigsterkr commented Mar 26, 2018

FaroukY commented Mar 29, 2018

Choose a reason for hiding this comment

karlnapf commented Apr 2, 2018

karlnapf commented Apr 2, 2018

FaroukY commented Apr 3, 2018

karlnapf commented Apr 3, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf left a comment

Choose a reason for hiding this comment

FaroukY commented Apr 3, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

FaroukY commented Mar 17, 2018 •

edited

karlnapf commented Apr 3, 2018 •

edited