Minimize Neural Layer Unit Test Redundancies #4424

saatvikshah · 2018-11-25T19:30:32Z

I've currently made changes for the RectifiedLinearLayer and the Softmax layer. Since its my first task, I wanted to be sure these changes look alright, before fixing the remaining Neural*Layer tests.

Addresses #4233

I could probably also make the fixes needed to address #4232 as part of this PR if thats alright with you'll?

saatvikshah · 2018-11-25T19:31:35Z

.gitignore

@@ -291,6 +291,10 @@ $RECYCLE.BIN/
 # Windows shortcuts
 *.lnk

+# CMake specific


Sorry, this would be CLion specific will fix.

we usually don't add such specifics to the gitignore. Devs do that locally

karlnapf

Thanks, this is a really useful cleanup! We really appreciate you getting into this.

I made a few comments that should be discussed/changed. We can then iterate, do a few changes, and merge it.

Thanks again!

karlnapf · 2018-11-25T21:34:49Z

.gitignore

@@ -291,6 +291,10 @@ $RECYCLE.BIN/
 # Windows shortcuts
 *.lnk

+# CMake specific


we usually don't add such specifics to the gitignore. Devs do that locally

karlnapf · 2018-11-25T22:07:06Z

tests/unit/neuralnets/NeuralRectifiedLinearLayer_unittest.cc

+protected:
+	void SetUp()
+	{
+		CMath::init_random(100);


can't we move way more of the shared code into the fixture?

karlnapf · 2018-11-26T09:35:43Z

tests/unit/utils/Utils.h

@@ -55,4 +58,44 @@ SGStringList<T> generateRandomStringData(
 	return strings;
 }

+struct NeuralLayerTestUtil


this code is so specific, I suggest we move it into a helper file within neuralnets

karlnapf · 2018-11-26T09:37:49Z

tests/unit/utils/Utils.h

+struct NeuralLayerTestUtil
+{
+	template <typename T>
+	static SGMatrix<T> create_rand_sgmat(


I think this is not a function we should have in neuralnet unittests.
Rather, this should be somewhere where other random number generators sit, like Random.h or something more appropriate. Random normal matrices are used in a lot of places in shogun

karlnapf · 2018-11-26T09:38:35Z

tests/unit/utils/Utils.h

+		auto data_batch = SGMatrix<T>(num_rows, num_cols);
+		for (int32_t i = 0; i < data_batch.num_rows * data_batch.num_cols; i++)
+		{
+			data_batch[i] = CMath::random(lower_bound, upper_bound);


actually, there should also be a more efficient way to general an array of random normals. Check the array filling procedures in Random.h

karlnapf · 2018-11-26T09:40:31Z

tests/unit/utils/Utils.h

+	{
+		layer->initialize_neural_layer(layers, input_indices);
+		SGVector<float64_t> params(layer->get_num_parameters());
+		SGVector<bool> param_regularizable(layer->get_num_parameters());


karlnapf · 2018-11-26T09:41:12Z

tests/unit/neuralnets/NeuralRectifiedLinearLayer_unittest.cc

+	SGMatrix<float64_t> x;
+	CNeuralInputLayer* input;
+	std::tie(x, input) =
+	    NeuralLayerTestUtil::create_rand_input_layer<float64_t>(


it is nice to pull this kind of stuff out!

karlnapf · 2018-11-26T09:43:19Z

tests/unit/neuralnets/NeuralSoftmaxLayer_unittest.cc


 using namespace shogun;

+class NeuralSoftmaxLayerTest : public ::testing::Test


this fixture is almost identical to the other one. Is there a reason to have this second one?

Yep will do so - I'll add a common fixture with the utility functions in NeuralLayerUtil right inside the unit/neuralnets folder

saatvikshah · 2018-11-26T22:51:31Z

I had 2 C++ in Shogun related questions:

Upto what version of C++ is permitted: Edit: Resolved - its C++14
Is there an option to run the tests with Address Sanitizer turned on?

saatvikshah · 2018-11-27T08:15:10Z

Addressed all your comments.
P.S. I ran clang-format-3.8 directly(the check-format.sh script doesnt work on my Macbook) which ended up changing some additional lines.

karlnapf · 2018-11-29T10:36:54Z

Ok cool, let me check this

tests/unit/neuralnets/NeuralLayerTestFixture.h

karlnapf · 2018-11-29T10:38:37Z

tests/unit/neuralnets/NeuralLayerTestFixture.h

+{
+public:
+	template <typename T>
+	SGMatrix<T> create_rand_sgmat(


see my earlier comment: this is not nn specific but should be a global shogun option

karlnapf · 2018-11-29T10:39:17Z

tests/unit/utils/Utils.h

-#include <shogun/lib/SGMatrix.h>
-#include <shogun/lib/SGStringList.h>
-#include <shogun/lib/SGVector.h>
+#include "shogun/lib/SGMatrix.h"


I believed that its standard convention - system files <> vs. project files "". Mainly useful since it helps differentiate project includes vs. system includes.
I can revert if Shogun follows a different convention regarding this.

I had made that change earlier when I had added NeuralLayerUtil. Decided to leave that in since I felt its a style improvement.

yes but in order to compile the shogun tests we add the include path via -I so it by definition is a system include. Check out the other headers ... we are doing this everywhere :)

Oh ok, thanks for clarifying. I should also reverse the "" includes I've done in the other classes then.
Ill just revert my changes on this file then. If needed even the #pragma once can be added later.

karlnapf · 2018-11-29T10:40:06Z

tests/unit/utils/Utils.h

@@ -3,12 +3,11 @@
 *
 * Authors: Giovanni De Toni, Heiko Strathmann
 */
-#ifndef __UTILS_H__
-#define __UTILS_H__
+#pragma once


we will need to discuss whether we allow this...
@vigsterkr @lisitsyn ?

karlnapf · 2018-11-29T10:41:38Z

tests/unit/neuralnets/NeuralLayerTestFixture.h

+	    int32_t num_rows, int32_t num_cols, T lower_bound, T upper_bound) const
+	{
+		auto data_v = SGVector<T>(num_rows * num_cols);
+		data_v.random(lower_bound, upper_bound);


e.g. we could have a random method in SGMatrix (because we have one in vector already)
this code is a bit too much noise for my taste, especially in a test

karlnapf

This is almost good to go. Nice work! :)
Just the one comment about random matrices.
I would suggest to add SGMatrix::random, and then do an optional elementwise multiplication using linalg

saatvikshah · 2018-12-02T04:11:24Z

Addressed comments - Further reduced a number of redundancies - Added shogun::linalg ops wherever possible.

karlnapf · 2018-12-02T09:59:26Z

src/shogun/lib/SGMatrix.cpp

+template <class T>
+void SGMatrix<T>::random(T min_value, T max_value)
+{
+	for (index_t idx = 0; idx < num_rows * num_cols; ++idx)


pls, instead of a loop, use the CRandom::fill_array* functions that in some situations can be much much faster

I was making it consistent with SGVector but sure - yes I noticed it uses SIMD instructions.
I think SGVector::random should be changed too accordingly - not sure whether should be done in this or a different PR.

There's two catches here here though:

SGMatrix::random should support the following types: int32/64; uint32/64,float32/64/128. fill_array* doesnt support int32/64 and float32/128. We will have to default back to the for loop implementation above for these types.

The uint32/64 returns a random 32 or 64-bit integer(not within any range). The float64 returns a float between [0, 1). Additional processing steps will be needed to scale them to [min, max]. Finally, the uint version only vectorizes random number generation if certain strict conditions are met, else it falls back to a for loop.

yes you are exactly right, this needs template specialization, and a loop fallback is fine imo
Let's do it for SGMatrix::random in this PR

I suggest we keep the interface without arguments, i.e. (SGMatrix::random() and SGMatrix::random_normal), and then the caller is responsible for scaling things to the appropriate bounds, using linalg

very nicely thought trough! :)

Thanks, though I'd like to clarify (2) again. I get the SGMatrix::random() function which would just fill the array with a uniform random distribution(using fill_array*). For random_normal I would need to further apply something like the Box-Muller transform right possibly using shogun::linalg ops.

@saatvikshah1994
we have two options here:

Make a loop over the CRandom::random_normal insice `SGMatrix::random (since CRandom doesn't support SIMD random normal generation

Implement a random_normal_array which
a. Has a default implementation with a loop for now
b. First uses fill_array and then re-uses the box muller trainsform code that is already present (would need some refactoring)

I suggest we keep things simple for now:

offer CRandom::random_normal_array and use a loop for now

if you want: make another PR once this is merged where you do option 2b from above

We definitely only want SGMatrix::random_normal to be a wrapper, i.e. no transforms or other complicated things in there.

@karlnapf thanks this makes sense.
I had one suggestion though - I feel that the entire random() and random_normal() impl should go in its own separate (issue + PR) combo. Reason being that (1) the implementation is less related to the immediate issue of minimizing the test redundancies and (2) the changes should ideally be propagated to SGVector as well - overall then a significant portion of this PR would mix up two disparate issues. For this PR, I could just use the original NeuralLayerFixture::create_rand_matrix which would then be replaced in the second PR.

I'm up for whichever way you lean towards.

karlnapf

Apart from the loop in random, this is ready to be merged

karlnapf · 2018-12-02T10:03:34Z

tests/unit/neuralnets/NeuralRectifiedLinearLayer_unittest.cc

-		}
-	}
+	shogun::linalg::add_vector(
+	    shogun::linalg::matrix_prod(weights, x), biases, A_ref);


you could store this in a variable as it is not optimized out or anything anyways

Not sure what you mean? add_vector itself stores its result in the position denoted by A_ref and returns void.

sorry I just mean to not nest linalg calls for readability. minor

karlnapf · 2018-12-02T10:04:48Z

tests/unit/neuralnets/NeuralSoftmaxLayer_unittest.cc

+#include <shogun/neuralnets/NeuralInputLayer.h>
+#include <shogun/neuralnets/NeuralSoftmaxLayer.h>
+
+#include <memory>


do you use anything from this?

no will remove

karlnapf · 2018-12-05T07:29:34Z

Yes I agree! This turned out to be a bit more effort than I first thought! ;) Let’s keep your original function then and feel free to take a stab at the other or you mentioned! Thanks for the nice discussion!

On Wed, 5 Dec 2018 at 08:18, Saatvik Shah ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/shogun/lib/SGMatrix.cpp <#4424 (comment)> : > @@ -227,6 +227,22 @@ void SGMatrix<T>::zero() set_const(static_cast<T>(0)); } +template <class T> +void SGMatrix<T>::random(T min_value, T max_value) +{ + for (index_t idx = 0; idx < num_rows * num_cols; ++idx) @karlnapf <https://github.com/karlnapf> thanks this makes sense. I had one suggestion though - I feel that the entire random() and random_normal() impl should go in its own separate (issue + PR combo). Reason being that (1) the implementation is less related to the immediate issue of minimizing the test redundancies and (2) the changes should ideally be propagated to SGVector as well - overall then a significant portion of this PR would mix up two disparate issues. For this PR, I could just use the original NeuralLayerFixture::create_rand_matrix which would then be replaced in the second PR. I'm up for whichever way you lean towards. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#4424 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAqqvwNqOsXrptGYMdrmc9zITqg-OrTkks5u13M6gaJpZM4YyBmq> .

-- Sent from my phone

karlnapf · 2018-12-06T20:45:20Z

tests/unit/neuralnets/NeuralLayerTestFixture.h

+protected:
+	void SetUp() final
+	{
+		CMath::init_random(SEED);


do we need to set the seed? would prefer not to (portability)

Not clear to me why this makes it less portable? I kept it to ensure consistency on each run + it was being done earlier :). You've approved the PR so should I change it still?

Different environments -> different random numbers for the same seed

If all the tests pass without it, I would prefer without ... let me know.

unit tests are one binary, so if you change the state of libshogun, all other tests might be affected. Even if not, it is better practice to only ensure something if it is needed.

Yes, all approved otherwise! :)

karlnapf

Looks pretty good to me!
Thanks for this, really useful

saatvikshah · 2018-12-06T21:13:48Z

Once this is merged, I'll need to sift through the remaining Neural*Layer tests and apply similar changes.

karlnapf · 2018-12-06T21:30:55Z

Once this is merged, I'll need to sift through the remaining Neural*Layer tests and apply similar changes.

great!

karlnapf

Ah sorry, there is one thing: you need to rebase against develop, since you have a data submodule commit in there. See the first changed file in the diff

Run clang-format on diff

Suggested improvements + linalg usage Addressed review comments Remove random seed

…ayer_test_redundancies

karlnapf · 2018-12-08T09:09:43Z

Thanks for this.
Please keep an eye in the [CI}(https://dev.azure.com/shogunml/shogun/_build?definitionId=2) and the buildbot. If anything breaks that wasn't broken before, we should fix it first thing.

saatvikshah · 2018-12-08T15:40:05Z

The CI output looks correct but the buildbot seems to have failed here: http://buildbot.shogun-toolbox.org:8080/#/ on multiple machines although the errors reported seem arbitrary.

karlnapf · 2018-12-08T17:13:00Z

Yep CI is indeed fine. Buildbot, just check the builder list for cases that were green until we merged this PR (last build). They are all fine. The broken builds need fixing, most of them at least, with the recently broken ones having a higher priority. It is usually a tiny bit of work to figure out what causes errors, but we usually have logs to see which commit broke them.

* Fixed redundancies in RectifiedLinear and Softmax Layer unit tests * Add common fixture class + formatting * Remove random seed

Gitignore additions

4963e83

saatvikshah commented Nov 25, 2018

View reviewed changes

karlnapf reviewed Nov 26, 2018

View reviewed changes

karlnapf reviewed Nov 29, 2018

View reviewed changes

tests/unit/neuralnets/NeuralLayerTestFixture.h Outdated Show resolved Hide resolved

karlnapf reviewed Nov 29, 2018

View reviewed changes

saatvikshah force-pushed the feature/neural_layer_test_redundancies branch from 957bb76 to 1edd4cb Compare December 2, 2018 04:01

karlnapf reviewed Dec 2, 2018

View reviewed changes

saatvikshah mentioned this pull request Dec 6, 2018

Vectorized RNG #4430

Open

saatvikshah force-pushed the feature/neural_layer_test_redundancies branch from 9011c8e to d2e478e Compare December 6, 2018 04:30

karlnapf reviewed Dec 6, 2018

View reviewed changes

karlnapf approved these changes Dec 6, 2018

View reviewed changes

karlnapf requested changes Dec 7, 2018

View reviewed changes

saatvikshah added 3 commits December 8, 2018 00:11

Fixed redundancies in RectifiedLinear and Softmax Layer unit tests

2bb8ea5

Run clang-format on diff

Add common fixture class + formatting

e1deec0

Suggested improvements + linalg usage Addressed review comments Remove random seed

Merge remote-tracking branch 'upstream/develop' into feature/neural_l…

7b5dcc4

…ayer_test_redundancies

saatvikshah force-pushed the feature/neural_layer_test_redundancies branch from 9d148fc to 7b5dcc4 Compare December 8, 2018 05:14

karlnapf merged commit 22d3861 into shogun-toolbox:develop Dec 8, 2018

ktiefe pushed a commit to ktiefe/shogun that referenced this pull request Jul 30, 2019

Minimize Neural Layer Unit Test Redundancies (shogun-toolbox#4424)

76e265e

* Fixed redundancies in RectifiedLinear and Softmax Layer unit tests * Add common fixture class + formatting * Remove random seed


		using namespace shogun;

		class NeuralSoftmaxLayerTest : public ::testing::Test

Minimize Neural Layer Unit Test Redundancies #4424

Minimize Neural Layer Unit Test Redundancies #4424

Conversation

saatvikshah commented Nov 25, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saatvikshah commented Nov 26, 2018 • edited

saatvikshah commented Nov 27, 2018

karlnapf commented Nov 29, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf left a comment

Choose a reason for hiding this comment

saatvikshah commented Dec 2, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saatvikshah Dec 4, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

saatvikshah Dec 5, 2018 • edited

Choose a reason for hiding this comment

karlnapf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf commented Dec 5, 2018 via email

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

karlnapf left a comment

Choose a reason for hiding this comment

saatvikshah commented Dec 6, 2018

karlnapf commented Dec 6, 2018

karlnapf left a comment

Choose a reason for hiding this comment

karlnapf commented Dec 8, 2018

saatvikshah commented Dec 8, 2018

karlnapf commented Dec 8, 2018

saatvikshah commented Nov 25, 2018 •

edited

saatvikshah commented Nov 26, 2018 •

edited

saatvikshah Dec 4, 2018 •

edited

saatvikshah Dec 5, 2018 •

edited