Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for command-line and Python bindings (nca) #1369

Merged
merged 8 commits into from
May 23, 2018
Merged

Add tests for command-line and Python bindings (nca) #1369

merged 8 commits into from
May 23, 2018

Conversation

yaswagner
Copy link
Contributor

Tests for the NCA binding, to insure its inputs and outputs work correctly.
Added a nca_test.cpp file under maint_test.

Copy link
Contributor

@daiviknema daiviknema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, thanks for the contribution! It looks like a number of checks are failing. This is because the minimum version of Armadillo supported by mlpack is 6.500. Unfortunately, arma::approx_equal is not available till 6.700. You could replace all calls to approx_equal with CheckMatrices which is available in mlpack/tests/test_tools.hpp, or write a loop by hand. That should help pass a few checks.

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there Yasmine,

Thanks for the contribution. The tests you've written look good to me. It would be nice to also add another few simple tests that look a lot like the ones you've already written, if you are willing:

  • Make sure that if input and labels are given, that an exception is thrown in mlpackMain() if the matrices have different numbers of points.

  • When tolerance is changed, we should check that the results are different. We could also do this for linear_scan, batch_size, and some of the L-BFGS-specific parameters. But some of those L-BFGS parameters like max_line_search_trials won't actually be guaranteed (or even likely) to cause a difference in the output, so I don't think we should worry about all of them.

Let me know what you think. If you are able to add the other tests, that is great, and if not we can just handle the single issue with approx_equal() and make sure the tests pass and then merge.

Thanks again! 👍


// Check that the output matrices are different.
BOOST_REQUIRE_EQUAL(arma::approx_equal(
CLI::GetParam<arma::mat>("output"), output, "absdiff", 0), false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid the use of approx_equal() here you could also do BOOST_REQUIRE_GT(arma::accu(CLI::GetParam<arma::mat>("output") != output), 0);

Copy link
Member

@rcurtin rcurtin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I didn't realize that this one was waiting on me!

Anyway, I think the tests look fine, so I will go ahead and merge this in 4 days to leave time for other comments. If you'd like to add your name to the list of contributors in src/mlpack/core.hpp and COPYRIGHT.txt, we can merge that too.

Also, if you want some mlpack stickers for your laptop, I'd be happy to mail them to you---just send me an email with your mailing address and I'll get them in the mail early next week (I am out of town over the weekend).

If you have a chance to look into the issues I mentioned (specifically the random seed issue), it would be great, but if not I'll handle it during merge. I think the tests are fine, that is just a minor issue to fix.

Thanks again for the contribution! 👍

/**
* Ensure that output is different when the tolerance is different.
*/
BOOST_AUTO_TEST_CASE(NCADiffferentToleranceTest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a really minor spelling issue here with Diffferent but I can fix it when I do the merge, unless you get to it first. :)

// Set parameters and set normalize to true.
SetInputParam("input", std::move(x));
SetInputParam("labels", std::move(labels));
SetInputParam("normalize", true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there is a little bit of trickiness here. NCA by default uses the SGD optimizer, and the SGD optimizer depends on the random seed since it chooses batches randomly. In this case, we want to ensure that normalization makes a difference, but as the code currently is, it's likely to give different results simply by virtue of the fact that the order in which batches are visited for SGD will be different.

I think a workaround we can use is for this type of test, we can set linear_scan to true, so that we will always visit points in the same order. (The linear_scan test is fine as-is, since there all we want to know is that we get a different result there. In fact, it might be useful to add another test where we run the same data twice with linear_scan set, and we should expect the same result.)

@zoq
Copy link
Member

zoq commented May 15, 2018

Perhaps, it's a good idea to edit the message above, and to send a private message.

Copy link
Member

@zoq zoq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just pointed out some minor style issue.

" 1.0 0.0 -1.0 1.0 0.0 -1.0 ";
arma::Row<size_t> labels = " 0 0 0 1 1 1 ";

// Set parameters with a small bath_size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issue; small batch_size.

BOOST_FIXTURE_TEST_SUITE(NCAMainTest, NCATestFixture);

/**
* Ensure that, when labels are implicitily given with input,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic style issue; missing space at the beginning, this should align with the first * from the first line.

" 1.0 0.0 -1.0 1.0 0.0 -1.0 ";
arma::Row<size_t> labels2 = " 0 0 0 1 1 1 ";

// Set parameters using the same input but set linear_scan flag to false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic style issue number 2, missing . at the end.


/**
* Ensure that using a different value of max_iteration
* results in a different output matrix
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.

@yaswagner yaswagner changed the title #1152 Add tests for command-line and Python bindings Add tests for command-line and Python bindings (nca) May 23, 2018
@rcurtin rcurtin merged commit d857114 into mlpack:master May 23, 2018
@rcurtin
Copy link
Member

rcurtin commented May 23, 2018

Thanks again for the contribution! I fixed a handful of things during the merge:

  • f5bee6e, 7d12b2b: added an explicit test for the label size to nca_main.cpp---the previous test would only succeed when compiled in debug mode.
  • 93fa628: fixed a memory error I found with valgrind while running the tests (though that wasn't related to any code you wrote, I just found it and fixed it)
  • 6ad2eaf: minor issue, the labels should have had 100 labels not 3 :)
  • e18d61d: I applied linear_scan to the rest of the tests that used SGD

Let me know if I broke anything or changed it in a way you don't like; I think the changes are mostly pretty minor though. 👍

@yaswagner yaswagner deleted the 1152 branch May 25, 2018 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants