New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recompute impostors only during Shuffle() call #1494

Open
wants to merge 3 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@manish7294
Member

manish7294 commented Aug 14, 2018

see #1490

@rcurtin

Great! Thank you for looking into this. Let me know what you think of the comments I left. Just a couple sanity checks left from my end; you can do them or I can do them, up to you:

  1. This should speed up LMNN for SGD-like optimizers significantly. Just need to run a test or two to make sure that's the case.
  2. Need to run the LMNNLowRankAccuracyBBSGDTest many times with different random seeds and make sure it never hangs.

If you can do those, great, if not, I'll get to them when I have a chance.

Thanks again for looking into this, I think it is a good solution. 👍

norm = newNorm.elem(ordering);
for (size_t i = 0; i < ordering.n_elem; i++)
if(!linearScan)

This comment has been minimized.

@rcurtin

rcurtin Aug 14, 2018

Member

Tiny style issue---don't forget a space here please. :)

@rcurtin

rcurtin Aug 14, 2018

Member

Tiny style issue---don't forget a space here please. :)

Show outdated Hide outdated src/mlpack/methods/lmnn/lmnn_function_impl.hpp
Show outdated Hide outdated src/mlpack/methods/lmnn/lmnn_main.cpp
@manish7294

This comment has been minimized.

Show comment
Hide comment
@manish7294

manish7294 Sep 5, 2018

Member

@rcurtin I think it's ready for a review. Please check this out as you find some time.

Member

manish7294 commented Sep 5, 2018

@rcurtin I think it's ready for a review. Please check this out as you find some time.

@rcurtin

Sorry it took me a little while to review this. I actually found (strangely) that the tests failed on my system but not on Travis. I'm still unclear on why the tests don't fail on Travis, but in any case what I found was that the LMNNSeparableObjectiveTest and LMNNSeparableEvaluateWithGradientTest failed:

$ bin/mlpack_test -t LMNNTest
Running 19 test cases...
/home/ryan/src/mlpack-manish/src/mlpack/tests/lmnn_test.cpp(214): fatal error: in "LMNNTest/LMNNSeparableObjectiveTest": difference{0.0152284} between lmnnfn.Evaluate(coordinates, 0, 1){1.6000000000000001} and 1.576{1.5760000000000001} exceeds 1e-05%
/home/ryan/src/mlpack-manish/src/mlpack/tests/lmnn_test.cpp(297): fatal error: in "LMNNTest/LMNNSeparableEvaluateWithGradientTest": difference{0.0152284} between objective{1.6000000000000001} and 1.576{1.5760000000000001} exceeds 1e-05%

*** 2 failures are detected in the test module "mlpackTest"

After I dug into it for a while, I noticed that the distance matrix is not actually properly initialized for this test. So, the situation that happened was, an LMNNFunction was initialized and Evaluate() was immediately called---but impostors are not recalculated, and the distance matrix was never filled correctly, so the value of distance(l, i) used on 419 was some random uninitialized memory value.

I was able to resolve this by using the overload of Impostors() that takes a distance matrix in the constructor of LMNNFunction, but this wouldn't be the only way of solving the issue if you had a different way you wanted to do it. Could you take a look into it and see if my solution was correct please?

I've found though that running LMNN with the bbsgd optimizer on the vc2 dataset still seems to fail with this update. I need to think about it a bit... could the pruning we're using be causing problems? I think that is it but I need to think of a solution. I'll update this when I think of something...

arma::cube newEvalOld = evalOld;
arma::vec newlastTransformationIndices = lastTransformationIndices;
arma::mat newMaxImpNorm = maxImpNorm;
arma::vec newNorm = norm;

This comment has been minimized.

@rcurtin

rcurtin Sep 7, 2018

Member

You could use std::move() here to accelerate these in most cases, I think.

@rcurtin

rcurtin Sep 7, 2018

Member

You could use std::move() here to accelerate these in most cases, I think.

@@ -384,7 +384,7 @@ BOOST_AUTO_TEST_CASE(LMNNLBFGSSimpleDatasetTest)
lmnn.LearnDistance(outputMatrix);
// Ensure that the objective function is better now.
LMNNFunction<> lmnnfn(dataset, labels, 1, 0.6, 1);
LMNNFunction<> lmnnfn(dataset, labels, 1, 0.6, 1, false);

This comment has been minimized.

@rcurtin

rcurtin Sep 7, 2018

Member

I think maybe it would be useful to add a test case where an SGD-like optimizer is used but linearScan is set to true, just to make sure we check every code path. You could just adapt the existing tests. Do you think that's reasonable?

@rcurtin

rcurtin Sep 7, 2018

Member

I think maybe it would be useful to add a test case where an SGD-like optimizer is used but linearScan is set to true, just to make sure we check every code path. You could just adapt the existing tests. Do you think that's reasonable?

@rcurtin

This comment has been minimized.

Show comment
Hide comment
@rcurtin

rcurtin Sep 7, 2018

Member

Thought before I go to bed (I think it is right): the use of distance(l, i) as a substitute for metric.Evaluate(transformedDataset.col(i), transformedDataset.col(impostors(l, i))) is currently used under the condition iteration - 1 % range == 0. But this may not work right anymore, since impostors are only calculated in Shuffle()! We can only use distance() as a speedup when the transformation matrix is the same as it was when impostors were calculated---so I wonder, maybe a replacement condition that could work is if (transformationDiff == 0.0)? Let me know what you think.

Member

rcurtin commented Sep 7, 2018

Thought before I go to bed (I think it is right): the use of distance(l, i) as a substitute for metric.Evaluate(transformedDataset.col(i), transformedDataset.col(impostors(l, i))) is currently used under the condition iteration - 1 % range == 0. But this may not work right anymore, since impostors are only calculated in Shuffle()! We can only use distance() as a speedup when the transformation matrix is the same as it was when impostors were calculated---so I wonder, maybe a replacement condition that could work is if (transformationDiff == 0.0)? Let me know what you think.

@rcurtin

This comment has been minimized.

Show comment
Hide comment
@rcurtin

rcurtin Sep 14, 2018

Member

Here's a patch that seems to work: https://gist.github.com/rcurtin/ecde14f3f9e459e4fb5d756052738466

It's possible it could be cleaned up a little. But it fixes the issue I described above. Let me know what you think.

Member

rcurtin commented Sep 14, 2018

Here's a patch that seems to work: https://gist.github.com/rcurtin/ecde14f3f9e459e4fb5d756052738466

It's possible it could be cleaned up a little. But it fixes the issue I described above. Let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment