Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMNN distance learning #1407

Merged
merged 46 commits into from Jul 4, 2018

Conversation

Projects
None yet
4 participants
@manish7294
Copy link
Member

manish7294 commented May 20, 2018

An implementation of LMNN.

manish7294 added some commits May 20, 2018

@rcurtin
Copy link
Member

rcurtin left a comment

I took a look over the code; it looks like a great start. Just some minor comments from my end. Do you want to add some unit tests for it? (i.e. simple toy problems)

math::ClearAlias(labels);

dataset = std::move(newDataset);
labels = std::move(newLabels);

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

Each time the dataset is shuffled, I believe we have to recalculate TargetNeighbors().

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

Or, on second thought, it's not necessary to recalculate the target neighbors, but we do need to apply the same shuffling to the indices. I guess an easy way to do it for testing is just to call TargetNeighbors() again with the shuffled data (but that is very slow and we shouldn't use it as a long-term solution).

This comment has been minimized.

@manish7294

manish7294 May 22, 2018

Author Member

Ahh! I missed this. Thanks for pointing it out. You're right a good strategy needs to be devised.

This comment has been minimized.

@manish7294

manish7294 May 22, 2018

Author Member

I read shuffleData() code and I think instead of directly calling shuffleData() we can use its internal code ( as it will give us the shuffle ordering ) and just re-map target neighbors to ordering indices. What do you think?

This comment has been minimized.

@rcurtin

rcurtin May 22, 2018

Member

I think that's fine; alternately, you could rewrite ShuffleData() so it uses variadic templates to process any number of matrices that need to be reordered.

This comment has been minimized.

@manish7294

manish7294 May 23, 2018

Author Member

I think using variadic won't work ---- here in case of target neighbors, all the target neighbors will have different indexes (resulting in a change of all row elements too). So, just changing order of target neighbors matrix cols according to shuffling order won't result in correct target neighbors matrix.

}
}

return cost;

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

This looks correct to me.

}
}

return cost;

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

This also looks correct to me.

}

//! Calculate outer product of vector.
arma::mat OuterProduct(arma::vec& diff)

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

I think it's unnecessary to have this function (I think it's just fine to write it in-place), but if you do want to include it, I'd suggest inlining it.

// Calculate gradient due to target neighbors.
arma::vec cij = dataset.col(i) - dataset.col(targetNeighbors(j, i));
gradient += (1 - regularization) * OuterProduct(cij);
}

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

I think that you could also move the gradient += (1 - regularization) * OuterProduct(cij) outside the loop, and just take the sum of all cijs inside the loop.

//! Compute gradient over whole dataset.
template<typename MetricType>
void LMNNFunction<MetricType>::Gradient(const arma::mat& coordinates,
arma::mat& gradient)

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

This assumes the same impostors as Evaluate() (or, I guess, the real assumption is that Gradient() will always be called after Evaluate() with the same coordinates). I think that is true in most cases, but if you are having trouble getting the algorithm to converge correctly, you could always try manually recalculating impostors here. In addition, if you implement an EvaluateWithGradient() function, an optimizer can make use of that and you can save on the calculation of impostors there. (See http://mlpack.org/docs/mlpack-git/doxygen/optimizertutorial.html, section "The FunctionType API", if you haven't already)

This comment has been minimized.

@manish7294

manish7294 May 22, 2018

Author Member

You're absolutely correct, that's the real reason behind avoiding re-calculation of impostors. Thanks for mentioning EvaluateWithGradient(), I totally missed it's significance earlier. :)

arma::cx_vec eigval;
arma::cx_mat eigvec;

arma::eig_gen(eigval, eigvec, iterate);

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

iterate has to be symmetric (since we are optimizing M), so you can use eig_sym() here which will make your life a bit easier.

This comment has been minimized.

@manish7294

manish7294 May 22, 2018

Author Member

Yeah, It will definitely help :)

template<typename DecomposableFunctionType>
double StandardSGD::Optimize(
DecomposableFunctionType& function,
arma::mat& iterate)

This comment has been minimized.

@rcurtin

rcurtin May 21, 2018

Member

I think that this will work, but I'm not sure how Projection() will work for L_BFGS. I suspect that maybe you could use a custom update policy for SGD to get the projection in there (instead of this overload). I'm not sure of the best way to do it for L-BFGS, I'd have to think about it a bit...

This comment has been minimized.

@manish7294

manish7294 May 22, 2018

Author Member

What do you think about policy update() overload? One for each 3 type, or just 1 (like vanilla update) we wish to keep.
And in case of L-BFGS, I am currently seeing LineSearch overload as an option, but it may not be best.

This comment has been minimized.

@rcurtin

rcurtin May 22, 2018

Member

Hmm, so it depends on whether or not we plan to go with this approach in the long term. In the short term, for testing, we can do something really ugly and at each Evaluate() call, cast the const arma::mat& coordinates to an arma::mat& then project it. That can work for testing but it is, in my view, far too much of a hack to actually commit long-term.

Alternately, I could see overloading a few optimizers to handle the projection step and then using a static_assert() to ensure that an overloaded optimizer is used, or, modifying all of the optimizers to have a PostStepPolicy or something like this that allows some action to be taken after a step. (That last idea would be a lot of work for just one FunctionType though.)

This comment has been minimized.

@manish7294

manish7294 May 23, 2018

Author Member

Let's use casting hack for now. As you said, we may not be even using projection as we move further.

manish7294 added some commits May 22, 2018

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 22, 2018

Sure, I will add some tests shortly.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 22, 2018

Sounds good---once the tests are there, we can know whether or not LMNN is working right. We can try to reproduce the results in the paper, and see how long it takes to run. Then we could switch out the M matrix for L^T L and this means there's no need for a projection step or any of the ugly hacks necessary for that, and then we can see if that's faster. Let me know if you think that's a reasonable plan, and if you already had something else in mind we could do that instead. :)

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 23, 2018

That sounds totally well put plan. I am more than happy to go with it :)

manish7294 added some commits May 23, 2018

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 24, 2018

@rcurtin I made a quick comparison of shogun's LMNN with our's current implementation on dataset given in this notebook with k = 1.

Here is a graph for original dataset - link
Here is result by shogun - link
Here is result by current implementation using L-BFGS optimizer - link

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 24, 2018

I think maybe those results are the same (or very similar)---it seems to me like the Shogun result is plotting the "stretched" circle instead of the stretched points, whereas the result you have plots the stretched points. So I think that this LMNN implementation is working fine. 👍

Do you want to add (maybe just as a comment to the PR or something like this) some simple benchmarking of how long it takes to run LMNN on different datasets with different k values when compiled without debugging symbols? That will give us some reference values for the future as we improve the code.

LMNNFunction<> lmnnfn(dataset, labels, 1, 0.6);

double initObj = lmnnfn.Evaluate(arma::eye<arma::mat>(2, 2));
double finalObj = lmnnfn.Evaluate(outputMatrix);

This comment has been minimized.

@rcurtin

rcurtin May 24, 2018

Member

Hmm, this checks that the objective function is better, which is correlated to the result we want, but what we'd really like to see is that the nearest neighbor of each point now has the same label. So maybe we should check that explicitly?

This comment has been minimized.

@manish7294

manish7294 May 26, 2018

Author Member

I think, "final nearest neighbors having same labels" shouldn't be necessary condition. As that would be the ideal case. What do you think?
Maybe we can do that check for this dataset explicitly.

This comment has been minimized.

@rcurtin

rcurtin Jun 4, 2018

Member

Oops, didn't respond to this one, sorry about that. I think maybe it is worth checking that KNN accuracy improves for this very simple dataset, but you are right, in general that may not always be the case.

This comment has been minimized.

@manish7294

manish7294 Jun 4, 2018

Author Member

I made a run on this dataset. We are getting 100% accuracy with this.

This comment has been minimized.

@rcurtin

rcurtin Jun 4, 2018

Member

Sounds good, we should write that into the test either by manual distance calculations or using the KNN object.

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 24, 2018

@rcurtin One thing is still bugging me. Currently we are getting good enough results with L-BFGS but with sgd the results are not good. Do you have some idea as what could be happening?

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 24, 2018

How many points are you using for testing? If the dataset is a tiny 6-point example dataset, then the SGD gradients for each mini-batch (or for each point, if the batch size is 1) are likely to point in opposite directions, and the optimizer may have a harder time converging. I'd suggest making a larger dataset (hundreds or thousands of points perhaps) and trying that; I think maybe it will work better.

Alternately, the step size may just need to be tuned. You could print the objective at every iteration to see what is going on too---if the step size is too large, it may diverge.

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 24, 2018

Right, that may be the cause. As with a dataset (iris) just greater than 6 points, the results were better but not better than L-BFGS. I will report it after checking once more.

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 25, 2018

I took a look at objective after every iteration and the final objective, to my surprise sgd's final objective is always coming to be the highest among all the values. Don't know what's going on?

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 25, 2018

That's strange, can you paste the output? SGD should be minimizing...

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 26, 2018

Here is a result using sgd on iris_train with k = 3 :
I kept maxiterations to 200, the results are similar with higher maxiterations too ---- but just a very large objective value.

75.8449
39.1785
152.473
67.7149
258.934
103.073
73.6216
423.22
74.7894
Final Objective 498.009
@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 26, 2018

Looks like the learning rate is too large and it is bouncing around minima. Try reducing the learning rate significantly... if you reduce it too much, it will be very slow to reduce the objective, but if you don't reduce it enough, it'll decrease for a bunch of steps and then may explode back to large values. This is a disadvantage of SGD, sometimes it needs to be tuned. :)

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 26, 2018

@rcurtin you're right, here's a result with same above dataset using sgd: link

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 26, 2018

I think somehow as per figure, target neighbors are also being pushed away in case of sgd.

manish7294 added some commits May 28, 2018

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 29, 2018

(sorry, I was away over the weekend and it was a 3-day weekend here)

The figure looks ok, but not perfect. Is that figure generated after the step size is tuned, or is that from the SGD output you gave earlier?

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 29, 2018

The figure is generated after tuning.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 29, 2018

Hm, ok, what was the output of SGD after tuning the step size? Did it converge to a better minima?

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 29, 2018

You can also compare the objective function value with the one generated by L-BFGS.

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented May 29, 2018

Ya, final objective value is around 2.7 as compare to starting 3.6 which is close to L-BFGS one.

@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented May 29, 2018

With SGD, the gradient can be really noisy with so few points. When I look at the figure, I think the results are okay, and if it's converging to a similar minima as L-BFGS then I think that it's working right. It might also be worth checking if SGD can converge to as good of a minima as L-BFGS on larger datasets like iris or vc2 or others, and if it does I'm reasonably convinced that the implementation is correct and we can start moving on to optimization, if you agree.

@manish7294 manish7294 changed the title [WIP] LMNN distance learning LMNN distance learning Jun 22, 2018

@rcurtin
Copy link
Member

rcurtin left a comment

Thanks for the hard work. The new tests look great and I think we're nearly ready to merge, just a couple more things. (Sorry that a few of them I did not think of in the last review. I don't expect to have anything more like that)

//! Range after which impostors need to be recalculated.
size_t range;
//! Holds pre-calculated cij.
arma::mat p_cij;

This comment has been minimized.

@rcurtin

rcurtin Jun 22, 2018

Member

Ah, I missed this one earlier, it is just a tiny style issue---we try to avoid underscores in variable names, maybe you can change this to pCij or something?

This comment has been minimized.

@manish7294

manish7294 Jun 23, 2018

Author Member

Done!

PARAM_MATRIX_IN_REQ("input", "Input dataset to run LMNN on.", "i");
PARAM_UROW_IN("labels", "Labels for input dataset.", "l");
PARAM_INT_IN("num_targets", "Number of target neighbors to use for each "
"datapoint.", "k", 1);

This comment has been minimized.

@rcurtin

rcurtin Jun 22, 2018

Member

We could just call this k like the KNN programs.

This comment has been minimized.

@manish7294

manish7294 Jun 23, 2018

Author Member

Done!

"the same will look like: "
"\n\n"
"$ bin/mlpack_lmnn -i iris.csv -l iris_labels.txt -k 3 -O bbsgd -o "
"output.csv --verbose"

This comment has been minimized.

@rcurtin

rcurtin Jun 22, 2018

Member

Examples look good to me. Can you use the PROGRAM_CALL() macro though? Then it will display correctly from Python too:

PROGRAM_CALL("mlpack_lmnn", "input", "iris", "labels", "iris_labels", "num_targets", 3, "optimizer", "bbsgd", "output", "output", "verbose");

(I think that is right.)

This comment has been minimized.

@manish7294

manish7294 Jun 23, 2018

Author Member

Done!

PARAM_UROW_IN("labels", "Labels for input dataset.", "l");
PARAM_INT_IN("num_targets", "Number of target neighbors to use for each "
"datapoint.", "k", 1);
PARAM_MATRIX_OUT("output", "Output matrix for learned distance matrix.", "o");

This comment has been minimized.

@rcurtin

rcurtin Jun 22, 2018

Member

I had an idea for two other nice options that should be straightforward to add---

  • transformed_data: an output option to save the transformed output data
  • print_accuracy: print the kNN accuracy before and after optimization

What do you think?

This comment has been minimized.

@manish7294

manish7294 Jun 23, 2018

Author Member

Done!

SetInputParam("input", std::move(inputData));
SetInputParam("labels", std::move(labels));
SetInputParam("optimizer", std::string("bbsgd"));
SetInputParam("batch_delta", std::move(10.0));

This comment has been minimized.

@rcurtin

rcurtin Jun 22, 2018

Member

No need to std::move() a number, but I guess it doesn't make a difference either way. :)

This comment has been minimized.

@manish7294

manish7294 Jun 23, 2018

Author Member

Done!

* Ensure that different value of regularization results in a
* different output matrix.
*/
BOOST_AUTO_TEST_CASE(LMNNDiffRegularizationTest)

This comment has been minimized.

@rcurtin

rcurtin Jun 22, 2018

Member

Note that for any of the Diff tests like this, if you use an SGD-type optimizer it'll be different just by virtue of points being visited in a random and different order. So I think we'd need to set linear_scan to true.

This comment has been minimized.

@manish7294

manish7294 Jun 23, 2018

Author Member

Done!

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented Jun 25, 2018

Ah! I just realized, there was no parameter for allowing the user to pass initial learning point, hence losing all significance of Low-Rank optimization. I have made some changes regarding it. Hopefully, they should suffice.

@rcurtin
Copy link
Member

rcurtin left a comment

Everything looks good to me and I'm looking forward to getting this merged. :) I'll wait 7 days since this is such a lot of code, in case anyone else would like to review it.

I'll also open a handful of issues about various optimizations, etc., that could make it even faster, and we can handle those as we are able.


for (size_t j = 0; j < k; j++)
Map(labels(neighbors(j, i))) +=
1 / std::pow(distances(j, i) + 1, 2);

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

This is a minor style issue, but could you add {} around the inner part of the for loop here for readability? i.e.

    for (size_t j = 0; j < k; j++)
    {
      Map(labels(neighbors(j, i))) +=
          1 / std::pow(distances(j, i) + 1, 2);
    }

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

sure :)

1e20);
PARAM_INT_IN("range", "Number of iterations after which impostors needs to be "
"recalculated", "R", 1);
PARAM_INT_IN("seed", "Random seed. If 0, 'std::time(NULL)' is used.", "s", 0);

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

This is a really huge number of parameters. I think it's nice for helping us debug it, so we can leave it in for now, but eventually we should prune down the number of parameters to something smaller. So, e.g., these last handful of L-BFGS-specific parameters we can just leave out. If a user wants to work at that fine-grained a level of detail, they can write C++ for themselves. Let me know what you think, and if you want to leave them in for now I'll just open an issue to simplify them and we can handle that later.

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

Right, this eventually got so big, that I had a hard time finding alphabets for them :)
It would be good enough if we can just avoid all basic optimizers specific parameters, I guess there are many like beta1, beta2, etc. Probably they will be used occasionally.

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

Ok---if you want to simplify before merge, feel free; otherwise we can open an issue.

This comment has been minimized.

@manish7294

manish7294 Jun 27, 2018

Author Member

Done!

This comment has been minimized.

@rcurtin

rcurtin Jun 27, 2018

Member

I think it's fine to remove num_basis, armijo_constant, and wolfe, if you like, since these don't usually need to be tuned for L-BFGS either (in fact L-BFGS doesn't tend to need tuning much at all).


if (CLI::HasParam("distance"))
{
distance = std::move(CLI::GetParam<arma::mat>("distance"));

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

I guess if a user wants to learn a low-rank distance, they would pass a low-rank matrix here. It might also be good to add a --rank parameter, which would create an initial low-rank matrix with randu() or something like this.

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

That will be fair enough.

This comment has been minimized.

@manish7294

manish7294 Jun 27, 2018

Author Member

Done!

* Ensure that when we pass a valid initial learning point, we get
* output of the same dimensions.
*/
BOOST_AUTO_TEST_CASE(LMNNValidDistanceTest)

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

Want to also add a test for when a low-rank initial distance matrix is passed, just to make sure there is no fatal error or anything?

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

Ya, there is an LMNNInvalidDistanceTest as well just below it. I think that's what you're probably referring too.

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

Right, the invalid distance test is good, but should we have two cases?

  1. The given distance matrix is low-rank, so we can do low-rank learning and get back a learned distance matrix of the same size.
  2. The given distance matrix is of some completely incorrect size, so we should either get an exception or a square distance matrix.

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

Sorry, If I not getting the point but I believe those are the exact same ideas valid and invalid distance tests are based on. Like valid one for 1st point and invalid one for 2nd point(checking square matrix). Please, correct me if I am off the point.

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

Oh, sorry! You are exactly right. I had misread the valid distance test. I thought that in this test, you were passing in a square matrix.

Do you think it would be worthwhile to then just add a third test to ensure that we can pass a distance matrix of size inputData.n_rows by inputData.n_rows?

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

no worries! I will add it.

This comment has been minimized.

@manish7294

manish7294 Jun 27, 2018

Author Member

Done!

CheckGradient(lmnnfn, coordinates);
}
}

This comment has been minimized.

@rcurtin

rcurtin Jun 26, 2018

Member

I think it would be a good idea to check that we can run LMNN with a low-rank distance matrix and we still get reasonable results. You could create a high-dimensional random dataset (maybe two Gaussians or something, one for each class) and then run LMNN twice, once with a low-rank matrix and once with a normal matrix, and then make sure the KNN accuracies are comparable at the end of the two runs. What do you think?

You could do that before merge or we can open a ticket for it, I am not picky.

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

sure, I will do that.

This comment has been minimized.

@manish7294

manish7294 Jun 27, 2018

Author Member

Done!

@zoq
Copy link
Member

zoq left a comment

Really interesting, will take a closer look at the code over the next days.

*/
Constraints(const arma::mat& dataset,
const arma::Row<size_t>& labels,
size_t k);

This comment has been minimized.

@zoq

zoq Jun 26, 2018

Member

Should we use const here as well, to be consistent with the rest of the codebase?

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

Ya, that will be good as we will not be changing k anyway directly while using this class.


/**
* Calculates k similar labeled nearest neighbors for a batch of dataset and
* stores them into the passed matrix.

This comment has been minimized.

@zoq

zoq Jun 26, 2018

Member

Looks like an extra space here.

const arma::mat& dataset,
const arma::Row<size_t>& labels);

//! Access the value of k.

This comment has been minimized.

@zoq

zoq Jun 26, 2018

Member

Do you think we could improve the comment here? e.g. Get the number of target neighbors (k)..

This comment has been minimized.

@manish7294

manish7294 Jun 27, 2018

Author Member

Done!


// Re-map neighbors to their index.
for (size_t j = 0; j < neighbors.n_elem; j++)
neighbors(j) = indexDiff[i].at(neighbors(j));

This comment has been minimized.

@zoq

zoq Jun 26, 2018

Member

Just an idea, but could directly use outputMatrix and avoid the extra copy at the end?

This comment has been minimized.

@manish7294

manish7294 Jun 26, 2018

Author Member

I think we can do that, but will be a bit messy.

// Re-map neighbors to their index.
for (size_t j = 0; j < neighbors.n_elem; j++)
  outputMatrix(j % k, indexSame[i](j / k)) = indexSame[i].at(neighbors(j));

Shall we do this?

This comment has been minimized.

@zoq

zoq Jun 27, 2018

Member

Your call, not sure we would see any performance improvements.

This comment has been minimized.

@manish7294

manish7294 Jun 28, 2018

Author Member

Then I think it's better as it is.

manish7294 added some commits Jun 27, 2018

1e20);
PARAM_INT_IN("range", "Number of iterations after which impostors needs to be "
"recalculated", "R", 1);
PARAM_INT_IN("seed", "Random seed. If 0, 'std::time(NULL)' is used.", "s", 0);

This comment has been minimized.

@rcurtin

rcurtin Jun 27, 2018

Member

I think it's fine to remove num_basis, armijo_constant, and wolfe, if you like, since these don't usually need to be tuned for L-BFGS either (in fact L-BFGS doesn't tend to need tuning much at all).

// Keeping tolerance very high.
// BOOST_REQUIRE_CLOSE(Accuracy1, Accuracy2, 2.0);
}

// Comprehensive gradient tests by Ryan Curtin.

This comment has been minimized.

@rcurtin

rcurtin Jun 27, 2018

Member

Ha, technically, I think it was Marcus who originally wrote these and then I adapted them. :)

This comment has been minimized.

@manish7294

manish7294 Jun 28, 2018

Author Member

Then both should be there :)

double Accuracy2 = KnnAccuracy(outputMatrix * dataset, labels, 1);

// Keeping tolerance very high.
// BOOST_REQUIRE_CLOSE(Accuracy1, Accuracy2, 2.0);

This comment has been minimized.

@rcurtin

rcurtin Jun 27, 2018

Member

I was thinking of checking something more like, e.g., check that both Accuracy1 and Accuracy2 are greater than some small value like 50%. The concern is that if there is a problem with the low-rank LMNN, the accuracy would end up near 0. It's fine to leave this commented out for now; let's discuss more in #1447.

manish7294 added some commits Jun 28, 2018

@ShikharJ

This comment has been minimized.

Copy link
Member

ShikharJ commented Jun 30, 2018

It might also help to squash / fixup the commits into one? So that it's easier to review?

@manish7294

This comment has been minimized.

Copy link
Member Author

manish7294 commented Jun 30, 2018

@ShikharJ Yeah that could simplify things but sorry, I can't do that as of now, at least not until Wednesday( I think merge date is somewhere between that period). Sorry in case you are having any kind of unconvienience, please try to bear with it :)


template<typename MetricType>
LMNNFunction<MetricType>::LMNNFunction(const arma::mat& dataset,
const arma::Row<size_t>& labels,

This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

Another picky style comment from my side, do you mind to align the method parameter with the first one, if you like you can also put the first parameter on the next line.

*/
template<typename GradType>
double EvaluateWithGradient(const arma::mat& transformation,
const size_t begin,

This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

Can you check the method parameter alignment?

* @tparam GradType The type of the gradient out-param.
* @param transformation Transformation matrix of Mahalanobis distance.
* @param begin Index of the initial point to use for objective function.
* @param batchSize Number of points to use for objective function.

This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

This should come after the gradient parameter.

* @tparam GradType The type of the gradient out-param.
* @param transformation Transformation matrix of Mahalanobis distance.
* @param begin Index of the initial point to use for objective function.
* @param batchSize Number of points to use for objective function.

This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

See comment below.

/**
* Constructor for LMNNFunction class.
*
* @param data Dataset for which metric is calculated.

This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

Looks like the comments don't match with the actual parameter.

namespace mlpack {
namespace lmnn {


This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

Perhaps you can comment on what this class does?

This comment has been minimized.

@manish7294

manish7294 Jul 4, 2018

Author Member

Done!

for (size_t i = 0; i < uniqueLabels.n_elem; i++)
{
// Store same and diff indices.
indexSame[i] = arma::find(labels == uniqueLabels[i]);

This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

I'm wondering if it makes sense to store the result of labels == uniqueLabels[i] so that it could be reused for the next line

This comment has been minimized.

@manish7294

manish7294 Jul 4, 2018

Author Member

Doing so will need something like set substraction or an AND operation like (labels && !(previousResult)), which I think will do nothing good here, as it will be more or less the same thing.

This comment has been minimized.

@rcurtin

rcurtin Jul 4, 2018

Member

I think that as we solve #1445 this code will be removed, so I don't think we need to worry about it for now.


template<typename MetricType>
inline void Constraints<MetricType>::Precalculate(
const arma::Row<size_t>& labels)

This comment has been minimized.

@zoq

zoq Jul 2, 2018

Member

This should be tapped twice.

This comment has been minimized.

@manish7294

manish7294 Jul 4, 2018

Author Member

Done!

@rcurtin rcurtin merged commit fd59d03 into mlpack:master Jul 4, 2018

5 checks passed

Memory Checks
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented Jul 4, 2018

Thanks again for the hard work here, I'm glad to add it to mlpack. :) I'll be releasing a bugfix version 3.0.3 next week, but I won't include this---we can include this for a 3.1.0 release, hopefully after some of the LMNN acceleration issues are handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.