FEAT Large Margin Nearest Neighbor implementation #8602

johny-c · 2017-03-17T00:34:46Z

Reference Issue

What does this implement/fix? Explain your changes.

Hi! This is an implementation of the Large Margin Nearest Neighbor algorithm that follows the original implementation of the author's (K.Weinberger) Matlab code (see also my docstrings). As far as I know there is no prior implementation of this version of the algorithm in python and no version at all in scikit-learn. The implementation lies in a single file (sklearn/neighbors/lmnn.py). Maybe it should go directly in the sklearn/neighbors/classification.py but I thought it would make things too cluttered. It is basically a subclass of the KNeighborsClassifier with a lot of added functionality. I put also an example in examples/neighbors/plot_lmnn_classification.py that is basically a copy of the simple nearest neighbor example in examples/neighbors/plot_classification.py.

Any other comments?

I run all scikit-learn tests I could find and it now passes all of them. There is still a TODO in the code which I think is a minor issue (the algorithm works anyway). Somehow during fitting, I need to pass a possibly changed attribute (n_neighbors) to the superclass and call the fit method of the superclass. I would appreciate any comments and/or help to finalize this. Thanks!

… for py2

… from init file

…on from fixes

…l for really old scipy.optimize.fmin_l_bfgs_b

…iter exceptions

johny-c · 2017-03-18T13:15:26Z

Hi all! I've been trying to nail down why some travis tests fail and I think it is a bug coming from an old scipy version of fmin_l_bfgs_b. I've opened an issue there ( scipy/scipy#7196 ).

Apart from that, the rest of my pull request was passing all other tests. On my last commit, it seems like one appveyor test is now failing on something completely unrelated to my code, namely on sklearn.decomposition.tests.test_pca.test_singular_values. (https://ci.appveyor.com/project/raghavrv/scikit-learn/build/1.0.217/job/iittcikl6hic3bdg). Am I missing something?

johny-c · 2017-03-18T18:15:37Z

From the last commit+builds, I can confirm that the failing travis tests are isolated and caused by this old scipy version of fmin_l_bfgs_b. I guess this issue is not going to be dealt with by scipy devs, since the newer versions work fine. Is there some standard way of dealing with such issues? Can I do something? I would appreciate any comments and/or help.

jnothman · 2017-03-19T10:11:48Z

Sorry I don't have much time to handle this. Usually we would backport functionality from a later version for users that have an older supported version by adding something in sklearn.utils.fixes. You might want to look at old scipy documentation, e.g. https://docs.scipy.org/doc/scipy-0.9.0/reference/generated/scipy.optimize.fmin_l_bfgs_b.html.

jnothman

This looks like an interesting contribution, although I don't know a great deal about its use. Sorry that the core devs are currently quite backlogged and short of time.

sklearn/neighbors/lmnn.py

glemaitre · 2017-03-19T10:34:53Z

I used this package when playing with metrics learning.
I don't know if it would be interesting to actually have a module in sklearn?

jnothman · 2017-03-19T10:44:12Z

I suppose the main problem is that including one invites many, hence maintainability problems. Yes, I had previously been aware of metric-learn. I'm inclined to agree that: (a) this may make more sense as a transformer than a classifier; and (b) we should ascertain that this is (one of) the most appropriate metric learner for inclusion. metric-learn's implementation looks markedly different, but I've not looked into it.

…

On 19 March 2017 at 21:34, Guillaume Lemaitre ***@***.***> wrote: I used this package <https://github.com/all-umass/metric-learn> when playing with metrics learning. I don't know if it would be interesting to actually have a module in sklearn? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8602 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz634Z9QfNbtGIFRZPKj9VteqHPlCDks5rnQTPgaJpZM4MgC9Y> .

GaelVaroquaux · 2017-03-19T12:01:02Z

One of the big difference between this model and actual metric learning models is that shoe-horning standard metric learning in the scikit-learn API is hard (typically, they work by learning on pair of instances that are specified as similar or not). This model fits naturally in the scikit-learn API. For this reason, I am quite enthousiastic about having it in scikti-learn: it would enables basic metric learning with scikit-learn, and there is a lot of value to this.

cmarmo

Hi @johny-c and happy new year to you too! Thanks a lot for your effort!
Your PR is labeled for review now. Let's wait one or two days before pinging.
In the meanwhile I have checked the examples (nice figures indeed!): they throw some warnings. Also, we are moving examples (one step at the time) to a notebook-style to make them more readable (see for example this one), if you have some time to reorganize them, I think this could improve understanding.
Thanks again for your patience!

cmarmo · 2021-01-13T20:50:39Z

examples/neighbors/plot_lmnn_classification.py

+An example comparing nearest neighbors classification with and without
+Large Margin Nearest Neighbor.


What about
"This example compares nearest neighbors...."

cmarmo · 2021-01-13T20:53:05Z

examples/neighbors/plot_lmnn_classification.py

+    # Put the result into a color plot
+    Z = Z.reshape(xx.shape)
+    plt.figure()
+    plt.pcolormesh(xx, yy, Z, cmap=cmap_light, alpha=.8)


Checking the rendered example this line produce a DeprecationWarning

/home/circleci/project/examples/neighbors/plot_lmnn_classification.py:75: MatplotlibDeprecationWarning: shading='flat' when X and Y have the same dimensions as C is deprecated since 3.3. Either specify the corners of the quadrilaterals with X and Y, or pass shading='auto', 'nearest' or 'gouraud', or set rcParams['pcolor.shading']. This will become an error two minor releases later. plt.pcolormesh(xx, yy, Z, cmap=cmap_light, alpha=.8) /home/circleci/project/examples/neighbors/plot_lmnn_classification.py:75: MatplotlibDeprecationWarning: shading='flat' when X and Y have the same dimensions as C is deprecated since 3.3. Either specify the corners of the quadrilaterals with X and Y, or pass shading='auto', 'nearest' or 'gouraud', or set rcParams['pcolor.shading']. This will become an error two minor releases later. plt.pcolormesh(xx, yy, Z, cmap=cmap_light, alpha=.8)

Do you mind fixing it? Thanks!

cmarmo · 2021-01-13T20:56:17Z

examples/neighbors/plot_lmnn_dim_reduction.py

+Dimensionality Reduction with Large Margin Nearest Neighbor
+===========================================================
+
+Sample usage of Large Margin Nearest Neighbor for dimensionality reduction.


Perhaps this sentence has become redundant now?

cmarmo · 2021-01-13T21:24:20Z

examples/neighbors/plot_lmnn_dim_reduction.py

+lmnn = LargeMarginNearestNeighbor(n_neighbors=n_neighbors, n_components=2,
+                                  max_iter=20, verbose=1,
+                                  random_state=random_state)


The rendered example shows a ConvergenceWarning, do you think it could be possible to fix it without increasing too much the execution time?

When running the file locally, I get the following error that seems to be coming from joblib:

File "plot_lmnn_dim_reduction.py", line 53, in <module> faces = datasets.fetch_olivetti_faces() File "/work/chiotell/projects/scikit-learn/sklearn/utils/validation.py", line 63, in inner_f return f(*args, **kwargs) File "/work/chiotell/projects/scikit-learn/sklearn/datasets/_olivetti_faces.py", line 120, in fetch_olivetti_faces faces = joblib.load(filepath) File "/home/johny/soft/miniconda3/envs/scikit/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 585, in load obj = _unpickle(fobj, filename, mmap_mode) File "/home/johny/soft/miniconda3/envs/scikit/lib/python3.6/site-packages/joblib/numpy_pickle.py", line 504, in _unpickle obj = unpickler.load() File "/home/johny/soft/miniconda3/envs/scikit/lib/python3.6/pickle.py", line 1050, in load dispatch[key[0]](self) File "/home/johny/soft/miniconda3/envs/scikit/lib/python3.6/pickle.py", line 1338, in load_global klass = self.find_class(module, name) File "/home/johny/soft/miniconda3/envs/scikit/lib/python3.6/pickle.py", line 1388, in find_class __import__(module, level=0) ModuleNotFoundError: No module named 'sklearn.externals.joblib.numpy_pickle'

I tried with multiple versions of joblib from 0.12.2 to 1.0 and the same error appears. Any ideas?

I can't reproduce the error neither with python 3.9 or 3.6 ... have you tried to clean and reinstall? joblib is no longer in externals ...

I found it. The problem was with my scikit_learn_data directory. For some reason (probably because it was created by an old scikit-learn version) it was causing pickle to look for the sklearn.externals.joblib instead of joblib. I deleted the contents and now it works. I also removed the ConvergenceWarning by passing a larger tolerance parameter.

cmarmo · 2021-01-13T21:26:48Z

examples/neighbors/plot_lmnn_illustration.py

+Large Margin Nearest Neighbor Illustration
+==========================================
+
+An example illustrating the goal of learning a distance metric that maximizes


What about
"This example illustrates...."

cmarmo · 2021-01-13T21:28:58Z

examples/neighbors/plot_lmnn_illustration.py

+
+# Annotate the impostors (1, 4, 5, 7)
+imp_centroid = (X[1] + X[4] + X[5] + X[7]) / 4
+imp_arrow_dict = dict(facecolor='black', color='gray', arrowstyle='->',


A UserWarning is thrown by matplotlib (see the rendered example). Do you mind fixing it? Thanks!

agramfort · 2021-01-14T20:56:33Z

@bellet and @wdevazelhes any opinion on this one? as authors of metric_learn I am wondering about the pros and cons (if any) of the proposed implementation. API wise does it lead to comparabale usage?

wdevazelhes · 2021-01-18T18:01:46Z

@bellet and @wdevazelhes any opinion on this one? as authors of metric_learn I am wondering about the pros and cons (if any) of the proposed implementation. API wise does it lead to comparabale usage?

@agramfort yes, API wise the general parameters (n_components, max_iter, tol, verbose, random_state) match the general API from metric-learn (and the one from sklearn.decomposition.NeighborhoodComponentsAnalysis)

There are just some parameters specific to LMNN that don't match metric-learn's LMNN implementation (n_neighbors vs k, and push_loss_weight vs regularization), however I find the namings in this PR (n_neighbors and push_loss_weight) more explicit, so maybe we could change them in metric-learn after this PR is merged, to ensure the same naming between both.

As a matter of fact, I think this implementation has a lot of cool features that are not yet in metric-learn's implementation, for instance the optimization with scipy.optimize (in metric-learn we do a manual gradient descent), and a chunked version for finding impostors, so I'm thinking it would be cool to integrate it in metric-learn too, what do you think @johny-c @bellet @perimosocordiae ?

Also just one remark: in metric-learn (and in sklearn.decomposition.NeighborhoodComponentsAnalysis) we added an "auto" initialization that can use LDA when possible, it seemed to work well in practice, @johny-c maybe it could be reused here too ?

johny-c · 2021-01-19T11:18:32Z

Hi @wdevazelhes , thanks for the comments ;) Sure, integrating to metric-learn would be fine by me. I guess the PR could get merged here first because it has been already reviewed by a bunch of people. And then it would be a simple copy in metric-learn.
Regarding LDA initialization, yes, it would be easy to add as an option. I'm not sure about an auto option though. I would keep PCA as the default.

bellet · 2021-01-19T12:09:41Z

+1 for also integrating this implementation in metric-learn, especially if we can show some clear performance improvement (which I think is the case). Our current implementation is actually difficult to maintain and we do have plans to refactor it (see scikit-learn-contrib/metric-learn#210). So @johny-c it would be great if you have a bit of time at some point to contribute your implementation there as well :-) This can be done now in parallel to this (integration to metric-learn should be fast), or after this has been merged as you suggested if you prefer to do it this way.

johny-c · 2021-01-19T14:16:45Z

Hi @bellet , sure but the next 2-3 weeks are a bit busy for me. I can open a PR in metric-learn afterwards. If in the meantime, it gets approval for merge here, it would help so we know how/when to finalize it.

bellet · 2021-01-19T16:24:11Z

Hi @bellet , sure but the next 2-3 weeks are a bit busy for me. I can open a PR in metric-learn afterwards. If in the meantime, it gets approval for merge here, it would help so we know how/when to finalize it.

That would be great, thanks!

cmarmo · 2021-06-03T16:41:28Z

Hi @johny-c I saw that you submitted your PR to metric-learn: I think this is the right move ... as apparently the review load for 1.0 is too high. Thank you very much for your patience and your work!

glemaitre · 2022-08-01T13:45:27Z

I wanted to give a hand by bringing this PR to main. I assume that it should be worth having it in scikit-learn still. On the way, I wanted to:

use the new parameter validation framework
update the docstring
improve code linked to the work done for the pairwise computation
refactor the code with NCA (this would be done later once this PR merge in its own branch and we have strong tests)

I was pretty unfamiliar with the original paper and I saw that as mentioned earlier, we do not use the hinge loss but instead the squared hinge loss and use LBFGS on differentiable loss. I could not find the original code or paper deriving this solver so it is a bit difficult to review.

I wanted to write a couple of tests to ensure that the minimization goes as expected. The first thing that I wanted to do was to compare the squared hinge loss minimization vs. the hinge loss minimization as shown in the JMLR paper. I quickly tried the metric-learn implementation on the illustration example and I get quite different results:

With metric-learn, I get the following results:

Basically, I set a very small tolerance and let the algorithm converge (tol=1e-12)
On the other side, the current implementation gives the following (with a low tolerance as well):

I was wondering if @bellet @johny-c have any inside regarding the difference in the solution found. Do you think that it is only due to the loss used?

glemaitre · 2022-08-02T07:39:37Z

Thinking a bit more about the problem, using the squared hinge should have more impact on the "push" loss. The quadratic part will penalized more impostors and I assume that the solution will therefore try to increase the distance between in-class NN and the impostors.

glemaitre · 2022-08-02T07:40:33Z

While I am not really against using the hinge loss, I still think that we should actually both loss and then the most appropriate solver.

johny-c added 10 commits March 16, 2017 20:09

first commit of lmnn

b016b70

add import in neighbors/__init__.py

e233427

[MRG] add lmnn example in neighbors

d325d6d

[MRG] Large Margin Nearest Neighbor Implementation

f51641a

setup logger in init, not in fit

81594c5

remove logger, add disp and warnings

6444d66

[MRG] lmnn pep8 compliance

81fd15a

[MRG] Large Margin Nearest Neighbor example

6dc7829

[MRG] Large Margin Nearest Neighbor implementation

d38fb0b

[MRG] Large Margin Nearest Neighbor implementation

6a08089

johny-c changed the title ~~[MRG] Large Margin Nearest Neighbor implementation~~ [MRG] Large Margin Nearest Neighbor implementation :label New feature Mar 17, 2017

johny-c changed the title ~~[MRG] Large Margin Nearest Neighbor implementation :label New feature~~ [MRG] Large Margin Nearest Neighbor implementation Mar 17, 2017

johny-c added 9 commits March 17, 2017 11:41

[MRG] Large Margin Nearest Neighbor implementation, fix utf8-encoding…

a840e15

… for py2

[MRG] Large Margin Nearest Neighbor implementation, remove empty line…

e26c0ca

… from init file

[MRG] Large Margin Nearest Neighbor implementation, import argpartiti…

62e4413

…on from fixes

[MRG] Large Margin Nearest Neighbor implementation, pep8 fix

08b0f08

[MRG] Large Margin Nearest Neighbor implementation, fix potential fai…

197bfdd

…l for really old scipy.optimize.fmin_l_bfgs_b

add LargeMarginNearestNeighbor to check_non_transformer_estimators_n_…

617eaae

…iter exceptions

catch fmin_lbfgs_b exceptions of old scipy

13db483

add args to fmin_lbfgs_b for intent(inout)

1e58e83

add order F flag in check_X for fmin_lbfgs_b for intent(inout)

942c2a5

johny-c added 2 commits March 18, 2017 17:51

raise exception for old version of scipy with lbfgs

191d9da

catch exception and terminate for old version of scipy with lbfgs

eef115e

jnothman reviewed Mar 19, 2017

View reviewed changes

sklearn/neighbors/lmnn.py Outdated Show resolved Hide resolved

sklearn/neighbors/lmnn.py Outdated Show resolved Hide resolved

cmarmo reviewed Jan 13, 2021

View reviewed changes

johny-c added 3 commits January 14, 2021 12:55

address @cmarmo 's comments

f8c8005

Merge branch 'master' into pylmnn

d10a90e

fix-remove ConvergenceWarning of dim reduction example

dd257a1

small improvements in documentation

87cde6b

Base automatically changed from master to main January 22, 2021 10:49

johny-c mentioned this pull request Mar 20, 2021

New LMNN version scikit-learn-contrib/metric-learn#309

Open

cmarmo removed the Waiting for Reviewer label Jun 3, 2021

cmarmo removed this from the 1.0 milestone Jun 3, 2021

cmarmo added Needs Decision - Close Requires decision for closing Move to scikit-learn-extra This PR should be moved to the scikit-learn-extras repository labels Dec 13, 2021

glemaitre requested review from glemaitre and removed request for glemaitre July 29, 2022 13:20

glemaitre self-assigned this Jul 29, 2022

Merge remote-tracking branch 'origin/main' into pr/johny-c/8602

835a86b

glemaitre changed the title ~~[MRG] Large Margin Nearest Neighbor implementation~~ FEAT Large Margin Nearest Neighbor implementation Jul 29, 2022

glemaitre added 3 commits July 29, 2022 17:11

update class docstring

316b122

udpate more docstring

0085610

use parameter validation framework

4d7559c

glemaitre removed their assignment Nov 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT Large Margin Nearest Neighbor implementation #8602

FEAT Large Margin Nearest Neighbor implementation #8602

johny-c commented Mar 17, 2017

johny-c commented Mar 18, 2017

johny-c commented Mar 18, 2017

jnothman commented Mar 19, 2017

jnothman left a comment

glemaitre commented Mar 19, 2017

jnothman commented Mar 19, 2017 via email

GaelVaroquaux commented Mar 19, 2017 via email

cmarmo left a comment

cmarmo Jan 13, 2021

johny-c Jan 14, 2021

cmarmo Jan 13, 2021

johny-c Jan 14, 2021

cmarmo Jan 13, 2021

johny-c Jan 14, 2021

cmarmo Jan 13, 2021

johny-c Jan 14, 2021

cmarmo Jan 14, 2021

johny-c Jan 14, 2021

cmarmo Jan 13, 2021

johny-c Jan 14, 2021

cmarmo Jan 13, 2021

johny-c Jan 14, 2021

agramfort commented Jan 14, 2021

wdevazelhes commented Jan 18, 2021

johny-c commented Jan 19, 2021

bellet commented Jan 19, 2021

johny-c commented Jan 19, 2021

bellet commented Jan 19, 2021

cmarmo commented Jun 3, 2021

glemaitre commented Aug 1, 2022

glemaitre commented Aug 2, 2022

glemaitre commented Aug 2, 2022

		An example comparing nearest neighbors classification with and without
		Large Margin Nearest Neighbor.

FEAT Large Margin Nearest Neighbor implementation #8602

Are you sure you want to change the base?

FEAT Large Margin Nearest Neighbor implementation #8602

Conversation

johny-c commented Mar 17, 2017

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

johny-c commented Mar 18, 2017

johny-c commented Mar 18, 2017

jnothman commented Mar 19, 2017

jnothman left a comment

Choose a reason for hiding this comment

glemaitre commented Mar 19, 2017

jnothman commented Mar 19, 2017 via email

GaelVaroquaux commented Mar 19, 2017 via email

cmarmo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

agramfort commented Jan 14, 2021

wdevazelhes commented Jan 18, 2021

johny-c commented Jan 19, 2021

bellet commented Jan 19, 2021

johny-c commented Jan 19, 2021

bellet commented Jan 19, 2021

cmarmo commented Jun 3, 2021

glemaitre commented Aug 1, 2022

glemaitre commented Aug 2, 2022

glemaitre commented Aug 2, 2022