[MRG] ENH: Optional positivity constraints on the dictionary and sparse code #6374

jakirkham · 2016-02-17T00:20:50Z

Allows for a positivity constraint to be set during dictionary learning. This should be very similar to one provided for by Marial, et al. in SPAMS. However, I may need guidance in making sure this is working correctly in all cases.

Todo:

GaelVaroquaux · 2016-02-17T07:08:34Z

In sparse_encode, you cannot ignore the positive argument for algorithms that do not support it. You need either to support it (for threshold it's just a question of putting the negative values to zero) or to raise an error.

In addition, you need a test for all the features that you have added, here testing positive support for all algorithms.

jakirkham · 2016-02-17T14:26:12Z

Added positivity support for the threshold case and added a ValueError for the orthogonal matching pursuit case.

Added tests that verify the positivity constraint is met for the sparse code in a variety of cases. Also, verifies that the ValueError for the orthogonal matching pursuit case is raised appropriately.

Additionally renamed the positive argument to be more clear in some cases. This constraint ended up needing to be included as part of the SparseCodingMixin to get the right behavior from transform.

It would be nice to extend this to the dictionary, as well. This is something that Marial, et al. do in SPAMS. I know this will come down to some sort of thresholding on the dictionary, but am a little unclear as to where. If you have any thoughts, please share.

jakirkham · 2016-02-18T03:28:01Z

So, I have added something that I think will correctly constrain the dictionary to positive values. It does appear to work. However, it may not be the best way or could have errors so feedback is definitely welcome here. Still not sure if there isn't something missing here.

jakirkham · 2016-02-18T09:03:42Z

Thus far, using this function on more complex datasets seems to yield the right results.

arthurmensch · 2016-02-18T09:57:34Z

sklearn/decomposition/dict_learning.py

@@ -304,7 +321,7 @@ def sparse_encode(X, dictionary, gram=None, cov=None, algorithm='lasso_lars',


 def _update_dict(dictionary, Y, code, verbose=False, return_r2=False,
-                 random_state=None):
+                 random_state=None, dict_positive=False):


positive=False

I had thought about this, but was concerned it might be unclear. If you would prefer it, I can make the change.

This has been changed.

arthurmensch · 2016-02-18T10:00:34Z

We need an example showing the interest of enforcing positivity to the dictionary / code. It would be great to compare to NMF estimators. I guess modifying faces decomposition examples is the easiest way, though it would be interesting to have a different usecase.

jakirkham · 2016-02-18T14:53:13Z

So, I am using this in image processing of calcium image data from the mouse brain primarily (various regions). While there are many cases of using NMF in this area (and we tried ourselves), we found dictionary learning with positivity constraints was more performant. Even on artificial data (used for testing), I will get the wrong answer without this positivity constraint on the dictionary.

The reason I added this is it was a feature provided in SPAMS by Marial, et al.. However, this package has gone without a release for a year and a half. I have tried patching it to fix various bugs that have crept up (NumPy support, better linking to the BLAS, etc.), but this solution is untenable. Especially as I am now looking to have Python 3 support and a new version of NumPy is on the horizon. It seems the original authors are likely exploring new interesting problems, which makes sense. Unfortunately, this is a problem from the maintenance side. I suppose one could argue about licensing too, but this seems secondary to having a working dependency stack.

arthurmensch · 2016-02-18T15:32:29Z

I am sure there are some usecases where it performs better than NMF. However there is no point in providing a new functionality to the end-user without advertising it by an example. Do you have any idea ? I'll look into that as well

jakirkham · 2016-02-18T18:56:41Z

Ok, sure, I'll try to come up with an example. It might not be today, but I will at least try tomorrow or this weekend.

jakirkham · 2016-02-22T03:44:16Z

For an example, take a look at this ( http://nbviewer.jupyter.org/gist/jakirkham/c5622e41843e6dafbcb7 ). I modified an example that I found in the scikit-learn docs and imposed the positivity constraints in different combinations. Thoughts?

amueller · 2016-10-07T17:13:03Z

Sorry for our slow replies @jakirkham. I think we don't have maintainers that use these methods a lot unfortunately, and we are swamped with many PRs. We need to figure something out though :-/

jnothman · 2016-11-01T11:32:55Z

Ping @vene

vene · 2016-11-03T00:10:56Z

(Caveat: I don't really have any experience with practical application of dictionary learning whatsoever...)

Maybe just show the last slice of the example (positive dictionary + code) since it looks like the components are a bit sparser than NMF.

Another idea is to check if it works as a topic model for text data, comparably to how NMF does.

ogrisel

LGTM but disclaimer: I am not familiar with the SPAMS implementation. Maybe @arthurmensch is more knowledgeable.

It would be great to update the faces decomposition example with a new entry with a positivity constraint on the dictionary components.

ogrisel · 2018-06-01T22:37:12Z

sklearn/decomposition/dict_learning.py

+        Whether to enforce positivity when finding the code.
+
+    fit_positive : bool
+        Whether to enforce positivity when finding the dictionary


Why not name this code_positive and dict_positive instead?

ogrisel · 2018-06-01T22:39:39Z

sklearn/decomposition/tests/test_dict_learning.py

+    n_components = 8
+
+    dico = MiniBatchDictionaryLearning(
+        n_components, transform_algorithm='lasso_lars', random_state=0,


If the test is not too long to run, it would be great to also check with other values of transform_algorithm.

ogrisel · 2018-06-01T22:40:53Z

sklearn/decomposition/tests/test_dict_learning.py

+                                            dict_positive=True,
+                                            code_positive=True)
+    assert_true((dictionary >= 0).all())
+    assert_true((code >= 0).all())


And maybe also test for other combinations such as: (dict_positive=True and dict_positive=False) and (dict_positive=False and dict_positive=True).

`

jakirkham · 2018-06-02T19:49:15Z

Thanks for your feedback, @ogrisel.

Have pushed some changes that hopefully address your comments and have squashed the history a bit. Tests appear to be passing. 🎉

For the most part this is just relying on whatever algorithm exposed positivity constraint is available and exposing that to the user. With the exception of the thresholding case where it is custom and orthogonal matching pursuit where we just error out (as it is unsupported), which is just following @GaelVaroquaux's advice earlier in the thread. So would hope knowledge of SPAMS is not required, but it would be nice to have someone with that expertise take a look, if they have time.

As to an example (though we did look at this case in person), have the faces with the positivity constraint in a notebook. This shows a few different results depending on these constraints. Would be happy to add them if they would be useful. Could also play with different data like the digits set as you suggested or something else that we deem appropriate. Though would suggest either selecting one or two of these to add to the existing gallery or breaking them out on another doc page to avoid having too much info for users to ingest at one time.

Thoughts on any/all of this would be welcome. 😄

ogrisel · 2018-06-06T13:24:25Z

@jakirkham yes! Please add the MBDL case with positive code and dict to the existing example for the faces decomposition. Maybe using the red-blue color map with white centered at zero for all those face plots would better highlight the differences between the decompositions.

ogrisel · 2018-06-06T13:25:31Z

Maybe breaking out the positivity constraints on a simpler dataset like digits in a standalone example is a good idea too. I let you experiment and judge which is best.

ogrisel · 2018-06-06T13:54:32Z

Also, I am wondering if it would not be better to rename "dict_positive" to "positive_dict" and "code_positive" to "positive_code". You are the native English speaker though :)

Provides an option for dictionary learning to positively constrain the dictionary and the sparse code. This is useful in applications of dictionary learning where the data is know to be positive (e.g. images), but the sparsity constraint that dictionary learning has is better suited for factorizing the data in contrast to other positively constrained factorization techniques like NMF, which may not be similarly sparse.

Ensure that when the positivity constraint is applied that the dictionary and code end up having only positive values in the respective results depending on whether dictionary and/or code are positively constrained.

Shows the various positivity constraints on dictionary learning and what the results of these look like using a Red to Blue color map. These are included in the examples and also in the docs below dictionary learning. All of these use the Olivetti faces as a training set.

jakirkham · 2018-06-11T06:01:59Z

Thanks for the suggestions, @ogrisel.

Agree that positive_dict/positive_code sounds better. Have updated the API accordingly.

The red/blue color map with faces is a good idea. Have included some code for this in plot_faces_decomposition.py in a separate section. Also have included these images in the docs under the "Generic dictionary learning" section with some text beforehand.

Please let me know what you think. :)

jakirkham · 2018-06-13T20:46:31Z

Have added a screenshot of what this doc section looks like below.

jakirkham · 2018-06-18T07:00:40Z

Any other thoughts?

ogrisel · 2018-06-21T14:37:12Z

Thank you very much @jakirkham! That is a great contribution.

ogrisel · 2018-06-21T14:40:01Z

I merged a bit too quickly: I forgot to ask for an entry in whats_new.rst and we also need the .. versionadded:: 0.20 annotation for the new options in the docstrings of the public functions / methods.

jakirkham · 2018-06-21T15:19:35Z

Thanks @ogrisel. Glad to have this in. 😄

No worries. Does PR ( #11341 ) address this?

GaelVaroquaux · 2018-06-21T18:52:58Z

Great! This is a fantastic addition. Thanks a lot!

jakirkham · 2018-06-22T01:41:35Z

Of course. Thanks for the help reviewing.

This was referenced Feb 17, 2016

l1-l2 optimization #4069

Closed

WIP: Make SPAMS an optional requirement include scikit-learn by default nanshe-org/nanshe#349

Closed

jakirkham force-pushed the dict_pos_constrt branch 4 times, most recently from c33c2f2 to dec16c1 Compare February 17, 2016 14:21

jakirkham force-pushed the dict_pos_constrt branch 4 times, most recently from c80e19e to fc931aa Compare February 18, 2016 03:23

jakirkham changed the title ~~ENH: Allow for a positivity constraint in dictionary learning methods~~ WIP, ENH: Allow for a positivity constraint in dictionary learning methods Feb 18, 2016

jakirkham changed the title ~~WIP, ENH: Allow for a positivity constraint in dictionary learning methods~~ ENH: Allow for a positivity constraint in dictionary learning methods Feb 18, 2016

jakirkham changed the title ~~ENH: Allow for a positivity constraint in dictionary learning methods~~ ENH: Allow for a positivity constraints on the dictionary and sparse code Feb 18, 2016

jakirkham changed the title ~~ENH: Allow for a positivity constraints on the dictionary and sparse code~~ ENH: Optional positivity constraints on the dictionary and sparse code Feb 18, 2016

jakirkham mentioned this pull request Feb 18, 2016

ENH: Add elastic net regularization option for dictionary learning. #6382

Closed

8 tasks

arthurmensch reviewed Feb 18, 2016
View reviewed changes

jakirkham force-pushed the dict_pos_constrt branch from fc931aa to ae02712 Compare February 18, 2016 15:22

jakirkham force-pushed the dict_pos_constrt branch 4 times, most recently from a1c1f53 to 7594fbf Compare February 22, 2016 02:25

jakirkham force-pushed the dict_pos_constrt branch from 7594fbf to 704d419 Compare June 1, 2018 21:21

ogrisel approved these changes Jun 1, 2018

View reviewed changes

jakirkham force-pushed the dict_pos_constrt branch 4 times, most recently from 401c65f to 52d41af Compare June 2, 2018 19:07

jakirkham force-pushed the dict_pos_constrt branch from 52d41af to fd13a29 Compare June 11, 2018 04:34

TST: Test positivity with code and dictionary

076d049

Ensure that when the positivity constraint is applied that the dictionary and code end up having only positive values in the respective results depending on whether dictionary and/or code are positively constrained.

jakirkham force-pushed the dict_pos_constrt branch from fd13a29 to 076d049 Compare June 11, 2018 04:35

jakirkham changed the title ~~ENH: Optional positivity constraints on the dictionary and sparse code~~ [MRG] ENH: Optional positivity constraints on the dictionary and sparse code Jun 13, 2018

ogrisel merged commit 6ce497c into scikit-learn:master Jun 21, 2018

jakirkham mentioned this pull request Jun 21, 2018

[MRG] DOC: Version added positivity in dict_learning #11341

Merged

jakirkham deleted the dict_pos_constrt branch June 21, 2018 15:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] ENH: Optional positivity constraints on the dictionary and sparse code #6374

[MRG] ENH: Optional positivity constraints on the dictionary and sparse code #6374

jakirkham commented Feb 17, 2016 •

edited

GaelVaroquaux commented Feb 17, 2016

jakirkham commented Feb 17, 2016

jakirkham commented Feb 18, 2016

jakirkham commented Feb 18, 2016

arthurmensch Feb 18, 2016

jakirkham Feb 18, 2016

jakirkham Feb 18, 2016

arthurmensch commented Feb 18, 2016

jakirkham commented Feb 18, 2016

arthurmensch commented Feb 18, 2016

jakirkham commented Feb 18, 2016

jakirkham commented Feb 22, 2016

amueller commented Oct 7, 2016

jnothman commented Nov 1, 2016

vene commented Nov 3, 2016

ogrisel left a comment •

edited

ogrisel Jun 1, 2018

ogrisel Jun 1, 2018

ogrisel Jun 1, 2018

jakirkham commented Jun 2, 2018

ogrisel commented Jun 6, 2018

ogrisel commented Jun 6, 2018

ogrisel commented Jun 6, 2018

jakirkham commented Jun 11, 2018

jakirkham commented Jun 13, 2018

jakirkham commented Jun 18, 2018

ogrisel commented Jun 21, 2018

ogrisel commented Jun 21, 2018

jakirkham commented Jun 21, 2018

GaelVaroquaux commented Jun 21, 2018 via email

jakirkham commented Jun 22, 2018

[MRG] ENH: Optional positivity constraints on the dictionary and sparse code #6374

[MRG] ENH: Optional positivity constraints on the dictionary and sparse code #6374

Conversation

jakirkham commented Feb 17, 2016 • edited

GaelVaroquaux commented Feb 17, 2016

jakirkham commented Feb 17, 2016

jakirkham commented Feb 18, 2016

jakirkham commented Feb 18, 2016

arthurmensch Feb 18, 2016

Choose a reason for hiding this comment

jakirkham Feb 18, 2016

Choose a reason for hiding this comment

jakirkham Feb 18, 2016

Choose a reason for hiding this comment

arthurmensch commented Feb 18, 2016

jakirkham commented Feb 18, 2016

arthurmensch commented Feb 18, 2016

jakirkham commented Feb 18, 2016

jakirkham commented Feb 22, 2016

amueller commented Oct 7, 2016

jnothman commented Nov 1, 2016

vene commented Nov 3, 2016

ogrisel left a comment • edited

Choose a reason for hiding this comment

ogrisel Jun 1, 2018

Choose a reason for hiding this comment

ogrisel Jun 1, 2018

Choose a reason for hiding this comment

ogrisel Jun 1, 2018

Choose a reason for hiding this comment

jakirkham commented Jun 2, 2018

ogrisel commented Jun 6, 2018

ogrisel commented Jun 6, 2018

ogrisel commented Jun 6, 2018

jakirkham commented Jun 11, 2018

jakirkham commented Jun 13, 2018

jakirkham commented Jun 18, 2018

ogrisel commented Jun 21, 2018

ogrisel commented Jun 21, 2018

jakirkham commented Jun 21, 2018

GaelVaroquaux commented Jun 21, 2018 via email

jakirkham commented Jun 22, 2018

jakirkham commented Feb 17, 2016 •

edited

ogrisel left a comment •

edited