[MRG+1] Fix lower_bound_ not equal to max lower bound in mixture models when n_init > 1 #10870

ageron · 2018-03-25T14:06:57Z

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Just set the lower_bound_ to be equal to the max_lower_bound at the end of the loop over the initializations (at the end of BaseMixture.fit()).
Also fix the test_init() function that was supposed to catch this bug. I do this by looping over multiple random states rather than just trying one (which had a 50% chance of wrongly succeeding).

Any other comments?

Thanks for making such a great library! :)

…ixes scikit-learn#10869

jnothman

I think this is right, but it makes the repeated setting of lower_bound_ look inexplicable (unless it were intended for debugging a crash!). Can we use a local lower_bound instead?

ageron · 2018-03-26T12:50:31Z

Hi @jnothman ,
Good point. I just made lower_bound a local variable instead.

… training, in BaseMixture.fit()

ageron · 2018-03-26T15:13:28Z

There were a couple traps, but I finally managed to get this thing to work. Here's the new behavior:

When warm_start is False or it is the first time we fit the model, then:
- the final self.lower_bound_ is now the maximum lower bound across the multiple inits (fixes In Gaussian mixtures, when n_init > 1, the lower_bound_ is not always the max #10869),
- and self.converged_ is equal to True if any of the inits converged (unchanged behavior, but I wanted to highlight this fact, as it is somewhat surprising).
When warm_start is True and it is not the first call to the fit() method, then:
- self.lower_bound_ is the max lower bound found during this call to the fit() method (even if it is lower than the max lower bound found earlier).
  - This is useful if you train a model on a first dataset, then you train it on another dataset: the lower bound may be lower after the second call to fit(), and that's expected.
- Moreover, self.converged_ is True if and only if the lower bound is within tol of the previous lower bound at any iteration. Upon the first iteration, prev_lower_bound is initialized to self.lower_bound_ (the max lower bound from the previous run).
  - I had to implement this logic to allow the test_monotonic_likelihood() to work, since it uses warm_start=True and max_iter=1, and therefore the only way it can detect that it has converged is if it looks at self.lower_bound_ from the previous iteration. There's a tiny risk that the algorithm has in fact not converged, but just happened to stumble upon a very close lower bound estimate. Maybe prev_lower_bound should be set to self.lower_bound_ only when max_iter=1, or -np.infty otherwise?
    Wdyt?

jnothman

I think this logic looks reasonable. Do attribute docstrings need improvement?

jnothman · 2018-03-26T21:53:27Z

sklearn/mixture/tests/test_gaussian_mixture.py

-    rand_data = RandomData(np.random.RandomState(random_state), scale=1)
-    n_components = rand_data.n_components
-    X = rand_data.X['full']
+    for random_state in range(100):


Could you double check that this runs quite quickly

It runs in 4 seconds on my laptop. Is that sufficiently quick? If not, we could reduce this to less than a second by iterating just 25 times. The probability for the unfixed code to pass this test would be 1/2^25, which is roughly 3e-8. Tell me what you prefer.

Yes, 4s for this kind of test is excessively long, IMO. And if the unfixed code only has a chance of 3e-8 of passing (did you get that the right way around?), then we can run it for fewer than that. Alternatively, we can reduce the size of X if that's not actually related to the property being tested.

Okay, reducing to 25 iterations. I think I got the calculation right: the unfixed code sets the lower_bound_ to the lower bound of the last initialization, which has 50% chance of being higher than the lower bound of the first initialization. So in order to pass the updated test, the unfixed code would need to have 50% chance 25 times in a row, which 0.5^25 = 3e-8. In other words, the new test would catch the bug in the unfixed code.

Sure, thanks. A smaller dataset would also be fine.

jnothman · 2018-03-26T21:55:27Z

sklearn/mixture/base.py

@@ -191,6 +191,7 @@ def fit(self, X, y=None):
        X = _check_X(X, self.n_components, ensure_min_samples=2)
        self._check_initial_parameters(X)

+


Rm blank line

Oops, indeed. Fixing this now.

ageron · 2018-03-27T10:05:35Z

I just updated the documentation, I hope it's clear.

I also changed the logic slightly: the previous lower_bound_ is now only used to check for convergence when warm_start is True and max_iter is 1. Indeed, if max_iter>1, it's safer to check for convergence after the 1st iteration, in case the dataset has changed.

ageron · 2018-04-09T10:00:32Z

Hi @jnothman , is there anything else you need from me to fix this bug or are we okay to merge?

jnothman

Can that new logic be tested?

This requires a second review before merge

jnothman · 2018-04-10T07:49:24Z

Please add an entry to the change log at doc/whats_new/v0.20.rst. Like the other entries there, please reference this pull request with :issue: and credit yourself (and other contributors if applicable) with :user:

…t file to fit on 80 columns

ageron · 2018-04-11T07:41:34Z

Thanks for your feedback, @jnothman . I just updated doc/whats_new/v0.20.rst, as requested.

jnothman · 2018-04-11T07:59:47Z

I also asked: Can that new logic be tested?

…art=True

ageron · 2018-04-11T09:16:16Z

Hi @jnothman , I just added the tests for the new logic. In short:

I added a test that convergence is properly detected when warm_start=True, with different values of max_iter (1, 2, and 50).
I added a test that convergence is never detected at the 1st iteration when warm_start=True and max_iter > 1. This ensures that no convergence will be detected by mistake at the first iteration if the dataset is changed.

ageron · 2018-05-04T17:29:48Z

Hi @jnothman ,
I've updated this PR because there were some conflicting changes in sklearn/mixture/base.py regarding n_iter_.
Could you please confirm that everything looks good to you and merge this PR if it does?
Thanks!

jnothman

This needs a review by someone else still.

jnothman · 2018-05-06T01:23:33Z

doc/whats_new/v0.20.rst

@@ -26,6 +26,8 @@ random sampling procedures.
 - :class:`linear_model.OrthogonalMatchingPursuit` (bug fix)
 - :class:`metrics.roc_auc_score` (bug fix)
 - :class:`metrics.roc_curve` (bug fix)
+- :class:`mixture.BayesianGaussianMixture` (bug bix)


Remind me: does this pr actually change the prediction? If it's only affecting the attribute value, I think we should leave it out of here

Yes, it may change the prediction in some very rare case. Consider:

gm = GaussianMixture(warm_start=True, max_iter=10) gm.fit(X1) gm.fit(X2)

If X2 is different from X1, is does not make sense to use the previous value of lower_bound_ as a starting point. The previous implementation would do that, and therefore it might wrongly detect convergence after the first iteration (this is unlikely, but possible, in particular if X2 is very similar to X1, or if tol is large). This PR fixes this. Thus, the fixed algorithm might converge to a better solution and produce different (better) predictions.

You could consider these as separate issues: (1) wrong lower_bound_ value, and (2) wrong convergence detection logic. However, they're both caused by the same few lines of code, so I fixed them both in this one PR.

Thanks for the explanation

jnothman · 2018-05-06T01:24:21Z

doc/whats_new/v0.20.rst

@@ -212,8 +214,8 @@ Model evaluation and meta-estimators
  :issue:`9304` by :user:`Breno Freitas <brenolf>`.

 - Add `return_estimator` parameter in :func:`model_selection.cross_validate` to
-  return estimators fitted on each split. :issue:`9686` by :user:`Aurélien Bellet
-  <bellet>`.
+  return estimators fitted on each split. :issue:`9686` by


Argh! Please revert all changes unrelated to the present fix!

Please do not change unrelated things. It makes your contribution harder to review and may introduce merge conflicts to other pull requests.

Okay, sorry about that, makes sense (FYI, I was trying to make the file use 80 characters per line).

jnothman · 2018-05-06T01:25:52Z

sklearn/mixture/tests/test_gaussian_mixture.py

+            gmm.fit(X)
+            if gmm.converged_:
+                break
+        assert_true(gmm.converged_)


Minor: since moving to pytest, we're trying to avoid such assert_* functions. Use a bare assert instead.

Okay, fixing this now.

amueller · 2018-05-22T19:34:42Z

sklearn/mixture/base.py

        which the model has the largest likelihood or lower bound. Within each
        trial, the method iterates between E-step and M-step for `max_iter`
        times until the change of likelihood or lower bound is less than
        `tol`, otherwise, a `ConvergenceWarning` is raised.
+        If `warm_start` is `True`, then `n_init` is ignored and a single


should be double backticks everywhere....

Oh right, just fixed this, thanks.

amueller · 2018-05-22T19:40:58Z

sklearn/mixture/tests/test_gaussian_mixture.py

+    # lower_bound_ is very close to the lower_bound_ after the previous call
+    # to the fit method.
+    # Unlikely, but possible and problematic, so we might as well avoid it.
+    rng = np.random.RandomState(0)


Sorry I'm being slow. I don't see where this case is handled and I don't understand the test. Why do we always reset with max_iter > 1 now?

No worries, let me explain. The scenario I'm trying to avoid, which could happen today (without this PR), is using warm_start=True, and you run gm.fit(X1) followed by gm.fit(X2), where X1 and X2 are different datasets, and if you are unlucky the first iteration of gm.fit(X2) happens to compute a lower bound very close to the final lower bound of gm.fit(X1), so the algorithm thinks the algorithm has converged when it fact it should have continued to iterate. Sure, this is quite unlikely with the default tol and if X1 and X2 are very different, but it might be dangerously likely if X1 and X2 are very similar, but not identical (or if tol is high).
Since it is hard to find two datasets X1 and X2 where this scenario occurs, I test this scenario by setting tol to infinity.
This case is handled on line 217 of base.py: we start the iterations with lower_bound = (-np.infty if do_init or self.max_iter > 1 else self.lower_bound_). So we only continue from the final lower bound of the last call to fit() if warm_start is True and max_iter == 1. The assumption is that people who use warm_start=True and max_iter = 1 are certainly doing this to manually run the training loop themselves on a single dataset, but if they are using max_iter > 1, it is unclear whether they are running consecutive calls to fit() on the same dataset or not, so we should err on the safe side.
So in short there are two issues in the current implementation: (1) if n_init > 1, the lower bound is the one from the last initialization, not from the best initialization, and (2) there is a risk of false convergence if warm_start is True. Since they are both due to the same few lines of code, I fixed them both in this one PR.
Hope this helps.

jnothman · 2018-05-29T22:17:27Z

I've not got a deep understanding yet, but I'm not sure we should be focusing on users changing datasets between warm_start fits (which I consider an abnormal use) or making implicit assumptions about the meaning of max_iter=1

ageron · 2018-05-30T15:51:36Z

@jnothman, thanks for your feedback. I did not know that switching datasets between warm_start fits was an abnormal use, in fact I have often done this after updating a dataset with new data, to avoid starting training from scratch. How else can this be done? In any case, I feel that this will require more discussion, so if you like I can split this PR in two: one part will just focus on fixing the original bug (incorrect lower_bound_ when n_init > 1) and another about the risk when switching datasets. Sounds good?

jnothman · 2018-05-31T00:06:31Z

I see what you mean now. that's a good use. let's look again...

jnothman

I think the only concerning part here is that max_iter controls multiple things now. If it's not hard to split that from this PR, it would help us ensure that at least the bug is fixed for release

ageron · 2018-05-31T13:30:06Z

Sure @jnothman , it shouldn't be too hard, I'll split the PR (probably this week-end).

ageron · 2018-06-05T16:31:39Z

Hi @jnothman , I updated this PR to keep only the fix for the original bug, i.e., the lower_bound_ was the max lower_bound_ of the last initialization (when n_init > 1) rather than the max lower_bound_ across all initializations.
I'll file a separate bug and a separate PR for the wrong convergence detection when consecutive fits use different datasets and using warm start.

jnothman

Thanks. LGTM.

ageron · 2018-06-19T13:44:46Z

Hi there @amueller , this PR needs a second review, whenever you have the chance.

GaelVaroquaux · 2018-07-16T10:44:45Z

I resolved the conflicts. I will merge once the tests pass.

GaelVaroquaux · 2018-07-16T12:47:01Z

All tests pass aside from appveyor, which is lagging behind as an effect of the sprint.

Merging!

GaelVaroquaux · 2018-07-16T12:47:17Z

Thank you!!!

ageron · 2018-07-19T12:23:52Z

Thanks to all the reviewers! 👍

Set lower_bound_ to max lower bound at the end of BaseMixture.fit(), f…

1b9f54e

…ixes scikit-learn#10869

jnothman reviewed Mar 26, 2018

View reviewed changes

ageron force-pushed the gaussian_mixture_lower_bound_fix branch from a0bffd9 to 1b9f54e Compare March 26, 2018 12:34

ageron force-pushed the gaussian_mixture_lower_bound_fix branch from 5957ea2 to 1b9f54e Compare March 26, 2018 14:22

Use a local lower_bound variable rather than self.lower_bound_ during…

8de0c93

… training, in BaseMixture.fit()

jnothman approved these changes Mar 26, 2018

View reviewed changes

ageron added 2 commits March 27, 2018 10:31

Remove extra empty line

73c2ad2

Update documentation and reduce test_init() iterations from 100 to 25

70805f7

jnothman reviewed Apr 10, 2018

View reviewed changes

ageron added 2 commits April 11, 2018 09:35

Update whats_new/v0.20.rst to add mention of issue 10869, and reforma…

b1f2641

…t file to fit on 80 columns

Remove extra line in whats_new/v0.20.rst

ece9428

Add tests for convergence detection in Gaussian mixtures when warm_st…

9a0f99f

…art=True

ageron and others added 3 commits April 11, 2018 11:19

Remove unnecessary catch_exception blocks

9582778

Merge branch 'master' into gaussian_mixture_lower_bound_fix

2cb4bb7

Fix tests since n_iter_ was recently fixed to be increased by 1

e83e46b

jnothman reviewed May 6, 2018

View reviewed changes

jnothman changed the title ~~Fix lower_bound_ not equal to max lower bound in mixture models when n_init > 1~~ [MRG+1] Fix lower_bound_ not equal to max lower bound in mixture models when n_init > 1 May 6, 2018

ageron added 3 commits May 7, 2018 12:17

Revert changes unrelated to PR 10870 in doc/whats_new/v0.20.rst

842f085

Replace assert_* with plain asserts because of the move to pytest

37eaa54

Remove comment in whats_new/v0.20.rst that will be added upon merging

52be9d6

amueller reviewed May 22, 2018

View reviewed changes

ageron added 2 commits May 25, 2018 11:21

Replace single backticks with double backticks in doc string

e11545d

Limit is 79 characters per line, not 80.

b4837b0

jnothman reviewed May 31, 2018

View reviewed changes

ageron added 2 commits June 5, 2018 18:07

Merge branch 'master' into gaussian_mixture_lower_bound_fix

c02f82c

Remove the false convergence fix to treat it in a separate PR

f554c52

jnothman approved these changes Jun 6, 2018

View reviewed changes

jnothman added this to the 0.20 milestone Jun 20, 2018

Merge branch 'master' into gaussian_mixture_lower_bound_fix

bf67ffa

GaelVaroquaux merged commit beb2aa0 into scikit-learn:master Jul 16, 2018

		@@ -191,6 +191,7 @@ def fit(self, X, y=None):
		X = _check_X(X, self.n_components, ensure_min_samples=2)
		self._check_initial_parameters(X)

[MRG+1] Fix lower_bound_ not equal to max lower bound in mixture models when n_init > 1 #10870

[MRG+1] Fix lower_bound_ not equal to max lower bound in mixture models when n_init > 1 #10870

Conversation

ageron commented Mar 25, 2018

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

jnothman left a comment

Choose a reason for hiding this comment

ageron commented Mar 26, 2018

ageron commented Mar 26, 2018

jnothman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ageron commented Mar 27, 2018

ageron commented Apr 9, 2018

jnothman left a comment

Choose a reason for hiding this comment

jnothman commented Apr 10, 2018

ageron commented Apr 11, 2018

jnothman commented Apr 11, 2018

ageron commented Apr 11, 2018

ageron commented May 4, 2018

jnothman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnothman commented May 29, 2018 via email

ageron commented May 30, 2018

jnothman commented May 31, 2018 via email

jnothman left a comment

Choose a reason for hiding this comment

ageron commented May 31, 2018

ageron commented Jun 5, 2018

jnothman left a comment

Choose a reason for hiding this comment

ageron commented Jun 19, 2018 • edited Loading

GaelVaroquaux commented Jul 16, 2018

GaelVaroquaux commented Jul 16, 2018

GaelVaroquaux commented Jul 16, 2018

ageron commented Jul 19, 2018

ageron commented Jun 19, 2018 •

edited

Loading