Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to 0.18.1 #39

Closed
wants to merge 3 commits into from
Closed

Update to 0.18.1 #39

wants to merge 3 commits into from

Conversation

jschueller
Copy link

No description provided.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@jakirkham
Copy link
Member

Will need similar changes for np111.

@jschueller
Copy link
Author

@amueller @lesteve After successfully backporting scikit-learn/scikit-learn@3933130 we still got the error reported by @jakirkham on osx:

======================================================================

FAIL: sklearn.manifold.tests.test_t_sne.test_n_iter_without_progress

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/Users/travis/miniconda3/conda-bld/scikit-learn_1490014369633/_t_env/lib/python2.7/site-packages/nose/case.py", line 197, in runTest

    self.test(*self.arg)

  File "/Users/travis/miniconda3/conda-bld/scikit-learn_1490014369633/_t_env/lib/python2.7/site-packages/sklearn/manifold/tests/test_t_sne.py", line 584, in test_n_iter_without_progress

    "last 2 episodes. Finished.", out)

AssertionError: 'did not make any progress during the last 2 episodes. Finished.' not found in '[t-SNE] Computing pairwise distances...\n[t-SNE] Computed conditional probabilities for sample 100 / 100\n[t-SNE] Mean sigma: 0.765549\n[t-SNE] Iteration 25: error = 13.5941929, gradient norm = 0.0787243\n[t-SNE] Iteration 50: error = 13.3894499, gradient norm = 0.0770499\n[t-SNE] Iteration 75: error = 17.2619819, gradient norm = 0.2986331\n[t-SNE] Iteration 100: error = 16.4814787, gradient norm = 0.0060466\n[t-SNE] KL divergence after 100 iterations with early exaggeration: 16.481479\n[t-SNE] Iteration 125: error = 2.3524479, gradient norm = 0.0015101\n[t-SNE] Iteration 150: error = 2.2885583, gradient norm = 0.0013097\n[t-SNE] Iteration 175: error = 2.2739044, gradient norm = 0.0012655\n[t-SNE] Iteration 200: error = 2.2700430, gradient norm = 0.0012547\n[t-SNE] Iteration 225: error = 2.2689857, gradient norm = 0.0012518\n[t-SNE] Iteration 250: error = 2.2686934, gradient norm = 0.0012510\n[t-SNE] Iteration 275: error = 2.2686124, gradient norm = 0.0012508\n[t-SNE] Iteration 300: error = 2.2685899, gradient norm = 0.0012507\n[t-SNE] Iteration 325: error = 2.2685837, gradient norm = 0.0012507\n[t-SNE] Iteration 350: error = 2.2685820, gradient norm = 0.0012507\n[t-SNE] Iteration 375: error = 2.2685815, gradient norm = 0.0012507\n[t-SNE] Iteration 400: error = 2.2685814, gradient norm = 0.0012507\n[t-SNE] Iteration 425: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 450: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 475: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 500: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 525: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 550: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 575: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 600: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 625: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 650: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 675: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 700: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 725: error = 2.2685813, gradient norm = 0.0012507\n[t-SNE] Iteration 725: error difference 0.000000. Finished.\n[t-SNE] Error after 725 iterations: 16.481479\n'



----------------------------------------------------------------------

cc @ocefpaf

@lesteve
Copy link
Member

lesteve commented Mar 20, 2017

I guess nobody looked at it since. I will try to reproduce it but I don't have a OSX VM handy right now.

For the record, I don't think it is a good idea to merge this PR. If we want 0.18.1 accessible on conda-forge we should have it exactly the same as the one on PyPI and skip the offending tests just for the purposes of releasing to conda-forge.

@jschueller
Copy link
Author

I opened an issue for this:
scikit-learn/scikit-learn#8618

@lesteve
Copy link
Member

lesteve commented Mar 20, 2017

I opened an issue for this:
scikit-learn/scikit-learn#8618

Thanks for this!

constant_column = np.var(Q, 0) < 1.e-12
# detect constant columns
w[constant_column] = 0 # cancel the regularization for the intercept
- w[v == 0] = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than backporting a fix, I would be in favour of skipping the two problematic tests on OSX and merging this PR. This way 0.18.1 will be available on conda-forge while the code changes will be only minor (test changes only) compared to the 0.18.1 release available on PyPI.

@jakirkham
Copy link
Member

Not to dup myself, but simply refresh after a long hiatus. We typically patch in conda-forge to fix something that is broken so that we can release it. Especially if there is a fix already available. While I understand that is not the exact same as where it was tagged, would much rather ensure having quality products going into the conda-forge channel than relax the quality controls. If what we are saying is this isn't up to snuff and we don't want to patch it to be so, maybe we can wait a bit for the dust to settle on the fixes.

@jschueller
Copy link
Author

You mean you don't want to release it as is ?

@lesteve
Copy link
Member

lesteve commented Mar 21, 2017

Not to dup myself, but simply refresh after a long hiatus

Yeah I do remember this conversation.

While I understand that is not the exact same as where it was tagged, would much rather ensure having quality products going into the conda-forge channel than relax the quality controls

I completely understand that skipping tests is not something that should be encouraged in general. It is very unfortunate that we did not fix these test failures before releasing 0.18.1 and we will certainly do our best for the next release to publish a release candidate on conda-forge in order to make sure all the tests pass prior to the release.

I think it is fair to say that 0.18.1 as it was released on PyPI is considered a quality product by the scikit-learn developers. Skipping these two tests only on OSX before releasing on conda-forge seems the best option as far as I am concerned.

I am more than happy to hear other scikit-learn developers' opinions on this, e.g. @amueller @ogrisel @jnothman.

@ogrisel
Copy link
Contributor

ogrisel commented Mar 21, 2017

I agree that the macosx test failure is probably revealing a bug (either in sklearn or openblas, it's hard to tell without investigating further), and this bug should be fixed before releasing scikit-learn 0.19.

This test failure could not be detected on the upstream scikit-learn CI infrastructure because it does not happen when numpy is linked against OSX Accelerate instead of openblas.

But sklearn 0.18.1 is released (source on pypi) and I think binary distributions such as conda-forge should reflect the matching source releases (possibly disabling this test to avoid noisy reports).

Note that this bug was probably already present in 0.17 but we did not test that part of the code at that point. So technically this bug is already present in the current state of conda-forge.

@jschueller
Copy link
Author

jschueller commented Mar 21, 2017

@ogrisel jakirkham point is less about the second issue than the first one we encountered for which a fix is already available (scikit-learn/scikit-learn#8154). IMHO if it fixes stuff, why not pack it in. That does not mean actively backporting every fixes from upstream. Also it easier to backport a simple patch than create a patch ourselves that disables the corresponding test : )

@amueller
Copy link
Contributor

There's probably 20-50 more minor fixes in master that are not in 0.18.1. Why not ship it? Because it will be impossible for us to find out what people have actually installed if the conda-forge version is a custom patch.
We are behind on releases and we should have done 0.19 (or 0.18.2) by now, but we haven't.

How do you decide what to ship? I would imagine all project that constant bug-fixes in master, and you don't synchronize.

@ogrisel
Copy link
Contributor

ogrisel commented Mar 21, 2017

If different distributions ship different variants of the code with the same version number that will make upstream maintenance / issue tracking and user support too complex.

One should be able to tell our users make sure to use version "X.Y.Z" to have bug XXX fixed.

@jakirkham
Copy link
Member

We don't love patching things either. However, sometimes it is necessary. In a few cases, we have patches that make little sense outside of conda. It is worth noting that if people do get this test failure, they might come back to bug us about.

In any event, I don't want to be pedantic about this. Especially as it seems the consensus is not to patch. If someone could help clarify how we can skip only the troublesome tests, let's do it and move forward.

@lesteve
Copy link
Member

lesteve commented Mar 22, 2017

In any event, I don't want to be pedantic about this

I am really really glad we can agree even if we approach the problem from different angles and with different biases. I think the simplest thing is to do something like this for the two test functions that fail:

from sklearn.utils.testing import SkipTest  # only needed if not already imported in the test module

def test_that_fails():
    if sys.platform == 'darwin':
        raise SkipTest('This test is known to fail on OSX')
    # body of the test function as it was previously

@jschueller
Copy link
Author

yes, I removed the assert, so the result is the same, unless you want to include it upstream.

@lesteve
Copy link
Member

lesteve commented Mar 23, 2017

yes, I removed the assert, so the result is the same, unless you want to include it upstream.

Sorry not very familiar with conda recipes, does the [osx] comment mean that the patch is only applied on OSX?

I think what is left to do is to skip the test rather than patching sklearn/linear_model/ridge.py.

@lesteve
Copy link
Member

lesteve commented Mar 23, 2017

I opened #40 in order to show what the ideal PR would be from a scikit-learn developer point of view. In case I did not screw up something related to the conda recipe I would be in favour of merging #40.

@jschueller jschueller closed this Mar 28, 2017
@jschueller jschueller deleted the 0181 branch March 28, 2017 11:05
@jschueller
Copy link
Author

jschueller commented Mar 28, 2017

superceded by #40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants