Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLN: Silence test output #1507

Merged
merged 3 commits into from Apr 3, 2014

Conversation

Projects
None yet
4 participants
@bashtage
Copy link
Contributor

commented Mar 24, 2014

A simple patch that makes minimal changes to tests that silences almost all output.
Most fixes were simply changing options to not output (e.g. disp=False), and a few
others trapped errors.

A new class was introduced in tools/testing.py that allows general output in a test
to be silenced. It would be preferred for code to only raise warnings and not simply
print, so that the warning could be captured instead.

@jseabold

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

Thanks.

Note to self. For the ARIMA models, it looks like disp no longer maps to iprint of l_bfgs_b, so these can now be 0 instead of -1, so that's fine.

I think the l_bfgs_b warning that you silence will be fixed by this, which I should have just pulled out. jseabold@f0944d4

There definitely shouldn't be any print statements anywhere. This shouldn't have been merged like that, so you can just delete them rather than commenting them out as far as I'm concerned.

For the redirection class, is the idea that there are stray print statements so you can't just use the catch_warnings context manager in the warnings package? If so, I'd prefer to just get rid of the print statements.

@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

The print statements are in the parallel code - they essentially raise warnings that joblib is not installed. It seems reasonable to inform a user who is using the parallel code that they should get joblib, so probably conversion to a warning would be most reasonable.

There are loads of 2.x prints in the examples. They cause a ton a parsing errors on 3.x build but obviously aren't failures. I suppose those should probable be replaced with

from __future__ import print_function
print('blah blah')

Other general clean ups now that are possible only Python 2.6+ is the minimum are using '{0}'.format(a) instead of '%s' % a

@jseabold

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

I'd change the print statement to a warning in the parallel code.

Yeah, we have an open issue on moving to one code base for Python 2.x - 3.x. I don't know if it's possible, trivial, or worth it to point the 2to3 at the examples rather than just cleaning them up. Using print_function is fine by me though.

@jseabold

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

I think we agreed Python 2.6 is now the minimum target. Many projects have moved to this. I'm trying to always use the newer string formatting, but I've been very inconsistent with this in the past.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

I haven't started to use string format yet. It looks a bit annoying to add the position integer in python 2.6, and change all of them if we insert a new value.

@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

You can use named arguments, which are often clearer in intention

'The {animal} is in the {location}'.format(animal='bear', location='woods')

Then you can add at will, You can also insert them in any order

'The {0} is in the {1}'.format('bear', 'woods')

can then become

'The {0} and the {2} are in the {1}'.format('bear', 'woods', 'bobcat')
@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

When we run 2to3 on the examples, almost the only changes are the print statements.
I use the examples after 2to3 on python 3 to try out things.

It would be very good to have the documentation examples compatible with both py2 and py3.

@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

I think at this the policy for new things, and goal for existing, is that submitted code works on both.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

I agree for text (where I also use dictionaries) but

>>> '%s = %6.2f' % ('phi', 3.5)
'phi =   3.50'

>>> '{0} = {1:6.2f}'.format('phi', 3.5)
'phi =   3.50'
@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

Those look pretty much the same to me. 1 extra keystroke on the first, and 3 on the second. The {} notation is much clearer for spotting the location of the formatted text in a longer string, IMO.

@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

And as long as I'm listing things that should be different, it would be better if warnings could be specific, rather than just raiding a generic Warning.

@jseabold

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

Feel free to add to statsmodels/tools/sm_exceptions.py. There are others scattered throughout the code, we're just not very systematic ATM.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

using % is mostly habit, but without named place holders, I have to reindex the terms
(which is at least annoying in examples with interactive use)

>>> '%s_%s = %6.2f' % ('rlm', 'phi', 3.5)
'rlm_phi =   3.50'

>>> '{0}_{1} = {2:6.2f}'.format('rlm', 'phi', 3.5)
'rlm_phi =   3.50'
@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

Would probably be best to have sm_warnings as well as sm_exceptions. I do think centrally collecting them is a good idea so that common structure can be extracted.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

I think the module sm_exceptions also has the warnings. I don't think the module will get large enough to split.
GEE uses ConvergenceWarning

I don't know if we ever checked that userwarnings always raise

@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

When a Warning is raised in a test, it should be caught using pandas.util.testing.assert_produces_warning so it doesn't pollute the build.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 24, 2014

I think we are still using numpy for most testing utility function
assert_raises assert_warns
and context managers to catch warnings

@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Mar 24, 2014

assert_produces_warning is a context manager that also checks for a specific warning. As far as I can tell it is simpler to use than assert_warns in complex code.

@coveralls

This comment has been minimized.

Copy link

commented Mar 24, 2014

Coverage Status

Coverage remained the same when pulling 6cd81dd on bashtage:silence-tests into 1861e75 on statsmodels:master.

@jseabold

This comment has been minimized.

Copy link
Member

commented Mar 26, 2014

This looks ok to merge to me, provided that the pandas import are available in our minimum versions. Given that we're bumping this, that should be fine.

@@ -56,11 +59,15 @@ def parallel_func(func, n_jobs, verbose=5):
import multiprocessing
n_jobs = multiprocessing.cpu_count()
except (ImportError, NotImplementedError):
print "multiprocessing not installed. Cannot run in parallel."
import warnings
warnings.warn(module_unavialable_doc.format('multiprocessing'),

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Mar 26, 2014

Member

typo or is it really called module_unavialable_doc

This comment has been minimized.

Copy link
@bashtage

bashtage Mar 26, 2014

Author Contributor

Should not have a misspelling - probably persistent typo.

@@ -1891,7 +1892,8 @@ def test_small_data():
# in start params regression.
res = mod.fit(trend="nc", disp=0, start_params=[.1,.1,.1,.1])
mod = ARIMA(y, (1, 0, 2))
res = mod.fit(disp=0, start_params=[.1, .1, .1, .1])
with assert_produces_warning(Warning):
res = mod.fit(disp=0, start_params=[.1, .1, .1, .1])

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Mar 26, 2014

Member

are we just catching a warning, or are we asserting that a warning is raised?
I thought the former.

This comment has been minimized.

Copy link
@bashtage

bashtage Mar 26, 2014

Author Contributor

Both.

pass


module_unavialable_doc = """

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Mar 26, 2014

Member

avialable ?

@@ -388,8 +390,9 @@ def setupClass(cls):
res2.probit()
cls.res2 = res2
fit = Probit(data.endog, data.exog).fit
cls.res1 = fit(method="basinhopping", disp=0, niter=5,
minimizer={'method' : 'L-BFGS-B', 'tol' : 1e-8})
with assert_produces_warning(RuntimeWarning):

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Mar 26, 2014

Member

catch or assert warning ?
I think it's catch.
with catch_warnings: ....

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Mar 26, 2014

Member

maybe we can leave it as assert, but if the optimizers improve, it might go away.

This comment has been minimized.

Copy link
@jseabold

jseabold Mar 26, 2014

Member

From what @bashtage is saying (and my recollection from pandas testing), it's a context manager that catches the warning, so you can test that the right warning and message is raised not just anyone. I don't know all the details here.

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Mar 26, 2014

Member

catch and assert are two different things.
If we assert, then it means the warning is intentional. (assert includes catch)
If we catch, then we just silence test noise, but it's not a bug if it doesn't raise a warning (on some machines, or different scipy versions.)

This comment has been minimized.

Copy link
@bashtage

bashtage Mar 26, 2014

Author Contributor

This will catch the warning and then assert it is the one expected. Essentially it wraps 2 lines into one - the catch and then the check.

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Apr 3, 2014

Member

Asserting that we raise a warning is useful in "real" cases, but in some edge cases they are not raised across all versions and machines. In those cases it's better to just check and assert the final results instead of intermediate warnings.

This comment has been minimized.

Copy link
@bashtage

bashtage Apr 3, 2014

Author Contributor

Ideally there would be an ‘assert_raises_if‘ that would allow the raise under known conditions, otherwise would assert that nothing is raised. In the unit root branch I am skipping these on Python 2.6 since assert_raises depends on unittest which changed in 2.7.

This comment has been minimized.

Copy link
@josef-pkt

josef-pkt Apr 3, 2014

Member

assert_raises from numpy should work across all python version, doesn't it? we used it for a long time
it should depend on nose not on python's unittest.

"known conditions" can be tooo messy.
Changes in scipy, different LAPACK/Blas versions, 64 versus 32, different "exotic" Debian versions. It's already a lot of work chasing "real" test failures

This comment has been minimized.

Copy link
@bashtage

bashtage Apr 3, 2014

Author Contributor

It doesn't work as a context manager on 2.6. It works otherwise. This seems to be because it calls nose which in turn uses unittest. I need the context manager since I am testing a property assignment.

On Apr 3, 2014 12:10 PM, Josef Perktold notifications@github.com wrote:

In statsmodels/discrete/tests/test_discrete.py:

@@ -388,8 +390,9 @@ def setupClass(cls):
res2.probit()
cls.res2 = res2
fit = Probit(data.endog, data.exog).fit

  •    cls.res1 = fit(method="basinhopping", disp=0, niter=5,
    
  •                   minimizer={'method' : 'L-BFGS-B', 'tol' : 1e-8})
    
  •    with assert_produces_warning(RuntimeWarning):
    

assert_raises from numpy should work across all python version, doesn't it? we used it for a long time
it should depend on nose not on python's unittest.

"known conditions" can be tooo messy.
Changes in scipy, different LAPACK/Blas versions, 64 versus 32, different "exotic" Debian versions. It's already a lot of work chasing "real" test failures


Reply to this email directly or view it on GitHubhttps://github.com//pull/1507/files#r11246934.

This comment has been minimized.

Copy link
@jseabold

jseabold Apr 3, 2014

Member

I don't think we should use it to test for warning from other packages ever. We want to use it to test the warnings that we are explicitly making. Regardless, I fixed this issue in master.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Mar 26, 2014

Looks good to me (with comments)

If we run into problems with earlier versions of pandas, then we can fix it once we have decided on what the minimum pandas actually is.

Kevin Sheppard added some commits Mar 24, 2014

Kevin Sheppard
ENH: Silence test output
A simple patch that makes minimal changes to tests that silences almost all output.
Most fixes were simply changing options to not output (e.g. disp=False), and a few
others trapped errors.

A new class was introduced in tools/testing.py that allows general output in a test
to be silenced.  It would be preferred for code to only raise warnings and not simply
print, so that the warning could be captured instead.
Kevin Sheppard
ENH: Converted parallel message to warning
parallel_funciton was printing a warning, but not raising one.  Converted to
a ModuleNotAvailableWarning.

Removed the 'catch all' class that was trapping this test.  Catching all text is
generally a bad idea.
Kevin Sheppard
ENH: Separated warnings from other code into a new file
sm_exceptions.py now holds all custom warnings and errors. ConvergenceWarning
was moved to this new module, and imports were corrected.
@bashtage

This comment has been minimized.

Copy link
Contributor Author

commented Apr 2, 2014

This one looks ready to me.

Removing the noise makes it clearer about the different levels of test on the different versions. It also highlights that some tests need to be re-enabled since the versions are always high enough.

BTW, is there a minimum pandas? 11 or 12? Tests are all using 12 -- 11 had some new errors.

@jseabold

This comment has been minimized.

Copy link
Member

commented Apr 3, 2014

I voted that we bump to 0.12.0, though if it's something simple I don't mind fixing it for 0.11.0. It won't be as much of an issue as the compatibility code we had to keep around for 0.6.0-0.8.0 transition when the timeseries stuff was in flux.

I'll go ahead and merge this. Thanks.

jseabold added a commit that referenced this pull request Apr 3, 2014

@jseabold jseabold merged commit c818e6f into statsmodels:master Apr 3, 2014

1 check passed

default The Travis CI build passed
Details

@josef-pkt josef-pkt added the PR label Apr 14, 2014

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this pull request Sep 2, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.