CLN: Silence test output #1507

Merged
merged 3 commits into from Apr 3, 2014

Projects

None yet

4 participants

@bashtage
Contributor

A simple patch that makes minimal changes to tests that silences almost all output.
Most fixes were simply changing options to not output (e.g. disp=False), and a few
others trapped errors.

A new class was introduced in tools/testing.py that allows general output in a test
to be silenced. It would be preferred for code to only raise warnings and not simply
print, so that the warning could be captured instead.

@jseabold
Member

Thanks.

Note to self. For the ARIMA models, it looks like disp no longer maps to iprint of l_bfgs_b, so these can now be 0 instead of -1, so that's fine.

I think the l_bfgs_b warning that you silence will be fixed by this, which I should have just pulled out. jseabold@f0944d4

There definitely shouldn't be any print statements anywhere. This shouldn't have been merged like that, so you can just delete them rather than commenting them out as far as I'm concerned.

For the redirection class, is the idea that there are stray print statements so you can't just use the catch_warnings context manager in the warnings package? If so, I'd prefer to just get rid of the print statements.

@bashtage
Contributor

The print statements are in the parallel code - they essentially raise warnings that joblib is not installed. It seems reasonable to inform a user who is using the parallel code that they should get joblib, so probably conversion to a warning would be most reasonable.

There are loads of 2.x prints in the examples. They cause a ton a parsing errors on 3.x build but obviously aren't failures. I suppose those should probable be replaced with

from __future__ import print_function
print('blah blah')

Other general clean ups now that are possible only Python 2.6+ is the minimum are using '{0}'.format(a) instead of '%s' % a

@jseabold
Member

I'd change the print statement to a warning in the parallel code.

Yeah, we have an open issue on moving to one code base for Python 2.x - 3.x. I don't know if it's possible, trivial, or worth it to point the 2to3 at the examples rather than just cleaning them up. Using print_function is fine by me though.

@jseabold
Member

I think we agreed Python 2.6 is now the minimum target. Many projects have moved to this. I'm trying to always use the newer string formatting, but I've been very inconsistent with this in the past.

@josef-pkt
Member

I haven't started to use string format yet. It looks a bit annoying to add the position integer in python 2.6, and change all of them if we insert a new value.

@bashtage
Contributor

You can use named arguments, which are often clearer in intention

'The {animal} is in the {location}'.format(animal='bear', location='woods')

Then you can add at will, You can also insert them in any order

'The {0} is in the {1}'.format('bear', 'woods')

can then become

'The {0} and the {2} are in the {1}'.format('bear', 'woods', 'bobcat')
@josef-pkt
Member

When we run 2to3 on the examples, almost the only changes are the print statements.
I use the examples after 2to3 on python 3 to try out things.

It would be very good to have the documentation examples compatible with both py2 and py3.

@bashtage
Contributor

I think at this the policy for new things, and goal for existing, is that submitted code works on both.

@josef-pkt
Member

I agree for text (where I also use dictionaries) but

>>> '%s = %6.2f' % ('phi', 3.5)
'phi =   3.50'

>>> '{0} = {1:6.2f}'.format('phi', 3.5)
'phi =   3.50'
@bashtage
Contributor

Those look pretty much the same to me. 1 extra keystroke on the first, and 3 on the second. The {} notation is much clearer for spotting the location of the formatted text in a longer string, IMO.

@bashtage
Contributor

And as long as I'm listing things that should be different, it would be better if warnings could be specific, rather than just raiding a generic Warning.

@jseabold
Member

Feel free to add to statsmodels/tools/sm_exceptions.py. There are others scattered throughout the code, we're just not very systematic ATM.

@josef-pkt
Member

using % is mostly habit, but without named place holders, I have to reindex the terms
(which is at least annoying in examples with interactive use)

>>> '%s_%s = %6.2f' % ('rlm', 'phi', 3.5)
'rlm_phi =   3.50'

>>> '{0}_{1} = {2:6.2f}'.format('rlm', 'phi', 3.5)
'rlm_phi =   3.50'
@bashtage
Contributor

Would probably be best to have sm_warnings as well as sm_exceptions. I do think centrally collecting them is a good idea so that common structure can be extracted.

@josef-pkt
Member

I think the module sm_exceptions also has the warnings. I don't think the module will get large enough to split.
GEE uses ConvergenceWarning

I don't know if we ever checked that userwarnings always raise

@bashtage
Contributor

When a Warning is raised in a test, it should be caught using pandas.util.testing.assert_produces_warning so it doesn't pollute the build.

@josef-pkt
Member

I think we are still using numpy for most testing utility function
assert_raises assert_warns
and context managers to catch warnings

@bashtage
Contributor

assert_produces_warning is a context manager that also checks for a specific warning. As far as I can tell it is simpler to use than assert_warns in complex code.

@coveralls

Coverage Status

Coverage remained the same when pulling 6cd81dd on bashtage:silence-tests into 1861e75 on statsmodels:master.

@jseabold
Member

This looks ok to merge to me, provided that the pandas import are available in our minimum versions. Given that we're bumping this, that should be fine.

@josef-pkt josef-pkt and 1 other commented on an outdated diff Mar 26, 2014
statsmodels/tools/parallel.py
@@ -56,11 +59,15 @@ def parallel_func(func, n_jobs, verbose=5):
import multiprocessing
n_jobs = multiprocessing.cpu_count()
except (ImportError, NotImplementedError):
- print "multiprocessing not installed. Cannot run in parallel."
+ import warnings
+ warnings.warn(module_unavialable_doc.format('multiprocessing'),
@josef-pkt
josef-pkt Mar 26, 2014 Member

typo or is it really called module_unavialable_doc

@bashtage
bashtage Mar 26, 2014 Contributor

Should not have a misspelling - probably persistent typo.

@josef-pkt josef-pkt commented on the diff Mar 26, 2014
statsmodels/tsa/tests/test_arima.py
@@ -1891,7 +1892,8 @@ def test_small_data():
# in start params regression.
res = mod.fit(trend="nc", disp=0, start_params=[.1,.1,.1,.1])
mod = ARIMA(y, (1, 0, 2))
- res = mod.fit(disp=0, start_params=[.1, .1, .1, .1])
+ with assert_produces_warning(Warning):
+ res = mod.fit(disp=0, start_params=[.1, .1, .1, .1])
@josef-pkt
josef-pkt Mar 26, 2014 Member

are we just catching a warning, or are we asserting that a warning is raised?
I thought the former.

@bashtage
bashtage Mar 26, 2014 Contributor

Both.

@josef-pkt josef-pkt commented on an outdated diff Mar 26, 2014
statsmodels/tools/sm_exceptions.py
class PerfectSeparationError(Exception):
pass
+
+# Warning
+class ModuleUnavailableWarning(Warning):
+ pass
+
+
+module_unavialable_doc = """
@josef-pkt
josef-pkt Mar 26, 2014 Member

avialable ?

@josef-pkt josef-pkt commented on the diff Mar 26, 2014
statsmodels/discrete/tests/test_discrete.py
@@ -388,8 +390,9 @@ def setupClass(cls):
res2.probit()
cls.res2 = res2
fit = Probit(data.endog, data.exog).fit
- cls.res1 = fit(method="basinhopping", disp=0, niter=5,
- minimizer={'method' : 'L-BFGS-B', 'tol' : 1e-8})
+ with assert_produces_warning(RuntimeWarning):
@josef-pkt
josef-pkt Mar 26, 2014 Member

catch or assert warning ?
I think it's catch.
with catch_warnings: ....

@josef-pkt
josef-pkt Mar 26, 2014 Member

maybe we can leave it as assert, but if the optimizers improve, it might go away.

@jseabold
jseabold Mar 26, 2014 Member

From what @bashtage is saying (and my recollection from pandas testing), it's a context manager that catches the warning, so you can test that the right warning and message is raised not just anyone. I don't know all the details here.

@josef-pkt
josef-pkt Mar 26, 2014 Member

catch and assert are two different things.
If we assert, then it means the warning is intentional. (assert includes catch)
If we catch, then we just silence test noise, but it's not a bug if it doesn't raise a warning (on some machines, or different scipy versions.)

@bashtage
bashtage Mar 26, 2014 Contributor

This will catch the warning and then assert it is the one expected. Essentially it wraps 2 lines into one - the catch and then the check.

@bashtage
bashtage Mar 26, 2014 Contributor

I am assuming that a warning produced is intentional, and that unintentional warnings should not be produced in tests

@jseabold
jseabold Apr 3, 2014 Member

Ah, we don't need this. I didn't understand the issue at first.

@jseabold
jseabold Apr 3, 2014 Member

This is fixed in #1127 I'm pretty sure. Specifically jseabold@f0944d4

We'll merge that soon, so we can take out this assert.

@bashtage
bashtage Apr 3, 2014 Contributor

This is one good reason to catch warnings -- it makes it explicit that they should exist and if something changes that removes the need for the warning, then it is a conscious decision.

@josef-pkt
josef-pkt Apr 3, 2014 Member

Asserting that we raise a warning is useful in "real" cases, but in some edge cases they are not raised across all versions and machines. In those cases it's better to just check and assert the final results instead of intermediate warnings.

@bashtage
bashtage Apr 3, 2014 Contributor

Ideally there would be an ‘assert_raises_if‘ that would allow the raise under known conditions, otherwise would assert that nothing is raised. In the unit root branch I am skipping these on Python 2.6 since assert_raises depends on unittest which changed in 2.7.

@josef-pkt
josef-pkt Apr 3, 2014 Member

assert_raises from numpy should work across all python version, doesn't it? we used it for a long time
it should depend on nose not on python's unittest.

"known conditions" can be tooo messy.
Changes in scipy, different LAPACK/Blas versions, 64 versus 32, different "exotic" Debian versions. It's already a lot of work chasing "real" test failures

@bashtage
bashtage Apr 3, 2014 Contributor

It doesn't work as a context manager on 2.6. It works otherwise. This seems to be because it calls nose which in turn uses unittest. I need the context manager since I am testing a property assignment.

On Apr 3, 2014 12:10 PM, Josef Perktold notifications@github.com wrote:

In statsmodels/discrete/tests/test_discrete.py:

@@ -388,8 +390,9 @@ def setupClass(cls):
res2.probit()
cls.res2 = res2
fit = Probit(data.endog, data.exog).fit

  •    cls.res1 = fit(method="basinhopping", disp=0, niter=5,
    
  •                   minimizer={'method' : 'L-BFGS-B', 'tol' : 1e-8})
    
  •    with assert_produces_warning(RuntimeWarning):
    

assert_raises from numpy should work across all python version, doesn't it? we used it for a long time
it should depend on nose not on python's unittest.

"known conditions" can be tooo messy.
Changes in scipy, different LAPACK/Blas versions, 64 versus 32, different "exotic" Debian versions. It's already a lot of work chasing "real" test failures


Reply to this email directly or view it on GitHubhttps://github.com/statsmodels/statsmodels/pull/1507/files#r11246934.

@jseabold
jseabold Apr 3, 2014 Member

I don't think we should use it to test for warning from other packages ever. We want to use it to test the warnings that we are explicitly making. Regardless, I fixed this issue in master.

@josef-pkt
Member

Looks good to me (with comments)

If we run into problems with earlier versions of pandas, then we can fix it once we have decided on what the minimum pandas actually is.

Kevin Sheppard added some commits Mar 24, 2014
Kevin Sheppard ENH: Silence test output
A simple patch that makes minimal changes to tests that silences almost all output.
Most fixes were simply changing options to not output (e.g. disp=False), and a few
others trapped errors.

A new class was introduced in tools/testing.py that allows general output in a test
to be silenced.  It would be preferred for code to only raise warnings and not simply
print, so that the warning could be captured instead.
5381b4c
Kevin Sheppard ENH: Converted parallel message to warning
parallel_funciton was printing a warning, but not raising one.  Converted to
a ModuleNotAvailableWarning.

Removed the 'catch all' class that was trapping this test.  Catching all text is
generally a bad idea.
fb40f14
Kevin Sheppard ENH: Separated warnings from other code into a new file
sm_exceptions.py now holds all custom warnings and errors. ConvergenceWarning
was moved to this new module, and imports were corrected.
07d725b
@bashtage
Contributor
bashtage commented Apr 2, 2014

This one looks ready to me.

Removing the noise makes it clearer about the different levels of test on the different versions. It also highlights that some tests need to be re-enabled since the versions are always high enough.

BTW, is there a minimum pandas? 11 or 12? Tests are all using 12 -- 11 had some new errors.

@jseabold
Member
jseabold commented Apr 3, 2014

I voted that we bump to 0.12.0, though if it's something simple I don't mind fixing it for 0.11.0. It won't be as much of an issue as the compatibility code we had to keep around for 0.6.0-0.8.0 transition when the timeseries stuff was in flux.

I'll go ahead and merge this. Thanks.

@jseabold jseabold merged commit c818e6f into statsmodels:master Apr 3, 2014

1 check passed

default The Travis CI build passed
Details
@josef-pkt josef-pkt added the PR label Apr 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment