BUG: patsy raises with zero constraint #89

josef-pkt · 2016-07-08T20:12:48Z

Why does patsy raise a ValueError in this case?

It looks like a perfectly valid degenerate case.
We are reusing t_test which uses patsy's constraint for other things that pure constraints.
statsmodels/statsmodels#3096

patsy\constraint.py", line 62:

        if np.any(np.all(self.coefs == 0, axis=1)):
            raise ValueError("can't test a constant constraint")

>>> res.t_test(np.zeros((1, 7)))
Traceback (most recent call last):
  File "<pyshell#9>", line 1, in <module>
    res.t_test(np.zeros((1, 7)))
  File "E:\josef_new_notebook\git\statsmodels_py34_pr\statsmodels\base\model.py", line 1204, in t_test
    LC = DesignInfo(names).linear_constraint(r_matrix)
  File "C:\Users\josef\Downloads\WinPython-64bit-3.4.4.2\python-3.4.4.amd64\lib\site-packages\patsy\design_info.py", line 536, in linear_constraint
    return linear_constraint(constraint_likes, self.column_names)
  File "C:\Users\josef\Downloads\WinPython-64bit-3.4.4.2\python-3.4.4.amd64\lib\site-packages\patsy\constraint.py", line 404, in linear_constraint
    return LinearConstraint(variable_names, coefs)
  File "C:\Users\josef\Downloads\WinPython-64bit-3.4.4.2\python-3.4.4.amd64\lib\site-packages\patsy\constraint.py", line 62, in __init__
    raise ValueError("can't test a constant constraint")
ValueError: can't test a constant constraint

>>> res.t_test(1e-20 * np.ones((1, 7)))
<class 'statsmodels.stats.contrast.ContrastResults'>
                             Test for Constraints                             
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
c0          1.534e-18   5.21e-19      2.945      0.004    4.97e-19    2.57e-18
==============================================================================

(In my opinion patsy should not enforce statistics, but can warn instead.)

The text was updated successfully, but these errors were encountered:

Closes: pydatagh-89

njsmith · 2016-07-09T00:21:06Z

See #90

I don't quite understand what you're trying to do here, but I guess I agree that there's no reason to rule out degenerate constraints like this.

There's another similar case that we also error out on right now: constraint matrices with zero rows. Currently the patch in #90 alters patsy to allow for constraints to have a row with all-zero coefficients, but it still doesn't allow for constraints with no rows at all. But semantically, I guess this would be an "always true" constraint, similar to 0*x1 + 0*x2 = 0, so maybe we should allow both? Do you have any opinion on this?

CC: @kshedden

josef-pkt · 2016-07-09T07:22:24Z

I never ran into the empty, zero rows constraints. But in many cases, the consistent empty array handling is convenient.

In the case of t_test, after commenting out the zero rows check in patsy:

repr and string summary of test results raises an exception, but all the computation look like they work correctly based on np.linalg and dot behavior with empty arrays.

So, from the statsmodels side the zero row exception could also be removed. (we have to fix the string summary to fully support it)

>>> tt = resnp.t_test(np.ones((0, 7)))
>>> import pprint
>>> pprint.pprint(vars(tt))
{'df_denom': 78.0,
 'dist': <scipy.stats._continuous_distns.t_gen object at 0x0000000007D35CF8>,
 'dist_args': (78.0,),
 'distribution': 't',
 'effect': array([], dtype=float64),
 'pvalue': array([], shape=(0, 0), dtype=float64),
 'sd': array([], shape=(0, 0), dtype=float64),
 'statistic': array([], shape=(0, 0), dtype=float64),
 'tvalue': array([], shape=(0, 0), dtype=float64)}

>>> tt = resnp.t_test((np.ones((0, 7)), []))
>>> pprint.pprint(vars(tt))
{'df_denom': 78.0,
 'dist': <scipy.stats._continuous_distns.t_gen object at 0x0000000007D35CF8>,
 'dist_args': (78.0,),
 'distribution': 't',
 'effect': array([], dtype=float64),
 'pvalue': array([], shape=(0, 0), dtype=float64),
 'sd': array([], shape=(0, 0), dtype=float64),
 'statistic': array([], shape=(0, 0), dtype=float64),
 'tvalue': array([], shape=(0, 0), dtype=float64)}

the above is for OLS with just numpy arrays, but it works the same way with formulas.

josef-pkt · 2016-07-09T07:41:37Z

asides:
I was thinking of skipping the patsy detour in t_test with fully specified numpy constraint matrices to avoid any ValueErrors and checks, but decided so far to keep it because it enforces some "required" attributes across models. ("required" = not required for computation but part of minimum common features in internal API)

all zeros constraint, that I ran in before but, AFAIR, wasn't needed in final design for that (constrained estimation):
statsmodels/statsmodels#1704
ValueError if names not defined:
statsmodels/statsmodels#436 (comment)

njsmith added a commit to njsmith/patsy that referenced this issue Jul 9, 2016

Allow degenerate linear constraints

12836b9

Closes: pydatagh-89

njsmith mentioned this issue Jul 9, 2016

Allow degenerate linear constraints #90

Merged

njsmith closed this as completed in #90 Oct 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: patsy raises with zero constraint #89

BUG: patsy raises with zero constraint #89

josef-pkt commented Jul 8, 2016

njsmith commented Jul 9, 2016

josef-pkt commented Jul 9, 2016

josef-pkt commented Jul 9, 2016

BUG: patsy raises with zero constraint #89

BUG: patsy raises with zero constraint #89

Comments

josef-pkt commented Jul 8, 2016

njsmith commented Jul 9, 2016

josef-pkt commented Jul 9, 2016

josef-pkt commented Jul 9, 2016