ENH: Add wrapper for MixedLM #2008

Merged
merged 15 commits into from Sep 26, 2014

Projects

None yet

4 participants

@jseabold
Member

Addresses at least partially #1990 and #1992.

@kshedden Do you want to have a look at the changes? Maybe just try updating the notebooks and see if the new names make sense and also bang on the naming with the wrappers (pandas-in and pandas-out for all attributes) and the summary table. I wasn't totally sure what was going on all the time, but I think I kept all your intentions.

This still won't handle missing values correctly AFAICT, and some of the stuff in __init__ and from_formula is redundant with stuff that already does happen and some of the stuff that should happen up the inheritance chain. It would also be nice to see if this can be vectorized/take advantage of the existing grouping functionality, but I'll likely leave these for later.

Last thing I want to change in this. likeval vs llf. It looks like there might be a case where they could be different? Should we have llf and penalized_llf? What's the pattern in the other models with regularization now?

@coveralls

Coverage Status

Coverage increased (+0.0%) when pulling f094073 on jseabold:mixed-lm-wrapper into a4e4eb8 on statsmodels:master.

@coveralls

Coverage Status

Coverage increased (+0.0%) when pulling 818e90c on jseabold:mixed-lm-wrapper into a4e4eb8 on statsmodels:master.

@josef-pkt josef-pkt commented on an outdated diff Sep 26, 2014
statsmodels/regression/mixed_linear_model.py
@@ -1726,7 +1749,8 @@ def __init__(self, model, params, cov_params):
super(MixedLMResults, self).__init__(model, params,
normalized_cov_params=cov_params)
- def bse_fe(self):
+ @cache_readonly
+ def standard_errors_fe(self):
@josef-pkt
josef-pkt Sep 26, 2014 Member

for consistency with current models, we should stick with bse
(don't start changing naming convention in one model only.)

@kshedden
Contributor

I looked over it, not very carefully though. It looks good to me. I will
leave these sorts of decisions up to you two.

Regarding standard_errors and bse, can we retain both for now, then
deprecate bse when the time comes.

I believe we are currently using bse in mice because it is more generically
available.

On Fri, Sep 26, 2014 at 12:04 AM, Skipper Seabold notifications@github.com
wrote:

Addresses at least partially #1990
#1990 and #1992
#1992.

@kshedden https://github.com/kshedden Do you want to have a look at the
changes? Maybe just try updating the notebooks and see if the new names
make sense and also bang on the naming with the wrappers (pandas-in and
pandas-out for all attributes) and the summary table. I wasn't totally sure
what was going on all the time, but I think I kept all your intentions.

This still won't handle missing values correctly AFAICT, and some of the
stuff in init and from_formula is redundant with stuff that already
does happen and some of the stuff that should happen up the inheritance
chain. It would also be nice to see if this can be vectorized/take
advantage of the existing grouping functionality, but I'll likely leave
these for later.

Last thing I want to change in this. likeval vs llf. It looks like there
might be a case where they could be different? Should we have llf and
penalized_llf? What's the pattern in the other models with regularization

now?

You can merge this Pull Request by running

git pull https://github.com/jseabold/statsmodels mixed-lm-wrapper

Or view, comment on, or merge it at:

#2008
Commit Summary

  • FIXUP/WIP: Refactor to use generic data handling
  • ENH: Allow more general param_names in data handling
  • ENH: Allow attachment of generic names.
  • ENH: Refactor to allow wrappers to work.
  • ENH: Handle the generic 2d case.
  • ENH: Wrap 2d attributes
  • REF: Return DataFrame instead of dict.
  • TST: Fix tests for refactor
  • STY: Rename attributes

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#2008.

@josef-pkt josef-pkt commented on the diff Sep 26, 2014
statsmodels/regression/mixed_linear_model.py
@@ -2013,3 +2034,16 @@ def profile_re(self, re_ix, num_low=5, dist_low=1., num_high=5,
model.exog_re_li = exog_re_li_save
return likev
+
+
+class MixedLMResultsWrapper(base.LikelihoodResultsWrapper):
+ _attrs = {'bse_re': ('generic_columns', 'exog_re_names'),
+ 'bse_fe': ('generic_columns', 'xnames'),
@josef-pkt
josef-pkt Sep 26, 2014 Member

see comment above where you have renamed bse

@josef-pkt josef-pkt commented on the diff Sep 26, 2014
statsmodels/base/data.py
@@ -251,6 +253,15 @@ def xnames(self):
return list(xnames)
return None
+ @property
+ def param_names(self):
@josef-pkt
josef-pkt Sep 26, 2014 Member

I think this is missing an s to match the name params, i.e. params_names

@josef-pkt
Member

Medium careful read.

It looks good overall, we need the params_names. However, my guess is that they need to be wired into the generic framework of inherited methods in the results.

currently exog_names, xnames needs to match the length of params

Does t_test work?
MixedLM might still have similar problems as GEE had

@jseabold jseabold changed the title from ENH: Add wraper for MixedLM to ENH: Add wrapper for MixedLM Sep 26, 2014
@jseabold
Member

I think it has the same problems as GEE re: grouping and missing.

@jseabold
Member

I want to deprecate bse in 0.6, but I'm hesitant because I don't want people to have warnings showing on such prominent code for 18 months.

@jseabold
Member

t_test does not work, but then again it never worked because df_resid isn't defined.

@josef-pkt
Member

No way we change bse in 0.6. If we change it we have to change it at the beginning of a release cycle.

consistency is more important than fancier names.

If you want to rename something as fundamental as bse, then it needs discussion.
standard_error is not a good name. My guess would be something like se_params in analogy to cov_params.

@josef-pkt
Member

You could merge this (minus the (re)naming)

It will get us partway towards params_names but it will take more refactoring, which we don't want to do before 0.6 is out. (We are patching exog_names in many places to make patsy not barf in t_test.)

@coveralls

Coverage Status

Coverage increased (+0.0%) when pulling c42b2f8 on jseabold:mixed-lm-wrapper into a4e4eb8 on statsmodels:master.

@jseabold jseabold added this to the 0.6 milestone Sep 26, 2014
@jseabold jseabold merged commit 1a08319 into statsmodels:master Sep 26, 2014
@jseabold jseabold deleted the jseabold:mixed-lm-wrapper branch Sep 26, 2014
This was referenced Sep 26, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment