Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memoize, last argument wins, how to attach sandwich to Results? #276

Closed
josef-pkt opened this issue May 17, 2012 · 4 comments

Comments

Projects
None yet
2 participants
@josef-pkt
Copy link
Member

commented May 17, 2012

I don't know how sandwich covariance matrices, that require arguments should be attached, relevant for cluster/groups and panel/groups.

It would be good to save the covariance for further use for bse calculation, summary(), tests.
Users might call the cov_cluster with different group definitions.

possibilities

  1. disallow redefining groups: groups is an attribute of data, for example res.set_groups(groups) then it's just a cached property

  2. save last call only: store last result, if user calls cov_cluster again with different groups then we reset the cached version

  3. memoize: cache with argument (groups) as a key.

  4. looks to messy, because we wouldn't know what to use for other calculations - out

  5. has the problem that the user can reset the groups. If we have a setter method, then we can force recalculation when the groups are reset (empty the cache). This would have the same behavior as 2) but without an argument to the method (would be just cached attribute)
    -> I'm currently leaning towards this with set_options(use_cov=xxx) from below

related question: arguments everywhere or letting user set a default?

for example

res.set_options(use_cov='hac')
then in the call to t_test, f_test and summary we can use the cov_hac instead of the standard OLS cov_params
One possible problem would be when we return only numbers because then the user has to remember which cov is used. That's not a problem for t_test f_test and summary, because we can add the type ('hac', 'HC0', ...) to the return. But tvalues, pvalues, ... would depend on the use_cov setting, without reminding the user.

alternative: user has to specify which cov should be used as argument in t_test, f_test and summary


standard usage in most cases will have fixed group, so resetting groups will not happen very often, I expect.

@jseabold

This comment has been minimized.

Copy link
Member

commented May 17, 2012

I can't think of a case where I'd want to change the groups. Can you give an example? I was playing with gretl today and I noticed they have a global option for setting HAC covariance everywhere, but they also let you set them when you do the estimation.

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented May 17, 2012

The main one I can think of are nested clusters:

firm level time series, do we cluster by firm, industry or geography?
school level time series, do we cluster by school or type of school or district?
cross country time series, do we cluster by country or by development level?

for short panels, time should also be treated as a group (unweighted aggregate instead of bartlett kernel): in this case time and cross-section would form two groups. It's not clear, whether a user wants to use both or either one.

Stata requires vcv to be specified with regress, and uses globals (last estimate) and memoizes for repeated calls to regress with same model. AFAIU, reading slowly through the User's Guide.
(I don't know how much Stata recalculates if you call regress repeatedly with different vcv options)

Being able to set the cov options only once (for a result instance) would be convenient, for example Stock/Watson textbook uses HC by default, unless otherwise specified.

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented May 17, 2012

One argument in favor of setting a result option in this case is that most users that use specific robust standard errors will be aware of and used to it, so I guess they won't get confused if

res.set_options(use_cov='hac')

changes many of their later results.

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Sep 20, 2014

I'm closing this. I have settled on creating new instances in the "official" usage.
cov_type is chosen in fit method, or could be chosen by get_robust_cov_results from an already existing instance.
The latter is not yet implemented for the models except OLS because it requires automatic results creation (cloning with adjustments), which is not possible for most models.

@josef-pkt josef-pkt closed this Sep 20, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.