missing in extra data (example sandwiches, robust covariances) #1220

Open
josef-pkt opened this Issue Dec 9, 2013 · 5 comments

Projects

None yet

2 participants

@josef-pkt
Member

see comments #863

The sandwich calculations require additional arrays like groups or time.

If patsy or statsmodels missing='drop' removes the rows with missing, then the data of the user doesn't match up with the cleaned endog, exog.

Currently the user needs to drop rows to get consistent matching data for all variables. (it's just a dropna() when using pandas)

Instead we can drop internally, if missing='drop' and we have the index of the dropped rows.

@josef-pkt josef-pkt referenced this issue Jan 29, 2014
Closed

Gee #1314

@josef-pkt
Member

we should just get a helper function that checks and converts/adjusts extra arrays for missing values, based on the stored information about which rows have been dropped or kept.

@josef-pkt josef-pkt added this to the 0.6 milestone Aug 12, 2014
@josef-pkt
Member

adding 0.6, if this is not easy to add, we drop it again from 0.6 milestone

@jseabold
Member
jseabold commented Oct 9, 2014

Is there an example that demonstrates the problem somewhere for a test?

@jseabold
Member

Should be closed by #2034. A test case here would be helpful. Re-open if still present.

@jseabold jseabold closed this Oct 15, 2014
@josef-pkt
Member

reopening

#2034 fixed the handling of extra arrays in model.__init__, but cov_type, cov_kwds are fit arguments, and don't have the missing value connection yet.
http://stackoverflow.com/questions/33712400/statsmodels-ols-clustered-standard-errors-not-accepting-series-from-df

We can add the check to cov_type handling for the case where the original and the cov arrays have matching pandas indices, and there are no extra missing values in the cov_kwd Series or DataFrame, In all other case we don't have enough information to adjust the cov_kwd data and need to raise an exception.

@josef-pkt josef-pkt reopened this Nov 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment