Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
missing in extra data (example sandwiches, robust covariances) #1220
see comments #863
The sandwich calculations require additional arrays like groups or time.
If patsy or statsmodels
Currently the user needs to drop rows to get consistent matching data for all variables. (it's just a
Instead we can drop internally, if
#2034 fixed the handling of extra arrays in
We can add the check to cov_type handling for the case where the original and the cov arrays have matching pandas indices, and there are no extra missing values in the cov_kwd Series or DataFrame, In all other case we don't have enough information to adjust the cov_kwd data and need to raise an exception.
@jbrockmendel No, I think not.
However, the problem is that we get now additional arrays or pandas equivalents in fit cov_kwds for panel and cluster cov_types. Those are currently not checked for whether the original data had rows removed because of missing value handling. Those arrays also never go through
I moved it to 0.10 because it will be a bit messy and users can work around by just using dropna() on their initial dataframe. (I still think users are better off to dropna themselves than to push it into the model.)