see comments #863
The sandwich calculations require additional arrays like groups or time.
If patsy or statsmodels missing='drop' removes the rows with missing, then the data of the user doesn't match up with the cleaned endog, exog.
Currently the user needs to drop rows to get consistent matching data for all variables. (it's just a dropna() when using pandas)
Instead we can drop internally, if missing='drop' and we have the index of the dropped rows.
we should just get a helper function that checks and converts/adjusts extra arrays for missing values, based on the stored information about which rows have been dropped or kept.
adding 0.6, if this is not easy to add, we drop it again from 0.6 milestone
Is there an example that demonstrates the problem somewhere for a test?
Should be closed by #2034. A test case here would be helpful. Re-open if still present.
#2034 fixed the handling of extra arrays in model.__init__, but cov_type, cov_kwds are fit arguments, and don't have the missing value connection yet.
We can add the check to cov_type handling for the case where the original and the cov arrays have matching pandas indices, and there are no extra missing values in the cov_kwd Series or DataFrame, In all other case we don't have enough information to adjust the cov_kwd data and need to raise an exception.