-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FAQ: get_prediction fails with regression models after calling remove_data #6887
Comments
Correct exog dimension when data has been removed closes statsmodels#6887
Correct exog dimension when data has been removed closes statsmodels#6887
Correct exog dimension when data has been removed closes statsmodels#6887
Not sure what the right behavior here is. It happens to work after you remove data because you call However, if you run import numpy as np
import statsmodels.api as sm
# toy data
endog = [i + np.random.normal(scale=0.1) for i in range(100)]
exog = [i for i in range(100)]
# fit
model = sm.OLS(endog, exog, weights=[1 for _ in range(100)]).fit()
model.remove_data()
# Broken now
model.get_prediction(1).predicted_mean
model.get_prediction([[1]]).predicted_mean
|
BUG: Correct dimension when data removed
Correct exog dimension when data has been removed closes statsmodels#6887
This is not supposed to work. get_prediction adds inferential statistics. predicted_mean itself does not use cov_params, but that is just a call to results.predict, that the user can do instead. AFAICS, the failing attribute is a missing The fix for this was to keep wresid, wexog, wendog just to compute scale, which defeats the purpose of remove_data updated: If model.scale is called before remove data, then wendog, wexog are not needed for get_prediction([1]).predicted_mean after remove_data . |
reopen as FAQ |
Edit
get_prediction
computes inferential statistics and will only work afterremove_data
if inferential attributes, specificallyscale
in this case, have been cached before data is removed. If cached attributes are accessed, e.g. by summary(), then they will in the cache and still be available afterremove_data
see comment #6887 (comment) below.
Describe the bug
the conditional expression at L160 in v0.11 regression._prediction.py will cause an AttributeError with a simple model (exog dim=1) if remove_data has been called on the underlying model.
Code Sample, a copy-pastable example if possible
Note: As you can see, there are many issues on our GitHub tracker, so it is very possible that your issue has been posted before. Please check first before submitting so that we do not have to handle and close duplicates.
Note: Please be sure you are using the latest released version of
statsmodels
, or a recent build ofmaster
. If your problem has been fixed in an unreleased version, you might be able to usemaster
until a new release occurs.Note: If you are using a released version, have you verified that the bug exists in the master branch of this repository? It helps the limited resources if we know problems exist in the current master so that they do not need to check whether the code sample produces a bug in the next release.
If the issue has not been resolved, please file it in the issue tracker.
Expected Output
AttributeError: 'NoneType' object has no attribute 'ndim'
Output of
import statsmodels.api as sm; sm.show_versions()
[paste the output of
import statsmodels.api as sm; sm.show_versions()
here below this line]INSTALLED VERSIONS
Python: 3.6.10.final.0
OS: Linux 3.10.0-862.14.4.el7.YAHOO.20180927.19.x86_64 #1 SMP Thu Sep 27 18:19:45 UTC 2018 x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
statsmodels
Installed: 0.11.0 (/grid/6/tmp/yarn-local/usercache/rrymer/appcache/application_1594696455454_937110/container_e24_1594696455454_937110_01_000002/python36-dependencies.zip/statsmodels)
Required Dependencies
cython: Not installed
numpy: 1.16.4 (/grid/6/tmp/yarn-local/usercache/rrymer/appcache/application_1594696455454_937110/container_e24_1594696455454_937110_01_000002/python36-dependencies.zip/numpy)
scipy: 1.5.0 (/opt/python/lib/python3.6/site-packages/scipy)
pandas: 1.0.0 (/grid/6/tmp/yarn-local/usercache/rrymer/appcache/application_1594696455454_937110/container_e24_1594696455454_937110_01_000002/python36-dependencies.zip/pandas)
dateutil: 2.8.1 (/grid/6/tmp/yarn-local/usercache/rrymer/appcache/application_1594696455454_937110/container_e24_1594696455454_937110_01_000002/jup3.zip/dateutil)
patsy: 0.5.1 (/grid/6/tmp/yarn-local/usercache/rrymer/appcache/application_1594696455454_937110/container_e24_1594696455454_937110_01_000002/python36-dependencies.zip/patsy)
Optional Dependencies
matplotlib: 3.2.2 (/opt/python/lib/python3.6/site-packages/matplotlib)
backend: module://ipykernel.pylab.backend_inline
cvxopt: Not installed
joblib: 0.15.1 (/opt/python/lib/python3.6/site-packages/joblib)
Developer Tools
IPython: 6.5.0 (/grid/6/tmp/yarn-local/usercache/rrymer/appcache/application_1594696455454_937110/container_e24_1594696455454_937110_01_000002/jup3.zip/IPython)
The text was updated successfully, but these errors were encountered: