BUG: Correct dimension when data removed #6888

bashtage · 2020-07-17T10:22:22Z

Correct exog dimension when data has been removed

closes FAQ: get_prediction fails with regression models after calling remove_data #6887
tests added / passed.
code/documentation is well formatted.
properly formatted commit message. See
NumPy's guide.

Notes:

It is essential that you add a test when making code changes. Tests are not
needed for doc changes.
When adding a new function, test values should usually be verified in another package (e.g., R/SAS/Stata).
When fixing a bug, you must add a test that would produce the bug in master and
then show that it is fixed with the new code.
New code additions must be well formatted. Changes should pass flake8. If on Linux or OSX, you can
verify you changes are well formatted by running
```
git diff upstream/master -u -- "*.py" | flake8 --diff --isolated
```
assuming flake8 is installed. This command is also available on Windows
using the Windows System for Linux once flake8 is installed in the
local Linux environment. While passing this test is not required, it is good practice and it help
improve code quality in statsmodels.
Docstring additions must render correctly, including escapes and LaTeX.

Correct exog dimension when data has been removed closes statsmodels#6887

coveralls · 2020-07-17T12:04:32Z

Coverage increased (+0.02%) to 88.043% when pulling 0c2887a on bashtage:gh-6887 into 3614385 on statsmodels:master.

josef-pkt · 2020-07-17T12:09:16Z

statsmodels/regression/_prediction.py

@@ -157,7 +157,8 @@ def get_prediction(self, exog=None, transform=True, weights=None,
                row_labels = None

        exog = np.asarray(exog)
-        if exog.ndim == 1 and (self.model.exog.ndim == 1 or
+        if exog.ndim == 1 and (self.model.exog is None or


this doesn't work if exog is None,
we need to check if exog had only 1 column

The entire remove_data and still predict is fundamentally broken since this data isn't available when data has been removed. I think probably must raise in this case.

we don't need model.exog for predict
remove_data was designed for predict without carrying around all the original data

Except that it is needed to handle edge cases like this. Once it has been removed the correct decision cannot be reasoned here.

As I mentioned on the mailing list, all we need to know is whether there is only 1 exog variable or more than 1.
It would be df_model except that doesn't count the const. In models without extra params, len(self.params) is sufficient

Now using params since it needs to go through np.dot

Fix prediction by retaining required attributes Make private attribute private

josef-pkt · 2021-09-24T17:11:45Z

statsmodels/regression/tests/test_predict.py

+    # GH6887
+    endog = [i + np.random.normal(scale=0.1) for i in range(100)]
+    exog = [i for i in range(100)]
+    model = OLS(endog, exog, weights=[1 for _ in range(100)]).fit()


OLS shouldn't have weights

bashtage force-pushed the gh-6887 branch from bc22619 to 0c79d98 Compare July 17, 2020 10:25

BUG: Correct dimension when data removed

a7c9d9e

Correct exog dimension when data has been removed closes statsmodels#6887

bashtage force-pushed the gh-6887 branch from 0c79d98 to a7c9d9e Compare July 17, 2020 11:18

josef-pkt reviewed Jul 17, 2020

View reviewed changes

BUG: Fix prediction in OLS after remove data

0c2887a

Fix prediction by retaining required attributes Make private attribute private

bashtage merged commit 70058d8 into statsmodels:master Jul 17, 2020

bashtage deleted the gh-6887 branch July 17, 2020 15:49

bashtage added comp-base type-bug labels Jul 27, 2020

bashtage added this to the 0.12 milestone Jul 27, 2020

josef-pkt mentioned this pull request Jul 20, 2021

BUG: regression, allow remove_data to remove wendog, wexog, wresid #7595

Merged

josef-pkt reviewed Sep 24, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Correct dimension when data removed #6888

BUG: Correct dimension when data removed #6888

bashtage commented Jul 17, 2020

coveralls commented Jul 17, 2020 •

edited

Loading

josef-pkt Jul 17, 2020

bashtage Jul 17, 2020

josef-pkt Jul 17, 2020

bashtage Jul 17, 2020

josef-pkt Jul 17, 2020

bashtage Jul 17, 2020

josef-pkt Sep 24, 2021

BUG: Correct dimension when data removed #6888

BUG: Correct dimension when data removed #6888

Conversation

bashtage commented Jul 17, 2020

coveralls commented Jul 17, 2020 • edited Loading

josef-pkt Jul 17, 2020

Choose a reason for hiding this comment

bashtage Jul 17, 2020

Choose a reason for hiding this comment

josef-pkt Jul 17, 2020

Choose a reason for hiding this comment

bashtage Jul 17, 2020

Choose a reason for hiding this comment

josef-pkt Jul 17, 2020

Choose a reason for hiding this comment

bashtage Jul 17, 2020

Choose a reason for hiding this comment

josef-pkt Sep 24, 2021

Choose a reason for hiding this comment

coveralls commented Jul 17, 2020 •

edited

Loading