You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a bug in class Hampel within statsmodels/robust/norms.py in defining piecewise function over z in Hampel.rho method.
It results in incorrect values for the Hampel.rho method outcomes.
The rho method uses _subset method to calculate boolean indicator arrays for t1, t2, and t3 pieces, with the last piece t4 defining the whole range needed for the Hampel's $\rho$ function being calculated in the rho method. This last piece is incorrect in the calculation of the boolean indicator array, I show in the code below.
Note that I can't reproduce the issue perfectly, since Hampel.rho does not rely on Hampel._subset entirely in defining piecewise boolean indicator arrays, as opposed to Hampel.psi or Hampel.weights.
Code Sample, a copy-pastable example if possible
importstatsmodels.apiassm# create hampel norm with default parameters a, b, champel=sm.robust.norms.Hampel(a=2., b=4., c=8.)
# create specially crafted t1, t2, t3t1, t2, t3=hampel._subset([1, 3, 5, 9])
# calculate t4 as in `Hampel.rho` (line 639 calculates that implicitly, without assigning variable `t4`)t4= (1-t1+t2+t3)
# t4 gives `array([0, 2, 2, 1])`
Note: As you can see, there are many issues on our GitHub tracker, so it is very possible that your issue has been posted before. Please check first before submitting so that we do not have to handle and close duplicates.
Note: Please be sure you are using the latest released version of statsmodels, or a recent build of main. If your problem has been fixed in an unreleased version, you might be able to use main until a new release occurs.
Note: If you are using a released version, have you verified that the bug exists in the main branch of this repository? It helps the limited resources if we know problems exist in the current main branch so that they do not need to check whether the code sample produces a bug in the next release.
If the issue has not been resolved, please file it in the issue tracker.
Expected Output
The implicit t4 that is calculated "on the fly" in Hampel.rho should result in an array array([0, 0, 0, 1]) to pick only the last piecewise boolean indicator array filter.
In fact, it is obvious the "on the fly" calculation in Hampel.rho line 639 should be
(1- (t1+t2+t3)) *a* (b+c-a))
rather than the existing
(1-t1+t2+t3) *a* (b+c-a))
Output of import statsmodels.api as sm; sm.show_versions()
INSTALLED VERSIONS
Python: 3.8.10.final.0
OS: Linux 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
Describe the bug
There is a bug in
class Hampel
within statsmodels/robust/norms.py in defining piecewise function overz
inHampel.rho
method.It results in incorrect values for the
Hampel.rho
method outcomes.The$\rho$ function being calculated in the
rho
method uses_subset
method to calculate boolean indicator arrays fort1
,t2
, andt3
pieces, with the last piecet4
defining the whole range needed for the Hampel'srho
method. This last piece is incorrect in the calculation of the boolean indicator array, I show in the code below.Note that I can't reproduce the issue perfectly, since
Hampel.rho
does not rely onHampel._subset
entirely in defining piecewise boolean indicator arrays, as opposed toHampel.psi
orHampel.weights
.Code Sample, a copy-pastable example if possible
Note: As you can see, there are many issues on our GitHub tracker, so it is very possible that your issue has been posted before. Please check first before submitting so that we do not have to handle and close duplicates.
Note: Please be sure you are using the latest released version of
statsmodels
, or a recent build ofmain
. If your problem has been fixed in an unreleased version, you might be able to usemain
until a new release occurs.Note: If you are using a released version, have you verified that the bug exists in the main branch of this repository? It helps the limited resources if we know problems exist in the current main branch so that they do not need to check whether the code sample produces a bug in the next release.
If the issue has not been resolved, please file it in the issue tracker.
Expected Output
The implicit
t4
that is calculated "on the fly" inHampel.rho
should result in an arrayarray([0, 0, 0, 1])
to pick only the last piecewise boolean indicator array filter.In fact, it is obvious the "on the fly" calculation in
Hampel.rho
line 639 should berather than the existing
Output of
import statsmodels.api as sm; sm.show_versions()
INSTALLED VERSIONS
Python: 3.8.10.final.0
OS: Linux 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
statsmodels
Installed: 0.13.5 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/statsmodels)
Required Dependencies
cython: Not installed
numpy: 1.22.3 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/numpy)
scipy: 1.8.0 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/scipy)
pandas: 1.4.1 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/pandas)
dateutil: 2.8.2 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/dateutil)
patsy: 0.5.3 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/patsy)
Optional Dependencies
matplotlib: 3.5.1 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/matplotlib)
backend: module://matplotlib_inline.backend_inline
cvxopt: Not installed
joblib: 1.1.0 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/joblib)
Developer Tools
IPython: 8.1.1 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/IPython)
jinja2: 3.0.3 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/jinja2)
sphinx: Not installed
pygments: 2.11.2 (/home/kamil/.cache/pypoetry/virtualenvs/smps-aJ6sAQBH-py3.8/lib/python3.8/site-packages/pygments)
pytest: Not installed
virtualenv: Not installed
The text was updated successfully, but these errors were encountered: