Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sandbox kernels, problems with inDomain #1239

Closed
josef-pkt opened this issue Dec 16, 2013 · 2 comments

Comments

Projects
None yet
1 participant
@josef-pkt
Copy link
Member

commented Dec 16, 2013

for gaussian this works because it doesn't have bounds

for triangular I get an exception see below
first is not vectorized, that's possible by design
second is that inDomain returns list, which causes failure in density

Note: there wasn't a problem for the "smoothconf" in PR #1233


update

Ok fixed KDEUnivariate.evaluate/CustonKernel.density in #1240
However, I don't think we can vectorize evaluate for bounded support kernels because inDomain will not return a rectangular array, each point might have a different number of neighbors.
So we would need to use some other structure to vectorize this (0 weights, sparse, ...). I doubt it's worth to vectorize just inDomain.
For convenience we could add a loop internally to KDEUnivariate evaluate for bounded kernels.


>>> kde2.fit(kernel='tri', fft=False)
>>> kde2.density[:5]
array([ 0.,  0.,  0.,  0.,  0.])
>>> kde2.evaluate(kde2.support[:5])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2_py27\statsmodels\statsmodels\nonparametric\kde.py", line 259, in evaluate
    return self.kernel.density(self.endog, point)
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2_py27\statsmodels\statsmodels\sandbox\nonparametric\kernels.py", line 189, in density
    xs = self.inDomain( xs, xs, x )[0]
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2_py27\statsmodels\statsmodels\sandbox\nonparametric\kernels.py", line 176, in inDomain
    filtered = filter(isInDomain, zip(xs, ys))
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2_py27\statsmodels\statsmodels\sandbox\nonparametric\kernels.py", line 171, in isInDomain
    return u >= self.domain[0] and u <= self.domain[1]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

>>> kde2.support[0]
-4.3733738207323043
>>> kde2.evaluate(kde2.support[0])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2_py27\statsmodels\statsmodels\nonparametric\kde.py", line 259, in evaluate
    return self.kernel.density(self.endog, point)
  File "e:\josef\eclipsegworkspace\statsmodels-git\statsmodels-all-new2_py27\statsmodels\statsmodels\sandbox\nonparametric\kernels.py", line 190, in density
    if xs.ndim == 1:
AttributeError: 'list' object has no attribute 'ndim'
@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Dec 16, 2013

possibly introduced by a4d722c
which would mean evaluate wouldn't have worked for some time for bounded support kernels
I cannot check right now because I have a loose HEAD in my checkout

my guess is that there is no test coverage for other kernels.

I have a csv file with results for weighted density from Stata, where I could check this.

@josef-pkt

This comment has been minimized.

Copy link
Member Author

commented Dec 18, 2013

fixed and unit tested but not vectorized in PR #1240 merged in c0a62a0

@josef-pkt josef-pkt closed this Dec 18, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.