Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KDEUnivariate with weights #823

Closed
jseabold opened this issue May 9, 2013 · 3 comments

Comments

Projects
None yet
2 participants
@jseabold
Copy link
Member

commented May 9, 2013

What should evaluate do in the case of weighted KDE?

import numpy as np
import matplotlib.pyplot as plt
from statsmodels.nonparametric.kde import KDEUnivariate
bimodal = np.concatenate([10+np.random.randn(100), 30+np.random.randn(100)])
kde = KDEUnivariate(bimodal)
x = np.linspace(0, 40, 256)
w1 = np.concatenate([ones(100), zeros(100)])
kde.fit(weights=w1, fft=False)
plt.plot(x, kde.evaluate(x));

Compare this to

plt.plot(kde.support, kde.density)

Non-optional weights argument to evaluate, etc.?

@josef-pkt

This comment has been minimized.

Copy link
Member

commented May 9, 2013

I think this is a bug in evaluate that it doesn't take weights
into account.

The way I understand the code after browsing a bit:

fit() sets the bandwidth (without taking weights into account)
calculates density for given points

evaluate() just calls the kernel which has bandwidth from fit() and
evaluates without taking weights into account.

I didn't look at weighted KDE, however, I think we can interpret
weights like weights in RLM, downweight certain observations.
Then we want to evaluate at a point assuming the weight on the point
is one, however based on a density that uses the weighted data points.

In the 0-1 case above, observations with weights zero, would be
effectively dropped when evaluating the density.


to support it with fft we would just need to adjust the weights/counts at the grid points, I think


reported at http://comments.gmane.org/gmane.comp.python.pystatsmodels/10970

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Oct 2, 2013

KDEUnivariate has the weights argument, but it's not listed under Parameters
http://statsmodels.sourceforge.net/devel/generated/statsmodels.nonparametric.kde.KDEUnivariate.fit.html

should be considered a bug.

@josef-pkt

This comment has been minimized.

Copy link
Member

commented Oct 2, 2013

looks like reference result in unit tests are from Stata, which allows fweights, aweights, and iweights in kdensity

@josef-pkt josef-pkt closed this in c0a62a0 Dec 18, 2013

PierreBdR pushed a commit to PierreBdR/statsmodels that referenced this issue Sep 2, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.