New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
any offsets? #173
Comments
@ric70x7 thanks for using the package :) unfortunately, not really. :/ could you tell me a bit about why an offset or exposure might be useful? I would like to figure out how to prioritize such a feature. Thanks :) |
That is perfect. Having exposure in the Poisson model solves my current problem. I'm interested in Point Processes for epidemiology. |
@ric70x7 great. let me know if this works for you. |
It is working well so far. Thanks. |
great!! |
Hi again. I just noticed that there is an issue with the exposure parameter. When I declare it different from a constant across all observations I get -infinite Log Likelihoods. The fitted values seem to make sense anyway. Here is an example: import pygam def some_rate(x): X = np.random.uniform(0, 100, 50) This doesn't happen if I make the exposure constant, for example: gam.fit(X, Y, exposure=20 * np.ones_like(population)) |
oh wow. this is a great example. im taking a look. |
@ric70x7 i see now that the issue comes from using exposure vs offset and the necessity of the Poisson observations to be integer. when using exposure, we divide the counts by the exposure and model that (and weight each sample by the exposure to compensate for the actual variance). This turns the counts into rates, which results in an equivalent model, but now the exact likelihood cannot be evaluated because Poisson requires integer observations. pyGAM rescales the rate into counts when computing the log-likelihood, but i forgot to round and cast to integer. small numerical errors caused the pmf function to believe these rescaled counts were non-integers... anyway, this could all be avoided if everything were modeled with offsets... |
im going to burn a new version. you should see this issue fixed in that version. |
cool, @ric70x7 can you try the new version? |
Yep, this works. Thanks! |
Hi! I'm currently needing to include exposure in a binomial model. Is that implemented (I couldn't find it) and, if not, is there a work-around? |
Hi FeysJan. I don't know if it is implemented like that, but you can fit a binomial model using a LogisticGAM and define the weights according to the exposure and number of positive cases. For example, if you have 3 positives and 5 trials, this would be passed as y=1 with a weight of 3/5 and y = 0 with a weight of 2/5. I wrote a function that organizes the data for you. Here is the link: It works like this: import numpy as np num_data = 20 y, w, newX = binomial_to_bernoulli(n_positive=n_positive, n_trials=n_trials, X=X) m.predict_mu(X) # this returns the fitted rate |
Thanks Ricardo, I appreciate the code. I’m looking at what you sent me to see if that solves my problems.
I had a hunch I may be able to convert to a LogisticGAM.
Jan
… On Sep 28, 2018, at 3:52 PM, Ricardo Andrade ***@***.***> wrote:
Hi FeysJan. I don't know if it is implemented like that, but you can fit a binomial model using a LogisticGAM and define the weights according to the exposure and number of positive cases. For example, if you have 3 positives and 5 trials, this would be passed as y=1 with a weight of 3/5 and y = 0 with a weight of 2/5.
I wrote a function that organizes the data for you. Here is the link:
https://github.com/disarm-platform/disarm-gears/blob/develop/disarm_gears/util/binomial_to_bernoulli.py <https://github.com/disarm-platform/disarm-gears/blob/develop/disarm_gears/util/binomial_to_bernoulli.py>
(Just remove what you don't need, because it has other things that are needed for my current project)
It works like this:
import numpy as np
import pygam
Some toy data
num_data = 20
n_trials = np.random.randint(10, 15, num_data)
X = np.random.normal(0, .8, num_data)
X.sort()
rate = 3 + 5 * X
n_positive = (1/(1+np.exp(-rate)) * n_trials).astype(int)
Turn your binomial observations into bernoulli observations
y, w, newX = binomial_to_bernoulli(n_positive=n_positive, n_trials=n_trials, X=X)
m = pygam.LogisticGAM()
m.fit(X=newX[:,None], y=y, weights=w)
m.predict_mu(X) # this returns the fitted rate
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#173 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/Ao_HTp8vSu1533X3li2B_q3n30fxXefyks5ufn4FgaJpZM4UlPuW>.
|
Hi,
First of all, this is a great package!
Is it possible to declare an offset or exposure variable? Meaning: a regressor with coefficient fixed to 1.
The text was updated successfully, but these errors were encountered: