Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: new ordered models, add generalized ordered logit and generalized ordered link models #8444

Open
josef-pkt opened this issue Oct 15, 2022 · 2 comments

Comments

@josef-pkt
Copy link
Member

josef-pkt commented Oct 15, 2022

see comment #8442 (comment) and following

It's also a cumulative link model, where params differ by choice P(y < k) = F(x beta_k)

My guess the only or main difference to OrderedModel is to split up params in _bounds and linpred to compute thresholds with category level specific params_{k} and params_{k-1}

loglike_obs just uses _bounds method
score and hessian are currently computed by numdiff

not checked yet:
I guess there is a computational problem when integration _bounds are not increasing,
I guess we need to directly impose nonnegativity of probabilities, prob = max(F(upp) - F(low), 0).
Do we still get sum(probs) = 1 if non-negativity constraint is binding?

Williams 2016 has section 5 on p. 18 about negative probabilities, sounds like they are not clipped to zero

Williams, Richard. “Understanding and Interpreting Generalized Ordered Logit Models.” The Journal of Mathematical Sociology 40, no. 1 (January 2, 2016): 7–20. https://doi.org/10.1080/0022250X.2015.1112384.

@josef-pkt
Copy link
Member Author

josef-pkt commented Oct 15, 2022

another thought

gologit corresponds to mnlogit in the sense of common exog but choice specific params

I guess a more general version would be the analogy to conditional mnlogit, i.e. common params but choice specific exog
This encompasses gologit but not the other way around.

For common exog, as in gologit, we would be replicating exog k times, i.e. number of choices and non-common exog should not be too large
if we compose the design matrix as [exog_common, exog_k1, exog_k2, ....]

Is this correct? How can we know when creating the design matrix which exog_k belong to an observation?
???
It has been a long time since I looked at this for variants of discrete choice models.

I guess what this means is that we cannot create a single design matrix, we need to keep several different exog_k and have linear predictor for each k (one params, many exog_k)
e.g.
prob(y = k) = f(b * x_k) or = F(a_k + b x_k) - F(a_{k-1} b x_{k-1})

So, I have the option to either have one exog and choice specific params (including overlapping params) or have several exog, common and choice specific ones.
I guess those two are equivalent in generality if I allow for arbitrary mappings/selection of params -> params_k or exog_all -> exog_k.
Everything is just collected inside linpred_k, i.e. params_k * exog_k
_bounds needs to call choice specific linpred(..., choice=k), similar to choice specific threshold/constants.

possible implementation for using params_k and one exog:

params_k = params[mask_k]
params_e[] = np.zeros(len(params))
params_e[mask_e_k] = params_k
where masks can be boolean or an integer index arrays

This requires full dot product params_e * exog_all where params_e can have many zeros, but we don't need to duplicate exog data arrays.

simples case: full gologit
mask_e_k is all true, or includes all indices (i.e. skip if None)
params_k = params.reshape[-1, k_exog][k] i.e. just the kth slice

edit
another possibility use linalg instead of masks
params_k = C_k dot params

This would make it easier to include linear restrictions on parameters,, i.e. mapping from restricted to full parameters.
In the simplest case it would be just an identity matrix in a submatrix of C_k

@josef-pkt josef-pkt changed the title ENH: new order models, add generalized ordered logit and generalized ordered link models ENH: new ordered models, add generalized ordered logit and generalized ordered link models Oct 16, 2022
@josef-pkt
Copy link
Member Author

Green, Hensher Primer section 7.2
has discussion of generalized logit and an alternative specification that can be used to impose monotonicity in integration limits to avoid negative probabilities

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant