Skip to content

Add Multinomial GLM #3149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jachymb opened this issue Feb 21, 2025 · 1 comment
Open

Add Multinomial GLM #3149

jachymb opened this issue Feb 21, 2025 · 1 comment

Comments

@jachymb
Copy link

jachymb commented Feb 21, 2025

We already have binomial_logit_glm and categorical_logit_glm

Multinomial distribution generalizes both Binomial and Categorical, is still in the exponential family and the logit (softmax) parametrization seem natural - we already have multinomial_logit. So multinomial_logit_glm is a missing piece to be completed.

I'm actually using this in an application (modelling counts of sold products each day), where I simply iterate multinomial_logit, but it would be cool to have an optimized single-call function.

Related issue: #1964

@jachymb
Copy link
Author

jachymb commented Mar 13, 2025

This is my STAN implementation, trying to avoid for-loops for better vectorization

real multinomial_logit_glm_lpmf(array[,] int y, matrix x, row_vector alpha, matrix beta) {
    // beta and alpha have 1 less DOF than columns. See: https://mc-stan.org/docs/stan-users-guide/regression.html#identifiability

    int num_categories = cols(beta);
    int num_events = rows(x); // one event = one independent multinomial draw

    vector[num_categories] ones = rep_vector(1, num_categories);  // used for matrix row-sum

    matrix[num_events, num_categories] y_mat = to_matrix(y);
    matrix[num_events, num_categories] predictors = x * beta;
    predictors += rep_matrix(alpha, num_events);  // this actually corresponds to the scalar alpha in univariate GLMs.  Maybe also allow matrix alpha as parameter overload

    real lp = 0;

    lp += sum(lgamma(y_mat * ones + 1));  // log multinomial numerator
    lp -= sum(lgamma(y_mat + 1)); // log multinomial denominator
    // note: we call lgamma over reals for coding convenience. But is it maybe better to call it specialized over ints?

    lp += sum(y_mat .* predictors); // log softmax numerator. == trace(y_mat * predictors'). see [#3161]
    lp -= sum(log(exp(predictors) * ones)' * y_mat);    // log softmax denominator.
    // Can we have log_softmax or at least log_sum_exp or to run row-wise over a matrix? That would feel better

    return lp;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant