Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Conditional logistic regression models #560

Closed
jon-mellon opened this issue Nov 28, 2018 · 21 comments
Closed

Implement Conditional logistic regression models #560

jon-mellon opened this issue Nov 28, 2018 · 21 comments

Comments

@jon-mellon
Copy link

jon-mellon commented Nov 28, 2018

Following up on https://twitter.com/paulbuerkner/status/1067809413581414401

It would be great it BRMS could support conditional logit models. They are used in a lot of places, such as analyzing conjoint experiments, purchase decisions or voting behavior.

As an example let's suppose that consumers (i) enter a store and can choose one brand of cereal. Different stores have different inventory at different times so the choice set is not always the same. We'll say that consumers weigh up the nutrition, price and advertising that each brand has received when making their decision and we want to know what weights they put on each. The data might look something like:

image

They're definitely possible in Stan in some form:
https://www.rdocumentation.org/packages/rstanarm/versions/2.17.4/topics/stan_clogit

Here's the original McFadden paper on the models:
https://eml.berkeley.edu/reprints/mcfadden/zarembka.pdf

I think the key extra piece of information that needs to be given is an identifier that defines one choice situation.

brm(formula = choice|chid(consumer) ~ price + nutrition + advertising)

we might then imagine that older people are more susceptible to advertising so we would interact age and advertising

brm(formula = choice|chid(consumer) ~ price + nutrition + advertising:age + advertising)

or that the slope of advertising varies across states:

brm(formula = choice|chid(consumer) ~ price + nutrition + advertising + (1+advertising|state), data)

I wonder if we could also have brand (i.e. particular choice) specific random intercepts:
brm(formula = choice|chid(consumer) ~ price + nutrition + advertising + (1|brand), data)

@paul-buerkner
Copy link
Owner

paul-buerkner commented Jan 30, 2019

I realized that it is already possible to fit conditional logistic models in brms, although the syntax is a little bit verbose. Here is an example:

library(brms)
# choice data in **wide** format
data("Fishing", package = "mlogit")

bform <- bf(
  mode ~ 1,
  nlf(mubeach ~ bprice * price.beach + bcatch * catch.beach),
  nlf(mupier ~ bpier + bprice * price.pier + bcatch * catch.pier),
  nlf(muboat ~ bboat + bprice * price.boat + bcatch * catch.boat),
  nlf(mucharter ~ bcharter + bprice * price.charter + bcatch * catch.charter),
  bpier + bboat + bcharter + bprice + bcatch ~ 1,
  family = categorical(refcat = NA)
)

nlpars <- c("bpier", "bboat", "bcharter", "bprice", "bcatch")
bprior <-  set_prior("normal(0, 5)", nlpar = nlpars)

fit <- brm(formula = bform, data = Fishing, 
           prior = bprior, chains = 2, cores = 2)
summary(fit)

The above model will only work in brms 2.7.2 or higher.

@paul-buerkner
Copy link
Owner

I updated my model in the above post, which now produces the expected results. Based on this syntax, it should be possible to specify all kinds of conditional logistic models.

@jon-mellon I would love to hear your opinion on this.

@paul-buerkner paul-buerkner added this to the brms 2.7.0++ milestone Jan 30, 2019
@jon-mellon
Copy link
Author

Thanks for doing this. I'll take a closer look soon, but my first question is: does this approach allow the choice sets to vary across cases. e.g. in one case someone is choosing between charter, pier and boat and in another case they are choosing between charter and beach. Or would this require that the respondent is choosing between all options in every case?

The use case I'm thinking of using this for is voting for political parties across many countries and years. So the actual parties change between countries and over time, but facts about the parties (their party family, their economic positions, their previous vote share etc) can be measured comparably across all settings.

In this case, it's important that I don't have voters in Sweden choosing between Republicans and Democrats and vice-versa.

@paul-buerkner
Copy link
Owner

The approach does not canonically support varying response categories, but one may try to trick it into supporting them by added highly negative offsets to impossible categories.

Suppose some persons did not see the "boat" option, then we could "deactivate" it as follows (this requires the latest version of brms as of January 31th 2019):

library(brms)
# choice data in **wide** format
data("Fishing", package = "mlogit")

Fishing$noboat <- 0
# the first 50 persons did not have the "boat" option
Fishing$noboat[1:50] <- -100

bform2 <- bf(
  mode ~ 1,
  nlf(mubeach ~ bprice * price.beach + bcatch * catch.beach),
  nlf(mupier ~ bpier + bprice * price.pier + bcatch * catch.pier),
  nlf(muboat ~ bboat + bprice * price.boat + bcatch * catch.boat + noboat),
  nlf(mucharter ~ bcharter + bprice * price.charter + bcatch * catch.charter),
  bpier + bboat + bcharter + bprice + bcatch ~ 1,
  family = categorical(refcat = NA)
)

nlpars <- c("bpier", "bboat", "bcharter", "bprice", "bcatch")
bprior2 <-  set_prior("normal(0, 5)", nlpar = nlpars)

fit2 <- brm(formula = bform2, data = Fishing,
           prior = bprior2, chains = 2, cores = 2)
summary(fit2)

# p(boat) is 0 for the first 50 observations
round(fitted(fit2, newdata = Fishing[1:100, ])[, 1, ], 5)

@paul-buerkner paul-buerkner removed this from the brms 2.7.0++ milestone Feb 16, 2019
@paul-buerkner
Copy link
Owner

Although the syntax shown above allows fitting conditional logit models, I am not sure if I would recommend it as an easy way to actually specify those models. I will have to think more a less verbose solution (related to your initial proposal) that works well within the brms framework. Accordingly, I will leave this issue open for now.

@jon-mellon
Copy link
Author

Thanks I think that makes sense. I will also try to see how closely the current version can replicate conditional logit results. The main thing I'm not sure about is whether variation on the x variables for deactivated options could still provide information to the fitting process in a way that might skew the results.

@paul-buerkner
Copy link
Owner

paul-buerkner commented Jun 28, 2019 via email

@jon-mellon
Copy link
Author

jon-mellon commented Jun 28, 2019

Just realized that I was still using 2.9.0. I've tried 2.9.2 from github and it seems to be working now.

@jon-mellon
Copy link
Author

jon-mellon commented Jun 29, 2019

I've built a simulation of a multinomial choice process as a benchmark (so I can be sure what the correct answer is) and the mlogit package seems to be able to recover something close to the true result (note that in reality the size of the choice set would vary across instances). Ideally I would want to be able to have the intercepts drawn from a distribution and partially pooled rather than being fixed effects and have higher level random intercepts as well.

library(dplyr)
library(mlogit)
library(reshape2)


n.intercepts <- 10
n.obs <- 1000
choices.per.time <- 4
price.coef <- -0.4
intercept.sd <- 1

mode.intercepts <- rnorm(n.intercepts, sd = intercept.sd)
choice.sets <- t(replicate(n.obs, sample(LETTERS[1:n.intercepts], choices.per.time)))
prices <- matrix(rnorm(n.intercepts * 100), ncol = 10)
names(mode.intercepts) <- LETTERS[1:n.intercepts]
colnames(prices) <-  LETTERS[1:n.intercepts]

intercept.choice.set <- matrix(mode.intercepts[choice.sets], ncol = 4, nrow = n.obs, byrow = F)
price.choice.set <- t(mapply(function(x,y) x[match(y, LETTERS)], 
                             split(prices, 1:nrow(prices)), 
                             split(choice.sets, 1:nrow(choice.sets))))

xb <- intercept.choice.set + price.choice.set * price.coef
exb <- exp(xb)

choice.set.probs <- t(mapply(function(x,y) y / x, rowSums(exb), split(exb, 1:nrow(exb))))
choices <- t(apply(choice.set.probs, 1, rmultinom, size = 1, n= 1))
choices <- mapply(function(x,y) x[y], split(choice.sets, 1:nrow(choice.sets)), apply(choices==1, 1, which))

colnames(prices) <- paste0("price", colnames(prices))

notpresent <- t(sapply(split(choice.sets, 1:nrow(choice.sets)), function(x) as.numeric(!LETTERS[1:n.intercepts] %in% x )))
colnames(notpresent) <- paste0("no", LETTERS[1:n.intercepts])

data <- data.frame(choices, notpresent, prices)
long <- melt(data.frame(id = 1:nrow(choice.sets), choice.sets), id.var = "id")
long <- data.frame(long, melt(price.choice.set))

long$variable <- NULL

data$id <- 1:nrow(data)

long$Var1 <- NULL
long$Var2 <- NULL

long <- long %>% arrange(id)
long$mode <- data$choices[match(long$id, data$id)]
long$choice <- long$mode==long$value
colnames(long) <- c("id", "alts", "price", "chosen", "choice")

summary(mlogit(data = long, formula = choice~price, shape = "long", chid.var = "id", alt.var = "alts"))

One bit of the code I'm currently not clear on for running the BRMS version of the model is this part:

bpier + bboat + bcharter + bprice + bcatch ~ 1,

Do I need to put all fixed parts of the model in there?

@paul-buerkner
Copy link
Owner

I haven't looked at your above code in detail, but yes, your fixed effects are indeed specified as non-linear parameters on which there is a lot of documentation available.

@dhalpern
Copy link

dhalpern commented Jul 3, 2019

I tried turning the above model into Stan code and it seems that the priors don't get included for some reason. Is this a bug? Or is it something weird about categorical nonlinear parameters?

Here is the code I ran:

bform <- bf(
    mode ~ 1,
    nlf(mubeach ~ bprice * price.beach + bcatch * catch.beach),
    nlf(mupier ~ bpier + bprice * price.pier + bcatch * catch.pier),
    nlf(muboat ~ bboat + bprice * price.boat + bcatch * catch.boat),
    nlf(mucharter ~ bcharter + bprice * price.charter + bcatch * catch.charter),
    bpier + bboat + bcharter + bprice + bcatch ~ 1,
    family = categorical(refcat = NA)
)

nlpars <- c("bpier", "bboat", "bcharter", "bprice", "bcatch")
bprior <-  set_prior("normal(0, 5)", nlpar = nlpars)

make_stancode(formula = bform, data = Fishing, prior = bprior)

And here is the resulting Stan code:

// generated with brms 2.9.0
functions {
}
data {
  int<lower=1> N;  // number of observations
  int<lower=2> ncat;  // number of categories
  int Y[N];  // response variable
  int<lower=1> K_bpier;  // number of population-level effects
  matrix[N, K_bpier] X_bpier;  // population-level design matrix
  int<lower=1> K_bboat;  // number of population-level effects
  matrix[N, K_bboat] X_bboat;  // population-level design matrix
  int<lower=1> K_bcharter;  // number of population-level effects
  matrix[N, K_bcharter] X_bcharter;  // population-level design matrix
  int<lower=1> K_bprice;  // number of population-level effects
  matrix[N, K_bprice] X_bprice;  // population-level design matrix
  int<lower=1> K_bcatch;  // number of population-level effects
  matrix[N, K_bcatch] X_bcatch;  // population-level design matrix
  // covariate vectors
  vector[N] C_mubeach_1;
  vector[N] C_mubeach_2;
  // covariate vectors
  vector[N] C_mupier_1;
  vector[N] C_mupier_2;
  // covariate vectors
  vector[N] C_muboat_1;
  vector[N] C_muboat_2;
  // covariate vectors
  vector[N] C_mucharter_1;
  vector[N] C_mucharter_2;
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
}
parameters {
  vector[K_bpier] b_bpier;  // population-level effects
  vector[K_bboat] b_bboat;  // population-level effects
  vector[K_bcharter] b_bcharter;  // population-level effects
  vector[K_bprice] b_bprice;  // population-level effects
  vector[K_bcatch] b_bcatch;  // population-level effects
}
transformed parameters {
}
model {
  vector[N] nlp_bpier = X_bpier * b_bpier;
  vector[N] nlp_bboat = X_bboat * b_bboat;
  vector[N] nlp_bcharter = X_bcharter * b_bcharter;
  vector[N] nlp_bprice = X_bprice * b_bprice;
  vector[N] nlp_bcatch = X_bcatch * b_bcatch;
  vector[N] mubeach;
  vector[N] mupier;
  vector[N] muboat;
  vector[N] mucharter;
  // linear predictor matrix
  vector[ncat] mu[N];
  for (n in 1:N) {
    // compute non-linear predictor
    mubeach[n] = nlp_bprice[n] * C_mubeach_1[n] + nlp_bcatch[n] * C_mubeach_2[n];
    // compute non-linear predictor
    mupier[n] = nlp_bpier[n] + nlp_bprice[n] * C_mupier_1[n] + nlp_bcatch[n] * C_mupier_2[n];
    // compute non-linear predictor
    muboat[n] = nlp_bboat[n] + nlp_bprice[n] * C_muboat_1[n] + nlp_bcatch[n] * C_muboat_2[n];
    // compute non-linear predictor
    mucharter[n] = nlp_bcharter[n] + nlp_bprice[n] * C_mucharter_1[n] + nlp_bcatch[n] * C_mucharter_2[n];
    mu[n] = [mubeach[n], mupier[n], muboat[n], mucharter[n]]';
  }
  // priors including all constants
  // likelihood including all constants
  if (!prior_only) {
    for (n in 1:N) {
      target += categorical_logit_lpmf(Y[n] | mu[n]);
    }
  }
}
generated quantities {
}

@paul-buerkner
Copy link
Owner

Thanks! This was indeed a bug which should now be fixed in the github version of brms.

@franzsf
Copy link

franzsf commented Aug 21, 2019

On this particular issue -- is it currently possible for the brms models (fit per the verbose syntax examples above) to predict responses when new options are included, assuming we have values for the new option's characteristics?

Similar to the end of this vignette predicting market shares with a new technology in the heating dataset; in the fishing context it'd be a new mode like "helicopter", with a certain price and catch ratio.

@paul-buerkner
Copy link
Owner

paul-buerkner commented Aug 21, 2019 via email

@franzsf
Copy link

franzsf commented Aug 21, 2019

In the case of the heating vignette, the new option has a known installation cost (ic) and operating (oc) cost; the coefficients on those predictors have been estimated through the model, so we do know something about the new option, right? That allows predictions of how market shares might change if the the new option were added to the mix.

@paul-buerkner
Copy link
Owner

paul-buerkner commented Aug 21, 2019 via email

@mariapartelova
Copy link

I'd like to kindly ask you for your help. Following on the example from Jon Mellon, I have a similar one, but with panel data. This means each consumer could visit store several times over time, thus she/he faced {1,..,n} choice occasions. On each store visit = choice occasion, she/he chose one alternative and choice set didn't vary in my case. Please find below example of a dataset:

consumer brand choice occasion choice nutrition price advertising
1 kellogs 1 1 2 5 0,9
1 general mills 1 0 4 5 0,3
1 walmart own brand 1 0 1 1 0
1 kellogs 2 0 2 5 0,9
1 general mills 2 0 4 5 0,3
1 walmart own brand 2 1 1 1 0
2 kellogs 1 0 2 5 0,9
2 general mills 1 1 4 5 0,3
2 walmart own brand 1 0 1 1 0

For instance, consumer 1 went to the store twice and she/he chose kellogs on the first visit and walmart own brand on the second visit. Consumer 2 went to the store once and she/he chose general mills.

Would it be possible to estimate random intercepts for brands, but also random intercepts for individual consumers in this case with brms please? Is there any example or description how to structure dataset that feeds into this model please? (For instance, with ChoiceModelR, it needs to be very specifically formatted - data needs to be in long format, alternatives need to be integers, choice needs to be stored on the first row for each choice occasion, etc. Unfortunately, I couldn't find any similar description/dataset example for brms.)

Thank you very very much for any advice. I have read the vignettes and tried to search for this issue extensively and I'm really sorry if I'm overlooking something.

@paul-buerkner
Copy link
Owner

Hi, please ask brms related questions on Stan discourse (https://discourse.mc-stan.org/) using the Interfaces - brms tag.

@SravyaKamalapuram
Copy link

Hi,
I am new to brms and am trying to solve a mode choice problem - categorical logit with panel data. The code above with "nlf" works well to get the population level effects. But I was just wondering if I could get the effects at person level. For example, in the dataset above a consumer visits a store multiple times. Can we edit the "nlf" formulae to add the effect at the consumer level?

@paul-buerkner
Copy link
Owner

I suggest you ask brms related questions on https://discourse.mc-stan.org where there is a greater audience to both read and answer your question.

@paul-buerkner
Copy link
Owner

I will close this issue to reduce the load of the brms issue tracker, as I am unlikely to work on this myself. If somone wants to work on this feature, please write here and I am happy to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants