Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

predict.bife Error in contrasts #9

Open
IsadoraBM opened this issue Aug 8, 2021 · 5 comments
Open

predict.bife Error in contrasts #9

IsadoraBM opened this issue Aug 8, 2021 · 5 comments

Comments

@IsadoraBM
Copy link

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels

predict.bife with X_new cannot compute with new dataframe of averaged/mode non-varying data for all variables except IV of interest once I move away from an iris example.

testbife <- iris %>%
  mutate(`Long sepal` = if_else(Sepal.Length > 5, "Yes", "No") %>% as_factor(.)) %>%
  bife(`Long sepal` ~ Sepal.Width + Petal.Length + Petal.Width | Species, data = ., "logit")
predictBife <- bife:::predict.bife(testbife, type = "response") 
predictBife %<>% as.list() %>% stack()
irisdf_bife <- new_data(testbife, "Petal.Length[all]")
predictBife <- bife:::predict.bife(testbife, type = "response",
                                   X_new = irisdf_bife)

This works but

agedf <- new_data(model = LogitFE, terms= c("Age[all]"))

from my own dataset put into:

predictBife <- bife:::predict.bife(LogitFE, 
                                   type = "response",
                                   X_new = agedf)

Produces contrast error.

@dczarnowske
Copy link

Hello,

without a minimal example that reproduces the error, we are not able to help.

@IsadoraBM
Copy link
Author

Here is a subset of my bife_df.zip (had to be zipped to upload) and a subset of my model that exemplifies the way in which I am unable to reproduce the iris example and
obtain the predicted probabilities for X_new that keeps all variables of non-interest constant at their means/mode as is usual for graphing predicted probabilities

zdfmini <- read_rds("bife_df.R")
logitAge <-bife(`Tradeoff refused` ~ Age+ Education +
                              Gender+Media+ Class  +  Religious+ 
                              Year| Country, data = zdfmini, "logit")
predictBife <- bife:::predict.bife(logitAge, type = "response") 
agedf <- ggeffects::new_data(model = logitAge, 
                             terms= c("Age[all]", "Tradeoff refused",
                                      "Year", "Country"))
agedf$`Tradeoff refused` <- if_else(agedf$`Tradeoff refused`==0,
                                    "Made tradeoff", "Tradeoff refused") %>%
  fct_relevel("Made tradeoff", "Tradeoff refused")

str(agedf)

#NULL: 
predictBifeage <- predict(logitAge,newdata = agedf, type = "response")
#Contrast Error
predictBifeage <- bife:::predict.bife(logitAge, 
                                   type = "response",
                                   X_new = agedf)
#Refitting with DF that only varies along Age, country-year also doesn't fit
#But that's what is needed for "predictions at different values of the focal term(s)", which is what ggeffects::ggpredict returns and allows me to graph. 
#Fitting with a dataset that forces non-focal terms at their mean fails. This is probably what is driving the predict.bife() error
logitAge2 <-bife(`Tradeoff refused` ~ Age+ Education +
                  Gender+Media+ 
                  Class  +  Religious+ 
                  Year| Country, data = agedf, "logit")

@IsadoraBM
Copy link
Author

IsadoraBM commented Aug 13, 2021

I was told this was not minimal, so here is another, with region-fixed effects.

library(ggeffects)
library(bife)
data("gss_cat")
gss_cat2 <- gss_cat %>% mutate(Nonwhite = if_else(race == "White", 0, 1), 
                               Region = rep(state.division, length = nrow(gss_cat)))
testbife <- bife(Nonwhite ~ denom+ tvhours+ partyid+ rincome+ age+ year|Region, data = gss_cat2, "logit")
agedfbife <- new_data(testbife, terms = "age")
dim(agedffixest) #9x7
bife:::predict.bife(testbife, type = "response") #works 
bife:::predict.bife(testbife, type = "response", X_new = agedfbife) #doesn't work

@dczarnowske
Copy link

dczarnowske commented Aug 13, 2021

From my point of view this is not a bife issue. The function new_data drops all the unused levels of the factor variables. Since bife estimated coefficients for the other levels as well, this causes the error.

Or to say it differently, the regressor matrix used by bife is different from the regressor matrix used by predict, and that does not work.

library(bife)
library(ggeffects)
library(tidyverse)
data("gss_cat")
gss_cat2 <- gss_cat %>% mutate(Nonwhite = if_else(race == "White", 0, 1), 
                               Region = rep(state.division, length = nrow(gss_cat)))
testbife <- bife(Nonwhite ~ denom+ tvhours+ partyid+ rincome+ age+ year | Region, data = gss_cat2, "logit")
agedfbife <- new_data(testbife, terms = "age")

# Example; new_data drops unused levels
levels(gss_cat2$denom)
levels(agedfbife$denom) 

# Re-assign unused levels
levels(agedfbife$denom) <- levels(gss_cat2$denom)
levels(agedfbife$partyid) <- levels(gss_cat2$partyid)
levels(agedfbife$rincome) <- levels(gss_cat2$rincome)

# Works now
predict(testbife, type = "response", X_new = agedfbife)

@IsadoraBM
Copy link
Author

IsadoraBM commented Aug 14, 2021

Yes, indeed this works on gss_cat, now I have to figure out why it still produces contrast errors for my data. Levels getting dropped by new_data isn't a problem for base glm, and doing it by hand would have produced the same issue since the usual manual method is to state the mode level (newdf$VarX <- "mode") without reassigning all other unused levels. I'll let them know of the issue & your solution though. Thanks for workshopping it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants