Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scoping problem with "combined_risk" in nc_mboostLSS #49

Closed
Almond-S opened this issue Jan 31, 2018 · 9 comments
Closed

Scoping problem with "combined_risk" in nc_mboostLSS #49

Almond-S opened this issue Jan 31, 2018 · 9 comments
Assignees
Labels

Comments

@Almond-S
Copy link

attr(model1, "combined_risk") seems to always relate to the last fitted model instead of model1.
And perhaps related to this issue, the combined_risk cannot be returned when fitting multiple models with mclapply:

library(gamboostLSS)
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~###
### Scoping problem with "combined_risk" in noncyclical mboostLSS 
### Particularly, prohibiting the access to "combined_risk" after fitting multiple models via mclapply 
###~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~###

### Use first example of ?mboostLSS but with mclapply over the methods cyclic and noncyclic:

# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
### Data generating process:
set.seed(1907)
x1 <- rnorm(1000)
x2 <- rnorm(1000)
x3 <- rnorm(1000)
x4 <- rnorm(1000)
x5 <- rnorm(1000)
x6 <- rnorm(1000)
mu    <- exp(1.5 +1 * x1 +0.5 * x2 -0.5 * x3 -1 * x4)
sigma <- exp(-0.4 * x3 -0.2 * x4 +0.2 * x5 +0.4 * x6)
y <- numeric(1000)
for( i in 1:1000)
 y[i] <- rnbinom(1, size = sigma[i], mu = mu[i])
dat <- data.frame(x1, x2, x3, x4, x5, x6, y)
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


### linear models with y ~ . for both components: 20 boosting iterations fitted via mclapply
model <- mclapply(c("cyclic", "noncyclic"), function(method) glmboostLSS(y ~ ., families = NBinomialLSS(), data = dat, 
                                                                      control = boost_control(mstop = 20),method = method, center = TRUE))
# only consider the noncyclic model[[2]] (as the cyclic does not have "combined_risk") 
model_nc <- model[[2]]

# Cannot access combined_risk
try(attr(model_nc, "combined_risk")())

### If there was another model previously fitted, the combined_risk of this other model is returned instead:
# e.g. model instead x1 as response and mstop = 30  
model_0 <- mboostLSS(x1 ~ ., families = GaussianLSS(), data = dat, 
                    control = boost_control(mstop = 30),method = "noncyclic")

try(length(attr(model_nc, "combined_risk")()))
# => combined_risk of model_0 (of length 30+2) is returned instead of the one from model_nc (length 20+2)


### And if we now fit model_nc seperately, the combined_risk of model_0 is replaced !!! 
model_1 <- glmboostLSS(y ~ ., families = NBinomialLSS(), data = dat, 
                      control = boost_control(mstop = 20),method = "noncyclic", center = TRUE)

length(attr(model_0, "combined_risk")())
# => now, length is 22 !
@hofnerb
Copy link
Member

hofnerb commented Apr 16, 2018

@ja-thomas can you please look at this?

@ja-thomas
Copy link
Member

Problem is here:

combined_risk <<- combined_risk

I'm not sure why I had to write this (horrible) line... I need to check

@ja-thomas
Copy link
Member

Hmmmm, I played around with this some more. I kind of remember why I did it. Since assignment for the parent.frame was not working correctly and gamboostLSS is writing to the .GlobaleEnv anyways.

This is pretty bad design from my part, but I didn't find a solution at the time and unfortunately I'm still not sure how to find the correct environment to write the combined_risk to. This seems to have to do with the fact that we assign functions (with environment) to attributes of objects.

tldr.: This sucks, but I have no idea how to fix it :(

@hofnerb
Copy link
Member

hofnerb commented Jun 22, 2018

Thanks @ja-thomas for digging into this.

Without having had another look I think the issue is that some functions are defined outside of the relevant environment and hence do not share the same parent environments. Thus, the combined_risk is written to the wrong place.

Of note, I do not think that gamboostLSS writes to the global environment in other places. We define global variables but only ro get rid of NOTES in R CMD check. Actually, these variables should be defined at the time they are used inside another environment than the global environment.

@Almond-S I am not sure if we should really close this issue. It still isn't solved. Of course it has not the highest priority but it should stay on the list of open issues. Perhaps I'll have some spare time on a train ride can look into this (or someone else has a good idea how to fix this).

@hofnerb hofnerb reopened this Jun 22, 2018
@hofnerb
Copy link
Member

hofnerb commented Jun 22, 2018

With @ja-thomas hints that was easy to fix. combined_risk simply needed to be defined within mboostLSS_fit. I'll add some checks to see if it really works and close the issue afterwards.

@Almond-S
Copy link
Author

Thank you both so much!

I had all my simulation studies showing apparently corrupt crossvalidation results, which did not fit to the results with prior gamboostLSS versions, and think that this should have been the problem. And this was kind of the last issue prohibiting me from finishing a project on FDboostLSS for functional responses.

So many thanks again!!

@ja-thomas
Copy link
Member

Thanks @hofnerb you're a genius :)

@hofnerb
Copy link
Member

hofnerb commented Jun 22, 2018

You are more than welcome. Thanks for locating the reason in the first place!

@Almond-S Is it currently sufficient for you to have the bug fixed on github or should we make a quick release to CRAN as well?

@Almond-S
Copy link
Author

I think, having it fixed on gitHub should be completely fine. But if it won't, for whatever reason, I'd get back to you. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants