-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nan #63
Nan #63
Conversation
TODO for the cv function ... please do NOT re-shuffle folds if there is one that has either all left censored outputs or all right censored outputs. in that case please just stop with an informative error that tells the user they need to manually specify the foldid argument. here is some similar code in one of my other packages, https://github.com/tdhock/penaltyLearning/blob/master/R/IntervalRegression.R#L189 |
also @anujkhare can you please give @theadityasam push access to your repo so he can create future branches/PRs in your repo? (I think it would make it simpler to keep track of the project if everything is in your repo) |
Okay, won't re-shuffle the folds. Also, for the censored data error message, if I implement a method to check in the R code itself, I'll have to go through the dataset twice - once for checking whether the model can be fit and then again while assigning the censorship types in get_censoring_types of iregnet_fit.cpp which might affect the run time of iregnet. I believe, it would be better to write a new piece of R code that checks for the censorship condition and assigns the censorship type done similar to what is done in |
checking in R code will not result in any significant slowdown if you use vector operations |
Okayy, will write a new R snippet for the checking |
try debugging using print statements or gdb https://tdhock.github.io/blog/2019/gdb/ |
@tdhock @theadityasam investigated the cases where NaNs are still produced. I think that they're cases where the unregularized solution does not exist. survreg also produces The ideal behavior is that we should fit till the lambda value that is well-behaved, throw a warning for the next lambda that produces a NaN, and then return an @theadityasam - anything to add? We should make the corresponding changes for this. |
There is a reference to this in section 2.3 of this paper about glmnet. For the cases where the number of covariates (p) > number of samples (n), the unregularized solution is undefined ( They ignore solutions for |
that is interesting. do you have any concrete p>n examples that can be coded as tests? it makes sense to me to only consider large lambda values in that case, and still return an iregnet object, with a warning. |
Closing since this has been merged into the cv branch: #54 . |
The NAN issue will be fixed and cv function will be made functional