Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAN Error #62

Closed
theadityasam opened this issue May 27, 2019 · 6 comments
Closed

NAN Error #62

theadityasam opened this issue May 27, 2019 · 6 comments

Comments

@theadityasam
Copy link
Collaborator

So currently I believe the NAN issue occurs when the data is either completely right censored or left truncated. I'll be introducing an error message in the get_censoring_types function of iregnet_fit.cpp whenever this condition is met and then will be testing it in various cases to see if the NAN error still occurs. If it does then I need to discuss if we have missed any other cause for the error.

@theadityasam
Copy link
Collaborator Author

Hii @anujkhare @tdhock
I have written the error message and it's working great for the main iregnet function.
For cv, as I had previously explained, in case we end up with a fold which produces the NAN error, we need to shuffle the folds again. Today i tried out many methods like counting the total number of left, right and interval censored and then splitting it into the folds accordingly but this will seriously cost the runtime of the function.
Can we plan for this over a call whenever you are free?

@tdhock
Copy link
Collaborator

tdhock commented May 29, 2019

hi @theadityasam it would be great to have discussions like this one in the context of a PR where you can make references to specific commits / lines of code

anyway that is great that you got an error message working... can you please add a unit test using expect_error?

I think it would be better to stop with an error in R code before calling the C++ routine.

same thing for the cv function ... please do NOT re-shuffle folds if there is one that has either all left censored outputs or all right censored outputs. in that case please just stop with an informative error that tells the user they need to manually specify the foldid argument. here is some similar code in one of my other packages, https://github.com/tdhock/penaltyLearning/blob/master/R/IntervalRegression.R#L189

@theadityasam
Copy link
Collaborator Author

theadityasam commented May 29, 2019

Hiii Toby,
I created a new branch called nan wherein first I'll fix the nan error and test it for various dataset
https://github.com/theadityasam/iregnet/tree/nan
I had made a pretty simple change to fix the issue, this is temporary and will make the change to the R code rather than the C++ code as you said - Link.
I'll include the unit test soon.
As for cv function, I'll include the error message as you said, but is there any problem in reshuffling to try and fit the fold?

@tdhock
Copy link
Collaborator

tdhock commented May 29, 2019

great to see that you are comitting to the nan branch -- please open a PR for that and let's move the discussion there.

yes I think there is a problem in reshuffling in the cv function. to me if the first random fold assignment does not work then there may be a problem with the data, and we should inform the user. do you agree @anujkhare ?

@theadityasam
Copy link
Collaborator Author

@tdhock @anujkhare
I've create a pull request #63
We can move the conversation over there

@tdhock
Copy link
Collaborator

tdhock commented May 29, 2019

great

@tdhock tdhock closed this as completed May 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants