-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DMatrix is now an AbstractMatrix #136
Conversation
I'm admittedly still somewhat confused here, so confirmation of what I'm about to say would be appreciated. Following the discussion here, it seems that our existing sparse matrix Furthermore, I've had to "hack" the
Of these I favor 1 as it seems like by far the simplest approach and does not compromise the utility of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there were discussions on XGBoost's side to restore 0s from sparse matrix input, but we weren't quite sure whether it's necessary for a ML library to perform these types of data manipulation (which is not trivial in the existing code base). It seems the argument is getting stronger.
I can't really say how necessary it is, in most cases whatever data was used to construct the My goal for this PR was to ensure that the I'll clean up docs and stuff and then this should be ready to merge, thanks for the clarification @trivialfis . |
This PR should now be complete. I'll wait approximately 24 hours to merge it to give people a chance to object or comment. |
This is added by virtue of the new
XGDMatrixGetDataCSR
method in libxgboost 1.7 which is now required. Unit tests forDMatrix
constructors have been vastly improved as they now check that the matrices were actually constructed correctly rather than merely having the same shape.The behavior I'm seeing in the presence of null values doesn't make sense to me, so this issue is going to have to be resolved before this gets merged.