Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust DMatrix creation from a sparse matrix #1606

Merged
merged 5 commits into from
Sep 25, 2016

Conversation

khotilov
Copy link
Member

Addresses #1583

The new "explicit" XGDMatrixCreateFromCS*Ex functions are pretty much duplicates of the old ones with an extra argument, slight code change, and slightly safer types. The old ones are marked as \deprecated in their doc.

I've updated the Python interface, but need to think a bit more about R (would like to make it handle the 64 bit indexing properly). And I didn't touch jvm.

@tqchen
Copy link
Member

tqchen commented Sep 23, 2016

https://travis-ci.org/dmlc/xgboost/builds/162198885 see the lint and lightweight test error. Note that lightweight test comes without sklearn or other external dependencies

@tqchen
Copy link
Member

tqchen commented Sep 24, 2016

still another lint problem in c_api.cc final step to go!

@khotilov
Copy link
Member Author

I guess I just have to install python2 to make the linter work locally for me.

@CodingCat
Copy link
Member

I will push jvm changes after this is merged

@khotilov
Copy link
Member Author

I've just updated the R-package to just use the new interface. The long vector support would need to be another issue.

@tqchen
Copy link
Member

tqchen commented Sep 25, 2016

Thanks, some final comments

  • Redirect the old functions to the new function, so implementation won't be duplicated
  • Fix the indentation in the function parameters

@khotilov
Copy link
Member Author

Is that better?
There are various function parameter indentations in the current code though.

@tqchen
Copy link
Member

tqchen commented Sep 25, 2016

Thanks! this is merged

@chleibig
Copy link

Thank you very much for this PR! I am using the python bindings and noticed that creating a DMatrix from
sparse matrices works correctly due to this PR. If the DMatrix is however created from a sparse matrix file (tested libsvm txt format, which is supported according to the API docs), the num_col are not correctly set if the last feature column is zero, even though the feature_names are provided and could be used to set this correctly. In particular, it seems that the XGDMatrixCreateFromFile function needs an update similar to the changes introduced here for the XGDMatrixCreateFromCS*Ex functions.

@lock lock bot locked as resolved and limited conversation to collaborators Jan 18, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants