Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling samples with undefined feature values? #39

Closed
0x7f opened this issue Feb 15, 2016 · 1 comment
Closed

Handling samples with undefined feature values? #39

0x7f opened this issue Feb 15, 2016 · 1 comment

Comments

@0x7f
Copy link
Contributor

0x7f commented Feb 15, 2016

First, thanks for ranger. Keep up the good work!

My issue: I have trouble using ranger with sparse data, i.e. when samples do not have certain continuous variables/features at all. At the moment, I set them to 0.0 but this produces wrong results of course. Checking the code, this is not really possible at the moment, right? First, I was confused by the sparse data feature in the Data class, but it is some feature for the GenABEL library. I mean sparse in the sense of sparse features matrices as in e.g. scipy.

Is this feature planned? If not, could you maybe sketch the solution, so I could help out with a patch?

@0x7f
Copy link
Contributor Author

0x7f commented Feb 26, 2016

Nevermind, I misunderstood how sklearn is handling sparse matrices in the first place. Read the original paper and sklearn's code and it's all about imputation.

@0x7f 0x7f closed this as completed Feb 26, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant