Handling samples with undefined feature values? #39

0x7f · 2016-02-15T18:47:55Z

First, thanks for ranger. Keep up the good work!

My issue: I have trouble using ranger with sparse data, i.e. when samples do not have certain continuous variables/features at all. At the moment, I set them to 0.0 but this produces wrong results of course. Checking the code, this is not really possible at the moment, right? First, I was confused by the sparse data feature in the Data class, but it is some feature for the GenABEL library. I mean sparse in the sense of sparse features matrices as in e.g. scipy.

Is this feature planned? If not, could you maybe sketch the solution, so I could help out with a patch?

The text was updated successfully, but these errors were encountered:

0x7f · 2016-02-26T17:51:33Z

Nevermind, I misunderstood how sklearn is handling sparse matrices in the first place. Read the original paper and sklearn's code and it's all about imputation.

0x7f closed this as completed Feb 26, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling samples with undefined feature values? #39

Handling samples with undefined feature values? #39

0x7f commented Feb 15, 2016

0x7f commented Feb 26, 2016

Handling samples with undefined feature values? #39

Handling samples with undefined feature values? #39

Comments

0x7f commented Feb 15, 2016

0x7f commented Feb 26, 2016