Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
DET has only one leaf prior to pruning every time #515
For some reason the DET keeps giving me only one leaf node, prior to pruning, and after pruning - for several datasets I have tried. Here is one of the datasets (attached):
It contains 896 observations of the MNIST digit 5.
Any help towards solving this issue will be greatly appreciated.
Thanks in advance.
Output from terminal:
Diagnosis: the log negative error of a DET is defined as
R(t) = log(|t|^2 / (N^2 V_t)).
At the first level of this tree, the volume of the node is the entire volume spanned by the data. i.e. V = the width of every dimension multiplied together. But some dimensions have width 0 in this dataset, so, V = 0 and R(t) = inf.
I don't yet know how I want to handle this problem for the mlpack code; I need to review the paper and maybe send Pari an email or something depending on what I can come up with.
A quick solution is to add tiny bits of noise to your data points, or to drop any dimensions that have zero range (i.e. where all of the rows have 0 in that dimension).
I'll keep digging and let you know what I think of.
I talked with Pari and we decided that the best idea was just to ignore the zero-variance dimensions in the log negative error calculation. This change has been made in 4e069ab and should fix your issue, so there should be no more need to add noise. Let me know if it doesn't and we can reopen the ticket. Thanks for reporting the issue! :)