You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When training the k-d tree-based Local Outlier Factor (KDLOF), if the highest variance column has a heavy tail such that the median is the smallest value in the column, then the partition will result in one group that has no members - throwing the Neighborhood cannot be empty exception in the Neighborhood factory method.
The fix is simple, have the partition operation be inclusive of the column split value
The text was updated successfully, but these errors were encountered:
This turned out to be an interesting fix because it gave me the opportunity to rethink the way that node splitting is done in our base k-d tree implementation. It turns out we can save alot of computation and still get fat splits (tight bounds) if we pick the axis as the column with the longest range (instead of highest variance) and the value as the midrange (instead of median).
When training the k-d tree-based Local Outlier Factor (KDLOF), if the highest variance column has a heavy tail such that the median is the smallest value in the column, then the partition will result in one group that has no members - throwing the Neighborhood cannot be empty exception in the Neighborhood factory method.
The fix is simple, have the partition operation be inclusive of the column split value
The text was updated successfully, but these errors were encountered: