Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Templatize splitting procedure for BinarySpaceTree #228

Closed
rcurtin opened this issue Dec 29, 2014 · 4 comments
Closed

Templatize splitting procedure for BinarySpaceTree #228

rcurtin opened this issue Dec 29, 2014 · 4 comments
Assignees
Milestone

Comments

@rcurtin
Copy link
Member

rcurtin commented Dec 29, 2014

Reported by rcurtin on 6 Aug 42549474 01:43 UTC
Right now the default BinarySpaceTree splitting procedure (to form the two children) is to find the largest dimension and split it in half. That's not the only way to make a binary space tree. So the SplitNode() method should be pulled out and templatized (in some manner).

@rcurtin rcurtin self-assigned this Dec 29, 2014
@rcurtin rcurtin added this to the mlpack 1.1.0 milestone Dec 29, 2014
@rcurtin rcurtin closed this as completed Dec 29, 2014
@rcurtin
Copy link
Member Author

rcurtin commented Dec 30, 2014

Commented by yashdv on 15 Jul 44187601 04:20 UTC
I have edited the template for BinarySpaceTree to include an additional parameter SplitType.

The current splitting procedure is now implemented by the HalfSplit class. I have removed GetSplitIndex from BinarySpaceTree, but not SplitNode because it is used by the constructors when building the tree recursively.
Additionally, there is some code in SplitNode which sets some tree variables, so we don't want that part removed.

We could shift the code in SplitNode in the constructors but it would lead to code duplication. The splitting part in SplitNode is now done via SplitType's methods.

SplitType does not need any information about the tree except for the dataset it is using. I have also included bound, because it helps in the splitting process (eg: to calculate splitDimension).

Regarding the kind of interface SplitType needs to have, I think it is essential to provide methods that tell us about the split dimension, split column and whether the points where actually split.

Any suggestions on a better interface/implementation are welcome :)

@rcurtin
Copy link
Member Author

rcurtin commented Dec 30, 2014

Commented by yashdv on 9 Sep 44187613 15:24 UTC
For now I have placed half_split.hpp and half_split_impl.hpp in the same directory as binary_space_tree.hpp. Is that the appropriate location ? Thanks.

@rcurtin
Copy link
Member Author

rcurtin commented Dec 30, 2014

Commented by yashdv on 14 Aug 44206044 16:27 UTC
Removed internal members from HalfSplit and made member functions static to make the class more simpler.

@rcurtin
Copy link
Member Author

rcurtin commented Dec 30, 2014

Commented by rcurtin on 24 Jun 44229558 00:56 UTC
Hi Yash,

Thank you for the patch. I've changed the name of HalfSplit to MeanSplit and committed it in r16373. Sorry it took so long.

Thanks,

Ryan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant