New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Universal B tree implementation #746
Changes from 1 commit
2930a75
fe5f7e9
8c5a97d
c502ea8
9bd634a
2c5eb57
3aee6e8
f17843f
9bcd066
3cd0d34
43a065b
94f4fa2
d3563e9
616f117
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -528,11 +528,12 @@ class BinarySpaceTree | |
* @param oldFromNew Vector which will be filled with the old positions for | ||
* each new point. | ||
*/ | ||
size_t PerformSplit(MatType& data, | ||
const size_t begin, | ||
const size_t count, | ||
const typename UBTreeSplit<BoundType<MetricType>, | ||
MatType>::SplitInfo& splitInfo); | ||
size_t PerformSplit( | ||
MatType& data, | ||
const size_t begin, | ||
const size_t count, | ||
const typename UBTreeSplit<BoundType<MetricType>, | ||
MatType>::SplitInfo& splitInfo); | ||
|
||
/** | ||
* An overload for the universal B tree. For the first time the function | ||
|
@@ -548,12 +549,13 @@ class BinarySpaceTree | |
* @param oldFromNew Vector which will be filled with the old positions for | ||
* each new point. | ||
*/ | ||
size_t PerformSplit(MatType& data, | ||
const size_t begin, | ||
const size_t count, | ||
const typename UBTreeSplit<BoundType<MetricType>, | ||
MatType>::SplitInfo& splitInfo, | ||
std::vector<size_t>& oldFromNew); | ||
size_t PerformSplit( | ||
MatType& data, | ||
const size_t begin, | ||
const size_t count, | ||
const typename UBTreeSplit<BoundType<MetricType>, | ||
MatType>::SplitInfo& splitInfo, | ||
std::vector<size_t>& oldFromNew); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I fixed spaces. But I am not sure that you do not mean the 557th line. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's fine as-is now, thanks for the fix. :) |
||
|
||
/** | ||
* Update the bound of the current node. This method does not take into | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -38,6 +38,33 @@ | |
namespace mlpack { | ||
namespace bound { | ||
|
||
/** | ||
* The CellBound class describes a bound that consists of a number of | ||
* hyperrectangles. These hyperrectangles do not overlap each other. The bound | ||
* is limited by an outer hyperrectangle and two addresses, the lower address | ||
* and the high address. Thus, the bound contains all points included between | ||
* the lower and the high addresses. The class caches the minimum bounding | ||
* rectangle, the lower and the high addresses and the hyperrectangles | ||
* that are described by the addresses. | ||
* | ||
* The notion of addresses is described in the following paper. | ||
* @code | ||
* @inproceedings{bayer1997, | ||
* author = {Bayer, Rudolf}, | ||
* title = {The Universal B-Tree for Multidimensional Indexing: General | ||
* Concepts}, | ||
* booktitle = {Proceedings of the International Conference on Worldwide | ||
* Computing and Its Applications}, | ||
* series = {WWCA '97}, | ||
* year = {1997}, | ||
* isbn = {3-540-63343-X}, | ||
* pages = {198--209}, | ||
* numpages = {12}, | ||
* publisher = {Springer-Verlag}, | ||
* address = {London, UK, UK}, | ||
* } | ||
* @endcode | ||
*/ | ||
template<typename MetricType = metric::LMetric<2, true>, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add some documentation here for the class itself? You can reuse what you've written at the top of the file. It might also be useful to point out that you are caching the minimum bounding rectangle of the points here. |
||
typename ElemType = double> | ||
class CellBound | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't mind, I'd like to add some documentation on what exactly is going on in this function. I spent a long time reading it, and it appears that really all we are doing (from a high level) is reordering the bits in the floating point number. So in the one-dimensional case, we have a number of the form
but your code transforms it roughly to
(but not exactly, since some modification of the mantissa may be necessary). In the multi-dimensional case, after we transform the representation, we have to interleave the bits of the new representation across all of the elements in the address vector. Is that correct? If so I can add those comments (or you can add them if you like, maybe it is better if you add them since you are more familiar with the code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it's correct. In one dimensional case this transform preserves the ordering (lower addresses correspond lower points). The function looks like
DiscreteHilbertValue::CalculateValue()
. I introduced that since it is faster than recursive calculation (indeed,RecursiveHilbertValue
was slower thanDiscreteHilbertValue
).I'll add the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks. Sorry my intuition was so wrong with the recursive Hilbert value calculation, I think my ideas wasted a lot of your time there. :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no problem:) I was not sure of
DiscreteHilbertValue
and that was an interesting experiment.